This article is more than one year old. Older articles may contain outdated content. Check that the information in the page has not become incorrect since its publication.
If you have deployed Kubernetes pods with CPU and/or memory resources specified, you may have noticed that changing the resource values involves restarting the pod. This has been a disruptive operation for running workloads... until now.
In Kubernetes v1.27, we have added a new alpha feature that allows users
to resize CPU/memory resources allocated to pods without restarting the
containers. To facilitate this, the resources field in a pod's containers
now allow mutation for cpu and memory resources. They can be changed
simply by patching the running pod spec.
This also means that resources field in the pod spec can no longer be
relied upon as an indicator of the pod's actual resources. Monitoring tools
and other such applications must now look at new fields in the pod's status.
Kubernetes queries the actual CPU and memory requests and limits enforced on
the running containers via a CRI (Container Runtime Interface) API call to the
runtime, such as containerd, which is responsible for running the containers.
The response from container runtime is reflected in the pod's status.
In addition, a new restartPolicy for resize has been added. It gives users
control over how their containers are handled when resources are resized.
Besides the addition of resize policy in the pod's spec, a new field named
allocatedResources has been added to containerStatuses in the pod's status.
This field reflects the node resources allocated to the pod's containers.
In addition, a new field called resources has been added to the container's
status. This field reflects the actual resource requests and limits configured
on the running containers as reported by the container runtime.
Lastly, a new field named resize has been added to the pod's status to show the
status of the last requested resize. A value of Proposed is an acknowledgement
of the requested resize and indicates that request was validated and recorded. A
value of InProgress indicates that the node has accepted the resize request
and is in the process of applying the resize request to the pod's containers.
A value of Deferred means that the requested resize cannot be granted at this
time, and the node will keep retrying. The resize may be granted when other pods
leave and free up node resources. A value of Infeasible is a signal that the
node cannot accommodate the requested resize. This can happen if the requested
resize exceeds the maximum resources the node can ever allocate for a pod.
Here are a few examples where this feature may be useful:
In order to use this feature in v1.27, the InPlacePodVerticalScaling
feature gate must be enabled. A local cluster with this feature enabled
can be started as shown below:
root@vbuild:~/go/src/k8s.io/kubernetes# FEATURE_GATES=InPlacePodVerticalScaling=true ./hack/local-up-cluster.sh
go version go1.20.2 linux/arm64
+++ [0320 13:52:02] Building go targets for linux/arm64
k8s.io/kubernetes/cmd/kubectl (static)
k8s.io/kubernetes/cmd/kube-apiserver (static)
k8s.io/kubernetes/cmd/kube-controller-manager (static)
k8s.io/kubernetes/cmd/cloud-controller-manager (non-static)
k8s.io/kubernetes/cmd/kubelet (non-static)
...
...
Logs:
/tmp/etcd.log
/tmp/kube-apiserver.log
/tmp/kube-controller-manager.log
/tmp/kube-proxy.log
/tmp/kube-scheduler.log
/tmp/kubelet.log
To start using your cluster, you can open up another terminal/tab and run:
export KUBECONFIG=/var/run/kubernetes/admin.kubeconfig
cluster/kubectl.sh
Alternatively, you can write to the default kubeconfig:
export KUBERNETES_PROVIDER=local
cluster/kubectl.sh config set-cluster local --server=https://localhost:6443 --certificate-authority=/var/run/kubernetes/server-ca.crt
cluster/kubectl.sh config set-credentials myself --client-key=/var/run/kubernetes/client-admin.key --client-certificate=/var/run/kubernetes/client-admin.crt
cluster/kubectl.sh config set-context local --cluster=local --user=myself
cluster/kubectl.sh config use-context local
cluster/kubectl.sh
Once the local cluster is up and running, Kubernetes users can schedule pods with resources, and resize the pods via kubectl. An example of how to use this feature is illustrated in the following demo video.
In this scenario, developers or development teams write their code locally but build and test their code in Kubernetes pods with consistent configs that reflect production use. Such pods need minimal resources when the developers are writing code, but need significantly more CPU and memory when they build their code or run a battery of tests. This use case can leverage in-place pod resize feature (with a little help from eBPF) to quickly resize the pod's resources and avoid kernel OOM (out of memory) killer from terminating their processes.
This KubeCon North America 2022 conference talk illustrates the use case.
Some Java applications may need significantly more CPU during initialization than what is needed during normal process operation time. If such applications specify CPU requests and limits suited for normal operation, they may suffer from very long startup times. Such pods can request higher CPU values at the time of pod creation, and can be resized down to normal running needs once the application has finished initializing.
This feature enters v1.27 at alpha stage. Below are a few known issues users may encounter:
InProgress state, and resources field in the pod's
status are never updated even though the new resources may have been enacted
on the running containers.This feature is a result of the efforts of a very collaborative Kubernetes community. Here's a little shoutout to just a few of the many many people that contributed countless hours of their time and helped make this happen.
And a big thanks to my very supportive management Dr. Xiaoning Ding and Dr. Ying Xiong for their patience and encouragement.