This article is more than one year old. Older articles may contain outdated content. Check that the information in the page has not become incorrect since its publication.
Kubernetes 1.25 brings cgroup v2 to GA (general availability), letting the kubelet use the latest container resource management capabilities.
Effective resource management is a critical aspect of Kubernetes. This involves managing the finite resources in your nodes, such as CPU, memory, and storage.
cgroups are a Linux kernel capability that establish resource management functionality like limiting CPU usage or setting memory limits for running processes.
When you use the resource management capabilities in Kubernetes, such as configuring requests and limits for Pods and containers, Kubernetes uses cgroups to enforce your resource requests and limits.
The Linux kernel offers two versions of cgroups: cgroup v1 and cgroup v2.
cgroup v2 is the latest version of the Linux cgroup API. cgroup v2 provides a unified control system with enhanced resource management capabilities.
cgroup v2 has been in development in the Linux Kernel since 2016 and in recent years has matured across the container ecosystem. With Kubernetes 1.25, cgroup v2 support has graduated to general availability.
Many recent releases of Linux distributions have switched over to cgroup v2 by default so it's important that Kubernetes continues to work well on these new updated distros.
cgroup v2 offers several improvements over cgroup v1, such as the following:
Some Kubernetes features exclusively use cgroup v2 for enhanced resource management and isolation. For example, the MemoryQoS feature improves memory utilization and relies on cgroup v2 functionality to enable it. New resource management features in the kubelet will also take advantage of the new cgroup v2 features moving forward.
Many Linux distributions are switching to cgroup v2 by default; you might start using it the next time you update the Linux version of your control plane and nodes!
Using a Linux distribution that uses cgroup v2 by default is the recommended method. Some of the popular Linux distributions that use cgroup v2 include the following:
To check if your distribution uses cgroup v2 by default, refer to Check your cgroup version or consult your distribution's documentation.
If you're using a managed Kubernetes offering, consult your provider to determine how they're adopting cgroup v2, and whether you need to take action.
To use cgroup v2 with Kubernetes, you must meet the following requirements:
The kubelet and container runtime use a cgroup driver to set cgroup parameters. When using cgroup v2, it's strongly recommended that both the kubelet and your container runtime use the systemd cgroup driver, so that there's a single cgroup manager on the system. To configure the kubelet and the container runtime to use the driver, refer to the systemd cgroup driver documentation.
When you run Kubernetes with a Linux distribution that enables cgroup v2, the kubelet should automatically adapt without any additional configuration required, as long as you meet the requirements.
In most cases, you won't see a difference in the user experience when you switch to using cgroup v2 unless your users access the cgroup file system directly.
If you have applications that access the cgroup file system directly, either on the node or from inside a container, you must update the applications to use the cgroup v2 API instead of the cgroup v1 API.
Scenarios in which you might need to update to cgroup v2 include the following:
Your feedback is always welcome! SIG Node meets regularly and are available in
the #sig-node channel in the Kubernetes Slack, or
using the SIG mailing list.
cgroup v2 has had a long journey and is a great example of open source community collaboration across the industry because it required work across the stack, from the Linux Kernel to systemd to various container runtimes, and (of course) Kubernetes.
We would like to thank Giuseppe Scrivano who initiated cgroup v2 support in Kubernetes, and reviews and leadership from the SIG Node community including chairs Dawn Chen and Derek Carr.
We'd also like to thank the maintainers of container runtimes like Docker, containerd and CRI-O, and the maintainers of components like cAdvisor and runc, libcontainer, which underpin many container runtimes. Finally, this wouldn't have been possible without support from systemd and upstream Linux Kernel maintainers.
It's a team effort!