We're excited to announce the alpha support for a changed block tracking mechanism. This enhances the Kubernetes storage ecosystem by providing an efficient way for CSI storage drivers to identify changed blocks in PersistentVolume snapshots. With a driver that can use the feature, you could benefit from faster and more resource-efficient backup operations.
If you're eager to try this feature, you can skip to the Getting Started section.
Changed block tracking enables storage systems to identify and track modifications at the block level between snapshots, eliminating the need to scan entire volumes during backup operations. The improvement is a change to the Container Storage Interface (CSI), and also to the storage support in Kubernetes itself. With the alpha feature enabled, your cluster can:
For Kubernetes users managing large datasets, this API enables significantly more efficient backup processes. Backup applications can now focus only on the blocks that have changed, rather than processing entire volumes.
As Kubernetes adoption grows for stateful workloads managing critical data, the need for efficient backup solutions becomes increasingly important. Traditional full backup approaches face challenges with:
The Changed Block Tracking API addresses these challenges by providing native Kubernetes support for incremental backup capabilities through the CSI interface.
The implementation consists of three primary components:
If you're an author of a storage integration with Kubernetes and want to support the changed block tracking feature, you must implement specific requirements:
Implement CSI RPCs: Storage providers need to implement the SnapshotMetadata service as defined in the CSI specifications protobuf. This service requires server-side streaming implementations for the following RPCs:
GetMetadataAllocated: For identifying allocated blocks in a snapshotGetMetadataDelta: For determining changed blocks between two snapshotsStorage backend capabilities: Ensure the storage backend has the capability to track and report block-level changes.
Deploy external components: Integrate with the external-snapshot-metadata sidecar to expose the snapshot metadata service.
Register custom resource: Register the SnapshotMetadataService resource using a CustomResourceDefinition and create a SnapshotMetadataService custom resource that advertises the availability of the metadata service and provides connection details.
Support error handling: Implement proper error handling for these RPCs according to the CSI specification requirements.
A backup solution looking to leverage this feature must:
Set up authentication: The backup application must provide a Kubernetes ServiceAccount token when using the Kubernetes SnapshotMetadataService API. Appropriate access grants, such as RBAC RoleBindings, must be established to authorize the backup application ServiceAccount to obtain such tokens.
Implement streaming client-side code: Develop clients that implement the streaming gRPC APIs defined in the schema.proto file. Specifically:
GetMetadataAllocated and GetMetadataDelta methodsSnapshotMetadataResponse message format with proper error handlingThe external-snapshot-metadata GitHub repository provides a convenient
iterator
support package to simplify client implementation.
Handle large dataset streaming: Design clients to efficiently handle large streams of block metadata that could be returned for volumes with significant changes.
Optimize backup processes: Modify backup workflows to use the changed block metadata to identify and only transfer changed blocks to make backups more efficient, reducing both backup duration and resource consumption.
To use changed block tracking in your cluster:
external-snapshot-metadata sidecarThe API provides two main functions:
GetMetadataAllocated: Lists blocks allocated in a single snapshotGetMetadataDelta: Lists blocks changed between two snapshotsDepending on feedback and adoption, the Kubernetes developers hope to push the CSI Snapshot Metadata implementation to Beta in the future releases.
For those interested in trying out this new feature:
external-snapshot-metadataThis project, like all of Kubernetes, is the result of hard work by many contributors from diverse backgrounds working together. On behalf of SIG Storage, I would like to offer a huge thank you to the contributors who helped review the design and implementation of the project, including but not limited to the following:
Thank also to everyone who has contributed to the project, including others who helped review the KEP and the CSI spec PR
For those interested in getting involved with the design and development of CSI or any part of the Kubernetes Storage system, join the Kubernetes Storage Special Interest Group (SIG). We always welcome new contributors.
The SIG also holds regular Data Protection Working Group meetings. New attendees are welcome to join our discussions.