New to KubeDB? Please start here.
Tiered Storage
This tutorial will show you how to use KubeDB to run a Tiered Storage. Kafka Tiered Storage is a feature that separates hot data and cold data by storing recent Kafka log segments on local broker disks and automatically offloading older segments to remote object storage (like S3, GCS, or Azure Blob).
Before You Begin
At first, you need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. If you do not already have a cluster, you can create one by using kind.
Now, install the KubeDB operator in your cluster following the steps here.
To keep things isolated, this tutorial uses a separate namespace called demo throughout this tutorial.
$ kubectl create namespace demo
namespace/demo created
$ kubectl get namespace
NAME STATUS AGE
demo Active 9s
Note: YAML files used in this tutorial are stored in examples/kafka/tiered-storage/ folder in GitHub repository kubedb/docs.
This tutorial will only work from version 4.0.0 onwards.
Create Secret for S3
Before creating a Kafka cluster with S3 tiered storage, you need to create a secret containing the AWS access key and secret key. Here’s an example secret:
apiVersion: v1
kind: Secret
metadata:
name: aws-secret
namespace: demo
type: Opaque
stringData:
accessKeyId: YOUR_ACCESS_KEY_ID
secretAccessKey: YOUR_SECRET_ACCESS_KEY
Apply the secret:
$ kubectl apply -f https://github.com/kubedb/docs/raw/v2026.2.26/docs/examples/kafka/tiered-storage/kafka-s3-tiered-secret.yaml
secret/aws-secret created
Create a Kafka Tiered Storage with S3 compatible storage
Here is an example Kafka CR that uses Tiered Storage with S3 compatible storage:
apiVersion: kubedb.com/v1
kind: Kafka
metadata:
name: kafka-prod-tiered
namespace: demo
spec:
version: 4.0.0
tieredStorage:
provider: s3
s3:
bucket: kafka
endpoint: http://minio.demo.svc.cluster.local:80
region: us-east-1
secretName: aws-secret
prefix: tiered-storage-demo/
topology:
broker:
replicas: 3
storage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: standard
controller:
replicas: 3
storage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: standard
storageType: Durable
deletionPolicy: WipeOut
Here,
spec.tieredStoragespecifies the tiered storage configuration for the Kafka cluster.spec.tieredStorage.providerspecifies the tiered storage provider. Here, it is set tos3.spec.tieredStorage.s3specifies the S3 compatible storage configuration.spec.tieredStorage.s3.bucketspecifies the S3 bucket name.spec.tieredStorage.s3.endpointspecifies the S3 endpoint URL.spec.tieredStorage.s3.regionspecifies the S3 region.spec.tieredStorage.s3.secretNamespecifies the name of the secret that contains the S3 access key and secret key.spec.tieredStorage.s3.prefixspecifies the prefix for the S3 objects.
$ kubectl apply -f https://github.com/kubedb/docs/raw/v2026.2.26/docs/examples/kafka/tiered-storage/kafka-s3-tiered.yaml
kafka.kubedb.com/kafka-prod-tiered created
$ kubectl get kafka -n demo -w
NAME TYPE VERSION STATUS AGE
kafka-prod-tiered kubedb.com/v1 4.0.0 Provisioning 2s
kafka-prod-tiered kubedb.com/v1 4.0.0 Provisioning 4s
.
.
kafka-prod-tiered kubedb.com/v1 4.0.0 Ready 112s
Exec one of the broker pods and run the following command to create a tiered storage enabled topic and insert some data into it:
$ kubectl exec -n demo -it kafka-prod-tiered-broker-0 -- bash
root@kafka-prod-tiered-broker-0:/# kafka-topics.sh \
--bootstrap-server localhost:9092 \
--create \
--config remote.storage.enable=true \
--config retention.ms=-1 \
--config segment.bytes=1048576 \
--config retention.bytes=104857600 \
--config local.retention.bytes=1 \
--partitions 1 \
--replication-factor 1 \
--topic topic1 \
--command-config config/clientauth.properties
topic1 created
root@kafka-prod-tiered-broker-0:/# kafka-producer-perf-test.sh \
--producer-props bootstrap.servers=localhost:9092 \
--topic topic1 \
--num-records 10000 \
--record-size 512 \
--throughput 1000 \
--producer.config config/clientauth.properties
4998 records sent, 999.2 records/sec (0.49 MB/sec), 13.5 ms avg latency, 526.0 ms max latency.
10000 records sent, 999.3 records/sec (0.49 MB/sec), 8.51 ms avg latency, 526.00 ms max latency, 4 ms 50th, 50 ms 95th, 92 ms 99th, 92 ms 99.9th.
here, we created a topic with local.retention.bytes=1 which will force kafka to offload segments to the remote tiered storage as soon as possible. You can check the S3 bucket to see the offloaded segments.
In this example, we are using an S3-compatible storage (MinIO). You can verify the offloaded segments using the mc (MinIO Client) command:
to the consumer transparently.
Verify Offloaded Segments in MinIO by running:
mc ls --recursive local/kafka
Example output:
[2026-02-19 23:42:14 +06] 2.0KiB STANDARD tiered-storage-demo/topic1-FzanGxsCRj6eR4xkkImQ9g/0/00000000000000000000-1qL0uiXzTrWBm-07BoNlnw.indexes
[2026-02-19 23:42:14 +06]1016KiB STANDARD tiered-storage-demo/topic1-FzanGxsCRj6eR4xkkImQ9g/0/00000000000000000000-1qL0uiXzTrWBm-07BoNlnw.log
[2026-02-19 23:42:14 +06] 736B STANDARD tiered-storage-demo/topic1-FzanGxsCRj6eR4xkkImQ9g/0/00000000000000000000-1qL0uiXzTrWBm-07BoNlnw.rsm-manifest
...
These files confirm that older Kafka log segments have been offloaded to MinIO (remote storage).
Inside the broker pod:
ls -lh /var/log/kafka/0/topic1-0
Example output:
total 104K
-rw-r--r-- 1 kafka kafka 10M Feb 19 17:42 00000000000000009844.index
-rw-r--r-- 1 kafka kafka 81K Feb 19 17:42 00000000000000009844.log
-rw-r--r-- 1 kafka kafka 56 Feb 19 17:42 00000000000000009844.snapshot
-rw-r--r-- 1 kafka kafka 10M Feb 19 17:42 00000000000000009844.timeindex
-rw-r--r-- 1 kafka kafka 8 Feb 19 17:41 leader-epoch-checkpoint
-rw-r--r-- 1 kafka kafka 43 Feb 19 17:41 partition.metadata
Notice that older segments are no longer present locally — they exist only in remote storage.
To consume data that has been offloaded, run the following inside the broker pod:
kafka-console-consumer.sh \
--bootstrap-server localhost:9092 \
--topic topic1 \
--from-beginning \
--timeout-ms 15000 \
--consumer.config config/clientauth.properties
Since the requested offsets are no longer available on local disk, Kafka must retrieve them from remote storage.
What Happens Internally
- The consumer requests offset
0. - The broker checks local storage for the required segment.
- The segment is not found locally.
- The broker fetches the segment from MinIO (remote storage).
- The broker uses the remote log index cache.
- The data is served to the consumer transparently.
The entire process is handled automatically, and the consumer is unaware whether the data came from local or remote storage.
Note: You can set
local.retention.msinstead oflocal.retention.bytesto offload segments based on time.
Create a Kafka Tiered Storage with Azure compatible storage
Here is an example Kafka CR that uses Tiered Storage with Azure compatible storage:
apiVersion: kubedb.com/v1
kind: Kafka
metadata:
name: kafka-prod-tiered
namespace: demo
spec:
version: 4.0.0
tieredStorage:
provider: azure
azure:
container: kafka
secretName: azure-secret
prefix: tiered-storage-demo/
storageAccount: demo-account
topology:
broker:
replicas: 3
storage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: standard
controller:
replicas: 3
storage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: standard
storageType: Durable
deletionPolicy: WipeOut
Here,
spec.tieredStoragespecifies the tiered storage configuration for the Kafka cluster.spec.tieredStorage.providerspecifies the tiered storage provider. Here, it is set toazure.spec.tieredStorage.azurespecifies the azure compatible storage configuration.spec.tieredStorage.azure.containerspecifies the azure container name.spec.tieredStorage.azure.secretNamespecifies the name of the secret that contains the azure storage account key.spec.tieredStorage.azure.prefixspecifies the prefix for the azure blobs.spec.tieredStorage.azure.storageAccountspecifies the azure storage account name.
$ kubectl apply -f https://github.com/kubedb/docs/raw/v2026.2.26/docs/examples/kafka/tiered-storage/kafka-azure-tiered.yaml
kafka.kubedb.com/kafka-prod-tiered created
$ kubectl get kafka -n demo -w
NAME TYPE VERSION STATUS AGE
kafka-prod-tiered kubedb.com/v1 4.0.0 Provisioning 2s
kafka-prod-tiered kubedb.com/v1 4.0.0 Provisioning 4s
.
.
kafka-prod-tiered kubedb.com/v1 4.0.0 Ready 112s
Create a Kafka Tiered Storage with GCS compatible storage
Here is an example Kafka CR that uses Tiered Storage with GCS compatible storage:
apiVersion: kubedb.com/v1
kind: Kafka
metadata:
name: kafka-prod-tiered
namespace: demo
spec:
version: 4.0.0
tieredStorage:
provider: gcs
gcs:
bucket: test-bucket
secretName: gcs-secret
prefix: tiered-storage-demo/
topology:
broker:
replicas: 3
storage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: standard
controller:
replicas: 3
storage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageClassName: standard
storageType: Durable
deletionPolicy: WipeOut
Here,
spec.tieredStoragespecifies the tiered storage configuration for the Kafka cluster.spec.tieredStorage.providerspecifies the tiered storage provider. Here, it is set togcs.spec.tieredStorage.gcsspecifies the gcs compatible storage configuration.spec.tieredStorage.gcs.bucketspecifies the gcs bucket name.spec.tieredStorage.gcs.secretNamespecifies the name of the secret that contains the gcs service account key.spec.tieredStorage.gcs.prefixspecifies the prefix for the gcs objects.
$ kubectl apply -f https://github.com/kubedb/docs/raw/v2026.2.26/docs/examples/kafka/tiered-storage/kafka-gcs-tiered.yaml
kafka.kubedb.com/kafka-prod-tiered created
$ kubectl get kafka -n demo -w
NAME TYPE VERSION STATUS AGE
kafka-prod-tiered kubedb.com/v1 4.0.0 Provisioning 2s
kafka-prod-tiered kubedb.com/v1 4.0.0 Provisioning 4s
.
.
kafka-prod-tiered kubedb.com/v1 4.0.0 Ready 112s
Next Steps
- Quickstart Kafka with KubeDB Operator.
- Quickstart ConnectCluster with KubeDB Operator.
- Use kubedb cli to manage databases like kubectl for Kubernetes.
- Detail concepts of ConnectCluster object.
- Want to hack on KubeDB? Check our contribution guidelines.































