You are looking at the documentation of a prior release. To read the documentation of the latest release, please
visit here.
Backup and Restore Cassandra database using KubeStash
KubeStash allows you to backup and restore Cassandra databases. It supports backups for Cassandra instances running in Standalone, and cluster configurations. KubeStash makes managing your Cassandra backups and restorations more straightforward and efficient.
This guide will give you how you can take backup and restore your Cassandra databases using Kubestash.
Before You Begin
- At first, you need to have a Kubernetes cluster, and the
kubectlcommand-line tool must be configured to communicate with your cluster. If you do not already have a cluster, you can create one by usingMinikubeorKind. - Install
KubeDBin your cluster following the steps here. - Install
KubeStashin your cluster following the steps here. - Install KubeStash
kubectlplugin following the steps here. - If you are not familiar with how KubeStash backup and restore Cassandra databases, please check the following guide here.
You should be familiar with the following KubeStash concepts:
To keep everything isolated, we are going to use a separate namespace called demo throughout this tutorial.
$ kubectl create ns demo
namespace/demo created
Note: YAML files used in this tutorial are stored in docs/guides/cassandra/backup/kubestash/logical/examples directory of kubedb/docs repository.
Backup Cassandra
KubeStash supports backups for Cassandra instances across different configurations, including Standalone, and Cluster setups. In this demonstration, we’ll focus on a Cassandra database using Clustering mode. The backup and restore process is similar for Standalone and Cluster configurations as well.
This section will demonstrate how to backup a Cassandra database. Here, we are going to deploy a Cassandra database using KubeDB. Then, we are going to backup this database into a S3 bucket. Finally, we are going to restore the backup up data into another Cassandra database.
Create Cassandra License Secret
We need Cassandra License to create Cassandra Database. So, Ensure that you have acquired a license and then simply pass the license by secret.
Deploy Sample Cassandra Database
Let’s deploy a sample Cassandra database and insert some data into it.
Create Cassandra CR:
Below is the YAML of a sample Cassandra CRD that we are going to create for this tutorial:
apiVersion: kubedb.com/v1alpha2
kind: Cassandra
metadata:
name: cas-sample
namespace: demo
spec:
version: 5.0.3
configuration:
topology:
rack:
- name: r0
replicas: 2
podTemplate:
spec:
containers:
- name: cassandra
resources:
limits:
memory: 2Gi
cpu: 2
requests:
memory: 1Gi
cpu: 1
storage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageType: Durable
deletionPolicy: WipeOut
Here,
spec.versionis the name of the CassandraVersion CRD where the docker images are specified. In this tutorial, a Cassandra5.0.3database is going to be created.spec.topologyspecifies that it will be used as cluster mode. If this field is nil it will be work as standalone mode.spec.storageTypespecifies the type of storage that will be used for Cassandra database. It can beDurableorEphemeral. Default value of this field isDurable. IfEphemeralis used then KubeDB will create Cassandra database usingEmptyDirvolume. In this case, you don’t have to specifyspec.storagefield. This is useful for testing purposes.spec.deletionPolicygives flexibility whether tonullify(reject) the delete operation ofCassandracrd or which resources KubeDB should keep or delete when you deleteCassandracrd. If admission webhook is enabled, It prevents users from deleting the database as long as thespec.deletionPolicyis set toDoNotTerminate. Learn details of allDeletionPolicyhere
Note:
spec.storagesection is used to create PVC for database pod. It will create PVC with storage size specified instorage.resources.requestsfield. Don’t specify limits here. PVC does not get resized automatically.
Create the above Cassandra CR,
$ kubectl apply -f https://github.com/kubedb/docs/raw/v2025.8.31/docs/guides/cassandra/backup/kubestash/logical/examples/cas-sample.yaml
cassandra.kubedb.com/cas-sample created
KubeDB will deploy a Cassandra database according to the above specification. It will also create the necessary Secrets and Services to access the database.
Let’s check if the database is ready to use,
$ kubectl get cassandras.kubedb.com -n demo
NAME TYPE VERSION STATUS AGE
cas-sample kubedb.com/v1alpha2 5.0.3 Ready 3m6s
The database is Ready. Verify that KubeDB has created a Secret and a Service for this database using the following commands,
$ kubectl get secret -n demo -l=app.kubernetes.io/instance=cas-sample
NAME TYPE DATA AGE
cas-sample-auth kubernetes.io/basic-auth 2 3m33s
cas-sample-config Opaque 1 3m33s
$ kubectl get service -n demo -l=app.kubernetes.io/instance=cas-sample
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cas-sample ClusterIP 10.96.77.149 <none> 9042/TCP,7000/TCP,7199/TCP,7001/TCP 3m57s
cas-sample-rack-r0-pods ClusterIP None <none> 9042/TCP,7000/TCP,7199/TCP,7001/TCP 3m57s
Here, we have to use service cas-sample and secret cas-sample-auth to connect with the database. KubeDB creates an AppBinding CR that holds the necessary information to connect with the database.
Verify AppBinding:
Verify that the AppBinding has been created successfully using the following command,
$ kubectl get appbindings -n demo
NAME TYPE VERSION AGE
cas-sample kubedb.com/cassandra 5.0.3 4m23s
Let’s check the YAML of the above AppBinding,
$ kubectl get appbindings -n demo cas-sample -o yaml
apiVersion: appcatalog.appscode.com/v1alpha1
kind: AppBinding
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"kubedb.com/v1alpha2","kind":"Cassandra","metadata":{"annotations":{},"name":"cas-sample","namespace":"demo"},"spec":{"configuration":null,"deletionPolicy":"WipeOut","topology":{"rack":[{"name":"r0","podTemplate":{"spec":{"containers":[{"name":"cassandra","resources":{"limits":{"cpu":2,"memory":"2Gi"},"requests":{"cpu":1,"memory":"1Gi"}}}]}},"replicas":2,"storage":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"1Gi"}}},"storageType":"Durable"}]},"version":"5.0.3"}}
creationTimestamp: "2025-07-28T05:04:35Z"
generation: 1
labels:
app.kubernetes.io/component: database
app.kubernetes.io/instance: cas-sample
app.kubernetes.io/managed-by: kubedb.com
app.kubernetes.io/name: cassandras.kubedb.com
name: cas-sample
namespace: demo
ownerReferences:
- apiVersion: kubedb.com/v1alpha2
blockOwnerDeletion: true
controller: true
kind: Cassandra
name: cas-sample
uid: de9c3313-c9f2-4235-8f84-3d9a92d22503
resourceVersion: "1844"
uid: f04d76e2-1f90-4475-8ee5-e6fdfe80079e
spec:
appRef:
apiGroup: kubedb.com
kind: Cassandra
name: cas-sample
namespace: demo
clientConfig:
service:
name: cas-sample
port: 9042
scheme: http
secret:
name: cas-sample-auth
type: kubedb.com/cassandra
version: 5.0.3
KubeStash uses the AppBinding CR to connect with the target database. It requires the following two fields to set in AppBinding’s .spec section.
.spec.clientConfig.service.namespecifies the name of the Service that connects to the database..spec.secretspecifies the name of the Secret that holds necessary credentials to access the database.spec.typespecifies the types of the app that this AppBinding is pointing to. KubeDB generated AppBinding follows the following format:<app group>/<app resource type>.
Insert Sample Data:
Now, we are going to exec into the any pod and create some sample data. At first, find out the database Pod using the following command,
$ kubectl get pods -n demo --selector="app.kubernetes.io/instance=cas-sample"
NAME READY STATUS RESTARTS AGE
cas-sample-rack-r0-0 1/1 Running 0 5m28s
cas-sample-rack-r0-1 1/1 Running 0 4m28s
And copy the username and password of the database to access into cqlsh shell.
$ kubectl get secret -n demo cas-sample-auth -o jsonpath='{.data.username}'| base64 -d
admin⏎
kubectl get secret -n demo cas-sample-auth -o jsonpath='{.data.password}'| base64 -d
gkebeP3HJbxubvCM⏎
Now, Lets exec into any Pod to enter into cqlsh shell to create a keyspace and a table,
$ kubectl exec -it -n demo cas-sample-rack-r0-0 -- cqlsh -u admin -p gkebeP3HJbxubvCM
Defaulted container "cassandra" out of: cassandra, cassandra-init (init), medusa-init (init)
Warning: Using a password on the command line interface can be insecure.
Recommendation: use the credentials file to securely provide the password.
Connected to Test Cluster at 127.0.0.1:9042
[cqlsh 6.2.0 | Cassandra 5.0.3 | CQL spec 3.4.7 | Native protocol v5]
Use HELP for help.
admin@cqlsh>
admin@cqlsh> CREATE KEYSPACE kubedb WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
admin@cqlsh> USE kubedb;
admin@cqlsh:kubedb> CREATE TABLE users (
... id UUID PRIMARY KEY,
... name TEXT,
... email TEXT
... );
admin@cqlsh:kubedb> INSERT INTO kubedb.users (id, name, email) VALUES (uuid(), 'demo_name1', '[email protected]');
admin@cqlsh:kubedb> INSERT INTO kubedb.users (id, name, email) VALUES (uuid(), 'demo_name2', '[email protected]');
admin@cqlsh:kubedb> SELECT * FROM kubedb.users;
id | email | name
--------------------------------------+------------------+------------
e778de6b-5a71-447b-b015-4c9e0b62bfd6 | [email protected] | demo_name1
17dd25bd-749f-476b-a29e-f9ae97820224 | [email protected] | demo_name2
(2 rows)
admin@cqlsh:kubedb> exit
⏎
Now, we are ready to backup the database.
Prepare Backend
We are going to store our backed up data into a S3 bucket. We have to create a Secret with necessary credentials and a BackupStorage CR to use this backend. If you want to use a different backend, please read the respective backend configuration doc from here.
Create Secret:
Let’s create a secret called medusa-cred with access credentials to our desired S3 bucket,
$ kubectl create secret generic -n demo medusa-cred \
--from-file=./AWS_ACCESS_KEY_ID \
--from-file=./AWS_SECRET_ACCESS_KEY
secret/medusa-cred created
Create BackupStorage:
Now, create a BackupStorage using this secret. Below is the YAML of BackupStorage CR we are going to create,
apiVersion: storage.kubestash.com/v1alpha1
kind: BackupStorage
metadata:
name: s3-storage
namespace: demo
spec:
storage:
provider: s3
s3:
bucket: anisur
prefix: medusa-jul
secretName: medusa-cred
region: us-east-1
usagePolicy:
allowedNamespaces:
from: All
default: true
deletionPolicy: Delete
Let’s create the BackupStorage we have shown above,
$ kubectl apply -f https://github.com/kubedb/docs/raw/v2025.8.31/docs/guides/cassandra/backup/kubestash/logical/examples/backupstorage.yaml
backupstorage.storage.kubestash.com/s3-storage created
Now, we are ready to backup our database to our desired backend.
Create RetentionPolicy:
Now, let’s create a RetentionPolicy to specify how the old Snapshots should be cleaned up.
Below is the YAML of the RetentionPolicy object that we are going to create,
apiVersion: storage.kubestash.com/v1alpha1
kind: RetentionPolicy
metadata:
name: demo-retention
namespace: demo
spec:
default: true
failedSnapshots:
last: 2
maxRetentionPeriod: 2mo
successfulSnapshots:
last: 5
usagePolicy:
allowedNamespaces:
from: All
Let’s create the above RetentionPolicy,
$ kubectl apply -f https://github.com/kubedb/docs/raw/v2025.8.31/docs/guides/cassandra/backup/kubestash/logical/examples/retentionpolicy.yaml
retentionpolicy.storage.kubestash.com/demo-retention created
Backup
We have to create a BackupConfiguration targeting respective cas-sample Cassandra database. Then, KubeStash will create a CronJob for each session to take periodic backup of that database.
Create BackupConfiguration:
Below is the YAML for BackupConfiguration CR to backup the cas-sample database that we have deployed earlier,
apiVersion: core.kubestash.com/v1alpha1
kind: BackupConfiguration
metadata:
name: sample-cas-backup
namespace: demo
spec:
target:
apiGroup: kubedb.com
kind: Cassandra
namespace: demo
name: cas-sample
backends:
- name: s3-backend
storageRef:
namespace: demo
name: s3-storage
retentionPolicy:
name: demo-retention
namespace: demo
sessions:
- name: frequent-backup
scheduler:
schedule: "*/5 * * * *"
jobTemplate:
backoffLimit: 1
repositories:
- name: s3-cassandra-repo
backend: s3-backend
directory: /cas
addon:
name: cassandra-addon
tasks:
- name: logical-backup
.spec.sessions[*].schedulespecifies that we want to backup the database at5 minutesinterval..spec.targetrefers to the targetedcas-sampleSigleStore database that we created earlier.
Let’s create the BackupConfiguration CR that we have shown above,
$ kubectl apply -f https://github.com/kubedb/docs/raw/v2025.8.31/docs/guides/cassandra/backup/kubestash/logical/examples/backupconfiguration.yaml
backupconfiguration.core.kubestash.com/sample-cas-backup created
Verify Backup Setup Successful
If everything goes well, the phase of the BackupConfiguration should be Ready. The Ready phase indicates that the backup setup is successful. Let’s verify the Phase of the BackupConfiguration,
$ kubectl get backupconfiguration -n demo
NAME PHASE PAUSED AGE
sample-cas-backup Ready 107s
Additionally, we can verify that the Repository specified in the BackupConfiguration has been created using the following command,
$ kubectl get repo -n demo
NAME INTEGRITY SNAPSHOT-COUNT SIZE PHASE LAST-SUCCESSFUL-BACKUP AGE
s3-cassandra-repo 1 0 B Ready 2m15s 2m48s
KubeStash keeps the backup for Repository YAMLs. If we navigate to the S3 bucket, we will see the Repository YAML stored in the demo/cassandra directory.
Verify CronJob:
It will also create a CronJob with the schedule specified in spec.sessions[*].scheduler.schedule field of BackupConfiguration CR.
Verify that the CronJob has been created using the following command,
$ kubectl get cronjob -n demo
NAME SCHEDULE TIMEZONE SUSPEND ACTIVE LAST SCHEDULE AGE
trigger-sample-cas-backup-frequent-backup */5 * * * * <none> False 0 47s 2m39s
Verify BackupSession:
KubeStash triggers an instant backup as soon as the BackupConfiguration is ready. After that, backups are scheduled according to the specified schedule.
$ kubectl get backupsession -n demo -w
NAME INVOKER-TYPE INVOKER-NAME PHASE DURATION AGE
sample-cas-backup-frequent-backup-1753682588 BackupConfiguration sample-cas-backup Succeeded 2m2s 2m59s
We can see from the above output that the backup session has succeeded. Now, we are going to verify whether the backed up data has been stored in the backend.
Verify Backup:
Once a backup is complete, KubeStash will update the respective Repository CR to reflect the backup. Check that the repository sample-cas-backup has been updated by the following command,
$ kubectl get repository -n demo s3-cassandra-repo
NAME INTEGRITY SNAPSHOT-COUNT SIZE PHASE LAST-SUCCESSFUL-BACKUP AGE
s3-cassandra-repo 1 0 B Ready 3m46s 4m19s
At this moment we have one Snapshot. Run the following command to check the respective Snapshot which represents the state of a backup run for an application.
$ kubectl get snapshots -n demo -l=kubestash.com/repo-name=s3-cassandra-repo
NAME REPOSITORY SESSION SNAPSHOT-TIME DELETION-POLICY PHASE AGE
s3-cassandra-repo-sample-cas-backup-frequent-backup-1753682588 s3-cassandra-repo frequent-backup 2025-07-28T06:03:08Z Delete Succeeded 4m12s
Note: KubeStash creates a
Snapshotwith the following labels:
kubestash.com/app-ref-kind: <target-kind>kubestash.com/app-ref-name: <target-name>kubestash.com/app-ref-namespace: <target-namespace>kubestash.com/repo-name: <repository-name>These labels can be used to watch only the
Snapshots related to our target Database orRepository.
If we check the YAML of the Snapshot, we can find the information about the backed up components of the Database.
$ kubectl get snapshots -n demo s3-cassandra-repo-sample-cas-backup-frequent-backup-1753682588 -oyaml
apiVersion: storage.kubestash.com/v1alpha1
kind: Snapshot
metadata:
annotations:
kubedb.com/db-version: 5.0.3
creationTimestamp: "2025-07-28T06:03:08Z"
finalizers:
- kubestash.com/cleanup
generation: 1
labels:
kubestash.com/app-ref-kind: Cassandra
kubestash.com/app-ref-name: cas-sample
kubestash.com/app-ref-namespace: demo
kubestash.com/repo-name: s3-cassandra-repo
name: s3-cassandra-repo-sample-cas-backup-frequent-backup-1753682588
namespace: demo
ownerReferences:
- apiVersion: storage.kubestash.com/v1alpha1
blockOwnerDeletion: true
controller: true
kind: Repository
name: s3-cassandra-repo
uid: 2e408d1b-081a-4f8b-8c94-c7856f267411
resourceVersion: "8506"
uid: 222c3109-62bf-42a5-a8e3-ecab3180f4c7
spec:
appRef:
apiGroup: kubedb.com
kind: Cassandra
name: cas-sample
namespace: demo
backupSession: sample-cas-backup-frequent-backup-1753682588
deletionPolicy: Delete
repository: s3-cassandra-repo
session: frequent-backup
snapshotID: 01K17T1CVZMMQBRKKJWRPPBAPS
type: FullBackup
version: v1
status:
components:
dump:
driver: Medusa
duration: 0s
medusaStats:
backupName: s3-cassandra-repo-sample-cas-backup-frequent-backup-1753682588
path: repository/v1/frequent-backup/dump
phase: Succeeded
conditions:
- lastTransitionTime: "2025-07-28T06:03:08Z"
message: Recent snapshot list updated successfully
reason: SuccessfullyUpdatedRecentSnapshotList
status: "True"
type: RecentSnapshotListUpdated
- lastTransitionTime: "2025-07-28T06:05:07Z"
message: Metadata uploaded to backend successfully
reason: SuccessfullyUploadedSnapshotMetadata
status: "True"
type: SnapshotMetadataUploaded
phase: Succeeded
snapshotTime: "2025-07-28T06:03:08Z"
totalComponents: 1
verificationStatus: NotVerified
Now, if we navigate to the S3 bucket, we will see the backed up data stored in the demo/cassandra/repository/v1/frequent-backup/dump directory. KubeStash also keeps the backup for Snapshot YAMLs, which can be found in the demo/dep/snapshots directory.
Note: KubeStash stores all dumped data encrypted in the backup directory, meaning it remains unreadable until decrypted.
Restore
In this section, we are going to restore the database from the backup we have taken in the previous section. We are going to deploy a new database and initialize it from the backup.
Deploy Restored Database:
Now, we have to deploy the restored database similarly as we have deployed the original cas-sample database.
Create RestoreSession:
Now, we need to create a RestoreSession CRD pointing to targeted Cassandra database.
Below, is the contents of YAML file of the RestoreSession object that we are going to create to restore backed up data into the newly created database provisioned by Cassandra object named restored-cassandra.
apiVersion: core.kubestash.com/v1alpha1
kind: RestoreSession
metadata:
name: restore-sample-cassandra
namespace: demo
spec:
target:
apiGroup: kubedb.com
kind: Cassandra
namespace: demo
name: cas-sample
dataSource:
repository: s3-cassandra-repo
snapshot: latest
addon:
name: cassandra-addon
tasks:
- name: logical-backup-restore
Here,
.spec.targetrefers to the newly createdrestored-cassandraCassandra object to where we want to restore backup data..spec.dataSource.repositoryspecifies the Repository object that holds the backed up data..spec.dataSource.snapshotspecifies to restore from latestSnapshot.
Let’s create the RestoreSession CRD object we have shown above,
$ kubectl apply -f https://github.com/kubedb/docs/raw/v2025.8.31/docs/guides/cassandra/backup/kubestash/logical/examples/restoresession.yaml
restoresession.core.kubestash.com/sample-cassandra-restore created
Once, you have created the RestoreSession object, KubeStash will create restore Job. Run the following command to watch the phase of the RestoreSession object,
$ kubectl get restoresession -n demo
NAME REPOSITORY PHASE DURATION AGE
restore-sample-cassandra s3-cassandra-repo Running 100s
The Succeeded phase means that the restore process has been completed successfully.
Verify Restored Data:
In this section, we are going to verify whether the desired data has been restored successfully. We are going to connect to the database server and check whether the database and the table we created earlier in the original database are restored.
At first, check if the database has gone into Ready state by the following command,
$ kubectl get cassandra -n demo cas-sample
NAME TYPE VERSION STATUS AGE
cas-sample kubedb.com/v1alpha2 5.0.3 Ready 136m
Now, find out the database Pod by the following command,
$ kubectl get pods -n demo --selector="app.kubernetes.io/instance=cas-
sample"
NAME READY STATUS RESTARTS AGE
cas-sample-rack-r0-0 1/1 Running 0 137m
cas-sample-rack-r0-1 1/1 Running 0 136m
And then copy the user name and password of the root user to access into cqlsh shell.
$ kubectl get secret -n demo cas-sample-auth -o jsonpath='{.data.username}'| base64 -d
admin⏎
kubectl get secret -n demo cas-sample-auth -o jsonpath='{.data.password}'| base64 -d
gkebeP3HJbxubvCM⏎
Now, Lets exec into any Pod to enter into cqlsh shell and access the previously created table,
$ kubectl exec -it -n demo cas-sample-rack-r0-0 -- cqlsh -u admin -p gkebeP3HJbxubvCM
Defaulted container "cassandra" out of: cassandra, cassandra-init (init), medusa-init (init)
Warning: Using a password on the command line interface can be insecure.
Recommendation: use the credentials file to securely provide the password.
Connected to Test Cluster at 127.0.0.1:9042
[cqlsh 6.2.0 | Cassandra 5.0.3 | CQL spec 3.4.7 | Native protocol v5]
Use HELP for help.
admin@cqlsh> SELECT * FROM kubedb.users;
id | email | name
--------------------------------------+------------------+------------
e778de6b-5a71-447b-b015-4c9e0b62bfd6 | [email protected] | demo_name1
17dd25bd-749f-476b-a29e-f9ae97820224 | [email protected] | demo_name2
(2 rows)
So, from the above output, we can see that the users table we have created earlier in the original database and now, they are restored successfully.
Cleanup
To cleanup the Kubernetes resources created by this tutorial, run:
kubectl delete backupconfigurations.core.kubestash.com -n demo sample-cas-backup
kubectl delete restoresessions.core.kubestash.com -n demo restore-sample-cassandra
kubectl delete retentionpolicies.storage.kubestash.com -n demo demo-retention
kubectl delete backupstorage -n demo s3-storage
kubectl delete secret -n demo medusa-cred
kubectl delete my -n demo cas-sample






























