Backup and Restore Cassandra database using KubeStash
KubeStash allows you to backup and restore Cassandra
databases. It supports backups for Cassandra
instances running in Standalone, and cluster configurations. KubeStash makes managing your Cassandra
backups and restorations more straightforward and efficient.
This guide will give you how you can take backup and restore your Cassandra
databases using Kubestash
.
Before You Begin
- At first, you need to have a Kubernetes cluster, and the
kubectl
command-line tool must be configured to communicate with your cluster. If you do not already have a cluster, you can create one by usingMinikube
orKind
. - Install
KubeDB
in your cluster following the steps here. - Install
KubeStash
in your cluster following the steps here. - Install KubeStash
kubectl
plugin following the steps here. - If you are not familiar with how KubeStash backup and restore Cassandra databases, please check the following guide here.
You should be familiar with the following KubeStash
concepts:
To keep everything isolated, we are going to use a separate namespace called demo
throughout this tutorial.
$ kubectl create ns demo
namespace/demo created
Note: YAML files used in this tutorial are stored in docs/guides/cassandra/backup/kubestash/logical/examples directory of kubedb/docs repository.
Backup Cassandra
KubeStash supports backups for Cassandra
instances across different configurations, including Standalone, and Cluster setups. In this demonstration, we’ll focus on a Cassandra
database using Clustering mode. The backup and restore process is similar for Standalone and Cluster configurations as well.
This section will demonstrate how to backup a Cassandra
database. Here, we are going to deploy a Cassandra
database using KubeDB. Then, we are going to backup this database into a S3
bucket. Finally, we are going to restore the backup up data into another Cassandra
database.
Create Cassandra License Secret
We need Cassandra License to create Cassandra Database. So, Ensure that you have acquired a license and then simply pass the license by secret.
Deploy Sample Cassandra Database
Let’s deploy a sample Cassandra
database and insert some data into it.
Create Cassandra CR:
Below is the YAML of a sample Cassandra
CRD that we are going to create for this tutorial:
apiVersion: kubedb.com/v1alpha2
kind: Cassandra
metadata:
name: cas-sample
namespace: demo
spec:
version: 5.0.3
configuration:
topology:
rack:
- name: r0
replicas: 2
podTemplate:
spec:
containers:
- name: cassandra
resources:
limits:
memory: 2Gi
cpu: 2
requests:
memory: 1Gi
cpu: 1
storage:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
storageType: Durable
deletionPolicy: WipeOut
Here,
spec.version
is the name of the CassandraVersion CRD where the docker images are specified. In this tutorial, a Cassandra5.0.3
database is going to be created.spec.topology
specifies that it will be used as cluster mode. If this field is nil it will be work as standalone mode.spec.storageType
specifies the type of storage that will be used for Cassandra database. It can beDurable
orEphemeral
. Default value of this field isDurable
. IfEphemeral
is used then KubeDB will create Cassandra database usingEmptyDir
volume. In this case, you don’t have to specifyspec.storage
field. This is useful for testing purposes.spec.deletionPolicy
gives flexibility whether tonullify
(reject) the delete operation ofCassandra
crd or which resources KubeDB should keep or delete when you deleteCassandra
crd. If admission webhook is enabled, It prevents users from deleting the database as long as thespec.deletionPolicy
is set toDoNotTerminate
. Learn details of allDeletionPolicy
here
Note:
spec.storage
section is used to create PVC for database pod. It will create PVC with storage size specified instorage.resources.requests
field. Don’t specify limits here. PVC does not get resized automatically.
Create the above Cassandra
CR,
$ kubectl apply -f https://github.com/kubedb/docs/raw/v2025.7.31/docs/guides/cassandra/backup/kubestash/logical/examples/cas-sample.yaml
cassandra.kubedb.com/cas-sample created
KubeDB will deploy a Cassandra database according to the above specification. It will also create the necessary Secrets
and Services
to access the database.
Let’s check if the database is ready to use,
$ kubectl get cassandras.kubedb.com -n demo
NAME TYPE VERSION STATUS AGE
cas-sample kubedb.com/v1alpha2 5.0.3 Ready 3m6s
The database is Ready
. Verify that KubeDB has created a Secret
and a Service
for this database using the following commands,
$ kubectl get secret -n demo -l=app.kubernetes.io/instance=cas-sample
NAME TYPE DATA AGE
cas-sample-auth kubernetes.io/basic-auth 2 3m33s
cas-sample-config Opaque 1 3m33s
$ kubectl get service -n demo -l=app.kubernetes.io/instance=cas-sample
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
cas-sample ClusterIP 10.96.77.149 <none> 9042/TCP,7000/TCP,7199/TCP,7001/TCP 3m57s
cas-sample-rack-r0-pods ClusterIP None <none> 9042/TCP,7000/TCP,7199/TCP,7001/TCP 3m57s
Here, we have to use service cas-sample
and secret cas-sample-auth
to connect with the database. KubeDB
creates an AppBinding CR that holds the necessary information to connect with the database.
Verify AppBinding:
Verify that the AppBinding
has been created successfully using the following command,
$ kubectl get appbindings -n demo
NAME TYPE VERSION AGE
cas-sample kubedb.com/cassandra 5.0.3 4m23s
Let’s check the YAML of the above AppBinding
,
$ kubectl get appbindings -n demo cas-sample -o yaml
apiVersion: appcatalog.appscode.com/v1alpha1
kind: AppBinding
metadata:
annotations:
kubectl.kubernetes.io/last-applied-configuration: |
{"apiVersion":"kubedb.com/v1alpha2","kind":"Cassandra","metadata":{"annotations":{},"name":"cas-sample","namespace":"demo"},"spec":{"configuration":null,"deletionPolicy":"WipeOut","topology":{"rack":[{"name":"r0","podTemplate":{"spec":{"containers":[{"name":"cassandra","resources":{"limits":{"cpu":2,"memory":"2Gi"},"requests":{"cpu":1,"memory":"1Gi"}}}]}},"replicas":2,"storage":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"1Gi"}}},"storageType":"Durable"}]},"version":"5.0.3"}}
creationTimestamp: "2025-07-28T05:04:35Z"
generation: 1
labels:
app.kubernetes.io/component: database
app.kubernetes.io/instance: cas-sample
app.kubernetes.io/managed-by: kubedb.com
app.kubernetes.io/name: cassandras.kubedb.com
name: cas-sample
namespace: demo
ownerReferences:
- apiVersion: kubedb.com/v1alpha2
blockOwnerDeletion: true
controller: true
kind: Cassandra
name: cas-sample
uid: de9c3313-c9f2-4235-8f84-3d9a92d22503
resourceVersion: "1844"
uid: f04d76e2-1f90-4475-8ee5-e6fdfe80079e
spec:
appRef:
apiGroup: kubedb.com
kind: Cassandra
name: cas-sample
namespace: demo
clientConfig:
service:
name: cas-sample
port: 9042
scheme: http
secret:
name: cas-sample-auth
type: kubedb.com/cassandra
version: 5.0.3
KubeStash uses the AppBinding
CR to connect with the target database. It requires the following two fields to set in AppBinding’s .spec
section.
.spec.clientConfig.service.name
specifies the name of the Service that connects to the database..spec.secret
specifies the name of the Secret that holds necessary credentials to access the database.spec.type
specifies the types of the app that this AppBinding is pointing to. KubeDB generated AppBinding follows the following format:<app group>/<app resource type>
.
Insert Sample Data:
Now, we are going to exec into the any pod and create some sample data. At first, find out the database Pod
using the following command,
$ kubectl get pods -n demo --selector="app.kubernetes.io/instance=cas-sample"
NAME READY STATUS RESTARTS AGE
cas-sample-rack-r0-0 1/1 Running 0 5m28s
cas-sample-rack-r0-1 1/1 Running 0 4m28s
And copy the username and password of the database to access into cqlsh
shell.
$ kubectl get secret -n demo cas-sample-auth -o jsonpath='{.data.username}'| base64 -d
admin⏎
kubectl get secret -n demo cas-sample-auth -o jsonpath='{.data.password}'| base64 -d
gkebeP3HJbxubvCM⏎
Now, Lets exec into any Pod
to enter into cqlsh
shell to create a keyspace and a table,
$ kubectl exec -it -n demo cas-sample-rack-r0-0 -- cqlsh -u admin -p gkebeP3HJbxubvCM
Defaulted container "cassandra" out of: cassandra, cassandra-init (init), medusa-init (init)
Warning: Using a password on the command line interface can be insecure.
Recommendation: use the credentials file to securely provide the password.
Connected to Test Cluster at 127.0.0.1:9042
[cqlsh 6.2.0 | Cassandra 5.0.3 | CQL spec 3.4.7 | Native protocol v5]
Use HELP for help.
admin@cqlsh>
admin@cqlsh> CREATE KEYSPACE kubedb WITH REPLICATION = { 'class' : 'SimpleStrategy', 'replication_factor' : 1 };
admin@cqlsh> USE kubedb;
admin@cqlsh:kubedb> CREATE TABLE users (
... id UUID PRIMARY KEY,
... name TEXT,
... email TEXT
... );
admin@cqlsh:kubedb> INSERT INTO kubedb.users (id, name, email) VALUES (uuid(), 'demo_name1', '[email protected]');
admin@cqlsh:kubedb> INSERT INTO kubedb.users (id, name, email) VALUES (uuid(), 'demo_name2', '[email protected]');
admin@cqlsh:kubedb> SELECT * FROM kubedb.users;
id | email | name
--------------------------------------+------------------+------------
e778de6b-5a71-447b-b015-4c9e0b62bfd6 | [email protected] | demo_name1
17dd25bd-749f-476b-a29e-f9ae97820224 | [email protected] | demo_name2
(2 rows)
admin@cqlsh:kubedb> exit
⏎
Now, we are ready to backup the database.
Prepare Backend
We are going to store our backed up data into a S3 bucket. We have to create a Secret with necessary credentials and a BackupStorage
CR to use this backend. If you want to use a different backend, please read the respective backend configuration doc from here.
Create Secret:
Let’s create a secret called medusa-cred
with access credentials to our desired S3 bucket,
$ kubectl create secret generic -n demo medusa-cred \
--from-file=./AWS_ACCESS_KEY_ID \
--from-file=./AWS_SECRET_ACCESS_KEY
secret/medusa-cred created
Create BackupStorage:
Now, create a BackupStorage
using this secret. Below is the YAML of BackupStorage
CR we are going to create,
apiVersion: storage.kubestash.com/v1alpha1
kind: BackupStorage
metadata:
name: s3-storage
namespace: demo
spec:
storage:
provider: s3
s3:
bucket: anisur
prefix: medusa-jul
secretName: medusa-cred
region: us-east-1
usagePolicy:
allowedNamespaces:
from: All
default: true
deletionPolicy: Delete
Let’s create the BackupStorage we have shown above,
$ kubectl apply -f https://github.com/kubedb/docs/raw/v2025.7.31/docs/guides/cassandra/backup/kubestash/logical/examples/backupstorage.yaml
backupstorage.storage.kubestash.com/s3-storage created
Now, we are ready to backup our database to our desired backend.
Create RetentionPolicy:
Now, let’s create a RetentionPolicy
to specify how the old Snapshots should be cleaned up.
Below is the YAML of the RetentionPolicy
object that we are going to create,
apiVersion: storage.kubestash.com/v1alpha1
kind: RetentionPolicy
metadata:
name: demo-retention
namespace: demo
spec:
default: true
failedSnapshots:
last: 2
maxRetentionPeriod: 2mo
successfulSnapshots:
last: 5
usagePolicy:
allowedNamespaces:
from: All
Let’s create the above RetentionPolicy
,
$ kubectl apply -f https://github.com/kubedb/docs/raw/v2025.7.31/docs/guides/cassandra/backup/kubestash/logical/examples/retentionpolicy.yaml
retentionpolicy.storage.kubestash.com/demo-retention created
Backup
We have to create a BackupConfiguration
targeting respective cas-sample
Cassandra database. Then, KubeStash will create a CronJob
for each session to take periodic backup of that database.
Create BackupConfiguration:
Below is the YAML for BackupConfiguration
CR to backup the cas-sample
database that we have deployed earlier,
apiVersion: core.kubestash.com/v1alpha1
kind: BackupConfiguration
metadata:
name: sample-cas-backup
namespace: demo
spec:
target:
apiGroup: kubedb.com
kind: Cassandra
namespace: demo
name: cas-sample
backends:
- name: s3-backend
storageRef:
namespace: demo
name: s3-storage
retentionPolicy:
name: demo-retention
namespace: demo
sessions:
- name: frequent-backup
scheduler:
schedule: "*/5 * * * *"
jobTemplate:
backoffLimit: 1
repositories:
- name: s3-cassandra-repo
backend: s3-backend
directory: /cas
addon:
name: cassandra-addon
tasks:
- name: logical-backup
.spec.sessions[*].schedule
specifies that we want to backup the database at5 minutes
interval..spec.target
refers to the targetedcas-sample
SigleStore database that we created earlier.
Let’s create the BackupConfiguration
CR that we have shown above,
$ kubectl apply -f https://github.com/kubedb/docs/raw/v2025.7.31/docs/guides/cassandra/backup/kubestash/logical/examples/backupconfiguration.yaml
backupconfiguration.core.kubestash.com/sample-cas-backup created
Verify Backup Setup Successful
If everything goes well, the phase of the BackupConfiguration
should be Ready
. The Ready
phase indicates that the backup setup is successful. Let’s verify the Phase
of the BackupConfiguration,
$ kubectl get backupconfiguration -n demo
NAME PHASE PAUSED AGE
sample-cas-backup Ready 107s
Additionally, we can verify that the Repository
specified in the BackupConfiguration
has been created using the following command,
$ kubectl get repo -n demo
NAME INTEGRITY SNAPSHOT-COUNT SIZE PHASE LAST-SUCCESSFUL-BACKUP AGE
s3-cassandra-repo 1 0 B Ready 2m15s 2m48s
KubeStash keeps the backup for Repository
YAMLs. If we navigate to the S3 bucket, we will see the Repository
YAML stored in the demo/cassandra
directory.
Verify CronJob:
It will also create a CronJob
with the schedule specified in spec.sessions[*].scheduler.schedule
field of BackupConfiguration
CR.
Verify that the CronJob
has been created using the following command,
$ kubectl get cronjob -n demo
NAME SCHEDULE TIMEZONE SUSPEND ACTIVE LAST SCHEDULE AGE
trigger-sample-cas-backup-frequent-backup */5 * * * * <none> False 0 47s 2m39s
Verify BackupSession:
KubeStash triggers an instant backup as soon as the BackupConfiguration
is ready. After that, backups are scheduled according to the specified schedule.
$ kubectl get backupsession -n demo -w
NAME INVOKER-TYPE INVOKER-NAME PHASE DURATION AGE
sample-cas-backup-frequent-backup-1753682588 BackupConfiguration sample-cas-backup Succeeded 2m2s 2m59s
We can see from the above output that the backup session has succeeded. Now, we are going to verify whether the backed up data has been stored in the backend.
Verify Backup:
Once a backup is complete, KubeStash will update the respective Repository
CR to reflect the backup. Check that the repository sample-cas-backup
has been updated by the following command,
$ kubectl get repository -n demo s3-cassandra-repo
NAME INTEGRITY SNAPSHOT-COUNT SIZE PHASE LAST-SUCCESSFUL-BACKUP AGE
s3-cassandra-repo 1 0 B Ready 3m46s 4m19s
At this moment we have one Snapshot
. Run the following command to check the respective Snapshot
which represents the state of a backup run for an application.
$ kubectl get snapshots -n demo -l=kubestash.com/repo-name=s3-cassandra-repo
NAME REPOSITORY SESSION SNAPSHOT-TIME DELETION-POLICY PHASE AGE
s3-cassandra-repo-sample-cas-backup-frequent-backup-1753682588 s3-cassandra-repo frequent-backup 2025-07-28T06:03:08Z Delete Succeeded 4m12s
Note: KubeStash creates a
Snapshot
with the following labels:
kubestash.com/app-ref-kind: <target-kind>
kubestash.com/app-ref-name: <target-name>
kubestash.com/app-ref-namespace: <target-namespace>
kubestash.com/repo-name: <repository-name>
These labels can be used to watch only the
Snapshot
s related to our target Database orRepository
.
If we check the YAML of the Snapshot
, we can find the information about the backed up components of the Database.
$ kubectl get snapshots -n demo s3-cassandra-repo-sample-cas-backup-frequent-backup-1753682588 -oyaml
apiVersion: storage.kubestash.com/v1alpha1
kind: Snapshot
metadata:
annotations:
kubedb.com/db-version: 5.0.3
creationTimestamp: "2025-07-28T06:03:08Z"
finalizers:
- kubestash.com/cleanup
generation: 1
labels:
kubestash.com/app-ref-kind: Cassandra
kubestash.com/app-ref-name: cas-sample
kubestash.com/app-ref-namespace: demo
kubestash.com/repo-name: s3-cassandra-repo
name: s3-cassandra-repo-sample-cas-backup-frequent-backup-1753682588
namespace: demo
ownerReferences:
- apiVersion: storage.kubestash.com/v1alpha1
blockOwnerDeletion: true
controller: true
kind: Repository
name: s3-cassandra-repo
uid: 2e408d1b-081a-4f8b-8c94-c7856f267411
resourceVersion: "8506"
uid: 222c3109-62bf-42a5-a8e3-ecab3180f4c7
spec:
appRef:
apiGroup: kubedb.com
kind: Cassandra
name: cas-sample
namespace: demo
backupSession: sample-cas-backup-frequent-backup-1753682588
deletionPolicy: Delete
repository: s3-cassandra-repo
session: frequent-backup
snapshotID: 01K17T1CVZMMQBRKKJWRPPBAPS
type: FullBackup
version: v1
status:
components:
dump:
driver: Medusa
duration: 0s
medusaStats:
backupName: s3-cassandra-repo-sample-cas-backup-frequent-backup-1753682588
path: repository/v1/frequent-backup/dump
phase: Succeeded
conditions:
- lastTransitionTime: "2025-07-28T06:03:08Z"
message: Recent snapshot list updated successfully
reason: SuccessfullyUpdatedRecentSnapshotList
status: "True"
type: RecentSnapshotListUpdated
- lastTransitionTime: "2025-07-28T06:05:07Z"
message: Metadata uploaded to backend successfully
reason: SuccessfullyUploadedSnapshotMetadata
status: "True"
type: SnapshotMetadataUploaded
phase: Succeeded
snapshotTime: "2025-07-28T06:03:08Z"
totalComponents: 1
verificationStatus: NotVerified
Now, if we navigate to the S3 bucket, we will see the backed up data stored in the demo/cassandra/repository/v1/frequent-backup/dump
directory. KubeStash also keeps the backup for Snapshot
YAMLs, which can be found in the demo/dep/snapshots
directory.
Note: KubeStash stores all dumped data encrypted in the backup directory, meaning it remains unreadable until decrypted.
Restore
In this section, we are going to restore the database from the backup we have taken in the previous section. We are going to deploy a new database and initialize it from the backup.
Deploy Restored Database:
Now, we have to deploy the restored database similarly as we have deployed the original cas-sample
database.
Create RestoreSession:
Now, we need to create a RestoreSession CRD pointing to targeted Cassandra
database.
Below, is the contents of YAML file of the RestoreSession
object that we are going to create to restore backed up data into the newly created database provisioned by Cassandra object named restored-cassandra
.
apiVersion: core.kubestash.com/v1alpha1
kind: RestoreSession
metadata:
name: restore-sample-cassandra
namespace: demo
spec:
target:
apiGroup: kubedb.com
kind: Cassandra
namespace: demo
name: cas-sample
dataSource:
repository: s3-cassandra-repo
snapshot: latest
addon:
name: cassandra-addon
tasks:
- name: logical-backup-restore
Here,
.spec.target
refers to the newly createdrestored-cassandra
Cassandra object to where we want to restore backup data..spec.dataSource.repository
specifies the Repository object that holds the backed up data..spec.dataSource.snapshot
specifies to restore from latestSnapshot
.
Let’s create the RestoreSession CRD object we have shown above,
$ kubectl apply -f https://github.com/kubedb/docs/raw/v2025.7.31/docs/guides/cassandra/backup/kubestash/logical/examples/restoresession.yaml
restoresession.core.kubestash.com/sample-cassandra-restore created
Once, you have created the RestoreSession
object, KubeStash will create restore Job. Run the following command to watch the phase of the RestoreSession
object,
$ kubectl get restoresession -n demo
NAME REPOSITORY PHASE DURATION AGE
restore-sample-cassandra s3-cassandra-repo Running 100s
The Succeeded
phase means that the restore process has been completed successfully.
Verify Restored Data:
In this section, we are going to verify whether the desired data has been restored successfully. We are going to connect to the database server and check whether the database and the table we created earlier in the original database are restored.
At first, check if the database has gone into Ready
state by the following command,
$ kubectl get cassandra -n demo cas-sample
NAME TYPE VERSION STATUS AGE
cas-sample kubedb.com/v1alpha2 5.0.3 Ready 136m
Now, find out the database Pod
by the following command,
$ kubectl get pods -n demo --selector="app.kubernetes.io/instance=cas-
sample"
NAME READY STATUS RESTARTS AGE
cas-sample-rack-r0-0 1/1 Running 0 137m
cas-sample-rack-r0-1 1/1 Running 0 136m
And then copy the user name and password of the root
user to access into cqlsh
shell.
$ kubectl get secret -n demo cas-sample-auth -o jsonpath='{.data.username}'| base64 -d
admin⏎
kubectl get secret -n demo cas-sample-auth -o jsonpath='{.data.password}'| base64 -d
gkebeP3HJbxubvCM⏎
Now, Lets exec into any Pod
to enter into cqlsh
shell and access the previously created table,
$ kubectl exec -it -n demo cas-sample-rack-r0-0 -- cqlsh -u admin -p gkebeP3HJbxubvCM
Defaulted container "cassandra" out of: cassandra, cassandra-init (init), medusa-init (init)
Warning: Using a password on the command line interface can be insecure.
Recommendation: use the credentials file to securely provide the password.
Connected to Test Cluster at 127.0.0.1:9042
[cqlsh 6.2.0 | Cassandra 5.0.3 | CQL spec 3.4.7 | Native protocol v5]
Use HELP for help.
admin@cqlsh> SELECT * FROM kubedb.users;
id | email | name
--------------------------------------+------------------+------------
e778de6b-5a71-447b-b015-4c9e0b62bfd6 | [email protected] | demo_name1
17dd25bd-749f-476b-a29e-f9ae97820224 | [email protected] | demo_name2
(2 rows)
So, from the above output, we can see that the users
table we have created earlier in the original database and now, they are restored successfully.
Cleanup
To cleanup the Kubernetes resources created by this tutorial, run:
kubectl delete backupconfigurations.core.kubestash.com -n demo sample-cas-backup
kubectl delete restoresessions.core.kubestash.com -n demo restore-sample-cassandra
kubectl delete retentionpolicies.storage.kubestash.com -n demo demo-retention
kubectl delete backupstorage -n demo s3-storage
kubectl delete secret -n demo medusa-cred
kubectl delete my -n demo cas-sample