This is an Enterprise-only feature. Please install Stash Enterprise Edition to try this feature. You can use KubeDB Enterprise license to install Stash Enterprise edition. Database backup with Stash is already included in the KubeDB Enterprise license. So, you don’t need a separate license for Stash.
MongoDB Backup & Restore Overview
KubeDB uses Stash to backup and restore databases. Stash by AppsCode is a cloud native data backup and recovery solution for Kubernetes workloads. Stash utilizes restic to securely backup stateful applications to any cloud or on-prem storage backends (for example, S3, GCS, Azure Blob storage, Minio, NetApp, Dell EMC etc.).
Fig: Backup KubeDB Databases Using Stash
How Backup Works
The following diagram shows how Stash takes backup of a MongoDB database. Open the image in a new tab to see the enlarged version.
Fig: MongoDB Backup Overview
The backup process consists of the following steps:
At first, a user creates a secret with access credentials of the backend where the backed up data will be stored.
Then, she creates a Repository crd that specifies the backend information along with the secret that holds the credentials to access the backend.
Then, she creates a BackupConfiguration crd targeting the AppBinding crd of the desired database. The BackupConfiguration object also specifies the Task to use to backup the database.
Stash operator watches for BackupConfiguration crd.
Once Stash operator finds a BackupConfiguration crd, it creates a CronJob with the schedule specified in BackupConfiguration object to trigger backup periodically.
On the next scheduled slot, the CronJob triggers a backup by creating a BackupSession crd.
Stash operator also watches for BackupSession crd.
When it finds a BackupSession object, it resolves the respective Task and Function and prepares a Job definition to backup.
Then, it creates the Job to backup the targeted database.
The backup Job reads necessary information to connect with the database from the AppBinding crd. It also reads backend information and access credentials from Repository crd and Storage Secret respectively.
Then, the Job dumps the targeted database and uploads the output to the backend. Stash pipes the output of dump command to uploading process. Hence, backup Job does not require a large volume to hold the entire dump output.
Finally, when the backup is complete, the Job sends Prometheus metrics to the Pushgateway running inside Stash operator pod. It also updates the BackupSession and Repository status to reflect the backup procedure.
Backup Different MongoDB Configurations
This section will show you how backup works for different MongoDB configurations.
Standalone MongoDB
For a standalone MongoDB database, the backup job directly dumps the database using mongodump and pipe the output to the backup process.
Fig: Standalone MongoDB Backup
MongoDB ReplicaSet Cluster
For MongoDB ReplicaSet cluster, Stash takes backup from one of the secondary replicas. The backup process consists of the following steps:
Identify a secondary replica.
Lock the secondary replica.
Backup the secondary replica.
Unlock the secondary replica.
Fig: MongoDB ReplicaSet Cluster Backup
MongoDB Sharded Cluster
For MongoDB sharded cluster, Stash takes backup of the individual shards as well as the config server. Stash takes backup from a secondary replica of the shards and the config server. If there is no secondary replica then Stash will take backup from the primary replica. The backup process consists of the following steps:
Disable balancer.
Lock config server.
Identify a secondary replica for each shard.
Lock the secondary replica.
Run backup on the secondary replica.
Unlock the secondary replica.
Unlock config server.
Enable balancer.
Fig: MongoDB Sharded Cluster Backup
How Restore Process Works
The following diagram shows how Stash restores backed up data into a MongoDB database. Open the image in a new tab to see the enlarged version.
Fig: MongoDB Restore Process Overview
The restore process consists of the following steps:
At first, a user creates a RestoreSession crd targeting the AppBinding of the desired database where the backed up data will be restored. It also specifies the Repository crd which holds the backend information and the Task to use to restore the target.
Stash operator watches for RestoreSession object.
Once it finds a RestoreSession object, it resolves the respective Task and Function and prepares a Job definition to restore.
Then, it creates the Job to restore the target.
The Job reads necessary information to connect with the database from respective AppBinding crd. It also reads backend information and access credentials from Repository crd and Storage Secret respectively.
Then, the job downloads the backed up data from the backend and injects into the desired database. Stash pipes the downloaded data to the respective database tool to inject into the database. Hence, restore job does not require a large volume to download entire backup data inside it.
Finally, when the restore process is complete, the Job sends Prometheus metrics to the Pushgateway and update the RestoreSession status to reflect restore completion.
Restoring Different MongoDB Configurations
This section will show you restore process works for different MongoDB configurations.
Standalone MongoDB
For a standalone MongoDB database, the restore job downloads the backed up data from the backend and pipe the downloaded data to mongorestore command which inserts the data into the desired MongoDB database.
Fig: Standalone MongoDB Restore
MongoDB ReplicaSet Cluster
For MongoDB ReplicaSet cluster, Stash identifies the primary replica and restore into it.
Fig: MongoDB ReplicaSet Cluster Restore
MongoDB Sharded Cluster
For MongoDB sharded cluster, Stash identifies the primary replica of each shard as well as the config server and restore respective backed up data into them.
Fig: MongoDB Sharded Cluster Restore
Next Steps
Backup a standalone MongoDB databases using Stash following the guide from here.
Backup a MongoDB Replicaset cluster using Stash following the guide from here.
Backup a sharded MongoDB cluster using Stash following the guide from here.