KubeDB
Run Production-Grade Databases on Kubernetes
arrow_forward
Stash
Backup and Recovery Solution for Kubernetes
arrow_forward
KubeVault
Run Production-Grade Vault on Kubernetes
arrow_forward
Voyager
Secure Ingress Controller for Kubernetes
arrow_forward
ConfigSyncer
Kubernetes Configuration Syncer
arrow_forward
Guard
Kubernetes Authentication WebHook Server
arrow_forward
KubeDB simplifies Provisioning, Upgrading, Scaling, Volume Expansion, Monitor, Backup, Restore for various Databases in Kubernetes on any Public & Private Cloud
- task_altLower administrative burden
- task_altNative Kubernetes Support
- task_altPerformance
- task_altAvailability and durability
- task_altManageability
- task_altCost-effectiveness
- task_altSecurity
A complete Kubernetes native disaster recovery solution for backup and restore your volumes and databases in Kubernetes on any public and private clouds.
- task_altDeclarative API
- task_altBackup Kubernetes Volumes
- task_altBackup Database
- task_altMultiple Storage Support
- task_altDeduplication
- task_altData Encryption
- task_altVolume Snapshot
- task_altPolicy Based Backup
KubeVault is a Git-Ops ready, production-grade solution for deploying and configuring Hashicorp's Vault on Kubernetes.
- task_altVault Kubernetes Deployment
- task_altAuto Initialization & Unsealing
- task_altVault Backup & Restore
- task_altConsume KubeVault Secrets with CSI
- task_altManage DB Users Privileges
- task_altStorage Backend
- task_altAuthentication Method
- task_altDatabase Secret Engine
Secure Ingress Controller for Kubernetes
- task_altHTTP & TCP
- task_altSSL
- task_altPlatform support
- task_altHAProxy
- task_altPrometheus
- task_altLet's Encrypt
Kubernetes Configuration Syncer
- task_altConfiguration Syncer
Kubernetes Authentication WebHook Server
- task_altIdentity Providers
- task_altCLI
- task_altRBAC
RESOURCES
Blog
Docs
Webinars
Learn
Demos
RECENT NEWS/BLOG
See More arrow_forward
Webinar New

You are looking at the documentation of a prior release. To read the documentation of the latest release, please visit here.

New to KubeDB? Please start here.

KubeDB MongoDB - Continuous Archiving and Point-in-time Recovery

Here, this doc will show you how to use KubeDB to provision a MongoDB to Archive continuously and Restore point-in-time.

Before You Begin

At first, you need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. If you do not already have a cluster, you can create one by using kind.

Now,

Install KubeDB operator in your cluster following the steps here.
Install KubeStash operator in your cluster following the steps here.
Install SideKick in your cluster following the steps here.
Install External-snapshotter in your cluster following the steps here, if you don’t already have a csi-driver available in the cluster.

To keep things isolated, this tutorial uses a separate namespace called demo throughout this tutorial.

$ kubectl create ns demo
namespace/demo created

Note: The yaml files used in this tutorial are stored in mg-archiver-demo

Continuous archiving

Continuous archiving involves making regular copies (or “archives”) of the MongoDB transaction log files. To ensure continuous archiving to a remote location we need to prepare BackupStorage,RetentionPolicy,MongoDBArchiver for the KubeDB Managed MongoDB Databases.

BackupStorage

BackupStorage is a CR provided by KubeStash that can manage storage from various providers like GCS, S3, and more.

apiVersion: storage.kubestash.com/v1alpha1
kind: BackupStorage
metadata:
  name: gcs-storage
  namespace: demo
spec:
  storage:
    provider: gcs
    gcs:
      bucket: kubestash-qa
      prefix: mg
      secret: gcs-secret
  usagePolicy:
    allowedNamespaces:
      from: All
  deletionPolicy: WipeOut # One of: WipeOut, Delete

For s3 compatible buckets, the .spec.storage will be like this :

provider: s3
s3:
  endpoint: us-east-1.linodeobjects.com
  bucket: arnob
  region: us-east-1
  prefix: ya
  secret: linode-secret

   $ kubectl apply -f https://raw.githubusercontent.com/kubedb/mg-archiver-demo/master/gke/backupstorage.yaml
   backupstorage.storage.kubestash.com/gcs-storage created

Secret for BackupStorage

You need to create a credentials which will hold the information about cloud bucket. Here are examples.

For GCS :

kubectl create secret generic -n demo gcs-secret \
  --from-literal=GOOGLE_PROJECT_ID=<your-project-id> \
  --from-file=./GOOGLE_SERVICE_ACCOUNT_JSON_KEY

For S3 :

kubectl create secret generic -n demo s3-secret \
    --from-file=./AWS_ACCESS_KEY_ID \
    --from-file=./AWS_SECRET_ACCESS_KEY

  $ kubectl apply -f https://raw.githubusercontent.com/kubedb/mg-archiver-demo/master/gke/storage-secret.yaml
  secret/gcs-secret created

Retention policy

RetentionPolicy is a CR provided by KubeStash that allows you to set how long you’d like to retain the backup data.

apiVersion: storage.kubestash.com/v1alpha1
kind: RetentionPolicy
metadata:
  name: mongodb-retention-policy
  namespace: demo
spec:
  maxRetentionPeriod: "30d"
  successfulSnapshots:
    last: 5
  failedSnapshots:
    last: 2

$ kubectl apply -https://raw.githubusercontent.com/kubedb/mg-archiver-demo/master/common/retention-policy.yaml
retentionpolicy.storage.kubestash.com/mongodb-retention-policy created

Ensure volumeSnapshotClass

kubectl get volumesnapshotclasses
NAME                    DRIVER               DELETIONPOLICY   AGE
longhorn-snapshot-vsc   driver.longhorn.io   Delete           7d22h

If not any, try using longhorn or any other volumeSnapshotClass.

$ helm install longhorn longhorn/longhorn --namespace longhorn-system --create-namespace
  ...
  ...
  kubectl get pod -n longhorn-system

kind: VolumeSnapshotClass
apiVersion: snapshot.storage.k8s.io/v1
metadata:
  name: longhorn-snapshot-vsc
driver: driver.longhorn.io
deletionPolicy: Delete
parameters:
  type: snap

If you already have a csi driver installed in your cluster, You need to refer that in the .driver section. Here is an example for GCS :

apiVersion: snapshot.storage.k8s.io/v1
kind: VolumeSnapshotClass
metadata:
  name: gke-vsc
driver: pd.csi.storage.gke.io
deletionPolicy: Delete

$ kubectl apply -f https://raw.githubusercontent.com/kubedb/mg-archiver-demo/master/gke/volume-snapshot-class.yaml
  volumesnapshotclass.snapshot.storage.k8s.io/gke-vsc unchanged

MongoDBArchiver

MongoDBArchiver is a CR provided by KubeDB for managing the archiving of MongoDB oplog files and performing volume-level backups

apiVersion: archiver.kubedb.com/v1alpha1
kind: MongoDBArchiver
metadata:
  name: mongodbarchiver-sample
  namespace: demo
spec:
  pause: false
  databases:
    namespaces:
      from: "Same"
    selector:
      matchLabels:
        archiver: "true"
  retentionPolicy:
    name: mongodb-retention-policy
    namespace: demo
  encryptionSecret:
    name: encrypt-secret
    namespace: demo
  fullBackup:
    driver: VolumeSnapshotter
    task:
      params:
        volumeSnapshotClassName: gke-vsc  # change it accordingly
    scheduler:
      successfulJobsHistoryLimit: 1
      failedJobsHistoryLimit: 1
      schedule: "*/50 * * * *"
    sessionHistoryLimit: 2
  manifestBackup:
    scheduler:
      successfulJobsHistoryLimit: 1
      failedJobsHistoryLimit: 1
      schedule: "*/5 * * * *"
    sessionHistoryLimit: 2
  backupStorage:
    ref:
      name: gcs-storage
      namespace: demo

EncryptionSecret

apiVersion: v1
kind: Secret
type: Opaque
metadata:
  name: encrypt-secret
  namespace: demo
stringData:
  RESTIC_PASSWORD: "changeit"

 $ kubectl create -f https://raw.githubusercontent.com/kubedb/mg-archiver-demo/master/common/encrypt-secret.yaml
 $ kubectl create -f https://raw.githubusercontent.com/kubedb/mg-archiver-demo/master/common/archiver.yaml

Deploy MongoDB

So far we are ready with setup for continuously archive MongoDB, We deploy a MongoDB referring the MongoDB archiver object

apiVersion: kubedb.com/v1alpha2
kind: MongoDB
metadata:
  name: mg-rs
  namespace: demo
  labels:
    archiver: "true"
spec:
  version: "4.4.26"
  replicaSet:
    name: "rs"
  replicas: 3
  podTemplate:
    spec:
      resources:
        requests:
          cpu: "500m"
          memory: "500Mi"
  storage:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 1Gi

The archiver: "true" label is important here. Because that’s how we are specifying that continous archiving will be done in this db.

$ kubectl get pod -n demo
NAME                                                  READY   STATUS      RESTARTS   AGE
mg-rs-0                                               2/2     Running     0          8m30s
mg-rs-1                                               2/2     Running     0          7m32s
mg-rs-2                                               2/2     Running     0          6m34s
mg-rs-backup-full-backup-1702457252-lvcbn             0/1     Completed   0          65s
mg-rs-backup-manifest-backup-1702457110-fjpw5         0/1     Completed   0          3m28s
mg-rs-backup-manifest-backup-1702457253-f4chq         0/1     Completed   0          65s
mg-rs-sidekick                                        1/1     Running     0          5m29s
trigger-mg-rs-backup-manifest-backup-28374285-rdcfq   0/1     Completed   0          3m38s

mg-rs-sidekick is responsible for uploading oplog-files mg-rs-full-backup-***** are the volumes levels backups for MongoDB. mg-rs-manifest-backup-***** are the backups of the manifest relate to MongoDB object

Validate BackupConfiguration and VolumeSnapshot

$ kubectl get backupstorage,backupconfigurations,backupsession,volumesnapshots -A

NAMESPACE   NAME                                              PROVIDER   DEFAULT   DELETION-POLICY   TOTAL-SIZE   PHASE   AGE
demo        backupstorage.storage.kubestash.com/gcs-storage   gcs                  WipeOut           3.292 KiB    Ready   11m

NAMESPACE   NAME                                                  PHASE   PAUSED   AGE
demo        backupconfiguration.core.kubestash.com/mg-rs-backup   Ready            6m45s

NAMESPACE   NAME                                                                       INVOKER-TYPE          INVOKER-NAME   PHASE       DURATION   AGE
demo        backupsession.core.kubestash.com/mg-rs-backup-full-backup-1702457252       BackupConfiguration   mg-rs-backup   Succeeded              2m20s
demo        backupsession.core.kubestash.com/mg-rs-backup-manifest-backup-1702457110   BackupConfiguration   mg-rs-backup   Succeeded              4m43s
demo        backupsession.core.kubestash.com/mg-rs-backup-manifest-backup-1702457253   BackupConfiguration   mg-rs-backup   Succeeded              2m20s

NAMESPACE   NAME                                                      READYTOUSE   SOURCEPVC         SOURCESNAPSHOTCONTENT   RESTORESIZE   SNAPSHOTCLASS   SNAPSHOTCONTENT                                    CREATIONTIME   AGE
demo        volumesnapshot.snapshot.storage.k8s.io/mg-rs-1702457262   true         datadir-mg-rs-1                           1Gi           gke-vsc         snapcontent-87f1013f-cd7e-4153-b245-da9552d2e44f   2m7s           2m11s

data insert and switch oplog

After each and every oplog switch the oplog files will be uploaded to backup storage

$ kubectl exec -it -n demo mg-rs-0 bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
Defaulted container "mongodb" out of: mongodb, replication-mode-detector, copy-config (init)
mongodb@mg-rs-0:/$ 
mongodb@mg-rs-0:/$ mongo -u root -p $MONGO_INITDB_ROOT_PASSWORD 
MongoDB shell version v4.4.26
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("4a51b9fc-a26c-487b-848d-341cf5512c86") }
MongoDB server version: 4.4.26
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
	https://docs.mongodb.com/
Questions? Try the MongoDB Developer Community Forums
	https://community.mongodb.com
---
The server generated these startup warnings when booting: 
        2023-12-13T08:40:40.423+00:00: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem
---
rs:PRIMARY> show dbs
admin          0.000GB
config         0.000GB
kubedb-system  0.000GB
local          0.000GB
rs:PRIMARY> use pink_floyd
switched to db pink_floyd
rs:PRIMARY> db.songs.insert({"name":"shine on you crazy diamond"})
WriteResult({ "nInserted" : 1 })
rs:PRIMARY> show collections
songs
rs:PRIMARY> db.songs.find()
{ "_id" : ObjectId("657970c1f965be0513c7f4d7"), "name" : "shine on you crazy diamond" }
rs:PRIMARY>

At this point We have a document in our newly created collection songs on database pink_floyd

Point-in-time Recovery

Point-In-Time Recovery allows you to restore a MongoDB database to a specific point in time using the archived transaction logs. This is particularly useful in scenarios where you need to recover to a state just before a specific error or data corruption occurred. Let’s say accidentally our dba drops the the table tab_1 and we want to restore.

```bash

rs:PRIMARY> use pink_floyd
switched to db pink_floyd

rs:PRIMARY> db.dropDatabase()
{
	"dropped" : "pink_floyd",
	"ok" : 1,
	"$clusterTime" : {
		"clusterTime" : Timestamp(1702457742, 2),
		"signature" : {
			"hash" : BinData(0,"QFpwWOtec/NdQ0iKKyFCx9Jz8/A="),
			"keyId" : NumberLong("7311996497896144901")
		}
	},
	"operationTime" : Timestamp(1702457742, 2)
}

Time time 1702457742 is unix timestamp. This is Wed Dec 13 2023 08:55:42 GMT+0000 in human readable format. We can’t restore from a full backup since at this point no full backup was perform. so we can choose a specific time in (just before this timestamp, for example 08:55:30) which time we want to restore.

Restore MongoDB

apiVersion: kubedb.com/v1alpha2
kind: MongoDB
metadata:
  name: mg-rs-restored
  namespace: demo
spec:
  version: "4.4.26"
  replicaSet:
    name: "rs"
  replicas: 3
  podTemplate:
    spec:
      resources:
        requests:
          cpu: "500m"
          memory: "500Mi"
  storage:
    accessModes:
      - ReadWriteOnce
    resources:
      requests:
        storage: 1Gi
  init:
    archiver:
      recoveryTimestamp: "2023-12-13T08:55:30Z"
      encryptionSecret:
        name: encrypt-secret
        namespace: demo
      fullDBRepository:
        name: mg-rs-full
        namespace: demo
      manifestRepository:
        name: mg-rs-manifest
        namespace: demo
  terminationPolicy: WipeOut

kubectl apply -f restore.yaml
mongo.kubedb.com/restore-mg created

check for Restored MongoDB

 kubectl get pods -n demo | grep restore
mg-rs-restored-0                                      2/2     Running     0          4m43s
mg-rs-restored-1                                      2/2     Running     0          3m52s
mg-rs-restored-2                                      2/2     Running     0          2m59s
mg-rs-restored-manifest-restorer-2qb46                0/1     Completed   0          4m58s
mg-rs-restored-wal-restorer-nkxfl                     0/1     Completed   0          41s

$ kubectl get mg -n demo
NAME             VERSION   STATUS   AGE
mg-rs-restored   4.4.26    Ready    5m47s

Validating data on Restored MongoDB

$ kubectl exec -it -n demo mg-rs-restored-0 bash
mongodb@mg-rs-restored-0:/$ mongo -u root -p $MONGO_INITDB_ROOT_PASSWORD 
MongoDB shell version v4.4.26
connecting to: mongodb://127.0.0.1:27017/?compressors=disabled&gssapiServiceName=mongodb
Implicit session: session { "id" : UUID("50d3fc74-bffc-4c97-a1e6-a2ea63cb88e1") }
MongoDB server version: 4.4.26
Welcome to the MongoDB shell.
For interactive help, type "help".
For more comprehensive documentation, see
	https://docs.mongodb.com/
Questions? Try the MongoDB Developer Community Forums
	https://community.mongodb.com
---
The server generated these startup warnings when booting: 
        2023-12-13T09:05:42.205+00:00: Using the XFS filesystem is strongly recommended with the WiredTiger storage engine. See http://dochub.mongodb.org/core/prodnotes-filesystem
---
rs:PRIMARY> show dbs
admin          0.000GB
config         0.000GB
kubedb-system  0.000GB
local          0.000GB
pink_floyd     0.000GB
rs:PRIMARY> use pink_floyd
switched to db pink_floyd
rs:PRIMARY> show collections
songs
rs:PRIMARY> db.songs.find()
{ "_id" : ObjectId("657970c1f965be0513c7f4d7"), "name" : "shine on you crazy diamond" }

so we are able to successfully recover from a disaster

Cleaning up

To cleanup the Kubernetes resources created by this tutorial, run:

kubectl delete -n demo mg/mg-rs
kubectl delete -n demo mg/mg-rs-restored
kubectl delete -n demo backupstorage/gcs-storage
kubectl delete ns demo

Next Steps

Learn about backup and restore MongoDB database using Stash.
Learn about initializing MongoDB with Script.
Learn about custom mongoVersions.
Want to setup MongoDB cluster? Check how to configure Highly Available MongoDB Cluster
Monitor your MongoDB database with KubeDB using built-in Prometheus.
Monitor your MongoDB database with KubeDB using Prometheus operator.
Detail concepts of mongo object.
Use private Docker registry to deploy MongoDB with KubeDB.
Want to hack on KubeDB? Check our contribution guidelines.

Improve This Page

KubeDB

Stash

KubeVault

Voyager

ConfigSyncer

Guard

RESOURCES

RECENT NEWS/BLOG

Elasticsearch

Kafka

MariaDB

Memcached

MongoDB

MySQL

OpenSearch

Percona XtraDB

PgBouncer

PostgreSQL

ProxySQL

Redis