You are looking at the documentation of a prior release. To read the documentation of the latest release, please
visit here.
New to KubeDB? Please start here.
Backup & Restore Elasticsearch Using Stash
KubeDB uses Stash to backup and restore databases. Stash by AppsCode is a cloud native data backup and recovery solution for Kubernetes workloads. Stash utilizes restic to securely backup stateful applications to any cloud or on-prem storage backends (for example, S3, GCS, Azure Blob storage, Minio, NetApp, Dell EMC etc.).
How Backup Works
The following diagram shows how Stash takes a backup of an Elasticsearch database. Open the image in a new tab to see the enlarged version.
The backup process consists of the following steps:
At first, a user creates a secret with access credentials of the backend where the backed up data will be stored.
Then, she creates a
Repository
crd that specifies the backend information along with the secret that holds the credentials to access the backend.Then, she creates a
BackupConfiguration
crd targeting the AppBinding crd of the desired database. TheBackupConfiguration
object also specifies theTask
to use to backup the database.Stash operator watches for
BackupConfiguration
crd.Once Stash operator finds a
BackupConfiguration
crd, it creates a CronJob with the schedule specified inBackupConfiguration
object to trigger backup periodically.On the next scheduled slot, the CronJob triggers a backup by creating a
BackupSession
crd.Stash operator also watches for
BackupSession
crd.When it finds a
BackupSession
object, it resolves the respectiveTask
andFunction
and prepares a Job definition to backup.Then, it creates the Job to backup the targeted database.
The backup Job reads necessary information to connect with the database from the
AppBinding
crd. It also reads backend information and access credentials fromRepository
crd and Storage Secret respectively.Then, the Job dumps the targeted database and uploads the output to the backend. Stash stores the dumped files temporarily before uploading into the backend. Hence, you should provide a PVC template using
spec.interimVolumeTemplate
field ofBackupConfiguration
crd to use to store those dumped files temporarily.Finally, when the backup is completed, the Job sends Prometheus metrics to the Pushgateway running inside Stash operator pod. It also updates the
BackupSession
andRepository
status to reflect the backup procedure.
How Restore Process Works
The following diagram shows how Stash restores backed up data into an Elasticsearch database. Open the image in a new tab to see the enlarged version.
The restore process consists of the following steps:
At first, a user creates a
RestoreSession
crd targeting theAppBinding
of the desired database where the backed up data will be restored. It also specifies theRepository
crd which holds the backend information and theTask
to use to restore the target.Stash operator watches for
RestoreSession
object.Once it finds a
RestoreSession
object, it resolves the respectiveTask
andFunction
and prepares a Job definition to restore.Then, it creates the Job to restore the target.
The Job reads necessary information to connect with the database from respective
AppBinding
crd. It also reads backend information and access credentials fromRepository
crd and Storage Secret respectively.Then, the job downloads the backed up data from the backend and insert into the desired database. Stash stores the downloaded files temporarily before inserting into the targeted database. Hence, you should provide a PVC template using
spec.interimVolumeTemplate
field ofRestoreSession
crd to use to store those restored files temporarily.Finally, when the restore process is completed, the Job sends Prometheus metrics to the Pushgateway and update the
RestoreSession
status to reflect restore completion.
Next Steps
- Backup your Elasticsearch databases using Stash following the guide from here.
- Configure a generic backup template for all the Elasticsearch databases of your cluster using Stash Auto-backup by following the guide from here.
- Customize the backup & restore process for your cluster by following the guides from here.