You are looking at the documentation of a prior release. To read the documentation of the latest release, please
visit here.
New to KubeDB? Please start here.
ZooKeeper Backup & Restore Overview
KubeDB also uses KubeStash to backup and restore databases. KubeStash by AppsCode is a cloud native data backup and recovery solution for Kubernetes workloads and databases. KubeStash utilizes restic to securely backup stateful applications to any cloud or on-prem storage backends (for example, S3, GCS, Azure Blob storage, Minio, NetApp, Dell EMC etc.).
How Backup Works
The following diagram shows how KubeStash takes backup of a ZooKeeper
database. Open the image in a new tab to see the enlarged version.
The backup process consists of the following steps:
At first, a user creates a
Secret
. This secret holds the credentials to access the backend where the backed up data will be stored.Then, he creates a
BackupStorage
custom resource that specifies the backend information, along with theSecret
containing the credentials needed to access the backend.KubeStash operator watches for
BackupStorage
custom resources. When it finds aBackupStorage
object, it initializes theBackupStorage
by uploading themetadata.yaml
file to the specified backend.Next, he creates a
BackupConfiguration
custom resource that specifies the target database, addon information (including backup tasks), backup schedules, storage backends for storing the backup data, and other additional settings.KubeStash operator watches for
BackupConfiguration
objects.Once the KubeStash operator finds a
BackupConfiguration
object, it createsRepository
with the information specified in theBackupConfiguration
.KubeStash operator watches for
Repository
custom resources. When it finds theRepository
object, it InitializesRepository
by uploadingrepository.yaml
file into thespec.sessions[*].repositories[*].directory
path specified inBackupConfiguration
.Then, it creates a
CronJob
for each session with the schedule specified inBackupConfiguration
to trigger backup periodically.KubeStash operator triggers an instant backup as soon as the
BackupConfiguration
is ready. Backups are otherwise triggered by theCronJob
based on the specified schedule.KubeStash operator watches for
BackupSession
custom resources.When it finds a
BackupSession
object, it creates aSnapshot
custom resource for eachRepository
specified in theBackupConfiguration
.Then it resolves the respective
Addon
andFunction
and prepares backupJob
definition.Then, it creates the
Job
to backup the targetedZooKeeper
database.The backup
Job
reads necessary information (e.g. auth secret, port) to connect with the database from theAppBinding
CR. It also reads backend information and access credentials fromBackupStorage
CR, Storage Secret andRepository
path respectively.Then, the
Job
dumps the targetedZooKeeper
database and uploads the output to the backend. KubeStash pipes the output of dump command to uploading process. Hence, backupJob
does not require a large volume to hold the entire dump output.After the backup process is completed, the backup
Job
updates thestatus.components[dump]
field of theSnapshot
resources with backup information of the targetZooKeeper
database.
How Restore Process Works
The following diagram shows how KubeStash restores backed up data into a PostgreSQL
database. Open the image in a new tab to see the enlarged version.
The restore process consists of the following steps:
At first, a user creates a
ZooKeeper
database where the data will be restored or the user can use the sameZooKeeper
database.Then, he creates a
RestoreSession
custom resource that specifies the target database where the backed-up data will be restored, addon information (including restore tasks), the target snapshot to be restored, the Repository containing that snapshot, and other additional settings.KubeStash operator watches for
RestoreSession
custom resources.When it finds a
RestoreSession
custom resource, it resolves the respectiveAddon
andFunction
and prepares a restoreJob
definition.Then, it creates the
Job
to restore the target.The
Job
reads necessary information to connect with the database from respectiveAppBinding
CR. It also reads backend information and access credentials fromRepository
CR and storageSecret
respectively.Then, the
Job
downloads the backed up data from the backend and injects into the desired database. KubeStash pipes the downloaded data to the respective database tool to inject into the database. Hence, restoreJob
does not require a large volume to download entire backup data inside it.Finally, when the restore process is completed, the
Job
updates thestatus.components[*]
field of theRestoreSession
with restore information of the target database.
Next Steps
- Backup a
ZooKeeper
database using KubeStash by the following guides from here.