Customizing Backup and Restore Process
KubeStash provides rich customization supports for the backup and restore process to meet the requirements of various cluster configurations. This guide will show you some examples of these customizations.
KubeStash uses the neo4j-admin database backup command under the hood to take backups and the neo4j-admin database restore command to restore them. By default, it backs up every database of the instance (including the system database). The customizations shown below are passed through the addon.tasks[*].params and addon.jobTemplate.spec sections of the BackupConfiguration and RestoreSession CRs.
Customizing Backup Process
In this section, we are going to show you how to customize the backup process. Here, we are going to show some examples of passing arguments to the backup process, backing up specific databases, running the backup process as a specific user, etc.
Passing arguments to the backup process
KubeStash Neo4j addon uses the neo4j-admin database backup command for backups. You can pass any extra arguments supported by this command through the neo4jAdminArgs parameter under the addon.tasks[*].params section. To pass multiple arguments, provide them as a comma-separated list.
The below example shows how you can pass --keep-failed=true and --parallel-recovery=true to the backup command.
apiVersion: core.kubestash.com/v1alpha1
kind: BackupConfiguration
metadata:
name: sample-neo4j-backup
namespace: demo
spec:
target:
apiGroup: kubedb.com
kind: Neo4j
namespace: demo
name: sample-neo4j
backends:
- name: s3-backend
storageRef:
namespace: demo
name: s3-storage
retentionPolicy:
name: demo-retention
namespace: demo
sessions:
- name: frequent-backup
scheduler:
schedule: "*/5 * * * *"
jobTemplate:
backoffLimit: 1
repositories:
- name: s3-neo4j-repo
backend: s3-backend
directory: /neo4j
addon:
name: neo4j-addon
tasks:
- name: logical-backup
params:
neo4jAdminArgs: "--keep-failed=true,--parallel-recovery=true"
Passing a target database to the backup process
By default, KubeStash Neo4j addon backs up every database of the instance (the * selector). If you want to back up only a specific set of databases, you can specify them using the databases parameter under the addon.tasks[*].params section. Provide the database names as a comma-separated list.
The below example shows how you can back up only the neo4j and movies databases.
apiVersion: core.kubestash.com/v1alpha1
kind: BackupConfiguration
metadata:
name: sample-neo4j-backup
namespace: demo
spec:
target:
apiGroup: kubedb.com
kind: Neo4j
namespace: demo
name: sample-neo4j
backends:
- name: s3-backend
storageRef:
namespace: demo
name: s3-storage
retentionPolicy:
name: demo-retention
namespace: demo
sessions:
- name: frequent-backup
scheduler:
schedule: "*/5 * * * *"
jobTemplate:
backoffLimit: 1
repositories:
- name: s3-neo4j-repo
backend: s3-backend
directory: /neo4j
addon:
name: neo4j-addon
tasks:
- name: logical-backup
params:
databases: "neo4j,movies"
WARNING: Make sure that the databases you provide already exist in the target instance before taking the backup.
Backing up from a specific server
By default, KubeStash takes the backup from the server resolved through the database AppBinding (over the backup port 6362). If you want to take the backup from a specific server, for example to offload the backup load from the leader to a particular replica, you can set the from parameter under the addon.tasks[*].params section to the desired server address.
The below example shows how you can take the backup from the sample-neo4j-2 pod.
apiVersion: core.kubestash.com/v1alpha1
kind: BackupConfiguration
metadata:
name: sample-neo4j-backup
namespace: demo
spec:
target:
apiGroup: kubedb.com
kind: Neo4j
namespace: demo
name: sample-neo4j
backends:
- name: s3-backend
storageRef:
namespace: demo
name: s3-storage
retentionPolicy:
name: demo-retention
namespace: demo
sessions:
- name: frequent-backup
scheduler:
schedule: "*/5 * * * *"
jobTemplate:
backoffLimit: 1
repositories:
- name: s3-neo4j-repo
backend: s3-backend
directory: /neo4j
addon:
name: neo4j-addon
tasks:
- name: logical-backup
params:
from: "sample-neo4j-2.demo.svc:6362"
The
fromaddress must point to a server that exposes the backup port (6362).
Running backup job as a specific user
If your cluster requires running the backup job as a specific user, you can provide securityContext under the addon.jobTemplate.spec.securityContext section. The below example shows how you can run the backup job as the neo4j user (runAsUser: 7474).
apiVersion: core.kubestash.com/v1alpha1
kind: BackupConfiguration
metadata:
name: sample-neo4j-backup
namespace: demo
spec:
target:
apiGroup: kubedb.com
kind: Neo4j
namespace: demo
name: sample-neo4j
backends:
- name: s3-backend
storageRef:
namespace: demo
name: s3-storage
retentionPolicy:
name: demo-retention
namespace: demo
sessions:
- name: frequent-backup
scheduler:
schedule: "*/5 * * * *"
jobTemplate:
backoffLimit: 1
repositories:
- name: s3-neo4j-repo
backend: s3-backend
directory: /neo4j
addon:
name: neo4j-addon
jobTemplate:
spec:
securityContext:
runAsUser: 7474
runAsGroup: 7474
tasks:
- name: logical-backup
Specifying Memory/CPU limit/request for the backup job
If you want to specify the Memory/CPU limit/request for your backup job, you can specify the resources field under the addon.jobTemplate.spec section.
apiVersion: core.kubestash.com/v1alpha1
kind: BackupConfiguration
metadata:
name: sample-neo4j-backup
namespace: demo
spec:
target:
apiGroup: kubedb.com
kind: Neo4j
namespace: demo
name: sample-neo4j
backends:
- name: s3-backend
storageRef:
namespace: demo
name: s3-storage
retentionPolicy:
name: demo-retention
namespace: demo
sessions:
- name: frequent-backup
scheduler:
schedule: "*/5 * * * *"
jobTemplate:
backoffLimit: 1
repositories:
- name: s3-neo4j-repo
backend: s3-backend
directory: /neo4j
addon:
name: neo4j-addon
jobTemplate:
spec:
resources:
requests:
cpu: "200m"
memory: "1Gi"
limits:
cpu: "200m"
memory: "1Gi"
tasks:
- name: logical-backup
You can configure additional runtime settings for backup jobs within the
addon.jobTemplate.specsections. For further details, please refer to the reference.
Customizing Restore Process
KubeStash uses the neo4j-admin database restore command during the restore process and then seeds the restored store into the cluster from a single bootstrap pod. In this section, we are going to show how you can restore a specific snapshot, restore specific databases, pass arguments to the restore process, run the restore job as a specific user, etc.
Note: For a clustered
Neo4jrestore, KubeStash restores the store files into the seed pod’s data volume and then bootstraps the other replicas from it. Therefore, in everyRestoreSessionyou need to set theseedServerNameparameter to the target seed pod (e.g.restored-neo4j-0) and mount that pod’s data PVC into the restoreJobas shown in the examples below.
Restore specific snapshot
You can also restore a specific snapshot. At first, list the available snapshots as below,
$ kubectl get snapshots.storage.kubestash.com -n demo -l=kubestash.com/repo-name=s3-neo4j-repo
NAME REPOSITORY SESSION SNAPSHOT-TIME DELETION-POLICY PHASE AGE
s3-neo4j-repo-sample-neo4j-backup-frequent-backup-1725257849 s3-neo4j-repo frequent-backup 2024-09-02T06:18:01Z Delete Succeeded 15m
s3-neo4j-repo-sample-neo4j-backup-frequent-backup-1725258000 s3-neo4j-repo frequent-backup 2024-09-02T06:20:00Z Delete Succeeded 13m
s3-neo4j-repo-sample-neo4j-backup-frequent-backup-1725258300 s3-neo4j-repo frequent-backup 2024-09-02T06:25:00Z Delete Succeeded 8m34s
s3-neo4j-repo-sample-neo4j-backup-frequent-backup-1725258600 s3-neo4j-repo frequent-backup 2024-09-02T06:30:00Z Delete Succeeded 3m34s
The below example shows how you can pass a specific snapshot name in the .spec.dataSource section.
apiVersion: core.kubestash.com/v1alpha1
kind: RestoreSession
metadata:
name: sample-neo4j-restore
namespace: demo
spec:
target:
apiGroup: kubedb.com
kind: Neo4j
namespace: demo
name: restored-neo4j
dataSource:
repository: s3-neo4j-repo
snapshot: s3-neo4j-repo-sample-neo4j-backup-frequent-backup-1725258000
addon:
name: neo4j-addon
tasks:
- name: logical-backup-restore
params:
seedServerName: "restored-neo4j-0" ## Neo4j Pod Name
jobTemplate:
spec:
volumes:
- name: data
persistentVolumeClaim:
claimName: data-restored-neo4j-0 # PVC Name
volumeMounts:
- mountPath: /data
name: data
subPath: data
securityContext:
runAsNonRoot: true
runAsUser: 7474
Restoring specific databases
A snapshot contains every database that was backed up. If you want to restore only a subset of them, you can specify the database names using the databases parameter under the addon.tasks[*].params section. Provide the database names as a comma-separated list.
Note: The
systemdatabase is never restored, regardless of the value ofdatabases. If you omit thedatabasesparameter, all user databases from the snapshot are restored.
The below example shows how you can restore only the movies database.
apiVersion: core.kubestash.com/v1alpha1
kind: RestoreSession
metadata:
name: sample-neo4j-restore
namespace: demo
spec:
target:
apiGroup: kubedb.com
kind: Neo4j
namespace: demo
name: restored-neo4j
dataSource:
repository: s3-neo4j-repo
snapshot: latest
addon:
name: neo4j-addon
tasks:
- name: logical-backup-restore
params:
seedServerName: "restored-neo4j-0" ## Neo4j Pod Name
databases: "movies"
jobTemplate:
spec:
volumes:
- name: data
persistentVolumeClaim:
claimName: data-restored-neo4j-0 # PVC Name
volumeMounts:
- mountPath: /data
name: data
subPath: data
securityContext:
runAsNonRoot: true
runAsUser: 7474
Passing arguments to the restore process
You can pass any extra arguments supported by the neo4j-admin database restore command through the neo4jAdminArgs parameter under the addon.tasks[*].params section. To pass multiple arguments, provide them as a comma-separated list.
A common use case is restoring into a database that already exists. Passing --overwrite-destination=true tells KubeStash to stop and drop the existing database before restoring it.
apiVersion: core.kubestash.com/v1alpha1
kind: RestoreSession
metadata:
name: sample-neo4j-restore
namespace: demo
spec:
target:
apiGroup: kubedb.com
kind: Neo4j
namespace: demo
name: restored-neo4j
dataSource:
repository: s3-neo4j-repo
snapshot: latest
addon:
name: neo4j-addon
tasks:
- name: logical-backup-restore
params:
seedServerName: "restored-neo4j-0" ## Neo4j Pod Name
neo4jAdminArgs: "--overwrite-destination=true"
jobTemplate:
spec:
volumes:
- name: data
persistentVolumeClaim:
claimName: data-restored-neo4j-0 # PVC Name
volumeMounts:
- mountPath: /data
name: data
subPath: data
securityContext:
runAsNonRoot: true
runAsUser: 7474
Specifying Memory/CPU limit/request for the restore job
Similar to the backup process, you can also provide the resources field under the addon.jobTemplate.spec section to limit the Memory/CPU for your restore job.
apiVersion: core.kubestash.com/v1alpha1
kind: RestoreSession
metadata:
name: sample-neo4j-restore
namespace: demo
spec:
target:
apiGroup: kubedb.com
kind: Neo4j
namespace: demo
name: restored-neo4j
dataSource:
repository: s3-neo4j-repo
snapshot: latest
addon:
name: neo4j-addon
tasks:
- name: logical-backup-restore
params:
seedServerName: "restored-neo4j-0" ## Neo4j Pod Name
jobTemplate:
spec:
resources:
requests:
cpu: "200m"
memory: "1Gi"
limits:
cpu: "200m"
memory: "1Gi"
volumes:
- name: data
persistentVolumeClaim:
claimName: data-restored-neo4j-0 # PVC Name
volumeMounts:
- mountPath: /data
name: data
subPath: data
securityContext:
runAsNonRoot: true
runAsUser: 7474
You can configure additional runtime settings for restore jobs within the
addon.jobTemplate.specsections. For further details, please refer to the reference.































