You are looking at the documentation of a prior release. To read the documentation of the latest release, please
visit here.
New to KubeDB? Please start here.
Snapshot and Restore Using S3 Repository Plugin
The S3 repository plugin adds support for using AWS S3 as a repository for Snapshot/Restore. It also works with S3 compatible other mediums such as Linode Object Storage.
For the demo, we are going to show you how to snapshot a KubeDB managed Elasticsearch and restore data from previously taken snapshot.
Before You Begin
At first, you need to have a Kubernetes cluster, and the kubectl
command-line tool must be configured to communicate with your cluster. If you do not already have a cluster, you can create one by using kind.
Now, install the KubeDB operator in your cluster following the steps here.
To keep things isolated, this tutorial uses a separate namespace called demo
throughout this tutorial.
$ kubectl create namespace demo
namespace/demo created
$ kubectl get namespace
NAME STATUS AGE
demo Active 9s
Note: YAML files used in this tutorial are stored in guides/elasticsearch/quickstart/overview/yamls folder in GitHub repository kubedb/docs
Create S3 Compatible Storage
We are going to use the Linode Object Storage which is S3 compatible. But you can any S3 compatible storage which suits you best. Let’s create a sample-s3-bucket
to store snapshot and later restore from it.
You also need to create access_key
and secret_key
so that your Elasticsearch Cluster can connect to the bucket.
Deploy Elasticsearch Cluster and Populate Data
For the demo, we are going to use Elasticsearch docker images from KubeDB distribution with the pre-installed S3 repository plugin.
Secure Client Settings
To make the plugin works we need to create a k8s secret with the Elastisearch secure settings:
apiVersion: v1
kind: Secret
metadata:
name: es-secure-settings
namespace: demo
stringData:
password: strong-password
s3.client.default.access_key: 6BU5GFIIUC2********
s3.client.default.secret_key: DD1FS5NAiPf********
N.B.: Here, the
password
is the ElasticsearchKEYSTROE_PASSWORD
, if you do not provide it, default to empty string (""
).
Let’s create the k8s secret with secure settings:
$ kubectl apply -f secure-settings-secret.yaml
secret/es-secure-settings created
In S3 Client Settings, If you do not configure the endpoint
, it default to s3.amazonaws.com
. Since we are using Linode Bucket instead of AWS S3, we need to configure the endpoint too. Let’s create another secret with custom client configurations:
apiVersion: v1
kind: Secret
metadata:
name: es-custom-config
namespace: demo
stringData:
elasticsearch.yml: |-
s3.client.default.endpoint: us-east-1.linodeobjects.com
N.B.: In Elasticsearch, only secure setting goes to
elasticsearch.keystore
, others are put intoelasticsearch.yml
config file. That’s why two different k8s secrets are used.
Let’s create the k8s secret with custom configurations:
$ kubectl apply -f custom-configuration.yaml
secret/es-custom-config created
Deploy Elasticsearch Cluster
Now that we have deployed our configuration secrets, it’s time to deploy our Elasticsearch instance.
apiVersion: kubedb.com/v1
kind: Elasticsearch
metadata:
name: sample-es
namespace: demo
spec:
# Custom configuration, which will update elasticsearch.yml
configSecret:
name: es-custom-config
# Secure settings which will be stored in elasticsearch.keystore
secureConfigSecret:
name: es-secure-settings
enableSSL: true
# we are using ElasticsearchVersion with pre-installed s3 repository plugin
version: xpack-8.11.1
storageType: Durable
replicas: 3
storage:
storageClassName: "standard"
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
Let’s deploy the Elasticsearch and wait for it to become ready to use:
$ kubectl apply -f https://github.com/kubedb/docs/raw/v2024.8.21/docs/guides/elasticsearch/plugins-backup/s3-repository/yamls/elasticsearch.yaml
elasticsearch.kubedb.com/sample-es created
$ kubectl get es -n demo -w
NAME VERSION STATUS AGE
sample-es xpack-8.11.1 0s
sample-es xpack-8.11.1 Provisioning 19s
sample-es xpack-8.11.1 Ready 41s
Populate Data
To connect to our Elasticsearch cluster, let’s port-forward the Elasticsearch service to local machine:
$ kubectl port-forward -n demo svc/sample-es 9200
Forwarding from 127.0.0.1:9200 -> 9200
Forwarding from [::1]:9200 -> 9200
Keep it like that and switch to another terminal window:
$ export ELASTIC_USER=$(kubectl get secret -n demo sample-es-elastic-cred -o jsonpath='{.data.username}' | base64 -d)
$ export ELASTIC_PASSWORD=$(kubectl get secret -n demo sample-es-elastic-cred -o jsonpath='{.data.password}' | base64 -d)
$ curl -XGET -k -u "$ELASTIC_USER:$ELASTIC_PASSWORD" "https://localhost:9200/_cluster/health?pretty"
{
"cluster_name" : "sample-es",
"status" : "green",
"timed_out" : false,
"number_of_nodes" : 3,
"number_of_data_nodes" : 3,
"active_primary_shards" : 1,
"active_shards" : 2,
"relocating_shards" : 0,
"initializing_shards" : 0,
"unassigned_shards" : 0,
"delayed_unassigned_shards" : 0,
"number_of_pending_tasks" : 0,
"number_of_in_flight_fetch" : 0,
"task_max_waiting_in_queue_millis" : 0,
"active_shards_percent_as_number" : 100.0
}
So, our cluster status is green. Let’s create some indices with dummy data:
$ curl -XPOST -k -u "$ELASTIC_USER:$ELASTIC_PASSWORD" "https://localhost:9200/products/_doc?pretty" -H 'Content-Type: application/json' -d '
{
"name": "KubeDB",
"vendor": "AppsCode Inc.",
"description": "Database Operator for Kubernetes"
}
'
$ curl -XPOST -k -u "$ELASTIC_USER:$ELASTIC_PASSWORD" "https://localhost:9200/companies/_doc?pretty" -H 'Content-Type: application/json' -d '
{
"name": "AppsCode Inc.",
"mission": "Accelerate the transition to Containers by building a Kubernetes-native Data Platform",
"products": ["KubeDB", "Stash", "KubeVault", "Kubeform", "ByteBuilders"]
}
'
Now, let’s verify that the indexes have been created successfully.
$ curl -XGET -k -u "$ELASTIC_USER:$ELASTIC_PASSWORD" "https://localhost:9200/_cat/indices?v&s=index&pretty"
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open .geoip_databases oiaZfJA8Q5CihQon0oR8hA 1 1 42 0 81.6mb 40.8mb
green open companies GuGisWJ8Tkqnq8vhREQ2-A 1 1 1 0 11.5kb 5.7kb
green open products wyu-fImDRr-Hk_GXVF7cDw 1 1 1 0 10.6kb 5.3kb
Repository Settings
The s3 repository type supports a number of settings to customize how data is stored in S3. These can be specified when creating the repository.
Let’s create the _snapshot
repository sample_s3_repo
with our bucket name sample-s3-bucket
:
$ curl -k -X PUT -u "$ELASTIC_USER:$ELASTIC_PASSWORD" "https://localhost:9200/_snapshot/sample_s3_repo?pretty" -H 'Content-Type: application/json' -d'
{
"type": "s3",
"settings": {
"bucket": "sample-s3-bucket"
}
}
'
{
"acknowledged" : true
}
We’ve successfully created our repository. Ready to take our first snapshot.
Create a Snapshot
A repository can contain multiple snapshots of the same cluster. Snapshots are identified by unique names within the cluster. For more details, visit Create a snapshot.
$ curl -k -X PUT -u "$ELASTIC_USER:$ELASTIC_PASSWORD" "https://localhost:9200/_snapshot/sample_s3_repo/snapshot_1?wait_for_completion=true&pretty"
{
"snapshot" : {
"snapshot" : "snapshot_1",
"uuid" : "JKoF5sgtS3WPBQ8A_OvWbw",
"repository" : "sample_s3_repo",
"version_id" : 7140099,
"version" : "7.14.0",
"indices" : [
".geoip_databases",
"companies",
"products"
],
"data_streams" : [ ],
"include_global_state" : true,
"state" : "SUCCESS",
"start_time" : "2021-08-24T14:45:38.930Z",
"start_time_in_millis" : 1629816338930,
"end_time" : "2021-08-24T14:46:16.946Z",
"end_time_in_millis" : 1629816376946,
"duration_in_millis" : 38016,
"failures" : [ ],
"shards" : {
"total" : 3,
"failed" : 0,
"successful" : 3
},
"feature_states" : [
{
"feature_name" : "geoip",
"indices" : [
".geoip_databases"
]
}
]
}
}
We’ve successfully taken our first snapshot.
Delete Data and Restore a Snapshot
Let’s delete all the indices:
$ curl -k -u "$ELASTIC_USER:$ELASTIC_PASSWORD" -X DELETE "https://localhost:9200/_all?pretty"
{
"acknowledged" : true
}
List and varify the deletion:
$ curl -XGET -k -u "$ELASTIC_USER:$ELASTIC_PASSWORD" "https://localhost:9200/_cat/indices?v&s=index&pretty"
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open .geoip_databases oiaZfJA8Q5CihQon0oR8hA 1 1 42 0 81.6mb 40.8mb
For more details about restore, visit Restore a snapshot.
Let’s restore the data from our snapshot_1
:
$ curl -k -u "$ELASTIC_USER:$ELASTIC_PASSWORD" -X POST "https://localhost:9200/_snapshot/sample_s3_repo/snapshot_1/_restore?pretty" -H 'Content-Type: application/json' -d'
{
"indices": "companies,products"
}
'
{
"accepted" : true
}
We’ve successfully restored our indices.
N.B.: We only wanted to restore the indices we created, but if you want to overwrite everything with the snapshot data, you can do it by setting
include_global_state
totrue
while restoring.
Varify Data
To varify our data, let’s list the indices:
$ curl -XGET -k -u "$ELASTIC_USER:$ELASTIC_PASSWORD" "https://localhost:9200/_cat/indices?v&s=index&pretty"
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open .geoip_databases oiaZfJA8Q5CihQon0oR8hA 1 1 42 0 81.6mb 40.8mb
green open companies drsv-5tvQwCcte7bkUT0uQ 1 1 1 0 11.7kb 5.8kb
green open products 7TXoXy5kRFiVgZDuyqffQA 1 1 1 0 10.6kb 5.3kb
Check the content inside:
$ curl -XGET -k -u "$ELASTIC_USER:$ELASTIC_PASSWORD" "https://localhost:9200/products/_search?pretty"
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "products",
"_type" : "_doc",
"_id" : "36SEeHsBS6UMHADkEvJw",
"_score" : 1.0,
"_source" : {
"name" : "KubeDB",
"vendor" : "AppsCode Inc.",
"description" : "Database Operator for Kubernetes"
}
}
]
}
}
$ curl -XGET -k -u "$ELASTIC_USER:$ELASTIC_PASSWORD" "https://localhost:9200/companies/_search?pretty"
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "companies",
"_type" : "_doc",
"_id" : "4KSFeHsBS6UMHADkGvL5",
"_score" : 1.0,
"_source" : {
"name" : "AppsCode Inc.",
"mission" : "Accelerate the transition to Containers by building a Kubernetes-native Data Platform",
"products" : [
"KubeDB",
"Stash",
"KubeVault",
"Kubeform",
"ByteBuilders"
]
}
}
]
}
}
So, we have successfully retored our data from the snapshot.