Deploy Elasticsearch via Kubernetes Elasticsearch Operator

Kubernetes Elasticsearch

In today’s environment of containerized applications and orchestration, effectively deploying and maintaining databases such as Elasticsearch demands an flexible and effective strategy. The Kubernetes Elasticsearch Operator simplifies the task of setting up, maintaining, and expanding Elasticsearch databases within a Kubernetes framework. This document explores the basics of Elasticsearch database installation through the Elasticsearch Kubernetes Operator, analyzing its advantages, characteristics, and stepwise instructions. Through this solution, you can attain increased adaptability and automation when it comes to provisioning and managing your Elasticsearch databases, while also following containerized infrastructure best practices.

Kubernetes is a revolutionary open-source platform that simplifies the whole process of containerized applications. With Kubernetes, you can acquire the ability to effortlessly launch, expand, and oversee applications, whether they operate on a single machine or span across a multi-cloud environment.

Kubernetes simplifies the management of numerous containers by automating essential tasks like load balancing, real-time scaling, and ensuring application resilience through automatic recovery. When it’s time to release a new version of your application, Kubernetes handles updates, minimizing downtime and reducing the potential for errors.

With a simple declarative configuration, you can specify your desired application behavior, and Kubernetes ensures it follows those specifications. This allows you to concentrate on developing exceptional applications, while Kubernetes guarantees their reliable and efficient operation. Now, you can simplify the provisioning and troubleshooting process and empowering you to confidently tackle the complexity of application deployment.

Why Elasticsearch in Kubernetes

Elasticsearch is an open-source, highly scalable search engine designed to handle massive amounts of data. It has many features, including full-text search, structured search, analytics, logging, and can be applied to a wide range of applications and use cases. Elasticsearch is an excellent choice for enterprises that need to manage and search through massive amounts of information in real-time since it provides fast and accurate search results.

Elasticsearch employs a distributed architecture that enables horizontal scaling across multiple nodes, ensuring it can effectively manage extensive data loads while ensuring continuous accessibility. In addition to its distributed design, Elasticsearch supports a flexible data format, enabling you to store and index various data types, including text, numerical data, and geospatial information, both structured and unstructured.

Integrating Elasticsearch within a Kubernetes environment offers a powerful combination that brings a host of advantages. It allows for the seamless management of Elasticsearch clusters at scale, ensuring optimal resource utilization and high availability, all within the robust orchestration framework of Kubernetes. Kubernetes simplifies the deployment and scaling of Elasticsearch instances, making it easier to adapt to evolving data demands. Additionally, it provides a unified platform for handling both application and data infrastructure, streamlining operations and reducing complexity. This integration enhances the overall efficiency and resilience of Elasticsearch deployments, facilitating real-time data processing and search capabilities within Kubernetes clusters, making it a formidable solution for modern data-driven applications.

Deploy Elasticsearch on Kubernetes

Pre-requisites

We have to set up the environment to deploy Elasticsearch on Kubernetes using a Kubernetes Elasticsearch operator. A running Kubernetes cluster and a fundamental understanding of Elasticsearch are required to conduct this tutorial. Here, we are going to create our kubernetes cluster using Kind. Additionally, you need to install Helm to your Kubernetes cluster.

In this article, We will use the Kubernetes Elasticsearch operator KubeDB to deploy Elasticsearch on Kubernetes. We must have KubeDB installed in our Kubernetes cluster. To set up KubeDB in our Kubernetes cluster, we require a license. Through the Appscode License Server, we can get a free enterprise license. We must provide our Kubernetes cluster ID to obtain a license. Run the following command below to get the cluster ID.

$ kubectl get ns kube-system -o jsonpath='{.metadata.uid}'
6c08dcb8-8440-4388-849f-1f2b590b731e

The license server will email us with a “license.txt” file attached after we provide the necessary data. Run the following commands listed below to install KubeDB.

$ helm install kubedb oci://ghcr.io/appscode-charts/kubedb \
  --version v2023.12.11 \
  --namespace kubedb --create-namespace \
  --set-file global.license=/path/to/the/license.txt \
  --wait --burst-limit=10000 --debug

Verify the installation by the following command,

$ kubectl get pods --all-namespaces -l "app.kubernetes.io/instance=kubedb"
NAMESPACE   NAME                                            READY   STATUS    RESTARTS   AGE
kubedb      kubedb-kubedb-autoscaler-b99bbb595-kr96z        1/1     Running   0          2m33s
kubedb      kubedb-kubedb-dashboard-69b476d689-tglfl        1/1     Running   0          2m33s
kubedb      kubedb-kubedb-ops-manager-7c6c7b6c97-9h25q      1/1     Running   0          2m33s
kubedb      kubedb-kubedb-provisioner-66f9cb9b99-pfrzb      1/1     Running   0          2m33s
kubedb      kubedb-kubedb-schema-manager-7654cd4457-d7s7j   1/1     Running   0          2m33s
kubedb      kubedb-kubedb-webhook-server-f89bc674b-g7lcb    1/1     Running   0          2m33s

We can go on to the next stage if every pod status is running.

Create a Namespace

Now we’ll create a new namespace in which we will deploy Elasticsearch. To create a namespace, we can use the following command:

$ kubectl create namespace es-demo
namespace/es-demo created

Deploy Elasticsearch via Kubernetes Elasticsearch operator

We need to create a yaml configuration to deploy Elasticsearch on Kubernetes. And we will apply this yaml below,

apiVersion: kubedb.com/v1alpha2
kind: Elasticsearch
metadata:
  name: es-topology-cluster
  namespace: es-demo
spec:
  enableSSL: true 
  version: xpack-8.11.1
  storageType: Durable
  topology:
    master:
      replicas: 2
      storage:
        storageClassName: "standard"
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 512Mi
    data:
      replicas: 2
      storage:
        storageClassName: "standard"
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 512Mi
    ingest:
      replicas: 2
      storage:
        storageClassName: "standard"
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 512Mi

You can see the detailed yaml specifications in the Kubernetes Elasticsearch documentation.

We will save this yaml configuration to es-topology-cluster.yaml. Then create the above Elasticsearch object.

$ kubectl apply -f es-topology-cluster.yaml
elasticsearch.kubedb.com/es-topology-cluster created

If all the above steps are handled correctly and the Elasticsearch is deployed, you will see that the following objects are created:

$ kubectl get all -n es-demo
NAME                               READY   STATUS    RESTARTS   AGE
pod/es-topology-cluster-data-0     1/1     Running   0          4m
pod/es-topology-cluster-data-1     1/1     Running   0          4m
pod/es-topology-cluster-ingest-0   1/1     Running   0          4m
pod/es-topology-cluster-ingest-1   1/1     Running   0          4m
pod/es-topology-cluster-master-0   1/1     Running   0          4m
pod/es-topology-cluster-master-1   1/1     Running   0          4m

NAME                                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
service/es-topology-cluster          ClusterIP   10.8.11.95   <none>        9200/TCP   4m
service/es-topology-cluster-master   ClusterIP   None         <none>        9300/TCP   4m
service/es-topology-cluster-pods     ClusterIP   None         <none>        9200/TCP   4m

NAME                                          READY   AGE
statefulset.apps/es-topology-cluster-data     2/2     4m
statefulset.apps/es-topology-cluster-ingest   2/2     4m
statefulset.apps/es-topology-cluster-master   2/2     4m

NAME                                                     TYPE                       VERSION   AGE
appbinding.appcatalog.appscode.com/es-topology-cluster   kubedb.com/elasticsearch   8.5.2     4m

NAME                                           VERSION       STATUS   AGE
elasticsearch.kubedb.com/es-topology-cluster   xpack-8.11.1   Ready    4m

We have successfully deployed Elasticsearch to Kubernetes via the Kubernetes Elasticsearch operator. Now, we will connect to the Elasticsearch database to insert some sample data and verify whether our Elasticsearch is usable or not. First, check the database status,

$ kubectl get elasticsearch -n es-demo es-topology-cluster
NAME                  VERSION       STATUS   AGE
es-topology-cluster   xpack-8.11.1   Ready    4m

Insert sample data to the Elasticsearch database

Now, we will create few indexes in Elasticsearch. The Kubernetes Elasticsearch operator establishes a governing service with the name of the Elasticsearch object itself when Elasticsearch yaml is deployed. Using this service, we will port-forward to the database from our local workstation and establish a connection. After that, we’ll add some data to Elasticsearch.

Port-forward the Service

KubeDB will create few Services to connect with the database. Let’s see the Services created by KubeDB for our Elasticsearch,

$ kubectl get service -n es-demo -l=app.kubernetes.io/instance=es-topology-cluster
NAME                         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)    AGE
es-topology-cluster          ClusterIP   10.8.11.95   <none>        9200/TCP   4m
es-topology-cluster-master   ClusterIP   None         <none>        9300/TCP   4m
es-topology-cluster-pods     ClusterIP   None         <none>        9200/TCP   4m

Here, we are going to use the es-topology-cluster Service to connect with the database. Now, let’s port-forward the es-topology-cluster Service.

$ kubectl port-forward -n es-demo svc/es-topology-cluster 9200
Forwarding from 127.0.0.1:9200 -> 9200
Forwarding from [::1]:9200 -> 9200

Export the Credentials

Kubernetes Elasticsearch operator will create some Secrets for the database. Let’s list the Secrets for our es-topology-cluster.

$ kubectl get secret -n es-demo -l=app.kubernetes.io/instance=es-topology-cluster
NAME                                              TYPE                       DATA   AGE
es-topology-cluster-apm-system-cred               kubernetes.io/basic-auth   2      5m
es-topology-cluster-beats-system-cred             kubernetes.io/basic-auth   2      5m
es-topology-cluster-ca-cert                       kubernetes.io/tls          2      5m
es-topology-cluster-client-cert                   kubernetes.io/tls          3      5m
es-topology-cluster-config                        Opaque                     1      5m
es-topology-cluster-elastic-cred                  kubernetes.io/basic-auth   2      5m
es-topology-cluster-http-cert                     kubernetes.io/tls          3      5m
es-topology-cluster-kibana-system-cred            kubernetes.io/basic-auth   2      5m
es-topology-cluster-logstash-system-cred          kubernetes.io/basic-auth   2      5m
es-topology-cluster-remote-monitoring-user-cred   kubernetes.io/basic-auth   2      5m
es-topology-cluster-transport-cert                kubernetes.io/tls          3      5m

Now, we can connect to the database with any of these secret that have the prefix cred. Here, we will use es-topology-cluster-elastic-cred which contains the admin level credentials to connect with the database.

$ kubectl get secret -n es-demo es-topology-cluster-elastic-cred -o jsonpath='{.data.username}' | base64 -d
elastic
$ kubectl get secret -n es-demo es-topology-cluster-elastic-cred -o jsonpath='{.data.password}' | base64 -d
q)UC;l!!euEnk.ZZ

We will now use curl to post some sample data into Elasticsearch. Use the -k flag to disable attempts to verify self-signed certificates for testing purposes.

$ curl -XPOST -k --user 'elastic:q)UC;l!!euEnk.ZZ' "https://localhost:9200/music/_doc?pretty" -H 'Content-Type: application/json' -d'
                  {
                      "Artist": "Bon Jovi",
                      "Song": "Its My Life"
                  }
                  '
{
  "_index" : "music",
  "_id" : "oyO52YYBpvYWEhqZyPph",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

Now, let’s verify that the index has been created successfully.

$ curl -XGET -k --user 'elastic:q)UC;l!!euEnk.ZZ' "https://localhost:9200/_cat/indices?v&s=index&pretty"
health status index         uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   kubedb-system zeIvl0RAQ6uhhcvW5GoWfg   1   1          1            5      834kb        404.9kb
green  open   music         _dAwOpXMSK2eHmc2Un7MNA   1   1          1            0     10.6kb          5.3kb

Also, let’s verify the data in the indexes:

$ curl -XGET -k --user 'elastic:q)UC;l!!euEnk.ZZ' "https://localhost:9200/music/_search?pretty"
{
  "took" : 11,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "music",
        "_id" : "oyO52YYBpvYWEhqZyPph",
        "_score" : 1.0,
        "_source" : {
          "Artist" : "Bon Jovi",
          "Song" : "Its My Life"
        }
      }
    ]
  }
}

We’ve successfully Deploy Elasticsearch to Kubernetes via Kubernetes Elasticsearch operator KubeDB and insert some sample data into it.

Elasticsearch on Kubernetes: Best Practices

To ensure the stability of your application while using Elasticsearch on Kubernetes using Kubernetes Elasticsearch operator, there are some best practices that you should follow:

  • Use Dashboard like Kibana: Deploy Kibana alongside Elasticsearch for real-time performance insights and data visualization. Secure Kibana with access controls and encryption, and leverage its dashboards and visualization tools to monitor Elasticsearch cluster health and gain valuable application performance insights.

  • High Availability: Ensure high availability by utilizing Elasticsearch’s data replication features, distributing data across multiple nodes for redundancy. Kubernetes pod anti-affinity rules help prevent single points of failure, and using a load balancer evenly distributes traffic among nodes.

  • Backup and Recovery: Prioritize backup and recovery with regular data backups using Elasticsearch snapshots or available backup and recovery solutions. Store backups in separate locations or cloud storage for disaster recovery, and routinely test data restoration processes for reliability.

  • Monitoring & Security: Implement comprehensive monitoring using tools like Prometheus, Grafana, or Elasticsearch’s monitoring features to track cluster health and performance metrics. Strengthen security with RBAC and robust authentication mechanisms, and implement Kubernetes network policies to Elasticsearch pods.

Conclusion

Elasticsearch is a powerful, open-source search and analytics engine renowned for its ability to handle large and diverse datasets with speed and precision. You have now successfully installed an Elasticsearch database on Kubernetes using the Kubernetes Elasticsearch operator, which is suitable for a variety of use cases. You can visit the official Elasticsearch documentation for further details. Managing database, whether on-premises or in the cloud, requires extensive knowledge and consistent practice. KubeDB provides a full support solution to ensure that your database management fulfills performance and uptime requirements. Whether your database infrastructure is hosted on-site, geographically distributed, or utilizing cloud services or database-as-a-service providers, KubeDB offers assistance in managing the entire process in a production-grade environment.

Share on social media

What They Are Talking About us

Trusted by top engineers at the most ambitious companies

faq-image

frequently asked questions

Here are a few of the questions we get the most. If you don't see what's on your mind, contact us anytime.

Can I manage multiple Databases via KubeDB?

Yes, you can manage as many databases as you want within a single subscription to KubeDB, and there is no extra charge for that!

Can I use KubeDB in any cloud?

Yes, of course! KubeDB is platform-independent. You can use KubeDB in any cloud or on-premises.

My cluster is running on bare metal. Will it be safe to use KubeDB?

KubeDB is running in production by multiple Governments and large organizations. Your data is always safe within KubeDB.

Do you have offer technical support?

We offer 24x7 technical system and maintain SLA to provide 100% reliability to our customers.

Is Stash complementary with KubeDB?

Yes, Stash is seemingly integrated with KubeDB. There is no extra charge for using Stash. It is complimentary with KubeDB.

Can we try KubeDB?

Of course! We offer a 30 days license free of cost to try fully featured KubeDB.

Is there any cancellation fee?

There is no cancellation fee. But plans are subject to minimum duration (1 year) as stated above.

What types of payment do you accept?

We prefer ACH transfer for US based customers and international wire transfer for everyone else. We can also accept all popular credit/debit cards such as Visa, Mastercard, American Express, Discover, etc.

Is my payment information safe?

Yes! For ACH transfer and wire transfer you work with your bank for payment. Our credit card processing is powerd by Stripe. You credit card data never touches our servers. For more information, please visit stripe.com.

Run and Manage your Database on Kubernetes FREE !

KubeDB is FREE to use on any supported Kubernetes engines. You can deploy and manage your database in Kubernetes using KubeDB. There is no up-front investment required. We offer a 30 days license FREE of cost to try KubeDB.