Deploy OpenSearch via Kubernetes OpenSearch Operator

Kubernetes OpenSearch

In the current landscape of containerized applications and orchestration, the efficient deployment and upkeep of databases like OpenSearch require a versatile and efficient approach. The Kubernetes OpenSearch Operator streamlines the process of configuring, sustaining, and scaling OpenSearch databases within a Kubernetes environment. This guide delves into the fundamentals of installing OpenSearch databases using the OpenSearch Kubernetes Operator, examining its benefits, features, and step-by-step guidelines. By adopting this solution, you can achieve enhanced flexibility and automation in the provisioning and administration of your OpenSearch databases, all while adhering to best practices for containerized infrastructure.

Kubernetes is a groundbreaking open-source platform that streamlines the entire workflow for containerized applications. It emphasizes that Kubernetes provides the capability to easily deploy, scale, and manage applications, whether they are running on a single machine or distributed across a multi-cloud environment.

Kubernetes streamlines the administration of multiple containers by automating critical functions such as load balancing, dynamic scaling, and ensuring application robustness with automatic recovery mechanisms. When introducing a new version of your application, Kubernetes takes charge of the update process, minimizing downtime and mitigating the risk of errors.

With a simple declarative configuration, you can specify your desired application behavior, and Kubernetes ensures it follows those specifications. This allows you to concentrate on developing exceptional applications, while Kubernetes guarantees their reliable and efficient operation. Now, you can simplify the provisioning and troubleshooting process and empowering you to confidently tackle the complexity of application deployment.

Why OpenSearch in Kubernetes

OpenSearch, an open-source and exceptionally scalable search engine explicitly created for processing extensive volumes of data. It has many features, including full-text search, structured search, analytics, and logging, OpenSearch proves versatile and applicable across a diverse array of applications and use scenarios. Particularly advantageous for enterprises dealing with substantial real-time data management and search requirements, OpenSearch stands out for its ability to deliver rapid and precise search results.

OpenSearch facilitates horizontal scaling across multiple nodes, ensuring efficient handling of large data loads while maintaining continuous accessibility. Alongside its distributed design, OpenSearch accommodates a versatile data format, allowing the storage and indexing of diverse data types, such as text, numerical data, and geospatial information, whether structured or unstructured.

Integrating OpenSearch within a Kubernetes environment offers a powerful combination that brings a host of advantages. It allows for the seamless management of OpenSearch clusters at scale, ensuring optimal resource utilization and high availability, all within the robust orchestration framework of Kubernetes. Kubernetes simplifies the deployment and scaling of OpenSearch instances, making it easier to adapt to evolving data demands. Additionally, it provides a unified platform for handling both application and data infrastructure, streamlining operations and reducing complexity. This integration enhances the overall efficiency and resilience of OpenSearch deployments, facilitating real-time data processing and search capabilities within Kubernetes clusters, making it a formidable solution for modern data-driven applications.

Deploying OpenSearch on Kubernetes

Pre-requisites

We have to set up the environment to deploy OpenSearch on Kubernetes using a Kubernetes OpenSearch operator. A running Kubernetes cluster and a fundamental understanding of OpenSearch are required to conduct this tutorial. Here, we are going to create our kubernetes cluster using Kind. Additionally, you need to install Helm to your Kubernetes cluster.

In this article, We will use the Kubernetes OpenSearch operator KubeDB to deploy OpenSearch on Kubernetes. We must have KubeDB installed in our Kubernetes cluster. KubeDB provides supports for the official Elasticsearch by Elastic and OpenSearch by AWS, but also other open source distributions like SearchGuard and OpenDistro. KubeDB provides all of these distribution’s support under the Elasticsearch CR of KubeDB. To set up KubeDB in our Kubernetes cluster, we require a license. Through the Appscode License Server, we can get a free enterprise license. We must provide our Kubernetes cluster ID to obtain a license. Run the following command below to get the cluster ID.

$ kubectl get ns kube-system -o jsonpath='{.metadata.uid}'
6c08dcb8-8440-4388-849f-1f2b590b731e

The license server will email us with a “license.txt” file attached after we provide the necessary data. Run the following commands listed below to install KubeDB.

$ helm install kubedb oci://ghcr.io/appscode-charts/kubedb \
  --version v2023.12.11 \
  --namespace kubedb --create-namespace \
  --set-file global.license=/path/to/the/license.txt \
  --wait --burst-limit=10000 --debug

Verify the installation by the following command,

$ kubectl get pods --all-namespaces -l "app.kubernetes.io/instance=kubedb"
NAMESPACE   NAME                                            READY   STATUS    RESTARTS   AGE
kubedb      kubedb-kubedb-autoscaler-8685b5f5f8-kwh9r       1/1     Running   0          2m38s
kubedb      kubedb-kubedb-dashboard-677448dff8-ggrz6        1/1     Running   0          2m38s
kubedb      kubedb-kubedb-ops-manager-f4d869f54-xbtd7       1/1     Running   0          2m38s
kubedb      kubedb-kubedb-provisioner-778795d79-zbn74       1/1     Running   0          2m38s
kubedb      kubedb-kubedb-schema-manager-64f9cc9445-vwfsk   1/1     Running   0          2m38s
kubedb      kubedb-kubedb-webhook-server-85cb5f5fdb-jtpgt   1/1     Running   0          2m38s

We can go on to the next stage if every pod status is running.

Create a Namespace

Now we’ll create a new namespace in which we will deploy OpenSearch. To create a namespace, we can use the following command:

$ kubectl create namespace os-demo
namespace/os-demo created

Deploy OpenSearch via Kubernetes OpenSearch operator

We need to create a yaml configuration to deploy OpenSearch on Kubernetes. And we will apply this yaml below,

apiVersion: kubedb.com/v1alpha2
kind: Elasticsearch
metadata:
  name: os-cluster
  namespace: os-demo
spec:
  enableSSL: true 
  version: opensearch-2.11.1
  storageType: Durable
  topology:
    master:
      replicas: 2
      resources:
      storage:
        storageClassName: "standard"
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi
    data:
      replicas: 2
      resources:
      storage:
        storageClassName: "standard"
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi
    ingest:
      replicas: 2
      resources:
      storage:
        storageClassName: "standard"
        accessModes:
        - ReadWriteOnce
        resources:
          requests:
            storage: 1Gi
  terminationPolicy: WipeOut

You can see the detailed yaml specifications in the Kubernetes OpenSearch documentation.

We will save this yaml configuration to os-cluster.yaml. Then create the above OpenSearch object.

$ kubectl apply -f os-cluster.yaml
elasticsearch.kubedb.com/os-cluster created

If all the above steps are handled correctly and the OpenSearch is deployed, you will see that the following objects are created:

$ kubectl get all -n os-demo
NAME                      READY   STATUS    RESTARTS   AGE
pod/os-cluster-data-0     1/1     Running   0          4m37s
pod/os-cluster-data-1     1/1     Running   0          2m39s
pod/os-cluster-ingest-0   1/1     Running   0          4m47s
pod/os-cluster-ingest-1   1/1     Running   0          2m42s
pod/os-cluster-master-0   1/1     Running   0          4m42s
pod/os-cluster-master-1   1/1     Running   0          2m36s

NAME                        TYPE        CLUSTER-IP     EXTERNAL-IP   PORT(S)    AGE
service/os-cluster          ClusterIP   10.96.99.212   <none>        9200/TCP   4m55s
service/os-cluster-master   ClusterIP   None           <none>        9300/TCP   4m55s
service/os-cluster-pods     ClusterIP   None           <none>        9200/TCP   4m55s

NAME                                 READY   AGE
statefulset.apps/os-cluster-data     2/2     4m37s
statefulset.apps/os-cluster-ingest   2/2     4m47s
statefulset.apps/os-cluster-master   2/2     4m42s

NAME                                            TYPE                       VERSION   AGE
appbinding.appcatalog.appscode.com/os-cluster   kubedb.com/elasticsearch   2.8.0     4m37s

NAME                                  VERSION            STATUS   AGE
elasticsearch.kubedb.com/os-cluster   opensearch-2.11.1   Ready    4m55s

We have successfully deployed OpenSearch to Kubernetes via the Kubernetes OpenSearch operator. Now, we will connect to the OpenSearch database to insert some sample data and verify whether our OpenSearch is usable or not. First, check the database status,

$ kubectl get es -n os-demo os-cluster
NAME         VERSION            STATUS   AGE
os-cluster   opensearch-2.11.1   Ready    4m59s

Insert sample data to the OpenSearch database

Now, we will create few indexes in OpenSearch. The Kubernetes OpenSearch operator establishes a governing service with the name of the OpenSearch object itself when OpenSearch yaml is deployed. Using this service, we will port-forward to the database from our local workstation and establish a connection. After that, we’ll add some data to OpenSearch.

Port-forward the Service

KubeDB will create few Services to connect with the database. Let’s see the Services created by KubeDB for our OpenSearch,

$ kubectl get service -n os-demo -l=app.kubernetes.io/instance=os-cluster
NAME                TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
os-cluster          ClusterIP   10.96.220.157   <none>        9200/TCP   5m
os-cluster-master   ClusterIP   None            <none>        9300/TCP   5m
os-cluster-pods     ClusterIP   None            <none>        9200/TCP   5m

Here, we are going to use the os-cluster Service to connect with the database. Now, let’s port-forward the os-cluster Service.

$ kubectl port-forward -n os-demo svc/os-cluster 9200
Forwarding from 127.0.0.1:9200 -> 9200
Forwarding from [::1]:9200 -> 9200

Export the Credentials

Kubernetes OpenSearch operator will create some Secrets for the database. Let’s list the Secrets for our os-cluster.

$ kubectl get secret -n os-demo -l=app.kubernetes.io/instance=os-cluster
NAME                              TYPE                       DATA   AGE
os-cluster-admin-cert             kubernetes.io/tls          3      5m
os-cluster-admin-cred             kubernetes.io/basic-auth   2      5m
os-cluster-ca-cert                kubernetes.io/tls          2      5m
os-cluster-client-cert            kubernetes.io/tls          3      5m
os-cluster-config                 Opaque                     3      5m
os-cluster-http-cert              kubernetes.io/tls          3      5m
os-cluster-kibanaro-cred          kubernetes.io/basic-auth   2      5m
os-cluster-kibanaserver-cred      kubernetes.io/basic-auth   2      5m
os-cluster-logstash-cred          kubernetes.io/basic-auth   2      5m
os-cluster-readall-cred           kubernetes.io/basic-auth   2      5m
os-cluster-snapshotrestore-cred   kubernetes.io/basic-auth   2      5m
os-cluster-transport-cert         kubernetes.io/tls          3      5m

Now, we can connect to the database with any of these secret that have the prefix cred. Here, we will use os-cluster-admin-cred which contains the admin level credentials to connect with the database.

$ kubectl get secret -n os-demo os-cluster-admin-cred -o jsonpath='{.data.username}' | base64 -d
admin
$ kubectl get secret -n os-demo os-cluster-admin-cred -o jsonpath='{.data.password}' | base64 -d
t;gmkX(o!4DuU6XP

We will now use curl to post some sample data into OpenSearch. Use the -k flag to disable attempts to verify self-signed certificates for testing purposes.

$ curl -XPOST -k --user 'admin:t;gmkX(o!4DuU6XP' "https://localhost:9200/music/_doc?pretty" -H 'Content-Type: application/json' -d'
                           {
                               "Artist": "Backstreet Boys",
                               "Song": "Show Me The Meaning"
                           }
                           '
{
  "_index" : "music",
  "_id" : "MRIPuYsBGygDWO9F_G9o",
  "_version" : 1,
  "result" : "created",
  "_shards" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "_seq_no" : 0,
  "_primary_term" : 1
}

Now, let’s verify that the index has been created successfully.

$ curl -XGET -k --user 'admin:t;gmkX(o!4DuU6XP' "https://localhost:9200/_cat/indices?v&s=index&pretty"
health status index                        uuid                   pri rep docs.count docs.deleted store.size pri.store.size
green  open   .opendistro_security         MtD1G8t7SCKHdRdgESbglw   1   1         10            0    120.8kb         75.4kb
green  open   .opensearch-observability    5miOoG23QQ2tQKJYDlDV1A   1   1          0            0       416b           208b
green  open   kubedb-system                cL0sZYAaTEa7MeE_OYVXcg   1   1          1          270      1.3mb        706.3kb
green  open   music                        7jmr68IFT9S5s0W_2IaP1g   1   1          1            0      9.3kb          4.6kb
green  open   security-auditlog-2023.11.10 EbBSYaTATuaiE7efHLFaKA   1   1         12            0    346.9kb        173.2kb

Also, let’s verify the data in the indexes:

$ curl -XGET -k --user 'admin:t;gmkX(o!4DuU6XP' "https://localhost:9200/music/_search?pretty"
{
  "took" : 93,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 1,
      "relation" : "eq"
    },
    "max_score" : 1.0,
    "hits" : [
      {
        "_index" : "music",
        "_id" : "MRIPuYsBGygDWO9F_G9o",
        "_score" : 1.0,
        "_source" : {
          "Artist" : "Backstreet Boys",
          "Song" : "Show Me The Meaning"
        }
      }
    ]
  }
}

We’ve successfully Deploy OpenSearch to Kubernetes via Kubernetes OpenSearch Operator KubeDB and insert some sample data into it.

OpenSearch on Kubernetes: Best Practices

To ensure the robustness and reliability of your application when leveraging OpenSearch on Kubernetes through the Kubernetes OpenSearch operator, there are some best practices that you should follow:

  • Dashboard Integration: Deploy OpenSearch Dashboards alongside your OpenSearch cluster to access real-time performance insights and efficient data visualization. Secure OpenSearch Dashboards by implementing access controls and encryption. Leverage the dashboard features to monitor the health of your OpenSearch cluster and extract valuable performance insights for your application.

  • High Availability: Ensuring high availability by leveraging OpenSearch’s built-in data replication capabilities. Distribute data across multiple nodes to ensure redundancy and resilience. Implement load balancing to evenly distribute traffic among nodes.

  • Backup and Recovery: Give precedence to backup and recovery protocols by consistently generating data backups using OpenSearch snapshots or alternative compatible backup tools. Safeguard backups by storing them in distinct locations or employing cloud storage, simulate the effectiveness of disaster recovery capabilities. Regularly assess the reliability of data restoration procedures to ensure their effectiveness in crucial scenarios.

  • Monitoring & Security: Implement a robust monitoring strategy using tools such as Prometheus, Grafana, or OpenSearch’s native monitoring features. Keep a close eye on cluster health and performance metrics to proactively address potential issues. Strengthen security by incorporating Role-Based Access Control (RBAC) and robust authentication mechanisms. Enforce Kubernetes network policies to secure communication between OpenSearch pods and maintain a resilient security posture.

Conclusion

OpenSearch is a robust open-source search and analytics engine known for its capability to handle extensive and varied datasets with speed and accuracy. You have now successfully deployed an OpenSearch database on Kubernetes using the Kubernetes OpenSearch operator, a versatile solution suitable for various applications. Additional details can be found in the official OpenSearch documentation. Managing databases, whether they are located on-premises or in cloud environments, demands a substantial understanding and ongoing commitment. KubeDB provides a full support solution to ensure that your database management fulfills performance and uptime requirements. Regardless of whether your database infrastructure is localized on-site, spread across diverse geographical regions, or relies on cloud services or database-as-a-service providers, KubeDB offers indispensable support in managing the complete process within a production-grade environment.

Share on social media

What They Are Talking About us

Trusted by top engineers at the most ambitious companies

faq-image

frequently asked questions

Here are a few of the questions we get the most. If you don't see what's on your mind, contact us anytime.

Can I manage multiple Databases via KubeDB?

Yes, you can manage as many databases as you want within a single subscription to KubeDB, and there is no extra charge for that!

Can I use KubeDB in any cloud?

Yes, of course! KubeDB is platform-independent. You can use KubeDB in any cloud or on-premises.

My cluster is running on bare metal. Will it be safe to use KubeDB?

KubeDB is running in production by multiple Governments and large organizations. Your data is always safe within KubeDB.

Do you have offer technical support?

We offer 24x7 technical system and maintain SLA to provide 100% reliability to our customers.

Is Stash complementary with KubeDB?

Yes, Stash is seemingly integrated with KubeDB. There is no extra charge for using Stash. It is complimentary with KubeDB.

Can we try KubeDB?

Of course! We offer a 30 days license free of cost to try fully featured KubeDB.

Is there any cancellation fee?

There is no cancellation fee. But plans are subject to minimum duration (1 year) as stated above.

What types of payment do you accept?

We prefer ACH transfer for US based customers and international wire transfer for everyone else. We can also accept all popular credit/debit cards such as Visa, Mastercard, American Express, Discover, etc.

Is my payment information safe?

Yes! For ACH transfer and wire transfer you work with your bank for payment. Our credit card processing is powerd by Stripe. You credit card data never touches our servers. For more information, please visit stripe.com.

Run and Manage your Database on Kubernetes FREE !

KubeDB is FREE to use on any supported Kubernetes engines. You can deploy and manage your database in Kubernetes using KubeDB. There is no up-front investment required. We offer a 30 days license FREE of cost to try KubeDB.