In the current landscape of containerized applications and orchestration, the efficient deployment and upkeep of databases like OpenSearch require a versatile and efficient approach. The Kubernetes OpenSearch Operator streamlines the process of configuring, sustaining, and scaling OpenSearch databases within a Kubernetes environment. This guide delves into the fundamentals of installing OpenSearch databases using the OpenSearch Kubernetes Operator, examining its benefits, features, and step-by-step guidelines. By adopting this solution, you can achieve enhanced flexibility and automation in the provisioning and administration of your OpenSearch databases, all while adhering to best practices for containerized infrastructure.
Kubernetes is a groundbreaking open-source platform that streamlines the entire workflow for containerized applications. It emphasizes that Kubernetes provides the capability to easily deploy, scale, and manage applications, whether they are running on a single machine or distributed across a multi-cloud environment.
Kubernetes streamlines the administration of multiple containers by automating critical functions such as load balancing, dynamic scaling, and ensuring application robustness with automatic recovery mechanisms. When introducing a new version of your application, Kubernetes takes charge of the update process, minimizing downtime and mitigating the risk of errors.
With a simple declarative configuration, you can specify your desired application behavior, and Kubernetes ensures it follows those specifications. This allows you to concentrate on developing exceptional applications, while Kubernetes guarantees their reliable and efficient operation. Now, you can simplify the provisioning and troubleshooting process and empowering you to confidently tackle the complexity of application deployment.
OpenSearch, an open-source and exceptionally scalable search engine explicitly created for processing extensive volumes of data. It has many features, including full-text search, structured search, analytics, and logging, OpenSearch proves versatile and applicable across a diverse array of applications and use scenarios. Particularly advantageous for enterprises dealing with substantial real-time data management and search requirements, OpenSearch stands out for its ability to deliver rapid and precise search results.
OpenSearch facilitates horizontal scaling across multiple nodes, ensuring efficient handling of large data loads while maintaining continuous accessibility. Alongside its distributed design, OpenSearch accommodates a versatile data format, allowing the storage and indexing of diverse data types, such as text, numerical data, and geospatial information, whether structured or unstructured.
Integrating OpenSearch within a Kubernetes environment offers a powerful combination that brings a host of advantages. It allows for the seamless management of OpenSearch clusters at scale, ensuring optimal resource utilization and high availability, all within the robust orchestration framework of Kubernetes. Kubernetes simplifies the deployment and scaling of OpenSearch instances, making it easier to adapt to evolving data demands. Additionally, it provides a unified platform for handling both application and data infrastructure, streamlining operations and reducing complexity. This integration enhances the overall efficiency and resilience of OpenSearch deployments, facilitating real-time data processing and search capabilities within Kubernetes clusters, making it a formidable solution for modern data-driven applications.
We have to set up the environment to deploy OpenSearch on Kubernetes using a Kubernetes OpenSearch operator. A running Kubernetes cluster and a fundamental understanding of OpenSearch are required to conduct this tutorial. Here, we are going to create our kubernetes cluster using Kind. Additionally, you need to install Helm to your Kubernetes cluster.
In this article, We will use the Kubernetes OpenSearch operator KubeDB to deploy OpenSearch on Kubernetes. We must have KubeDB installed in our Kubernetes cluster. KubeDB provides supports for the official Elasticsearch by Elastic and OpenSearch by AWS, but also other open source distributions like SearchGuard and OpenDistro. KubeDB provides all of these distribution’s support under the Elasticsearch CR of KubeDB. To set up KubeDB in our Kubernetes cluster, we require a license. Through the Appscode License Server, we can get a free enterprise license. We must provide our Kubernetes cluster ID to obtain a license. Run the following command below to get the cluster ID.
$ kubectl get ns kube-system -o jsonpath='{.metadata.uid}'
6c08dcb8-8440-4388-849f-1f2b590b731e
The license server will email us with a “license.txt” file attached after we provide the necessary data. Run the following commands listed below to install KubeDB.
$ helm install kubedb oci://ghcr.io/appscode-charts/kubedb \
--version v2023.12.11 \
--namespace kubedb --create-namespace \
--set-file global.license=/path/to/the/license.txt \
--wait --burst-limit=10000 --debug
Verify the installation by the following command,
$ kubectl get pods --all-namespaces -l "app.kubernetes.io/instance=kubedb"
NAMESPACE NAME READY STATUS RESTARTS AGE
kubedb kubedb-kubedb-autoscaler-8685b5f5f8-kwh9r 1/1 Running 0 2m38s
kubedb kubedb-kubedb-dashboard-677448dff8-ggrz6 1/1 Running 0 2m38s
kubedb kubedb-kubedb-ops-manager-f4d869f54-xbtd7 1/1 Running 0 2m38s
kubedb kubedb-kubedb-provisioner-778795d79-zbn74 1/1 Running 0 2m38s
kubedb kubedb-kubedb-schema-manager-64f9cc9445-vwfsk 1/1 Running 0 2m38s
kubedb kubedb-kubedb-webhook-server-85cb5f5fdb-jtpgt 1/1 Running 0 2m38s
We can go on to the next stage if every pod status is running.
Now we’ll create a new namespace in which we will deploy OpenSearch. To create a namespace, we can use the following command:
$ kubectl create namespace os-demo
namespace/os-demo created
We need to create a yaml configuration to deploy OpenSearch on Kubernetes. And we will apply this yaml below,
apiVersion: kubedb.com/v1alpha2
kind: Elasticsearch
metadata:
name: os-cluster
namespace: os-demo
spec:
enableSSL: true
version: opensearch-2.11.1
storageType: Durable
topology:
master:
replicas: 2
resources:
storage:
storageClassName: "standard"
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
data:
replicas: 2
resources:
storage:
storageClassName: "standard"
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
ingest:
replicas: 2
resources:
storage:
storageClassName: "standard"
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
terminationPolicy: WipeOut
You can see the detailed yaml specifications in the Kubernetes OpenSearch documentation.
We will save this yaml configuration to os-cluster.yaml
. Then create the above OpenSearch object.
$ kubectl apply -f os-cluster.yaml
elasticsearch.kubedb.com/os-cluster created
If all the above steps are handled correctly and the OpenSearch is deployed, you will see that the following objects are created:
$ kubectl get all -n os-demo
NAME READY STATUS RESTARTS AGE
pod/os-cluster-data-0 1/1 Running 0 4m37s
pod/os-cluster-data-1 1/1 Running 0 2m39s
pod/os-cluster-ingest-0 1/1 Running 0 4m47s
pod/os-cluster-ingest-1 1/1 Running 0 2m42s
pod/os-cluster-master-0 1/1 Running 0 4m42s
pod/os-cluster-master-1 1/1 Running 0 2m36s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/os-cluster ClusterIP 10.96.99.212 <none> 9200/TCP 4m55s
service/os-cluster-master ClusterIP None <none> 9300/TCP 4m55s
service/os-cluster-pods ClusterIP None <none> 9200/TCP 4m55s
NAME READY AGE
statefulset.apps/os-cluster-data 2/2 4m37s
statefulset.apps/os-cluster-ingest 2/2 4m47s
statefulset.apps/os-cluster-master 2/2 4m42s
NAME TYPE VERSION AGE
appbinding.appcatalog.appscode.com/os-cluster kubedb.com/elasticsearch 2.8.0 4m37s
NAME VERSION STATUS AGE
elasticsearch.kubedb.com/os-cluster opensearch-2.11.1 Ready 4m55s
We have successfully deployed OpenSearch to Kubernetes via the Kubernetes OpenSearch operator. Now, we will connect to the OpenSearch database to insert some sample data and verify whether our OpenSearch is usable or not. First, check the database status,
$ kubectl get es -n os-demo os-cluster
NAME VERSION STATUS AGE
os-cluster opensearch-2.11.1 Ready 4m59s
Now, we will create few indexes in OpenSearch. The Kubernetes OpenSearch operator establishes a governing service with the name of the OpenSearch object itself when OpenSearch yaml is deployed. Using this service, we will port-forward to the database from our local workstation and establish a connection. After that, we’ll add some data to OpenSearch.
KubeDB will create few Services to connect with the database. Let’s see the Services created by KubeDB for our OpenSearch,
$ kubectl get service -n os-demo -l=app.kubernetes.io/instance=os-cluster
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
os-cluster ClusterIP 10.96.220.157 <none> 9200/TCP 5m
os-cluster-master ClusterIP None <none> 9300/TCP 5m
os-cluster-pods ClusterIP None <none> 9200/TCP 5m
Here, we are going to use the os-cluster
Service to connect with the database. Now, let’s port-forward the os-cluster
Service.
$ kubectl port-forward -n os-demo svc/os-cluster 9200
Forwarding from 127.0.0.1:9200 -> 9200
Forwarding from [::1]:9200 -> 9200
Kubernetes OpenSearch operator will create some Secrets for the database. Let’s list the Secrets for our os-cluster
.
$ kubectl get secret -n os-demo -l=app.kubernetes.io/instance=os-cluster
NAME TYPE DATA AGE
os-cluster-admin-cert kubernetes.io/tls 3 5m
os-cluster-admin-cred kubernetes.io/basic-auth 2 5m
os-cluster-ca-cert kubernetes.io/tls 2 5m
os-cluster-client-cert kubernetes.io/tls 3 5m
os-cluster-config Opaque 3 5m
os-cluster-http-cert kubernetes.io/tls 3 5m
os-cluster-kibanaro-cred kubernetes.io/basic-auth 2 5m
os-cluster-kibanaserver-cred kubernetes.io/basic-auth 2 5m
os-cluster-logstash-cred kubernetes.io/basic-auth 2 5m
os-cluster-readall-cred kubernetes.io/basic-auth 2 5m
os-cluster-snapshotrestore-cred kubernetes.io/basic-auth 2 5m
os-cluster-transport-cert kubernetes.io/tls 3 5m
Now, we can connect to the database with any of these secret that have the prefix cred
. Here, we will use os-cluster-admin-cred
which contains the admin level credentials to connect with the database.
$ kubectl get secret -n os-demo os-cluster-admin-cred -o jsonpath='{.data.username}' | base64 -d
admin
$ kubectl get secret -n os-demo os-cluster-admin-cred -o jsonpath='{.data.password}' | base64 -d
t;gmkX(o!4DuU6XP
We will now use curl to post some sample data into OpenSearch. Use the -k flag to disable attempts to verify self-signed certificates for testing purposes.
$ curl -XPOST -k --user 'admin:t;gmkX(o!4DuU6XP' "https://localhost:9200/music/_doc?pretty" -H 'Content-Type: application/json' -d'
{
"Artist": "Backstreet Boys",
"Song": "Show Me The Meaning"
}
'
{
"_index" : "music",
"_id" : "MRIPuYsBGygDWO9F_G9o",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
Now, let’s verify that the index has been created successfully.
$ curl -XGET -k --user 'admin:t;gmkX(o!4DuU6XP' "https://localhost:9200/_cat/indices?v&s=index&pretty"
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
green open .opendistro_security MtD1G8t7SCKHdRdgESbglw 1 1 10 0 120.8kb 75.4kb
green open .opensearch-observability 5miOoG23QQ2tQKJYDlDV1A 1 1 0 0 416b 208b
green open kubedb-system cL0sZYAaTEa7MeE_OYVXcg 1 1 1 270 1.3mb 706.3kb
green open music 7jmr68IFT9S5s0W_2IaP1g 1 1 1 0 9.3kb 4.6kb
green open security-auditlog-2023.11.10 EbBSYaTATuaiE7efHLFaKA 1 1 12 0 346.9kb 173.2kb
Also, let’s verify the data in the indexes:
$ curl -XGET -k --user 'admin:t;gmkX(o!4DuU6XP' "https://localhost:9200/music/_search?pretty"
{
"took" : 93,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 1,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "music",
"_id" : "MRIPuYsBGygDWO9F_G9o",
"_score" : 1.0,
"_source" : {
"Artist" : "Backstreet Boys",
"Song" : "Show Me The Meaning"
}
}
]
}
}
We’ve successfully Deploy OpenSearch to Kubernetes via Kubernetes OpenSearch Operator KubeDB and insert some sample data into it.
To ensure the robustness and reliability of your application when leveraging OpenSearch on Kubernetes through the Kubernetes OpenSearch operator, there are some best practices that you should follow:
Dashboard Integration: Deploy OpenSearch Dashboards alongside your OpenSearch cluster to access real-time performance insights and efficient data visualization. Secure OpenSearch Dashboards by implementing access controls and encryption. Leverage the dashboard features to monitor the health of your OpenSearch cluster and extract valuable performance insights for your application.
High Availability: Ensuring high availability by leveraging OpenSearch’s built-in data replication capabilities. Distribute data across multiple nodes to ensure redundancy and resilience. Implement load balancing to evenly distribute traffic among nodes.
Backup and Recovery: Give precedence to backup and recovery protocols by consistently generating data backups using OpenSearch snapshots or alternative compatible backup tools. Safeguard backups by storing them in distinct locations or employing cloud storage, simulate the effectiveness of disaster recovery capabilities. Regularly assess the reliability of data restoration procedures to ensure their effectiveness in crucial scenarios.
Monitoring & Security: Implement a robust monitoring strategy using tools such as Prometheus, Grafana, or OpenSearch’s native monitoring features. Keep a close eye on cluster health and performance metrics to proactively address potential issues. Strengthen security by incorporating Role-Based Access Control (RBAC) and robust authentication mechanisms. Enforce Kubernetes network policies to secure communication between OpenSearch pods and maintain a resilient security posture.
OpenSearch is a robust open-source search and analytics engine known for its capability to handle extensive and varied datasets with speed and accuracy. You have now successfully deployed an OpenSearch database on Kubernetes using the Kubernetes OpenSearch operator, a versatile solution suitable for various applications. Additional details can be found in the official OpenSearch documentation. Managing databases, whether they are located on-premises or in cloud environments, demands a substantial understanding and ongoing commitment. KubeDB provides a full support solution to ensure that your database management fulfills performance and uptime requirements. Regardless of whether your database infrastructure is localized on-site, spread across diverse geographical regions, or relies on cloud services or database-as-a-service providers, KubeDB offers indispensable support in managing the complete process within a production-grade environment.
Trusted by top engineers at the most ambitious companies
InterSystems was delighted to engage with AppsCode in the delicate, yet fundamental task of supporting durable, non-ephemeral workloads with Kubernetes. We needed the best-prepared, most-proficient database operator consulting in the industry. Given AppsCode's pedigree of database building operators, the decision was easy. No time was wasted and all objectives reached in an amazingly short period of time. I would recommend AppsCode consulting for any Kubernetes related work.
Voyager made it simple and efficient for us to protect and initiate our bare metal Kubernetes workload. Its underlying technology and extensive L4 support along with seamless SSL integration is what made us choose Voyager over others. Voyager team is also very responsive when it comes to support. Great product!
Voyager is the easiest way to use the fast and reliable HAProxy as our ingress controller. At PriceHubble, it is the corner-stone of our blue/green deployments.
I work with a few Kubernetes clusters and we use Voyager as our preferred ingress controller. We really like the ease of configuration. Documentation is pretty good. Also the use of HaProxy is important for us because it works really well with both L4 and L7 load balancing. One of our TCP services, Wayk Now, is able to withstand thousands of persistent connections very smoothly at the same time.
We really like using Voyager. Its straightforward and well-documented config and SSL (especially Let's Encrypt) has made our migration of services to Kubernetes a breeze. Each major version has been a very welcome update!
We're using Voyager as part of most Astarte deployments and it's orchestrated also by our brand new Operator. We are glad to showcase that.
Here are a few of the questions we get the most. If you don't see what's on your mind, contact us anytime.
Yes, you can manage as many databases as you want within a single subscription to KubeDB, and there is no extra charge for that!
Yes, of course! KubeDB is platform-independent. You can use KubeDB in any cloud or on-premises.
KubeDB is running in production by multiple Governments and large organizations. Your data is always safe within KubeDB.
We offer 24x7 technical system and maintain SLA to provide 100% reliability to our customers.
Yes, Stash is seemingly integrated with KubeDB. There is no extra charge for using Stash. It is complimentary with KubeDB.
Of course! We offer a 30 days license free of cost to try fully featured KubeDB.
There is no cancellation fee. But plans are subject to minimum duration (1 year) as stated above.
We prefer ACH transfer for US based customers and international wire transfer for everyone else. We can also accept all popular credit/debit cards such as Visa, Mastercard, American Express, Discover, etc.
Yes! For ACH transfer and wire transfer you work with your bank for payment. Our credit card processing is powerd by Stripe. You credit card data never touches our servers. For more information, please visit stripe.com.
KubeDB is FREE to use on any supported Kubernetes engines. You can deploy and manage your database in Kubernetes using KubeDB. There is no up-front investment required. We offer a 30 days license FREE of cost to try KubeDB.