New to KubeDB? Please start here.

Monitoring Apache Druid with KubeDB

KubeDB has native support for monitoring via Prometheus. You can use builtin Prometheus scraper or Prometheus operator to monitor KubeDB managed databases. This tutorial will show you how database monitoring works with KubeDB and how to configure Database crd to enable monitoring.

Overview

KubeDB uses Prometheus exporter images to export Prometheus metrics for respective databases. As KubeDB supports Druid versions in KRaft mode, and the officially recognized exporter image doesn’t expose metrics for them yet - KubeDB managed Druid instances use JMX Exporter instead. This exporter is intended to be run as a Java Agent inside Druid container, exposing a HTTP server and serving metrics of the local JVM. To Following diagram shows the logical flow of database monitoring with KubeDB.

  Database Monitoring Flow

When a user creates a Druid crd with spec.monitor section configured, KubeDB operator provisions the respective Druid cluster while running the exporter as a Java agent inside the druid containers. It also creates a dedicated stats service with name {database-crd-name}-stats for monitoring. Prometheus server can scrape metrics using this stats service.

Configure Monitoring

In order to enable monitoring for a database, you have to configure spec.monitor section. KubeDB provides following options to configure spec.monitor section:

FieldTypeUses
spec.monitor.agentRequiredType of the monitoring agent that will be used to monitor this database. It can be prometheus.io/builtin or prometheus.io/operator.
spec.monitor.prometheus.exporter.portOptionalPort number where the exporter side car will serve metrics.
spec.monitor.prometheus.exporter.argsOptionalArguments to pass to the exporter sidecar.
spec.monitor.prometheus.exporter.envOptionalList of environment variables to set in the exporter sidecar container.
spec.monitor.prometheus.exporter.resourcesOptionalResources required by exporter sidecar container.
spec.monitor.prometheus.exporter.securityContextOptionalSecurity options the exporter should run with.
spec.monitor.prometheus.serviceMonitor.labelsOptionalLabels for ServiceMonitor crd.
spec.monitor.prometheus.serviceMonitor.intervalOptionalInterval at which metrics should be scraped.

Sample Configuration

A sample YAML for TLS secured Druid crd with spec.monitor section configured to enable monitoring with Prometheus operator is shown below.

apiVersion: kubedb.com/v1alpha2
kind: Druid
metadata:
  name: druid-with-monitoring
  namespace: demo
spec:
  version: 28.0.1
  deepStorage:
    type: s3
    configSecret:
      name: deep-storage-config
  topology:
    routers:
      replicas: 1
  monitor:
    agent: prometheus.io/operator
    prometheus:
      serviceMonitor:
        labels:
          release: prometheus
        interval: 10s
  deletionPolicy: WipeOut

Create External Dependency (Deep Storage)

Before proceeding further, we need to prepare deep storage, which is one of the external dependency of Druid and used for storing the segments. It is a storage mechanism that Apache Druid does not provide. Amazon S3, Google Cloud Storage, or Azure Blob Storage, S3-compatible storage (like Minio), or HDFS are generally convenient options for deep storage.

In this tutorial, we will run a minio-server as deep storage in our local kind cluster using minio-operator and create a bucket named druid in it, which the deployed druid database will use.


$ helm repo add minio https://operator.min.io/
$ helm repo update minio
$ helm upgrade --install --namespace "minio-operator" --create-namespace "minio-operator" minio/operator --set operator.replicaCount=1

$ helm upgrade --install --namespace "demo" --create-namespace druid-minio minio/tenant \
--set tenant.pools[0].servers=1 \
--set tenant.pools[0].volumesPerServer=1 \
--set tenant.pools[0].size=1Gi \
--set tenant.certificate.requestAutoCert=false \
--set tenant.buckets[0].name="druid" \
--set tenant.pools[0].name="default"

Now we need to create a Secret named deep-storage-config. It contains the necessary connection information using which the druid database will connect to the deep storage.

apiVersion: v1
kind: Secret
metadata:
  name: deep-storage-config
  namespace: demo
stringData:
  druid.storage.type: "s3"
  druid.storage.bucket: "druid"
  druid.storage.baseKey: "druid/segments"
  druid.s3.accessKey: "minio"
  druid.s3.secretKey: "minio123"
  druid.s3.protocol: "http"
  druid.s3.enablePathStyleAccess: "true"
  druid.s3.endpoint.signingRegion: "us-east-1"
  druid.s3.endpoint.url: "http://myminio-hl.demo.svc.cluster.local:9000/"

Let’s create the deep-storage-config Secret shown above:

$ kubectl create -f https://github.com/kubedb/docs/raw/v2024.11.18/docs/guides/druid/monitoring/yamls/deep-storage-config.yaml
secret/deep-storage-config created

Let’s deploy the above druid example by the following command:

$ kubectl create -f https://github.com/kubedb/docs/raw/v2024.11.18/docs/guides/druid/monitoring/yamls/druid-with-monitoring.yaml
druid.kubedb.com/druid created

Here, we have specified that we are going to monitor this server using Prometheus operator through spec.monitor.agent: prometheus.io/operator. KubeDB will create a ServiceMonitor crd in databases namespace and this ServiceMonitor will have release: prometheus label.

Next Steps