New to KubeDB? Please start here.
Streaming Replication
Streaming Replication provides asynchronous replication to one or more standby servers.
Before You Begin
At first, you need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. If you do not already have a cluster, you can create one by using kind.
Now, install KubeDB cli on your workstation and KubeDB operator in your cluster following the steps here.
To keep things isolated, this tutorial uses a separate namespace called demo
throughout this tutorial.
$ kubectl create ns demo
namespace/demo created
Note: YAML files used in this tutorial are stored in docs/examples/postgres folder in GitHub repository kubedb/docs.
Create PostgreSQL with Streaming replication
The example below demonstrates KubeDB PostgreSQL for Streaming Replication
apiVersion: kubedb.com/v1
kind: Postgres
metadata:
name: ha-postgres
namespace: demo
spec:
version: "13.13"
replicas: 3
storageType: Durable
storage:
storageClassName: "standard"
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
In this examples:
- This
Postgres
object creates three PostgreSQL servers, indicated by thereplicas
field. - One server will be primary and two others will be warm standby servers, default of
spec.standbyMode
What is Streaming Replication
Streaming Replication allows a standby server to stay more up-to-date by shipping and applying the WAL XLOG records continuously. The standby connects to the primary, which streams WAL records to the standby as they’re generated, without waiting for the WAL file to be filled.
Streaming Replication is asynchronous by default. As a result, there is a small delay between committing a transaction in the primary and the changes becoming visible in the standby.
Streaming Replication setup
Following parameters are set in postgresql.conf
for both primary and standby server
wal_level = replica
max_wal_senders = 99
wal_keep_segments = 32
Here,
- wal_keep_segments specifies the minimum number of past log file segments kept in the pg_xlog directory.
And followings are in recovery.conf
for standby server
standby_mode = on
trigger_file = '/tmp/pg-failover-trigger'
recovery_target_timeline = 'latest'
primary_conninfo = 'application_name=$HOSTNAME host=$PRIMARY_HOST'
Here,
- trigger_file is created to trigger a standby to take over as primary server.
- $PRIMARY_HOST holds the Kubernetes Service name that targets primary server
Now create this Postgres object with Streaming Replication support
$ kubectl create -f https://github.com/kubedb/docs/raw/v2024.12.18/docs/examples/postgres/clustering/ha-postgres.yaml
postgres.kubedb.com/ha-postgres created
KubeDB operator creates three Pod as PostgreSQL server.
$ kubectl get pods -n demo --selector="app.kubernetes.io/instance=ha-postgres" --show-labels
NAME READY STATUS RESTARTS AGE LABELS
ha-postgres-0 1/1 Running 0 20s controller-revision-hash=ha-postgres-6b7998ccfd,app.kubernetes.io/name=postgreses.kubedb.com,app.kubernetes.io/instance=ha-postgres,kubedb.com/role=primary,petset.kubernetes.io/pod-name=ha-postgres-0
ha-postgres-1 1/1 Running 0 16s controller-revision-hash=ha-postgres-6b7998ccfd,app.kubernetes.io/name=postgreses.kubedb.com,app.kubernetes.io/instance=ha-postgres,kubedb.com/role=replica,petset.kubernetes.io/pod-name=ha-postgres-1
ha-postgres-2 1/1 Running 0 10s controller-revision-hash=ha-postgres-6b7998ccfd,app.kubernetes.io/name=postgreses.kubedb.com,app.kubernetes.io/instance=ha-postgres,kubedb.com/role=replica,petset.kubernetes.io/pod-name=ha-postgres-2
Here,
- Pod
ha-postgres-0
is serving as primary server, indicated by labelkubedb.com/role=primary
- Pod
ha-postgres-1
&ha-postgres-2
both are serving as standby server, indicated by labelkubedb.com/role=replica
And two services for Postgres ha-postgres
are created.
$ kubectl get svc -n demo --selector="app.kubernetes.io/instance=ha-postgres"
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
ha-postgres ClusterIP 10.102.19.49 <none> 5432/TCP 4m
ha-postgres-replicas ClusterIP 10.97.36.117 <none> 5432/TCP 4m
$ kubectl get svc -n demo --selector="app.kubernetes.io/instance=ha-postgres" -o=custom-columns=NAME:.metadata.name,SELECTOR:.spec.selector
NAME SELECTOR
ha-postgres map[app.kubernetes.io/name:postgreses.kubedb.com app.kubernetes.io/instance:ha-postgres kubedb.com/role:primary]
ha-postgres-replicas map[app.kubernetes.io/name:postgreses.kubedb.com app.kubernetes.io/instance:ha-postgres]
Here,
- Service
ha-postgres
targets Podha-postgres-0
, which is primary server, by selectorapp.kubernetes.io/name=postgreses.kubedb.com,app.kubernetes.io/instance=ha-postgres,kubedb.com/role=primary
. - Service
ha-postgres-replicas
targets all Pods (ha-postgres-0
,ha-postgres-1
andha-postgres-2
) with labelapp.kubernetes.io/name=postgreses.kubedb.com,app.kubernetes.io/instance=ha-postgres
.
These standby servers are asynchronous warm standby server. That means, you can only connect to primary sever.
Now connect to this primary server Pod ha-postgres-0
using pgAdmin installed in quickstart tutorial.
Connection information:
Host name/address: you can use any of these
- Service:
ha-postgres.demo
- Pod IP: (
$kubectl get pods ha-postgres-0 -n demo -o yaml | grep podIP
)
- Service:
Port:
5432
Maintenance database:
postgres
Username: Run following command to get username,
$ kubectl get secrets -n demo ha-postgres-auth -o jsonpath='{.data.\POSTGRES_USER}' | base64 -d postgres
Password: Run the following command to get password,
$ kubectl get secrets -n demo ha-postgres-auth -o jsonpath='{.data.\POSTGRES_PASSWORD}' | base64 -d MHRrOcuyddfh3YpU
You can check pg_stat_replication
information to know who is currently streaming from primary.
postgres=# select * from pg_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_port | backend_start | state | sent_location | write_location | flush_location | replay_location | sync_priority | sync_state |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
89 | 10 | postgres | ha-postgres-2 | 172.17.0.8 | 35306 | 2018-02-09 04:27:11.674828+00 | streaming | 0/5000060 | 0/5000060 | 0/5000060 | 0/5000060 | 0 | async |
90 | 10 | postgres | ha-postgres-1 | 172.17.0.7 | 42400 | 2018-02-09 04:27:13.716104+00 | streaming | 0/5000060 | 0/5000060 | 0/5000060 | 0/5000060 | 0 | async |
Here, both ha-postgres-1
and ha-postgres-2
are streaming asynchronously from primary server.
Lease Duration
Get the postgres CRD at this point.
$ kubectl get pg -n demo ha-postgres -o yaml
apiVersion: kubedb.com/v1
kind: Postgres
metadata:
creationTimestamp: "2019-02-07T12:14:05Z"
finalizers:
- kubedb.com
generation: 2
name: ha-postgres
namespace: demo
resourceVersion: "44966"
selfLink: /apis/kubedb.com/v1/namespaces/demo/postgreses/ha-postgres
uid: dcf6d96a-2ad1-11e9-9d44-080027154f61
spec:
authSecret:
name: ha-postgres-auth
leaderElection:
leaseDurationSeconds: 15
renewDeadlineSeconds: 10
retryPeriodSeconds: 2
podTemplate:
controller: {}
metadata: {}
spec:
resources: {}
replicas: 3
storage:
accessModes:
- ReadWriteOnce
dataSource: null
resources:
requests:
storage: 1Gi
storageClassName: standard
storageType: Durable
deletionPolicy: Halt
version: "10.2"-v5
status:
observedGeneration: 2$4213139756412538772
phase: Running
There are three fields under Postgres CRD’s spec.leaderElection
. These values defines how fast the leader election can happen.
- leaseDurationSeconds: This is the duration in seconds that non-leader candidates will wait to force acquire leadership. This is measured against time of last observed ack. Default 15 secs.
- renewDeadlineSeconds: This is the duration in seconds that the acting master will retry refreshing leadership before giving up. Normally, LeaseDuration * 2 / 3. Default 10 secs.
- retryPeriodSeconds: This is the duration in seconds the LeaderElector clients should wait between tries of actions. Normally, LeaseDuration / 3. Default 2 secs.
If the Cluster machine is powerful, user can reduce the times. But, Do not make it so little, in that case Postgres will restarts very often.
Automatic failover
If primary server fails, another standby server will take over and serve as primary.
Delete Pod ha-postgres-0
to see the failover behavior.
kubectl delete pod -n demo ha-postgres-0
$ kubectl get pods -n demo --selector="app.kubernetes.io/instance=ha-postgres" --show-labels
NAME READY STATUS RESTARTS AGE LABELS
ha-postgres-0 1/1 Running 0 10s controller-revision-hash=ha-postgres-b8b4b5fc4,app.kubernetes.io/name=postgreses.kubedb.com,app.kubernetes.io/instance=ha-postgres,kubedb.com/role=replica,petset.kubernetes.io/pod-name=ha-postgres-0
ha-postgres-1 1/1 Running 0 52m controller-revision-hash=ha-postgres-b8b4b5fc4,app.kubernetes.io/name=postgreses.kubedb.com,app.kubernetes.io/instance=ha-postgres,kubedb.com/role=primary,petset.kubernetes.io/pod-name=ha-postgres-1
ha-postgres-2 1/1 Running 0 51m controller-revision-hash=ha-postgres-b8b4b5fc4,app.kubernetes.io/name=postgreses.kubedb.com,app.kubernetes.io/instance=ha-postgres,kubedb.com/role=replica,petset.kubernetes.io/pod-name=ha-postgres-2
Here,
- Pod
ha-postgres-1
is now serving as primary server - Pod
ha-postgres-0
andha-postgres-2
both are serving as standby server
And result from pg_stat_replication
postgres=# select * from pg_stat_replication;
pid | usesysid | usename | application_name | client_addr | client_port | backend_start | state | sent_location | write_location | flush_location | replay_location | sync_priority | sync_state |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
57 | 10 | postgres | ha-postgres-0 | 172.17.0.6 | 52730 | 2018-02-09 04:33:06.051716 | 00 | streaming | 0/7000060 | 0/7000060 | 0/7000060 | 0/7000060 | 0 |
58 | 10 | postgres | ha-postgres-2 | 172.17.0.8 | 42824 | 2018-02-09 04:33:09.762168 | 00 | streaming | 0/7000060 | 0/7000060 | 0/7000060 | 0/7000060 | 0 |
You can see here, now ha-postgres-0
and ha-postgres-2
are streaming asynchronously from ha-postgres-1
, our primary server.
Streaming Replication with hot standby
Streaming Replication also works with one or more hot standby servers.
apiVersion: kubedb.com/v1
kind: Postgres
metadata:
name: hot-postgres
namespace: demo
spec:
version: "13.13"
replicas: 3
standbyMode: Hot
storageType: Durable
storage:
storageClassName: "standard"
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi
In this examples:
- This
Postgres
object creates three PostgreSQL servers, indicated by thereplicas
field. - One server will be primary and two others will be hot standby servers, as instructed by
spec.standbyMode
hot standby
setup
Following parameters are set in postgresql.conf
for standby server
hot_standby = on
Here,
- hot_standby specifies that standby server will act as hot standby.
Now create this Postgres object
$ kubectl create -f https://github.com/kubedb/docs/raw/v2024.12.18/docs/examples/postgres/clustering/hot-postgres.yaml
postgres "hot-postgres" created
KubeDB operator creates three Pod as PostgreSQL server.
$ kubectl get pods -n demo --selector="app.kubernetes.io/instance=hot-postgres" --show-labels
NAME READY STATUS RESTARTS AGE LABELS
hot-postgres-0 1/1 Running 0 1m controller-revision-hash=hot-postgres-6c48cfb5bb,app.kubernetes.io/name=postgreses.kubedb.com,app.kubernetes.io/instance=hot-postgres,kubedb.com/role=primary,petset.kubernetes.io/pod-name=hot-postgres-0
hot-postgres-1 1/1 Running 0 1m controller-revision-hash=hot-postgres-6c48cfb5bb,app.kubernetes.io/name=postgreses.kubedb.com,app.kubernetes.io/instance=hot-postgres,kubedb.com/role=replica,petset.kubernetes.io/pod-name=hot-postgres-1
hot-postgres-2 1/1 Running 0 48s controller-revision-hash=hot-postgres-6c48cfb5bb,app.kubernetes.io/name=postgreses.kubedb.com,app.kubernetes.io/instance=hot-postgres,kubedb.com/role=replica,petset.kubernetes.io/pod-name=hot-postgres-2
Here,
- Pod
hot-postgres-0
is serving as primary server, indicated by labelkubedb.com/role=primary
- Pod
hot-postgres-1
&hot-postgres-2
both are serving as standby server, indicated by labelkubedb.com/role=replica
These standby servers are asynchronous hot standby servers.
That means, you can connect to both primary and standby sever. But these hot standby servers only accept read-only queries.
Now connect to one of our hot standby servers Pod hot-postgres-2
using pgAdmin installed in quickstart tutorial.
Connection information:
Host name/address: you can use any of these
- Service:
hot-postgres-replicas.demo
- Pod IP: (
$kubectl get pods hot-postgres-2 -n demo -o yaml | grep podIP
)
- Service:
Port:
5432
Maintenance database:
postgres
Username: Run following command to get username,
$ kubectl get secrets -n demo hot-postgres-auth -o jsonpath='{.data.\POSTGRES_USER}' | base64 -d postgres
Password: Run the following command to get password,
$ kubectl get secrets -n demo hot-postgres-auth -o jsonpath='{.data.\POSTGRES_PASSWORD}' | base64 -d ZZgjjQMUdKJYy1W9
Try to create a database (write operation)
postgres=# CREATE DATABASE standby;
ERROR: cannot execute CREATE DATABASE in a read-only transaction
Failed to execute write operation. But it can execute following read query
postgres=# select pg_last_xlog_receive_location();
pg_last_xlog_receive_location
-------------------------------
0/7000220
So, you can see here that you can connect to hot standby and it only accepts read-only queries.
Cleaning up
To cleanup the Kubernetes resources created by this tutorial, run:
$ kubectl patch -n demo pg/ha-postgres pg/hot-postgres -p '{"spec":{"deletionPolicy":"WipeOut"}}' --type="merge"
$ kubectl delete -n demo pg/ha-postgres pg/hot-postgres
$ kubectl delete ns demo
Next Steps
- Monitor your PostgreSQL database with KubeDB using built-in Prometheus.
- Monitor your PostgreSQL database with KubeDB using Prometheus operator.
- Want to hack on KubeDB? Check our contribution guidelines.