Managing PostgreSQL Clusters

Create/destroy PostgreSQL clusters, scale existing clusters, and clone clusters.

Quick Reference

ActionCommandDescription
Create Clusterbin/pgsql-add <cls>Create a new PostgreSQL cluster
Expand Clusterbin/pgsql-add <cls> <ip...>Add replica to existing cluster
Shrink Clusterbin/pgsql-rm <cls> <ip...>Remove instance from cluster
Remove Clusterbin/pgsql-rm <cls>Destroy entire PostgreSQL cluster
Reload Servicebin/pgsql-svc <cls> [ip...]Reload cluster load balancer config
Reload HBAbin/pgsql-hba <cls> [ip...]Reload cluster HBA access rules
Clone Cluster-Clone via standby cluster or PITR

For other management tasks, see: HA Management, Manage Users, Manage Databases.


Create Cluster

To create a new PostgreSQL cluster, first define the cluster in the inventory, then add nodes and initialize:

bin/node-add  <cls>     # Add nodes in group <cls>
./node.yml  -l <cls>    # Use Ansible playbook to add nodes in group <cls>
bin/pgsql-add pg-test   # Add nodes in pg-test group, runs ./node.yml -l pg-test

On managed nodes, create the cluster with: (Execute pgsql.yml playbook on <cls> group)

bin/pgsql-add <cls>     # Create PostgreSQL cluster <cls>
./pgsql.yml -l <cls>    # Use Ansible playbook to create PostgreSQL cluster <cls>
bin/pgsql-add pg-test   # Create pg-test cluster

Example: Create 3-node PG cluster pg-test


Expand Cluster

To add a new replica to an existing PostgreSQL cluster, add the instance definition to inventory: all.children.<cls>.hosts.

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary } # existing member
    10.10.10.12: { pg_seq: 2, pg_role: replica } # existing member
    10.10.10.13: { pg_seq: 3, pg_role: replica } # <--- new member
  vars: { pg_cluster: pg-test }

Scaling out is similar to creating a cluster. First add the new node to Pigsty: Add Node:

bin/node-add <ip>       # Add node with IP <ip>
./node.yml -l <ip>      # Use Ansible playbook to add node <ip>
bin/node-add 10.10.10.13    # Add node 10.10.10.13, runs ./node.yml -l 10.10.10.13

Then run the following on the new node to scale out (Install PGSQL module on new node with same pg_cluster):

bin/pgsql-add <cls> <ip>  # Add node <ip> to cluster
./pgsql.yml -l <ip>       # Core: Use Ansible playbook to install PGSQL module on <ip>
bin/pgsql-add pg-test 10.10.10.13   # Scale out pg-test with node 10.10.10.13

After scaling, you should Reload Service to add the new member to load balancer.

Example: Add replica 10.10.10.13 to 2-node cluster pg-test


Shrink Cluster

To remove a replica from an existing PostgreSQL cluster, remove the instance definition from inventory all.children.<cls>.hosts.

First uninstall PGSQL module from target node (Execute pgsql-rm.yml on <ip>):

bin/pgsql-rm <cls> <ip>   # Remove PostgreSQL instance on <ip> from cluster <cls>
./pgsql-rm.yml -l <ip>    # Use Ansible playbook to remove PostgreSQL instance on <ip>
bin/pgsql-rm pg-test 10.10.10.13  # Remove 10.10.10.13 from pg-test cluster

After removing PGSQL module, optionally remove the node from Pigsty: Remove Node:

bin/node-rm <ip>          # Remove node <ip> from Pigsty management
./node-rm.yml -l <ip>     # Use Ansible playbook to remove node <ip>
bin/node-rm 10.10.10.13   # Remove node 10.10.10.13 from Pigsty

After scaling in, remove the instance from inventory, then Reload Service to remove it from load balancer.

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: replica } # <--- remove after execution
  vars: { pg_cluster: pg-test }

Example: Remove replica 10.10.10.13 from 3-node cluster pg-test


Remove Cluster

To destroy a cluster, uninstall PGSQL module from all nodes (Execute pgsql-rm.yml on <cls>):

bin/pgsql-rm <cls>        # Destroy entire PostgreSQL cluster <cls>
./pgsql-rm.yml -l <cls>   # Use Ansible playbook to destroy cluster <cls>
bin/pgsql-rm pg-test      # Destroy pg-test cluster

After destroying PGSQL, optionally remove all nodes from Pigsty: Remove Node (optional if other services exist):

bin/node-rm <cls>         # Remove all nodes in group <cls> from Pigsty
./node-rm.yml -l <cls>    # Use Ansible playbook to remove nodes in group <cls>
bin/node-rm pg-test       # Remove all pg-test nodes from Pigsty

After removal, delete the entire cluster definition from inventory.

pg-test: # remove this cluster definition group
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: replica }
  vars: { pg_cluster: pg-test }

Example: Destroy 3-node PG cluster pg-test

Note: If pg_safeguard is configured (or globally true), pgsql-rm.yml will abort to prevent accidental removal. Override with playbook command line to force removal. By default, cluster backup repo is deleted with the cluster. To preserve backups (e.g., with centralized repo), set pg_rm_backup=false:

./pgsql-rm.yml -l pg-meta -e pg_safeguard=false    # force remove protected cluster pg-meta
./pgsql-rm.yml -l pg-meta -e pg_rm_backup=false    # preserve backup repo during removal

Reload Service

PostgreSQL clusters expose services via HAProxy on host nodes. When service definitions change, instance weights change, or cluster membership changes (e.g., scale out/scale in, switchover/failover), reload services to update load balancer config.

To reload service config on entire cluster or specific instances (Execute pg_service subtask of pgsql.yml on <cls> or <ip>):

bin/pgsql-svc <cls>           # Reload service config for entire cluster <cls>
bin/pgsql-svc <cls> <ip...>   # Reload service config for specific instances
./pgsql.yml -l <cls> -t pg_service -e pg_reload=true        # Reload entire cluster
./pgsql.yml -l <ip>  -t pg_service -e pg_reload=true        # Reload specific instance
bin/pgsql-svc pg-test                 # Reload pg-test cluster service config
bin/pgsql-svc pg-test 10.10.10.13     # Reload pg-test 10.10.10.13 instance service config

Note: If using dedicated load balancer cluster (pg_service_provider), only reloading cluster primary updates the LB config.

Example: Reload pg-test cluster service config

Example: Reload PG Service to Remove Instance

asciicast


Reload HBA

When HBA configs change, reload HBA rules to apply. (pg_hba_rules / pgb_hba_rules) If you have role-specific HBA rules or IP ranges referencing cluster member aliases, reload HBA after switchover/scaling.

To reload PG and Pgbouncer HBA rules on entire cluster or specific instances (Execute HBA subtasks of pgsql.yml on <cls> or <ip>):

bin/pgsql-hba <cls>           # Reload HBA rules for entire cluster <cls>
bin/pgsql-hba <cls> <ip...>   # Reload HBA rules for specific instances
./pgsql.yml -l <cls> -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload -e pg_reload=true   # Reload entire cluster
./pgsql.yml -l <ip>  -t pg_hba,pg_reload,pgbouncer_hba,pgbouncer_reload -e pg_reload=true   # Reload specific instance
bin/pgsql-hba pg-test                 # Reload pg-test cluster HBA rules
bin/pgsql-hba pg-test 10.10.10.13     # Reload pg-test 10.10.10.13 instance HBA rules

Example: Reload pg-test cluster HBA rules


Config Cluster

PostgreSQL config params are managed by Patroni. Initial params are specified by Patroni config template. After cluster init, config is stored in Etcd, dynamically managed and synced by Patroni. Most Patroni config params can be modified via patronictl. Other params (e.g., etcd DCS config, log/RestAPI config) can be updated via subtasks. For example, when etcd cluster membership changes, refresh Patroni config:

./pgsql.yml -l pg-test -t pg_conf                   # Update Patroni config file
ansible pg-test -b -a 'systemctl reload patroni'    # Reload Patroni service

You can override Patroni-managed defaults at different levels: specify params per instance, specify params per user, or specify params per database.


Clone Cluster

Two ways to clone a cluster: use Standby Cluster, or use Point-in-Time Recovery. The former is simple with no dependencies but only clones latest state; the latter requires centralized backup repository (e.g., MinIO) but can clone to any point within retention period.

MethodProsConsUse Cases
Standby ClusterSimple, no dependenciesOnly clones latest stateDR, read-write separation, migration
PITRRecover to any pointRequires centralized backupUndo mistakes, data audit

Clone via Standby Cluster

Standby Cluster continuously syncs from upstream cluster via streaming replication - the simplest cloning method. Specify pg_upstream on the new cluster primary to auto-pull data from upstream.

# pg-test is the original cluster
pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
  vars: { pg_cluster: pg-test }

# pg-test2 is standby cluster (clone) of pg-test
pg-test2:
  hosts:
    10.10.10.12: { pg_seq: 1, pg_role: primary, pg_upstream: 10.10.10.11 }  # specify upstream
    10.10.10.13: { pg_seq: 2, pg_role: replica }
  vars: { pg_cluster: pg-test2 }

Create standby cluster with:

bin/pgsql-add pg-test2    # Create standby cluster, auto-clone from upstream pg-test
./pgsql.yml -l pg-test2   # Use Ansible playbook to create standby cluster

Standby cluster follows upstream, keeping data in sync. Promote to independent cluster anytime:

Example: Promote Standby to Independent Cluster

Via Config Cluster, remove standby_cluster config to promote:

$ pg edit-config pg-test2
-standby_cluster:
-  create_replica_methods:
-  - basebackup
-  host: 10.10.10.11
-  port: 5432

Apply these changes? [y/N]: y

After promotion, pg-test2 becomes independent cluster accepting writes, forked from pg-test.

Example: Change Replication Upstream

If upstream cluster switchover occurs, change standby cluster upstream via Config Cluster:

$ pg edit-config pg-test2

 standby_cluster:
   create_replica_methods:
   - basebackup
-  host: 10.10.10.11     # <--- old upstream
+  host: 10.10.10.14     # <--- new upstream
   port: 5432

Apply these changes? [y/N]: y

Clone via PITR

Point-in-Time Recovery (PITR) allows recovery to any point within backup retention. Requires centralized backup repository (MinIO/S3), but more powerful.

To clone via PITR, add pg_pitr param specifying recovery target:

# Clone new cluster pg-meta2 from pg-meta backup
pg-meta2:
  hosts: { 10.10.10.12: { pg_seq: 1, pg_role: primary } }
  vars:
    pg_cluster: pg-meta2
    pg_pitr:
      cluster: pg-meta                    # Recover from pg-meta backup
      time: '2025-01-10 10:00:00+00'      # Recover to specific time

Execute clone with pgsql-pitr.yml playbook:

./pgsql-pitr.yml -l pg-meta2    # Clone pg-meta2 from pg-meta backup
# Specify PITR options via command line
./pgsql-pitr.yml -l pg-meta2 -e '{"pg_pitr": {"cluster": "pg-meta", "time": "2025-01-10 10:00:00+00"}}'

PITR supports multiple recovery target types:

Target TypeExampleDescription
Timetime: "2025-01-10 10:00:00+00"Recover to specific timestamp
XIDxid: "250000"Recover to before/after txn
Namename: "before_migration"Recover to named restore point
LSNlsn: "0/4001C80"Recover to specific WAL pos
Latesttype: "latest"Recover to end of WAL archive

For detailed PITR usage, see Restore Operations documentation.