HA Drill: 2/3 Failure
Module:
Categories:
When two nodes (majority) in a classic 3-node HA deployment fail simultaneously, automatic failover becomes impossible. Time for some manual intervention - let’s roll up our sleeves!
First, assess the status of the failed nodes. If they can be restored quickly, prioritize bringing them back online. Otherwise, initiate the Emergency Response Protocol.
The Emergency Response Protocol assumes your management node is down, leaving only a single database node alive. In this “last man standing” scenario, here’s the fastest recovery path:
- Adjust HAProxy configuration to direct traffic to the primary
- Stop Patroni and manually promote the PostgreSQL replica to primary
Adjusting HAProxy Configuration
If you’re accessing the cluster through means other than HAProxy, you can skip this part (lucky you!). If you’re using HAProxy to access your database cluster, you’ll need to adjust the load balancer configuration to manually direct read/write traffic to the primary.
- Edit
/etc/haproxy/<pg_cluster>-primary.cfg
, where<pg_cluster>
is your PostgreSQL cluster name (e.g.,pg-meta
) - Comment out health check configurations
- Comment out the server entries for the two failed nodes, keeping only the current primary
listen pg-meta-primary
bind *:5433
mode tcp
maxconn 5000
balance roundrobin
# Comment out these four health check lines
#option httpchk # <---- remove this
#option http-keep-alive # <---- remove this
#http-check send meth OPTIONS uri /primary # <---- remove this
#http-check expect status 200 # <---- remove this
default-server inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100
server pg-meta-1 10.10.10.10:6432 check port 8008 weight 100
# Comment out the failed nodes
#server pg-meta-2 10.10.10.11:6432 check port 8008 weight 100 <---- comment this
#server pg-meta-3 10.10.10.12:6432 check port 8008 weight 100 <---- comment this
Don’t rush to systemctl reload haproxy
just yet - we’ll do that after promoting the primary.
This configuration bypasses Patroni’s health checks and directs write traffic straight to our soon-to-be primary.
Manual Replica Promotion
SSH into the target server, switch to dbsu user, execute a CHECKPOINT
to flush disk buffers, stop Patroni, restart PostgreSQL, and perform the promotion:
sudo su - postgres # Switch to database dbsu user
psql -c 'checkpoint; checkpoint;' # Double CHECKPOINT for good luck (and clean buffers)
sudo systemctl stop patroni # Bid farewell to Patroni
pg-restart # Restart PostgreSQL
pg-promote # Time for a promotion!
psql -c 'SELECT pg_is_in_recovery();' # 'f' means we're primary - mission accomplished!
If you modified the HAProxy config earlier, now’s the time to systemctl reload haproxy
and direct traffic to our new primary.
systemctl reload haproxy # Route write traffic to our new primary
Preventing Split-Brain
After stopping the bleeding, priority #2 is: Prevent Split-Brain. We need to ensure the other two servers don’t come back online and start a civil war with our current primary.
The simple approach:
- Pull the plug (power/network) on the other two servers - ensure they can’t surprise us with an unexpected comeback
- Update application connection strings to point directly to our lone survivor primary
Next steps depend on your situation:
- A: The two servers have temporary issues (network/power outage) and can be restored in place
- B: The two servers are permanently dead (hardware failure) and need to be decommissioned
Recovery from Temporary Failure
If the other two servers can be restored, follow these steps:
- Handle one failed server at a time, prioritizing the management/INFRA node
- Start the failed server and immediately stop Patroni
Once ETCD quorum is restored, start Patroni on the surviving server (current primary) to take control of PostgreSQL and reclaim cluster leadership. Put Patroni in maintenance mode:
systemctl restart patroni
pg pause <pg_cluster>
On the other two instances, create a touch /pg/data/standby.signal
file as postgres
user to mark them as replicas, then start Patroni:
systemctl restart patroni
After confirming Patroni cluster identity/roles are correct, exit maintenance mode:
pg resume <pg_cluster>
Recovery from Permanent Failure
After permanent failure, first recover the ~/pigsty
directory on the management node - particularly the crucial pigsty.yml
and files/pki/ca/ca.key
files.
No backup of these files? You might need to deploy a fresh Pigsty and migrate your existing cluster via backup cluster.
Pro tip: Keep your
pigsty
directory under version control (Git). Learn from this experience - future you will thank present you.
Config Repair
Use your surviving node as the new management node. Copy the ~/pigsty
directory there and adjust the configuration.
For example, replace the default management node 10.10.10.10
with the surviving node 10.10.10.12
:
all:
vars:
admin_ip: 10.10.10.12 # New management node IP
node_etc_hosts: [10.10.10.12 h.pigsty a.pigsty p.pigsty g.pigsty sss.pigsty]
infra_portal: {} # Update other configs referencing old admin_ip
children:
infra: # Adjust Infra cluster
hosts:
# 10.10.10.10: { infra_seq: 1 } # Old Infra node
10.10.10.12: { infra_seq: 3 } # New Infra node
etcd: # Adjust ETCD cluster
hosts:
#10.10.10.10: { etcd_seq: 1 } # Comment out failed node
#10.10.10.11: { etcd_seq: 2 } # Comment out failed node
10.10.10.12: { etcd_seq: 3 } # Keep survivor
vars:
etcd_cluster: etcd
pg-meta: # Adjust PGSQL cluster config
hosts:
#10.10.10.10: { pg_seq: 1, pg_role: primary }
#10.10.10.11: { pg_seq: 2, pg_role: replica }
#10.10.10.12: { pg_seq: 3, pg_role: replica , pg_offline_query: true }
10.10.10.12: { pg_seq: 3, pg_role: primary , pg_offline_query: true }
vars:
pg_cluster: pg-meta
ETCD Repair
Reset ETCD to a single-node cluster:
./etcd.yml -e etcd_safeguard=false -e etcd_clean=true
Follow ETCD Config Reload to adjust ETCD endpoint references.
INFRA Repair
If the surviving node lacks INFRA module, configure and install it:
./infra.yml -l 10.10.10.12
Fix monitoring on the current node:
./node.yml -t node_monitor
PGSQL Repair
./pgsql.yml -t pg_conf # Regenerate PG config
systemctl reload patroni # Reload Patroni config on survivor
After module repairs, follow the standard scale-out procedure to add new nodes and restore HA.
Feedback
Was this page helpful?
Glad to hear it! Please tell us how we can improve.
Sorry to hear that. Please tell us how we can improve.