FAQ

Pigsty ETCD dcs module frequently asked questions

What is the impact of ETCD failure?

ETCD availability is critical for the PGSQL cluster’s HA, which is guaranteed by using multiple nodes. With a 3-node ETCD cluster, if one node is down, the other two nodes can still function normally; and with a 5-node ETCD cluster, two-node failure can still be tolerated. If more than half of the ETCD nodes are down, the ETCD cluster and its service will be unavailable. Before Patroni 3.0, this could lead to a global PGSQL outage; all primary will be demoted and reject write requests.

Since pigsty 2.0, the patroni 3.0 DCS failsafe mode is enabled by default, which will LOCK the PGSQL cluster status if the ETCD cluster is unavailable and all PGSQL members are still known to the primary.

The PGSQL cluster can still function normally, but you must recover the ETCD cluster ASAP. (you can’t configure the PGSQL cluster through patroni if etcd is down)


How to use existing external etcd cluster?

The hard-coded group, etcd, will be used as DCS servers for PGSQL. You can initialize them with etcd.yml or assume it is an existing external etcd cluster.

To use an existing external etcd cluster, define them as usual and make sure your current etcd cluster certificate is signed by the same CA as your self-signed CA for PGSQL.


How to add a new member to the existing etcd cluster?

Check Add a member to etcd cluster

etcdctl member add <etcd-?> --learner=true --peer-urls=https://<new_ins_ip>:2380 # on admin node
./etcd.yml -l <new_ins_ip> -e etcd_init=existing                                 # init new etcd member
etcdctl member promote <new_ins_server_id>                                       # on admin node

How to remove a member from an existing etcd cluster?

Check Remove member from etcd cluster

etcdctl member remove <etcd_server_id>   # kick member out of the cluster (on admin node)
./etcd.yml -l <ins_ip> -t etcd_purge     # purge etcd instance

Last modified 2024-02-29: update content (34b2b75)