Operations SOP Index
Getting Started Path
| Order | Question | Entry |
|---|---|---|
| 1 | What modules does Pigsty include? | Modular Architecture, PGSQL Architecture, PGSQL Cluster Model |
| 2 | How do I get it running first? | Quick Start, Web UI, PostgreSQL Quick Start |
| 3 | How should I read the config file? | Declarative Configuration, Configuration Guide, Configuration Parameters |
| 4 | What should I prepare for production? | Planning, Preparation, Administration Model |
| 5 | How do I deploy a multi-node cluster? | Production Deployment, Playbooks, PGSQL Playbooks |
| 6 | How do I operate databases daily? | PGSQL Administration, Cluster Management, User Management, Database Management |
| 7 | How do I validate reliability? | PostgreSQL HA, Patroni Management, Backup & Restore, Restore Operations |
Task Index
| Task | Read First | Operation Entry |
|---|---|---|
| Prepare servers, disks, networks, VIPs | Preparation, Planning, Linux Compatibility | Production Deployment |
| Prepare SSH, sudo, and admin users | Administration Model | Production Deployment |
| Build a local or cloud sandbox | Sandbox | Vagrant, Terraform |
| Single-node trial | Quick Start | ./configure -g, ./deploy.yml |
| Multi-node production deployment | Deployment, Production Deployment | ./deploy.yml, ./pgsql.yml |
| Deploy in an offline environment | Offline Installation | Repository Management |
| Choose a config template | Config Templates, Template List | ./configure -c <template> |
| Plan cluster, database, and user names | PGSQL Cluster Model | pg_cluster, pg_databases, pg_users |
| Create a PostgreSQL cluster | Cluster Instance Config | Cluster Management, ./pgsql.yml -l <cluster> |
| Add business users | User/Role Config | User Management, ./pgsql-user.yml -l <cluster> |
| Add business databases | Database Config | Database Management, ./pgsql-db.yml -l <cluster> |
| Configure access endpoints | Service/Access | pg_services, pg_default_services |
| Modify HBA | HBA Config | HBA Management |
| Switchover | Patroni Management | patronictl switchover |
| HA drills | PostgreSQL HA, RPO, RTO | 3-of-2 Failure Drill |
| Configure VIP | HA Service Access | Configure PG VIP |
| Configure backup policy | Backup Policy | Backup Admin Commands |
| Perform PITR | Point-in-Time Recovery | Restore Operations |
| Recover dropped data, tables, or databases | Drop Recovery | Manual Recovery |
| Clone or fork a cluster | Clone Database Cluster | Fork Instance |
| Use MinIO for backups | MINIO Module | MinIO Config, Backup Repository |
| View monitoring and alerts | Monitoring System | PGSQL Monitoring, PGSQL Dashboards |
| Troubleshoot database failures | PGSQL FAQ | Troubleshooting, Component Management |
| Scale PostgreSQL clusters | Cluster Instance Config | Cluster Management |
| Upgrade PostgreSQL | Version Upgrade | Kernel Versions |
| Install or enable extensions | Extensions | Extension Management |
| Migrate existing databases | Data Migration | Migration Playbook |
| Harden security | Security Considerations | Access Control, CA & Certificates |
| Manage domains and web entrypoints | Domain Management | Nginx Management |
| Maintain infrastructure | INFRA Administration | infra.yml, infra-rm.yml |
| Maintain Etcd | ETCD Config | ETCD Management, ETCD FAQ |
| Deploy app templates | Applications | Docker Module, ./app.yml |
Preparation And Deployment
For production deployment, start with Planning and Preparation. These two pages cover node count, disks, filesystems, networks, VIPs, domains, and software sources.
After machines are ready, read Administration Model: admin users, passwordless SSH, sudo, reachability, and firewall handling are covered there. Check system versions and architectures in Linux Compatibility.
Use Quick Start for the first installation. Use Production Deployment for multi-node production environments. If there is no internet access, read Offline Installation and Repository Management.
Do not overthink template choice at the beginning: use meta for single-node default deployment; ha/trio for three-node HA; ha/full for more complete HA; ha/safe when consistency matters; and ha/dual or ha/simu when resources are tight.
Naming And Configuration
First distinguish three names: cluster name, database name, and service name.
pg_cluster is the top-level name Pigsty uses to manage a PostgreSQL cluster. It affects instance names, service names, backup stanzas, monitoring labels, and many file paths. It is not a display name that can be casually changed. See PGSQL Cluster Model for naming rules, Cluster Instance Config for instance roles, and Service/Access for service names and connection entrypoints.
Database names and user names are logical objects inside PostgreSQL. See Database Config and Database Management for databases; User/Role Config and User Management for users and roles; and Access Control plus ACL Config for the privilege model.
In practice, use lowercase letters, digits, and hyphens for cluster names, such as pg-meta, pg-test, and pg-user-prod. Use snake_case for database object names. Avoid non-ASCII names, spaces, mixed case, and SQL keywords. For more background, read Database Cluster Management Concepts and Entity Naming Rules and PostgreSQL Convention (2024 Edition).
Follow one habit for config changes: edit pigsty.yml first, then run the corresponding playbook. See Declarative Configuration and Configuration Guide for config structure; Configuration Parameters and Parameter List for parameter meanings; Playbooks and Playbook List for playbook entrypoints.
Daily Operations
The main entrypoint for database operations is PGSQL Administration.
| Operation | Documentation |
|---|---|
| Create, scale, shrink, retire, and clone clusters | Cluster Management |
| Create, modify, and remove business users | User Management |
| Create, modify, drop, and rebuild databases | Database Management |
| Refresh and troubleshoot HBA | HBA Management |
| View HA status, switchover, restart, and reinitialize replicas | Patroni Management |
| Manage connection pools | Pgbouncer Management |
| Start and stop PostgreSQL, Patroni, Pgbouncer, and Exporter | Component Management |
| Manage backups, checks, cleanup, and restore | pgBackRest Management |
| Configure scheduled backup, vacuum, analyze, and other tasks | Crontab |
| Upgrade versions and extensions | Version Upgrade, Extension Management |
For background reading, see Routine PostgreSQL Maintenance.
High Availability Drills
To understand HA, start with PostgreSQL HA. Do not only ask “can it fail over automatically”; also read RPO and RTO: the former is the maximum acceptable data loss, and the latter is the time to restore service.
For access-layer behavior, see HA Service Access and Service/Access. For component relationships, see PGSQL Architecture. For Etcd’s role, see ETCD Config.
Drill entrypoints are concentrated in three places: Patroni Management for planned switchover; Component Management for service status; and 3-of-2 Failure Drill for extreme failures. If you need VIP, read Configure PG VIP.
Background article: How Should PostgreSQL High Availability Be Done?.
Backup And Recovery
For PITR, read Point-in-Time Recovery first, then Mechanism, Architecture, Tradeoffs, and Scenarios.
For configuration and maintenance, see Backup & Restore, Backup Policy, Backup Mechanism, Backup Repository, and Backup Admin Commands.
When actually restoring, use Restore Operations for the automatic path and Manual Recovery for drills. For dropped data, tables, or databases, see Drop Recovery. If you do not want to touch the original cluster directly, start with Clone Database Cluster or Fork Instance.
Before recovery, confirm at least four things: the target timestamp or restore point is clear; backup and WAL are continuous; business writes have stopped; and you know whether you are restoring in place or first pulling up a new validation cluster.
Background articles: Overview of Backup and Recovery Methods and PgBackRest2 Documentation.
Monitoring And Troubleshooting
For monitoring overview, see Monitoring System. For entrypoints and domains, see Web UI. For database metrics, logs, and alerts, see PGSQL Monitoring and PGSQL Dashboards.
Monitoring for non-database modules is documented separately: INFRA Monitoring, NODE Monitoring, ETCD Monitoring, and MINIO Monitoring.
Start troubleshooting with PGSQL FAQ, then Troubleshooting. For connection authentication issues, see HBA Management; for HA status issues, see Patroni Management; for process state issues, see Component Management.
General PostgreSQL troubleshooting articles: Routine PG Server Logging Configuration, Macro Query Optimization with pg_stat_statements, Incident File: PostgreSQL Transaction ID Wraparound, Finding Fake Indexes, and Table Bloat Cleanup.
Scaling, Upgrades, And Migration
For capacity and topology design, see Planning, Preparation, and PGSQL Cluster Model.
When scaling by module, see Cluster Management for PGSQL; NODE Management for NODE; ETCD Management for ETCD; MINIO Management for MINIO; INFRA Administration for INFRA; and REDIS Management for REDIS.
For PostgreSQL upgrades, see Version Upgrade and Kernel Versions. For extensions, see Extensions, Extension Management, Extension Repository, and Package Aliases.
For migrating existing PostgreSQL databases, see Data Migration and PGSQL Migration Playbook. For low-downtime migration ideas, see Migration without Downtime.
If you need horizontal scaling, then read Citus Cluster Deployment and Citus Kernel Branch.
Security And Entrypoints
For deployment security, start with Security Considerations. For the security model, see Security Compliance. For PostgreSQL privileges, see Access Control and ACL Config; for authentication rules, see HBA Config and HBA Management.
For certificates, see CA & Certificates. For domains, Nginx, and web entrypoints, see Domain Management and Nginx Management.
For production, at minimum change default passwords, tighten HBA, clearly separate business users from admin users, and confirm backup repository retention, encryption, and access permissions.
Application Access
Before applications connect to databases, read Service/Access and PostgreSQL Quick Start. For connection pool behavior, see Pgbouncer Management.
When using Pigsty-managed databases and deploying stateless applications, see Application Templates and Docker Module.
Common Mistakes
| Mistake | Where To Read |
|---|---|
Treating pg_cluster as a casually changeable display name | PGSQL Cluster Model |
| Confusing database names, cluster names, and service names | Naming And Configuration, Service/Access |
| Deploying only the primary without restore drills | Manual Recovery, Restore Operations |
| Assuming HA always means zero data loss | RPO, RTO |
| Doing the first failover drill directly in production | Sandbox, 3-of-2 Failure Drill |
| Ignoring Etcd | ETCD Module, ETCD FAQ |
| Checking backup success without verifying restore | Backup & Restore, Clone Database Cluster |
| Changing HBA, certificates, or service entrypoints without a rollback path | Security Compliance, HBA Management, Nginx Management |
Further Reading
| Topic | Article |
|---|---|
| Naming and entity model | Database Cluster Management Concepts and Entity Naming Rules |
| PostgreSQL conventions | PostgreSQL Convention (2024 Edition) |
| High availability | How Should PostgreSQL High Availability Be Done? |
| Backup and recovery | Overview of Backup and Recovery Methods, PgBackRest2 Documentation |
| Routine maintenance | Routine PostgreSQL Maintenance |
| Connection pooling | Pgbouncer Quick Start |
| Query and workload | Macro Query Optimization with pg_stat_statements, PostgreSQL KPIs |
| Logging and incidents | Routine PG Server Logging Configuration, Incident File: PostgreSQL Transaction ID Wraparound |
| Ecosystem and extensions | PostgreSQL Is Eating the Database World, Pig: Package Manager for PostgreSQL Kernels and Extensions |
Feedback
Was this page helpful?
Thanks for the feedback! Please let us know how we can improve.
Sorry to hear that. Please let us know how we can improve.