Point-in-Time Recovery

Pigsty utilizes pgBackRest for PostgreSQL point-in-time recovery, allowing users to roll back to any point within the backup policy limits.

Overview

You can roll back your cluster to any point in time, avoiding data loss caused by software defects and human errors.

Pigsty’s PostgreSQL clusters come with an automatically configured Point in Time Recovery (PITR) solution, based on the backup component pgBackRest and the optional object storage repository MinIO.

High Availability solutions can address hardware failures, but they are powerless against data deletions/overwrites/deletions caused by software defects and human errors. For such scenarios, Pigsty offers an out-of-the-box Point in Time Recovery (PITR) capability, enabled by default without any additional configuration.

Pigsty provides you with the default configuration for base backups and WAL archiving, allowing you to use local directories and disks, or dedicated MinIO clusters or S3 object storage services to store backups and achieve off-site disaster recovery. When using local disks, by default, you retain the ability to recover to any point in time within the past day. When using MinIO or S3, by default, you retain the ability to recover to any point in time within the past week. As long as storage space permits, you can keep a recoverable time span as long as desired, based on your budget.


What problems does Point in Time Recovery (PITR) solve?

  • Enhanced disaster recovery capability: RPO reduces from ∞ to a few MBs, RTO from ∞ to a few hours/minutes.
  • Ensures data security: Data Integrity among C/I/A: avoids data consistency issues caused by accidental deletions.
  • Ensures data security: Data Availability among C/I/A: provides a safety net for “permanently unavailable” disasters.
Singleton Strategy Event RTO RPO
Do nothing Crash Permanently lost All lost
Basic backup Crash Depends on backup size and bandwidth (a few hours) Loss of data after the last backup (a few hours to days)
Basic backup + WAL Archiving Crash Depends on backup size and bandwidth (a few hours) Loss of data not yet archived (a few dozen MBs)

What are the costs of Point in Time Recovery?

  • Reduced C in data security: Confidentiality, creating additional leakage points, requiring extra protection for backups.
  • Additional resource consumption: local storage or network traffic/bandwidth costs, usually not a problem.
  • Increased complexity cost: users need to invest in backup management.

Limitations of Point in Time Recovery

If PITR is the only method for fault recovery, the RTO and RPO metrics are inferior compared to High Availability solutions, and it’s usually best to use both in combination.

  • RTO: With only a single machine + PITR, recovery time depends on backup size and network/disk bandwidth, ranging from tens of minutes to several hours or days.
  • RPO: With only a single machine + PITR, a crash might result in the loss of a small amount of data, as one or several WAL log segments might not yet be archived, losing between 16 MB to several dozen MBs of data.

Apart from PITR, you can also use Delayed Clusters in Pigsty to address data deletion or alteration issues caused by human errors or software defects.


How does PITR works?

Point in Time Recovery allows you to roll back your cluster to any “specific moment” in the past, avoiding data loss caused by software defects and human errors. To achieve this, two key preparations are necessary: Base Backups and WAL Archiving. Having a Base Backup allows users to restore the database to the state at the time of the backup, while having WAL Archiving from a certain base backup enables users to restore the database to any point in time after the base backup.

fig-10-02.png

For a detailed principle, refer to: Base Backups and Point in Time Recovery; for specific operations, refer to PGSQL Management: Backup and Restore.

Base Backups

Pigsty uses pgBackRest to manage PostgreSQL backups. pgBackRest will initialize an empty repository on all cluster instances, but it will only use the repository on the primary instance.

pgBackRest supports three backup modes: Full Backup, Incremental Backup, and Differential Backup, with the first two being the most commonly used. A Full Backup takes a complete physical snapshot of the database cluster at a current moment, while an Incremental Backup records the differences between the current database cluster and the last full backup.

Pigsty provides a wrapper command for backups: /pg/bin/pg-backup [full|incr]. You can make base backups periodically as needed through Crontab or any other task scheduling system.

WAL Archiving

By default, Pigsty enables WAL archiving on the primary instance of the cluster and continuously pushes WAL segment files to the backup repository using the pgbackrest command-line tool.

pgBackRest automatically manages the required WAL files and promptly cleans up expired backups and their corresponding WAL archive files according to the backup retention policy.

If you do not need PITR functionality, you can disable WAL archiving by configuring the cluster: archive_mode: off, and remove node_crontab to stop periodic backup tasks.


Implementation

By default, Pigsty provides two preset backup strategies: using the local filesystem backup repository by default, where a full backup is taken daily to ensure users can roll back to any point within a day at any time. The alternative strategy uses a dedicated MinIO cluster or S3 storage for backups, with a full backup on Monday and incremental backups daily, keeping two weeks of backups and WAL archives by default.

Pigsty uses pgBackRest to manage backups, receive WAL archives, and perform PITR. The backup repository can be flexibly configured (pgbackrest_repo): by default, it uses the local filesystem (local) of the primary instance, but it can also use other disk paths, or the optional MinIO service (minio) and cloud-based S3 services.

pgbackrest_repo:                  # pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository
  local:                          # default pgbackrest repo with local posix fs
    path: /pg/backup              # local backup directory, `/pg/backup` by default
    retention_full_type: count    # retention full backups by count
    retention_full: 2             # keep 2, at most 3 full backup when using local fs repo
  minio:                          # optional minio repo for pgbackrest
    type: s3                      # minio is s3-compatible, so s3 is used
    s3_endpoint: sss.pigsty       # minio endpoint domain name, `sss.pigsty` by default
    s3_region: us-east-1          # minio region, us-east-1 by default, useless for minio
    s3_bucket: pgsql              # minio bucket name, `pgsql` by default
    s3_key: pgbackrest            # minio user access key for pgbackrest
    s3_key_secret: S3User.Backup  # minio user secret key for pgbackrest
    s3_uri_style: path            # use path style uri for minio rather than host style
    path: /pgbackrest             # minio backup path, default is `/pgbackrest`
    storage_port: 9000            # minio port, 9000 by default
    storage_ca_file: /etc/pki/ca.crt  # minio ca file path, `/etc/pki/ca.crt` by default
    bundle: y                     # bundle small files into a single file
    cipher_type: aes-256-cbc      # enable AES encryption for remote backup repo
    cipher_pass: pgBackRest       # AES encryption password, default is 'pgBackRest'
    retention_full_type: time     # retention full backup by time on minio repo
    retention_full: 14            # keep full backup for last 14 days

Pigsty has two built-in backup options: local file system repository with daily full backups or dedicated MinIO/S3 storage with weekly full and daily incremental backups, retaining two weeks’ worth by default.

The target repositories in Pigsty parameter pgbackrest_repo are translated into repository definitions in the /etc/pgbackrest/pgbackrest.conf configuration file. For example, if you define a West US region S3 repository for cold backups, you could use the following reference configuration.

s3:    # ------> /etc/pgbackrest/pgbackrest.conf
  repo1-type: s3                                   # ----> repo1-type=s3
  repo1-s3-region: us-west-1                       # ----> repo1-s3-region=us-west-1
  repo1-s3-endpoint: s3-us-west-1.amazonaws.com    # ----> repo1-s3-endpoint=s3-us-west-1.amazonaws.com
  repo1-s3-key: '<your_access_key>'                # ----> repo1-s3-key=<your_access_key>
  repo1-s3-key-secret: '<your_secret_key>'         # ----> repo1-s3-key-secret=<your_secret_key>
  repo1-s3-bucket: pgsql                           # ----> repo1-s3-bucket=pgsql
  repo1-s3-uri-style: host                         # ----> repo1-s3-uri-style=host
  repo1-path: /pgbackrest                          # ----> repo1-path=/pgbackrest
  repo1-bundle: y                                  # ----> repo1-bundle=y
  repo1-cipher-type: aes-256-cbc                   # ----> repo1-cipher-type=aes-256-cbc
  repo1-cipher-pass: pgBackRest                    # ----> repo1-cipher-pass=pgBackRest
  repo1-retention-full-type: time                  # ----> repo1-retention-full-type=time
  repo1-retention-full: 90                         # ----> repo1-retention-full=90

Recovery

You can use the following encapsulated commands for Point in Time Recovery of the PostgreSQL database cluster.

By default, Pigsty uses incremental, differential, parallel recovery, allowing you to restore to a specified point in time as quickly as possible.

pg-pitr                                 # restore to wal archive stream end (e.g. used in case of entire DC failure)
pg-pitr -i                              # restore to the time of latest backup complete (not often used)
pg-pitr --time="2022-12-30 14:44:44+08" # restore to specific time point (in case of drop db, drop table)
pg-pitr --name="my-restore-point"       # restore TO a named restore point create by pg_create_restore_point
pg-pitr --lsn="0/7C82CB8" -X            # restore right BEFORE a LSN
pg-pitr --xid="1234567" -X -P           # restore right BEFORE a specific transaction id, then promote
pg-pitr --backup=latest                 # restore to latest backup set
pg-pitr --backup=20221108-105325        # restore to a specific backup set, which can be checked with pgbackrest info

pg-pitr                                 # pgbackrest --stanza=pg-meta restore
pg-pitr -i                              # pgbackrest --stanza=pg-meta --type=immediate restore
pg-pitr -t "2022-12-30 14:44:44+08"     # pgbackrest --stanza=pg-meta --type=time --target="2022-12-30 14:44:44+08" restore
pg-pitr -n "my-restore-point"           # pgbackrest --stanza=pg-meta --type=name --target=my-restore-point restore
pg-pitr -b 20221108-105325F             # pgbackrest --stanza=pg-meta --type=name --set=20221230-120101F restore
pg-pitr -l "0/7C82CB8" -X               # pgbackrest --stanza=pg-meta --type=lsn --target="0/7C82CB8" --target-exclusive restore
pg-pitr -x 1234567 -X -P                # pgbackrest --stanza=pg-meta --type=xid --target="0/7C82CB8" --target-exclusive --target-action=promote restore

During PITR, you can observe the LSN point status of the cluster using the Pigsty monitoring system to determine if it has successfully restored to the specified time point, transaction point, LSN point, or other points.

pitr


Last modified 2024-03-07: fix en docs typo (805ca44)