This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Pigsty Document v2.7

PostgreSQL In Great STYle”: Postgres, Infras, Graphics, Service, Toolbox, it’s all Yours.

—— Battery-Included, Local-First PostgreSQL Distribution as an Open-Source RDS Alternative

Repo | Demo | Blog | CN Blog | Discuss | Discord | GPTs | Roadmap | 中文文档

Get Started with the latest v2.7.0 release: curl -fsSL https://get.pigsty.cc/i | bash


Setup: Install | Offline Install | Preparation | Configuration | Playbook | Provision | Security | FAQ

Modules: PGSQL | INFRA | NODE | ETCD | MINIO | REDIS | MONGO | DOCKER | APP

About: Features | History | Event | Community | Privacy Policy | License | Sponsor | Support & Subscription

Concept: Architecture | Cluster Model |Local CA | IaC | HA | PITR | Service Access | Access Control

Reference: Compatibility | Parameters | Extensions | FHS | Comparison | Cost | Glossary

1 - About Pigsty

Learn about Pigsty itself: features, values, history, license, privacy policy, events, and news.

1.1 - Features

Why using Pigsty? The features and values of Pigsty

PostgreSQL In Great STYle”: Postgres, Infras, Graphics, Service, Toolbox, it’s all Yours.

—— Battery-Included, Local-First PostgreSQL Distribution as an Open-Source RDS Alternative


Values


Overview

  • Battery-Included RDS: Delivers production-ready PostgreSQL services from version 12-16 on Linux x86, spanning kernel to RDS distribution.
  • Plentiful Extensions: Integrates 220+ extensions, providing turnkey capabilities for time-series, geospatial, full-text-search, vector and more!
  • Flexible Architecture: Compose Redis/Etcd/MinIO/Mongo modules on nodes, monitoring existing cluster and remote RDS, self-hosting Supabase.
  • Stunning Observability: Leveraging the Prometheus/Grafana modern observability stack, and provides unmatched database insights.
  • Proven Reliability: Self-healing HA architecture with automatic failover and uninterrupted client access, and pre-configured PITR.
  • Great Maintainability: Declarative API, GitOps ready, foolproof design, Database/Infra-as-Code, and management SOP seals complexity!
  • Sound Security: Nothing needs to be worried about database security, as long as your hardware & credentials are safe.
  • Versatile Application: Lots of applications work well with PostgreSQL. Run them in one command with docker.
  • Open Source & Free: Pigsty is a free & open source software under AGPLv3. It was built for PostgreSQL with love.

Pigsty is built-upon industry best practices:

Pigsty can be used in different scenarios:

  • Run HA PostgreSQL RDS for production usage, with PostGIS, TimescaleDB, Citus, Hydra, etc…
  • Run AI infra stack with pgvector and PostgresML.
  • Develop low-code apps with self-hosted Supabase, FerretDB, and NocoDB.
  • Run various business software & apps with docker-compose templates.
  • Run demos & data apps, analyze data, and visualize them with ECharts panels.

pigsty-home.jpg


Battery-Included RDS

Run production-grade RDS for PostgreSQL on your own machine in 10 minutes!

While PostgreSQL shines as a database kernel, it excels as a Relational Database Service (RDS) with Pigsty’s touch.

Pigsty is compatible with PostgreSQL 12-16 and runs seamlessly on EL 7, 8, 9, Debian 11/12, Ubuntu 20/22 and similar OS distributions. It integrates the kernel with a rich set of extensions, provides all the essentials for a production-ready RDS, an entire set of infrastructure runtime coupled with fully automated deployment playbooks. With everything bundled for offline installation without internet connectivity.

You can transit from a fresh node to a production-ready state effortlessly, deploy a top-tier PostgreSQL RDS service in a mere 10 minutes. Pigsty will tune parameters to your hardware, handling everything from kernel, extensions, pooling, load balancing, high-availability, monitoring & logging, backups & PITR, security and more! All you need to do is run the command and connect with the given URL.

pigsty-arch.jpg


Plentiful Extensions

Harness the might of the most advanced Open-Source RDBMS or the world!

PostgreSQL’s has an unique extension ecosystem. Pigsty seamlessly integrates these powerful extensions, delivering turnkey distributed solutions for time-series, geospatial, and vector capabilities.

Pigsty boasts over 220 PostgreSQL extensions, and maintaining some not found in official PGDG repositories. Rigorous testing ensures flawless integration for core extensions: Leverage PostGIS for geospatial data, TimescaleDB for time-series analysis, Citus for horizontal scale out, PGVector for AI embeddings, Apache AGE for graph data, ParadeDB for Full-Text Search, and Hydra, DuckdbFDW, pg_analytics for OLAP workloads!

You can also run self-hosted Supabase & PostgresML with Pigsty managed HA PostgreSQL. If you want to add your own extension, feel free to suggest or compile it by yourself.

pigsty-extension.jpg


Flexible Architecture

modular design, composable, Redis/MinIO/Etcd/Mongo support, and monitoring existing PG & RDS

All functionality is abstracted as Modules that can be freely composed for different scenarios. INFRA gives you a modern observability stack, while NODE can be used for host monitoring. Installing the PGSQL module on multiple nodes will automatically form a HA cluster.

And you can also have dedicated ETCD clusters for distributed consensus & MinIO clusters for backup storage. REDIS are also supported since they work well with PostgreSQL. You can reuse Pigsty infra and extend it with your Modules (e.g. GPSQL, KAFKA, MONGO, MYSQL…).

Moreover, Pigsty’s INFRA module can be used alone — ideal for monitoring hosts, databases, or cloud RDS.

pigsty-sandbox.jpg


Stunning Observability

Unparalleled monitoring system based on modern observability stack and open-source best-practice!

Pigsty will automatically monitor any newly deployed components such as Node, Docker, HAProxy, Postgres, Patroni, Pgbouncer, Redis, Minio, and itself. There are 30+ default dashboards and pre-configured alerting rules, which will upgrade your system’s observability to a whole new level. Of course, it can be used as your application monitoring infrastructure too.

There are over 3K+ metrics that describe every aspect of your environment, from the topmost overview dashboard to a detailed table/index/func/seq. As a result, you can have complete insight into the past, present, and future.

Check the dashboard gallery and public demo for more details.

pigsty-dashboard.jpg


Proven Reliability

Pigsty has pre-configured HA & PITR for PostgreSQL to ensure your database service is always reliable.

Hardware failures are covered by self-healing HA architecture powered by patroni, etcd, and haproxy, which will perform auto failover in case of leader failure (RTO < 30s), and there will be no data loss (RPO = 0) in sync mode. Moreover, with the self-healing traffic control proxy, the client may not even notice a switchover/replica failure.

Software Failures, human errors, and Data Center Failures are covered with Cold backups & PITR, which are implemented with pgBackRest. It allows you to travel time to any point in your database’s history as long as your storage is capable. You can store them in the local backup disk, built-in MinIO cluster, or S3 service.

Large organizations have used Pigsty for several years. One of the largest deployments has 25K CPU cores and 220+ massive PostgreSQL instances. In the past three years, there have been dozens of hardware failures & incidents, but the overall availability remains several nines (99.999% +).

pigsty-ha.png


Great Maintainability

Infra as Code, Database as Code, Declarative API & Idempotent Playbooks, GitOPS works like a charm.

Pigsty provides a declarative interface: Describe everything in a config file, and Pigsty operates it to the desired state. It works like Kubernetes CRDs & Operators but for databases and infrastructures on any nodes: bare metal or virtual machines.

To create cluster/database/user/extension, expose services, or add replicas. All you need to do is to modify the cluster definition and run the idempotent playbook. Databases & Nodes are tuned automatically according to their hardware specs, and monitoring & alerting is pre-configured. As a result, database administration becomes much more manageable.

Pigsty has a full-featured sandbox powered by Vagrant, a pre-configured one or 4-node environment for testing & demonstration purposes. You can also provision required IaaS resources from cloud vendors with Terraform templates.

pigsty-iac.jpg


Sound Security

Nothing needs to be worried about database security, as long as your hardware & credentials are safe.

Pigsty use SSL for API & network traffic, Encryption for password & backups, HBA rules for host & clients, and access control for users & objects.

Pigsty has an easy-to-use, fine-grained, and fully customizable access control framework based on roles, privileges, and HBA rules. It has four default roles: read-only, read-write, admin (DDL), offline (ETL), and four default users: dbsu, replicator, monitor, and admin. Newly created database objects will have proper default privileges for those roles. And client access is restricted by a set of HBA rules that follows the least privilege principle.

Your entire network communication can be secured with SSL. Pigsty will automatically create a self-signed CA and issue certs for that. Database credentials are encrypted with the scram-sha-256 algorithm, and cold backups are encrypted with the AES-256 algorithm when using MinIO/S3. Admin Pages and dangerous APIs are protected with HTTPS, and access is restricted from specific admin/infra nodes.

pigsty-acl.jpg


Versatile Application

Lots of applications work well with PostgreSQL. Run them in one command with docker.

The database is usually the most tricky part of most software. Since Pigsty already provides the RDS. It could be nice to have a series of docker templates to run software in stateless mode and persist their data with Pigsty-managed HA PostgreSQL (or Redis, MinIO), including Gitlab, Gitea, Wiki.js, NocoDB, Odoo, Jira, Confluence, Harbour, Mastodon, Discourse, and KeyCloak.

Pigsty also provides a toolset to help you manage your database and build data applications in a low-code fashion: PGAdmin4, PGWeb, ByteBase, PostgREST, Kong, and higher “Database” that use Postgres as underlying storage, such as EdgeDB, FerretDB, and Supabase. And since you already have Grafana & Postgres, You can quickly make an interactive data application demo with them. In addition, advanced visualization can be achieved with the built-in ECharts panel.

pigsty-app.jpg


Open Source & Free

Pigsty is a free & open source software under AGPLv3. It was built for PostgreSQL with love.

Pigsty allows you to run production-grade RDS on your hardware without suffering from human resources. As a result, you can achieve the same or even better reliability & performance & maintainability with only 5% ~ 40% cost compared to Cloud RDS PG. As a result, you may have an RDS with a lower price even than ECS.

There will be no vendor lock-in, annoying license fee, and node/CPU/core limit. You can have as many RDS as possible and run them as long as possible. All your data belongs to you and is under your control.

Pigsty is free software under AGPLv3. It’s free of charge, but beware that freedom is not free, so use it at your own risk! It’s not very difficult, and we are glad to help. For those enterprise users who seek professional consulting services, we do have a subscription for that.

pigsty-price.jpg

1.2 - Modules

This section lists the available feature modules within Pigsty, and future planning modules

Core Modules

Pigsty offers four CORE modules, which are essential for providing PostgreSQL service:

  • PGSQL : An autonomous, highly available PostgreSQL cluster powered by Patroni, Pgbouncer, HAproxy, PgBackrest, and others.
  • INFRA : Local software repository, Prometheus, Grafana, Loki, AlertManager, PushGateway, Blackbox Exporter, etc.
  • NODE : Adjusts the node to the desired state, name, time zone, NTP, ssh, sudo, haproxy, docker, promtail, keepalived.
  • ETCD : Distributed key-value store, serving as the DCS (Distributed Consensus System) for the highly available Postgres cluster: consensus leadership election, configuration management, service discovery.

Extended Modules

Pigsty offers four OPTIONAL modules,which are not necessary for the core functionality but can enhance the capabilities of PostgreSQL:

  • MINIO: S3-compatible simple object storage server, serving as an optional PostgreSQL database backup repository with production deployment support and monitoring.
  • REDIS: Redis server, a high-performance data structure server supporting standalone master-slave, sentinel, and cluster mode production deployments, with comprehensive monitoring support.
  • MONGO: Native deployment support for FerertDB — adding MongoDB wire protocol level API compatibility to PostgreSQL!
  • DOCKER: Docker Daemon service, allowing users to easily deploy containerized stateless software tool templates, adding various functionalities.

Pilot Modules

Pigsty includes some pilot and planned functional modules. If you’re interested, you may consider trying them out and providing us with suggestions and feedback:

  • DUCK: Pigsty by default includes DuckDB in the offline software package, offering powerful standalone embedded analytics capabilities (Beta).
  • CLOUD:Pigsty plans to use SealOS to provide out-of-the-box production-grade Kubernetes deployment and monitoring support (Beta).
  • MYSQL:Pigsty is researching adding high-availability deployment support for MySQL as an optional extension feature (Alpha).
  • GPSQL:Pigsty plans to support production-level deployment of Greenplum (Alpha).
  • KAFKA:Pigsty plans to offer message queue support (Draft).

Docker Modules

Pigsty also provides some services that can be quickly deployed through the DOCKER module, such as various second-order “databases” that use PostgreSQL as the actual state storage:

  • Supabase: An open-source alternative to Firebase based on PostgreSQL. Pigsty provides the necessary extensions in EL series operating systems to allow you to quickly set up Supabase.
  • FerretDB: An open-source MongoDB alternative based on PostgreSQL. You can quickly deploy containerized FerretDB using Docker Compose.
  • NocoDB: An open-source AirTable alternative based on PostgreSQL, offering low-code application development capabilities. If you need a Web Excel, consider using NocoDB.
  • Metabase: Enables quick in-database data analysis with a friendly web interface toolbox. When you need to explore data in PostgreSQL, consider using Metabase.

Monitoring Other Databases

Pigsty’s INFRA module can be used independently as a plug-and-play monitoring infrastructure for other nodes or existing PostgreSQL databases:

  • Existing PostgreSQL services: Pigsty can monitor external, non-Pigsty managed PostgreSQL services, still providing relatively complete monitoring support.
  • RDS PG: Cloud vendor-provided PostgreSQL RDS services, treated as standard external Postgres instances for monitoring.
  • PolarDB: Alibaba Cloud’s cloud-native database, treated as an external PostgreSQL 11 / 14 instance for monitoring.
  • KingBase: A trusted domestic database provided by People’s University of China, treated as an external PostgreSQL 12 instance for monitoring.
  • Greenplum / YMatrixDB monitoring, currently treated as horizontally partitioned PostgreSQL clusters for monitoring.

Moreover, Pigsty is planning to support monitoring other types of database systems:

  • MySQL: Pigsty currently offers preliminary support for MySQL monitoring (Alpha).
  • Kafka: Pigsty plans to provide monitoring support for Kafka (planned).
  • MongoDB: Pigsty plans to provide monitoring support for MongoDB (planned).

1.3 - Roadmap

The Pigsty project roadmap, including new features, development plans, and versioning & release policy.

New Features on the Radar

  • Maintaining Deb packages for Pigsty extensions
  • ARM architecture support for infrastructure components
  • Adding more extensions to PostgreSQL
  • A command-line tool that’s actually good
  • More pre-configured scenario-based templates
  • Migrating package repositories and download sources entirely to Cloudflare
  • Deploying and monitoring high-availability Kubernetes clusters with SealOS!
  • Support for PostgreSQL 17 alpha
  • Loki and Promtail seem a bit off; could VictoriaLogs and Vector step up?
  • Swapping Prometheus storage for VictoriaMetrics to handle time-series data
  • Monitoring deployments of MySQL databases
  • Monitoring databases within Kubernetes
  • Offering a richer variety of Docker application templates

Here’s our Issues and Roadmap


Versioning Policy

Pigsty employs semantic versioning, denoted as <major version>.<minor version>.<patch>. Alpha/Beta/RC versions are indicated with a suffix, such as -a1, -b1, -rc1.

Major updates signify foundational changes and a plethora of new features; minor updates typically introduce new features, software package version updates, and minor API changes, while patch updates are meant for bug fixes and documentation improvements.

Pigsty plans to release a major update annually, with minor updates usually following the rhythm of PostgreSQL minor releases, aiming to catch up within a month after a new PostgreSQL version is released, typically resulting in 4 - 6 minor updates annually. For a complete release history, refer to Release Notes.

1.4 - History

The Origin and Motivation Behind the Pigsty Project, Its Historical Development, and Future Goals and Visions.

Origin Story

The Pigsty project kicked off between 2018 and 2019, originating from Tantan, a dating app akin to China’s Tinder, now acquired by Momo. Tantan, a startup with a Nordic vibe, was founded by a team of Swedish engineers. Renowned for their tech sophistication, they chose PostgreSQL and Go as their core tech stack. Tantan’s architecture, inspired by Instagram, revolves around PostgreSQL. They managed to scale to millions of daily active users, millions of TPS, and hundreds of TBs of data using PostgreSQL exclusively. Almost all business logic was implemented using PG stored procedures, including recommendation algorithms with 100ms latency!

This unconventional development approach, deeply leveraging PostgreSQL features, demanded exceptional engineering and DBA skills. Pigsty emerged from these real-world, high-standard database cluster scenarios as an open-source project encapsulating our top-tier PostgreSQL expertise and best practices.


Dev Journey

Initially, Pigsty didn’t have the vision, objectives, or scope it has today. It was meant to be a PostgreSQL monitoring system for our use. After evaluating every available option—open-source, commercial, cloud-based, datadog, pgwatch,…… none met our observability bar. So, we took matters into our own hands, creating a system based on Grafana and Prometheus, which became the precursor to Pigsty. As a monitoring system, it was remarkably effective, solving countless management issues.

Eventually, developers wanted the same monitoring capabilities on their local dev machines. We used Ansible to write provisioning scripts, transitioning from a one-off setup to a reusable software. New features allowed users to quickly set up local DevBoxes or production servers with Vagrant and Terraform, automating PostgreSQL and monitoring system deployment through Infra as Code.

We then redesigned the production PostgreSQL architecture, introducing Patroni and pgBackRest for high availability and point-in-time recovery. We developed a zero-downtime migration strategy based on logical replication, performing rolling updates across 200 database clusters to the latest major version using blue-green deployments. These capabilities were integrated into Pigsty.

Pigsty, built for our use, reflects our understanding of our needs, avoiding shortcuts. The greatest benefit of “eating our own dog food” is being both developers and users, deeply understanding and not compromising on our requirements.

We tackled one problem after another, incorporating solutions into Pigsty. Its role evolved from a monitoring system to a ready-to-use PostgreSQL distribution. At this stage, we decided to open-source Pigsty, initiating a series of technical talks and promotions, attracting feedback from users across various industries.


Full-time Startup

In 2022, Pigsty secured seed funding from Dr. Qi’s MiraclePlus S22 (Former YC China), enabling me to work on it full-time. As an open-source project, Pigsty has thrived. In the two years since going full-time, its GitHub stars skyrocketed from a few hundred to 2400, On OSSRank, Pigsty ranks 37th among PostgreSQL ecosystem projects.

Originally only compatible with CentOS7, Pigsty now supports all major Linux Distors and PostgreSQL versions 12 - 16, integrating over 220+ extensions from the ecosystem. I’ve personally compiled, packaged, and maintained some extensions not found in official PGDG repositories.

Pigsty’s identity has evolved from a PostgreSQL distribution to an open-source cloud database alternative, directly competing with entire cloud database services offered by cloud providers.


Cloud Rebel

Public cloud vendors like AWS, Azure, GCP, and Aliyun offer many conveniences to startups but are proprietary and lock users into high-cost infra rentals.

We believe that top-notch database services should be as accessible same as top-notch database kernel (PostgreSQL), not confined to costly rentals from cyber lords.

Cloud agility and elasticity are great, but it should be open-source, local-first and cheap enough. We envision a cloud computing universe with an open-source solution, returning the control to users without sacrificing the benefits of the cloud.

Thus, we’re leading the “cloud-exit” movement in China, rebelling against public cloud norms to reshape industry values.


Our Vision

We’d like to see a world where everyone has the factual right to use top services freely, not just view the world from the pens provided by a few public cloud providers.

This is what Pigsty aims to achieve —— a superior, open-source, free RDS alternative. Enabling users to deploy a database service better than cloud RDS with just one click, anywhere (including on cloud servers).

Pigsty is a comprehensive enhancement for PostgreSQL and a spicy satire on cloud RDS. We offer “the Simple Data Stack”, which consists of PostgreSQL, Redis, MinIO, and more optional modules.

Pigsty is entirely open-source and free, sustained through consulting and sponsorship. A well-built system might run for years without issues, but when database problems arise, they’re serious. Often, expert advice can turn a dire situation around, and we offer such services to clients in need—a fairer and more rational model.


About Me

I’m Feng Ruohang, the creator of Pigsty. I’ve developed most of Pigsty’s code solo, with the community contributing specific features.

Unique individuals create unique works —— I hope Pigsty can be one of those creations.

If you are interested in the author, here’s my personal website: https://vonng.com/en/



1.5 - Event & News

Latest activity, event, and news about Pigsty and PostgreSQL.

Latest Event

The name of this project always makes me grin: PIGSTY is actually an acronym, standing for Postgres In Great STYle! It’s a Postgres distribution that includes lots of components and tools out of the box in areas like availability, deployment, and observability. The latest release pushes everything up to Postgres 16.2 standards and introduces new ParadeDB and DuckDB FDW extensions.


Release Event

Version Time Description Release
v2.6.0 2024-02-29 PG 16 as default version, ParadeDB & DuckDB v2.6.0
v2.5.1 2023-12-01 Routine update, pg16 major extensions v2.5.1
v2.5.0 2023-10-24 Ubuntu/Debian Support: bullseye, bookworm, jammy, focal v2.5.0
v2.4.1 2023-09-24 Supabase/PostgresML support, graphql, jwt, pg_net, vault v2.4.1
v2.4.0 2023-09-14 PG16, RDS Monitor, New Extensions v2.4.0
v2.3.1 2023-09-01 PGVector with HNSW, PG16 RC1, Chinese Docs, Bug Fix v2.3.1
v2.3.0 2023-08-20 PGSQL/REDIS Update, NODE VIP, Mongo/FerretDB, MYSQL Stub v2.3.0
v2.2.0 2023-08-04 Dashboard & Provision overhaul, UOS compatibility v2.2.0
v2.1.0 2023-06-10 PostgreSQL 12 ~ 16beta support v2.1.0
v2.0.2 2023-03-31 Add pgvector support and fix MinIO CVE v2.0.2
v2.0.1 2023-03-21 v2 Bug Fix, security enhance and bump grafana version v2.0.1
v2.0.0 2023-02-28 Compatibility Security Maintainability Enhancement v2.0.0
v1.5.1 2022-06-18 Grafana Security Hotfix v1.5.1
v1.5.0 2022-05-31 Docker Applications v1.5.0
v1.4.1 2022-04-20 Bug fix & Full translation of English documents. v1.4.1
v1.4.0 2022-03-31 MatrixDB Support, Separated INFRA, NODES, PGSQL, REDIS v1.4.0
v1.3.0 2021-11-30 PGCAT Overhaul & PGSQL Enhancement & Redis Support Beta v1.3.0
v1.2.0 2021-11-03 Upgrade default Postgres to 14, monitoring existing pg v1.2.0
v1.1.0 2021-10-12 HomePage, JupyterLab, PGWEB, Pev2 & Pgbadger v1.1.0
v1.0.0 2021-07-26 v1 GA, Monitoring System Overhaul v1.0.0
v0.9.0 2021-04-04 Pigsty GUI, CLI, Logging Integration v0.9.0
v0.8.0 2021-03-28 Service Provision v0.8.0
v0.7.0 2021-03-01 Monitor only deployment v0.7.0
v0.6.0 2021-02-19 Architecture Enhancement v0.6.0
v0.5.0 2021-01-07 Database Customize Template v0.5.0
v0.4.0 2020-12-14 PostgreSQL 13 Support, Official Documentation v0.4.0
v0.3.0 2020-10-22 Provisioning Solution GA v0.3.0
v0.2.0 2020-07-10 PGSQL Monitoring v6 GA v0.2.0
v0.1.0 2020-06-20 Validation on Testing Environment v0.1.0
v0.0.5 2020-08-19 Offline Installation Mode v0.0.5
v0.0.4 2020-07-27 Refactor playbooks into ansible roles v0.0.4
v0.0.3 2020-06-22 Interface enhancement v0.0.3
v0.0.2 2020-04-30 First Commit v0.0.2
v0.0.1 2019-05-15 POC v0.0.1

Conferences and Talks

Date Type Event Topic
2023-12-20 Live Debate Open Source Musings, Episode 7 Cloud Up or Down, Harvesting Users or Reducing Costs?
2023-11-24 Tech Conference Vector Databases in the Era of Large Models Roundtable Discussion: The New Future of Vector Databases in the Era of Large Models
2023-09-08 Exclusive Interview Motianlun Notable Figures Interview Feng Ruohang: A Tech Fanatic Who Doesn’t Want to Be Just a Meme Maker Isn’t a Good Open Source Founder
2023-08-16 Tech Conference DTCC 2023 DBA Night: The Open Source Licensing Issue of PostgreSQL vs MySQL
2023-08-09 Live Debate Open Source Musings, Episode 1 MySQL vs PostgreSQL, Who is the World’s Number One?
2023-07-01 Tech Conference SACC 2023 Workshop 8: FinOps Practices: Cloud Cost Management and Optimization
2023-05-12 Offline Event PostgreSQL China Community Wenzhou Offline Salon PG With DB4AI: Vector Database PGVECTOR & AI4DB: Autonomous Driving Database Pigsty
2023-04-08 Tech Conference Database Carnival 2023 A Better Open Source RDS Alternative: Pigsty
2023-04-01 Tech Conference PostgreSQL China Community Xi’an Offline Salon Best Practices for High Availability and Disaster Recovery in PG
2023-03-23 Public Livestream Bytebase x Pigsty Best Practices for Managing PostgreSQL: Bytebase x Pigsty
2023-03-04 Tech Conference PostgreSQL China Tech Conference Bombarding RDS, Release of Pigsty v2.0
2023-02-01 Tech Conference DTCC 2022 Open Source RDS Alternatives: Out-of-the-Box, Self-Driving Database Edition Pigsty
2022-07-21 Live Debate Can Open Source Fight Back Against Cloud Cannibalization? Can Open Source Fight Back Against Cloud Cannibalization?
2022-07-04 Exclusive Interview Creators Speak Post-90s, Quitting Job to Entrepreneur, Aiming to Outperform Cloud Databases
2022-06-28 Public Livestream Beth’s Roundtable SQL Review Best Practices
2022-06-12 Public Roadshow MiraclePlus S22 Demo Day Cost-Effective Database Edition Pigsty
2022-06-05 Video Livestream PG Chinese Community Livestream Sharing Quick Start with New Features of Pigstyv1.5 & Building Production Clusters

1.6 - Community

Pigsty is Build in Public. We have active community in GitHub

The Pigsty community already offers free WeChat/Discord/Telegram Q&A Office Hours, and we are also happy to provide more free value-added services to our supporters.


GitHub

Our GitHub Repo: https://github.com/Vonng/pigsty , welcome to watch and star us.

Everyone is very welcome to submit new Issue or create Pull Request, propose feature suggestions and participate in Pigsty contribution.

Star History Chart

请注意,关于 Pigsty 文档的问题,请在 github.com/Vonng/pigsty.cc 仓库中提交 Issue

Beware that for Pigsty documentation issues, please submit Issue in the github.com/Vonng/pigsty.cc repository.


Community

Telegram: https://t.me/joinchat/gV9zfZraNPM3YjFh

Discord: https://discord.gg/j5pG8qfKxU

WeChat: Search pigsty-cc and join the User Group.

We have a GPTs for Pigsty documentation QA: https://chat.openai.com/g/g-y0USNfoXJ-pigsty-consul

You can also contact me with email: [email protected]


Ask for Help

When having troubles with pigsty. You can ask the community for help, with enough info & context, here’s a template:

What happened? (REQUIRED)

Pigsty Version & OS Version (REQUIRED)

$ grep version  pigsty.yml 

$ cat /etc/os-release

If you are using a cloud provider, please tell us which cloud provider and what operating system image you are using.

If you have customized and modified the environment after installing the bare OS, or have specific security rules and firewall configurations in your WAN, please also tell us when troubleshooting.

Pigsty Config File (REQUIRED)

Don’t forget to remove sensitive information like passwords, etc…

cat ~/pigsty/pigsty.yml

What did you expect to happen?

Please describe what you expected to happen.

How to reproduce it?

Please tell us as much detail as possible about how to reproduce the problem.

Monitoring Screenshots

If you are using pigsty monitoring system, you can paste RELEVANT screenshots here.

Error Log

Please copy and paste any RELEVANT log output. Do not paste something like “Failed to start xxx service”

  • Syslog: /var/log/messages (rhel) or /var/log/syslog (debian)
  • Postgres: /pg/log/postgres/*
  • Patroni: /pg/log/patroni/*
  • Pgbouncer: /pg/log/pgbouncer/*
  • Pgbackrest: /pg/log/pgbackrest/*
journalctl -u patroni
journalctl -u <service name>

Have you tried the Issue & FAQ?

Anything else we need to know?

The more information and context you provide, the more likely we are to be able to help you solve the problem.




1.7 - License

Pigsty is open sourced under AGPLv3, here’s the details about permissions, limitations, conditions and exemptions.

Pigsty is open sourced under the AGPLv3 license, which is a copyleft license.


Summary

Pigsty use the AGPLv3 license, which is a strong copyleft license that requires you to also distribute the source code of your derivative works under the same license. If you distribute Pigsty, you must make the source code available under the same license, and you must make it clear that the source code is available.

Permissions:

  • Commercial use
  • Modification
  • Distribution
  • Patent use
  • Private use

Limitations:

  • Liability
  • Warranty

Conditions:

  • License and copyright notice
  • State changes
  • Disclose source
  • Network use is distribution
  • Same license

Beware that the Pigsty official website is also open sourced under CC by 4.0 license.


Exemptions

While employing the AGPLv3 license for Pigsty, we extend exemptions to common end users under terms akin to the Apache 2.0 license. Common end users are defined as all entities except public cloud and database service vendors.

These users may utilize Pigsty for commercial activities and service provision without licensing concerns. Our subscription include written guarantees of these terms for additional assurance.

We encourage cloud & databases vendors adhering to AGPLv3 to use Pigsty for derivative works and to contribute to the community.


BOM Inventory

Related software and open source project:

Module Software Name License
PGSQL PostgreSQL PostgreSQL License (BSD-Like)
PGSQL pgbouncer ISC License
PGSQL patroni MIT License
PGSQL pgbackrest MIT License
PGSQL vip-manager BSD 2-Clause License
PGSQL pg_exporter Apache License 2.0
NODE node_exporter Apache License 2.0
NODE haproxy HAPROXY’s License (GPLv2)
NODE keepalived MIT License
INFRA Grafana, Loki GNU Affero General Public License v3.0
INFRA Prometheus Apache License 2.0
INFRA DNSMASQ GPLv2 / GPLv3
INFRA Ansible GNU General Public License v3.0
ETCD etcd Apache License 2.0
MINIO MinIO GNU Affero General Public License v3.0
REDIS Redis Redis License (3-clause BSD)
REDIS Redis Exporter MIT License
MONGO FerretDB Apache License 2.0
DOCKER docker-ce Apache License 2.0
CLOUD Sealos Apache License 2.0
DUCKDB DuckDB MIT
External Vagrant Business Source License 1.1
External Terraform Business Source License 1.1
External Virtualbox GPLv2

PostgreSQL Extensions:

Extension Name License
PostGIS PostGIS (GPLv2+)
Citus AGPLv3
TimescaleDB Timescale
PGVector PostgreSQL (BSD-Like)
pg_repack BSD 3-Clause
wal2json BSD 3-Clause
scws BSD-Like
zhparser BSD-Like
pg_roaringbitmap Apache-2.0
pg_tle Apache-2.0
pgsql-http MIT
pgsql-gzip MIT
pgjwt MIT
vault Apache-2.0
pointcloud Apache-Like
imgsmlr Postgres License
pg_similarity BSD-Like
pg_bigm BSD-Like
hydra AGPLv3
pg_net Apache-2.0
pg_filedump GPLv2
age Apache-2.0
duckdb_fdw MIT
pg_sparse Postgres License
pg_graphql Apache-2.0
pgml MIT
pg_search AGPLv3
pg_analytics AGPLv3
pg_graphql Apache-2.0
pg_jsonschema Apache-2.0
wrappers Apache-2.0
pgmq PostgreSQL
pg_tier Apache-2.0
pg_vectorize PostgreSQL
pg_later PostgreSQL
pg_idkit Apache-2.0
plprql Apache-2.0
pgsmcrypto MIT
pg_tiktoken Apache-2.0
pgdd MIT
plv8 BSD 2-Clause Like
pg_tde MIT
md5hash BSD Like?
pg_dirtyread BSD 3-Clause Like

Content

                    GNU AFFERO GENERAL PUBLIC LICENSE
                       Version 3, 19 November 2007

 Copyright (C) 2007 Free Software Foundation, Inc. <https://fsf.org/>
 Everyone is permitted to copy and distribute verbatim copies
 of this license document, but changing it is not allowed.

                            Preamble

  The GNU Affero General Public License is a free, copyleft license for
software and other kinds of works, specifically designed to ensure
cooperation with the community in the case of network server software.

  The licenses for most software and other practical works are designed
to take away your freedom to share and change the works.  By contrast,
our General Public Licenses are intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users.

  When we speak of free software, we are referring to freedom, not
price.  Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.

  Developers that use our General Public Licenses protect your rights
with two steps: (1) assert copyright on the software, and (2) offer
you this License which gives you legal permission to copy, distribute
and/or modify the software.

  A secondary benefit of defending all users' freedom is that
improvements made in alternate versions of the program, if they
receive widespread use, become available for other developers to
incorporate.  Many developers of free software are heartened and
encouraged by the resulting cooperation.  However, in the case of
software used on network servers, this result may fail to come about.
The GNU General Public License permits making a modified version and
letting the public access it on a server without ever releasing its
source code to the public.

  The GNU Affero General Public License is designed specifically to
ensure that, in such cases, the modified source code becomes available
to the community.  It requires the operator of a network server to
provide the source code of the modified version running there to the
users of that server.  Therefore, public use of a modified version, on
a publicly accessible server, gives the public access to the source
code of the modified version.

  An older license, called the Affero General Public License and
published by Affero, was designed to accomplish similar goals.  This is
a different license, not a version of the Affero GPL, but Affero has
released a new version of the Affero GPL which permits relicensing under
this license.

  The precise terms and conditions for copying, distribution and
modification follow.

                       TERMS AND CONDITIONS

  0. Definitions.

  "This License" refers to version 3 of the GNU Affero General Public License.

  "Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.

  "The Program" refers to any copyrightable work licensed under this
License.  Each licensee is addressed as "you".  "Licensees" and
"recipients" may be individuals or organizations.

  To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy.  The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.

  A "covered work" means either the unmodified Program or a work based
on the Program.

  To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy.  Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.

  To "convey" a work means any kind of propagation that enables other
parties to make or receive copies.  Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.

  An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License.  If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.

  1. Source Code.

  The "source code" for a work means the preferred form of the work
for making modifications to it.  "Object code" means any non-source
form of a work.

  A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.

  The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form.  A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.

  The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities.  However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work.  For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.

  The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.

  The Corresponding Source for a work in source code form is that
same work.

  2. Basic Permissions.

  All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met.  This License explicitly affirms your unlimited
permission to run the unmodified Program.  The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work.  This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.

  You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force.  You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright.  Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.

  Conveying under any other circumstances is permitted solely under
the conditions stated below.  Sublicensing is not allowed; section 10
makes it unnecessary.

  3. Protecting Users' Legal Rights From Anti-Circumvention Law.

  No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.

  When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.

  4. Conveying Verbatim Copies.

  You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.

  You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.

  5. Conveying Modified Source Versions.

  You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:

    a) The work must carry prominent notices stating that you modified
    it, and giving a relevant date.

    b) The work must carry prominent notices stating that it is
    released under this License and any conditions added under section
    7.  This requirement modifies the requirement in section 4 to
    "keep intact all notices".

    c) You must license the entire work, as a whole, under this
    License to anyone who comes into possession of a copy.  This
    License will therefore apply, along with any applicable section 7
    additional terms, to the whole of the work, and all its parts,
    regardless of how they are packaged.  This License gives no
    permission to license the work in any other way, but it does not
    invalidate such permission if you have separately received it.

    d) If the work has interactive user interfaces, each must display
    Appropriate Legal Notices; however, if the Program has interactive
    interfaces that do not display Appropriate Legal Notices, your
    work need not make them do so.

  A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit.  Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.

  6. Conveying Non-Source Forms.

  You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:

    a) Convey the object code in, or embodied in, a physical product
    (including a physical distribution medium), accompanied by the
    Corresponding Source fixed on a durable physical medium
    customarily used for software interchange.

    b) Convey the object code in, or embodied in, a physical product
    (including a physical distribution medium), accompanied by a
    written offer, valid for at least three years and valid for as
    long as you offer spare parts or customer support for that product
    model, to give anyone who possesses the object code either (1) a
    copy of the Corresponding Source for all the software in the
    product that is covered by this License, on a durable physical
    medium customarily used for software interchange, for a price no
    more than your reasonable cost of physically performing this
    conveying of source, or (2) access to copy the
    Corresponding Source from a network server at no charge.

    c) Convey individual copies of the object code with a copy of the
    written offer to provide the Corresponding Source.  This
    alternative is allowed only occasionally and noncommercially, and
    only if you received the object code with such an offer, in accord
    with subsection 6b.

    d) Convey the object code by offering access from a designated
    place (gratis or for a charge), and offer equivalent access to the
    Corresponding Source in the same way through the same place at no
    further charge.  You need not require recipients to copy the
    Corresponding Source along with the object code.  If the place to
    copy the object code is a network server, the Corresponding Source
    may be on a different server (operated by you or a third party)
    that supports equivalent copying facilities, provided you maintain
    clear directions next to the object code saying where to find the
    Corresponding Source.  Regardless of what server hosts the
    Corresponding Source, you remain obligated to ensure that it is
    available for as long as needed to satisfy these requirements.

    e) Convey the object code using peer-to-peer transmission, provided
    you inform other peers where the object code and Corresponding
    Source of the work are being offered to the general public at no
    charge under subsection 6d.

  A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.

  A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling.  In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage.  For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product.  A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.

  "Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source.  The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.

  If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information.  But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).

  The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed.  Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.

  Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.

  7. Additional Terms.

  "Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law.  If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.

  When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it.  (Additional permissions may be written to require their own
removal in certain cases when you modify the work.)  You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.

  Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:

    a) Disclaiming warranty or limiting liability differently from the
    terms of sections 15 and 16 of this License; or

    b) Requiring preservation of specified reasonable legal notices or
    author attributions in that material or in the Appropriate Legal
    Notices displayed by works containing it; or

    c) Prohibiting misrepresentation of the origin of that material, or
    requiring that modified versions of such material be marked in
    reasonable ways as different from the original version; or

    d) Limiting the use for publicity purposes of names of licensors or
    authors of the material; or

    e) Declining to grant rights under trademark law for use of some
    trade names, trademarks, or service marks; or

    f) Requiring indemnification of licensors and authors of that
    material by anyone who conveys the material (or modified versions of
    it) with contractual assumptions of liability to the recipient, for
    any liability that these contractual assumptions directly impose on
    those licensors and authors.

  All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10.  If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term.  If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.

  If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.

  Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.

  8. Termination.

  You may not propagate or modify a covered work except as expressly
provided under this License.  Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).

  However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.

  Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.

  Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License.  If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.

  9. Acceptance Not Required for Having Copies.

  You are not required to accept this License in order to receive or
run a copy of the Program.  Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance.  However,
nothing other than this License grants you permission to propagate or
modify any covered work.  These actions infringe copyright if you do
not accept this License.  Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.

  10. Automatic Licensing of Downstream Recipients.

  Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License.  You are not responsible
for enforcing compliance by third parties with this License.

  An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations.  If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.

  You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License.  For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.

  11. Patents.

  A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based.  The
work thus licensed is called the contributor's "contributor version".

  A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version.  For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.

  Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.

  In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement).  To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.

  If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients.  "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.

  If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.

  A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License.  You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.

  Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.

  12. No Surrender of Others' Freedom.

  If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License.  If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all.  For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.

  13. Remote Network Interaction; Use with the GNU General Public License.

  Notwithstanding any other provision of this License, if you modify the
Program, your modified version must prominently offer all users
interacting with it remotely through a computer network (if your version
supports such interaction) an opportunity to receive the Corresponding
Source of your version by providing access to the Corresponding Source
from a network server at no charge, through some standard or customary
means of facilitating copying of software.  This Corresponding Source
shall include the Corresponding Source for any work covered by version 3
of the GNU General Public License that is incorporated pursuant to the
following paragraph.

  Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU General Public License into a single
combined work, and to convey the resulting work.  The terms of this
License will continue to apply to the part which is the covered work,
but the work with which it is combined will remain governed by version
3 of the GNU General Public License.

  14. Revised Versions of this License.

  The Free Software Foundation may publish revised and/or new versions of
the GNU Affero General Public License from time to time.  Such new versions
will be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.

  Each version is given a distinguishing version number.  If the
Program specifies that a certain numbered version of the GNU Affero General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation.  If the Program does not specify a version number of the
GNU Affero General Public License, you may choose any version ever published
by the Free Software Foundation.

  If the Program specifies that a proxy can decide which future
versions of the GNU Affero General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.

  Later license versions may give you additional or different
permissions.  However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.

  15. Disclaimer of Warranty.

  THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW.  EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE.  THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU.  SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.

  16. Limitation of Liability.

  IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.

  17. Interpretation of Sections 15 and 16.

  If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.

                     END OF TERMS AND CONDITIONS

            How to Apply These Terms to Your New Programs

  If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.

  To do so, attach the following notices to the program.  It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.

    Copyright (C) 2018-2024  Ruohang Feng, Author of Pigsty

    This program is free software: you can redistribute it and/or modify
    it under the terms of the GNU Affero General Public License as published by
    the Free Software Foundation, either version 3 of the License, or
    (at your option) any later version.

    This program is distributed in the hope that it will be useful,
    but WITHOUT ANY WARRANTY; without even the implied warranty of
    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
    GNU Affero General Public License for more details.

    You should have received a copy of the GNU Affero General Public License
    along with this program.  If not, see <https://www.gnu.org/licenses/>.

Also add information on how to contact you by electronic and paper mail.

  If your software can interact with users remotely through a computer
network, you should also make sure that it provides a way for users to
get its source.  For example, if your program is a web application, its
interface could display a "Source" link that leads users to an archive
of the code.  There are many ways you could offer source, and different
solutions will be better for different programs; see section 13 for the
specific requirements.

  You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU AGPL, see
<https://www.gnu.org/licenses/>.

1.8 - Sponsorship

Sponsors and Investors of Pigsty, Thank You for Your Support of This Project!

Investor

Pigsty is funded by MiraclePlus (formal YC China), S22 Batch.

Thanks to MiraclePlus and Dr.Qi’s support.


Sponsorship

Pigsty is a free & open-source software nurtured by the passion of PostgreSQL community members.

If our work has helped you, please consider sponsoring or supporting our project. Every penny counts, and advertisements are also a form of support:

  • Make Donation & Sponsor us.
  • Share your experiences and use cases of Pigsty through articles, lectures, and videos.
  • Allow us to mention your organization in “These users who use Pigsty”
  • Nominate/Recommend our project and services to your friends, colleagues, and clients in need.
  • Follow our WeChat Column and share technical articles with your friends.

1.9 - Privacy Policy

How we process your data & protect your privacy in this website and pigsty’s software.

Pigsty Software

When you install the Pigsty software, if you use offline packages in a network-isolated environment, we will not receive any data about you.

If you choose to install online, then when downloading relevant software packages, our server or the servers of our cloud providers will automatically log the visiting machine’s IP address and/or hostname, as well as the name of the software package you downloaded, in the logs.

We will not share this information with other organizations unless legally required to do so.

The domain name used by Pigsty is: pigsty.io


Pigsty Website

When you visit our website, our servers automatically log your IP address and/or host name.

We store information such as your email address, name and locality only if you decide to send us such information by completing a survey, or registering as a user on one of our sites

We collect this information to help us improve the content of our sites, customize the layout of our web pages and to contact people for technical and support purposes. We will not share your email address with other organisations unless required by law.

This website uses Google Analytics, a web analytics service provided by Google, Inc. (“Google”). Google Analytics uses “cookies”, which are text files placed on your computer, to help the website analyze how users use the site.

The information generated by the cookie about your use of the website (including your IP address) will be transmitted to and stored by Google on servers in the United States. Google will use this information for the purpose of evaluating your use of the website, compiling reports on website activity for website operators and providing other services relating to website activity and internet usage. Google may also transfer this information to third parties where required to do so by law, or where such third parties process the information on Google’s behalf. Google will not associate your IP address with any other data held by Google. You may refuse the use of cookies by selecting the appropriate settings on your browser, however please note that if you do this you may not be able to use the full functionality of this website. By using this website, you consent to the processing of data about you by Google in the manner and for the purposes set out above.

If you have any questions or comments around this policy, or to request the deletion of personal data, you can contact us at [email protected]

1.10 - Service

Pigsty has professional support which provides services & consulting to cover corner-cases!

Pigsty is a battery-included use PostgreSQL database distribution, a local-first alternative to RDS/cloud database services, allowing users to run a fully-featured local RDS service at a hardware cost of just a few core months. The software itself is completely open-source and free. If it has helped you, please consider sponsoring us.

Although Pigsty is designed to replace manual database operations with database autopilot software, even the best software can only solve some problems. There will always be some infrequent issues and various problems not limited to technology that require expert intervention. Therefore, we also offer professional service subscriptions.


Available Plans

Pigsty OSS
Free!
But no Warranty

PG: 16

OS: 3 major distro

  • EL 8.9
  • Debian 12
  • Ubuntu 22.04

Modules: Core Modules

SLA: Not Available

Community Support

Pigsty Basic
7,000 $ / year
or 700 $/month

PG: 15, 16

OS: 5 major distro

  • EL 7.9 / 8.9 / 9.3
  • Ubuntu 20.04 / 22.04
  • Debian 11 / 12

Modules: All Available

SLA: 5 x 8 (<48h)

Basic Support

  • Bug Fix
  • Security Patch
  • Failure Analysis
  • Upgrade Path

Pigsty Professional
20,000 $ / year
or 2,000 $/month

PG: 14, 15, 16

OS: 5x full minor distros

  • EL 7.x / 8.x / 9.x
  • Ubuntu 20.x / 22.x
  • Debian 11.x / 12.x

Modules: All Available

SLA:5 x 8 (<4h)

Professional Consulting

  • Bug Fix
  • Security Patch
  • Failure Analysis
  • Upgrade Path
  • DBA Consulting

Pigsty Enterprise
50,000 $ / year
or 5,000 $/month

PG: 12 - 16

OS: Be Spoke

  • EL, Debian, Ubuntu
  • UOS / Anolis / Cloud
  • ARM64, Be spoke

Modules: All Available

SLA:7 x 24 (on-call)

Enterprise Service

  • Bug Fix
  • Security Patch
  • Failure Analysis
  • Upgrade Path
  • DBA Consulting
  • Arch Review
  • Outage on-call


Service Subscription

Plan Open Source Standard Pro Enterprise
Who Self-sufficient OSS Guru, Developers Elite tech team who seeks assurance suitable choice for common users critical scenario which require strict SLA
Price (Year) Free under AGPLv3 7,000 $ / Year 20,000 $ / Year 50,000 $ / Year
Price (Month) Free under AGPLv3 700 $ / Month 2,000 $ / Month 5,000 $ / Month
Node Size Unlimited <= 5 <= 15 <= 40
Consult Community Support
Groups, Issues, Discuz
Bug Fix & Security Patch
Failure Analysis
Upgrade Path
Bug Fix & Security Patch
Failure Analysis
Upgrade Path
DBA Consulting
Bug Fix & Security Patch
Failure Analysis
Upgrade Path
DBA Consulting
Arch Review
Outage on-call
Service - One-time setup (< 1 day) 2 free expert days / Year 4 free expert days / Year
PG Support the latest major version
PG 16
Last 2 version
PG 15, 16
Last 3 version
PG 14, 15, 16
Lifecycle Version
PG 12 - 16
OS Support EL 8.9
Ubuntu 22.04
Debian 12
EL 7.9 / 8.9 / 9.3
Ubuntu 22.04 / 20.04
Debian 11 / 12
EL 7.x / 8.x / 9.x
Ubuntu 22.x / 20.x
Debian 11.x / 12.x
EL / Debian / Ubuntu
UOS / Anolis / CloudOS
And, be spoke…
Arch Support x86_64 x86_64 x86_64 x86_64
arm64 / aarch64
Modules PGSQL, INFRA, NODE
ETCD, MINIO, REDIS
PGSQL, INFRA, NODE
ETCD, MINIO, REDIS
All Other Modules
PGSQL, INFRA, NODE
ETCD, MINIO, REDIS
All Other Modules
PGSQL, INFRA, NODE
ETCD, MINIO, REDIS
All Other Modules
SLA - 5 x 8 (<48h) 5 x 8 (<4h) 7 x 24 (on-call)
Cost Like Compare to
RDS & DBA
12 vCPU RDS
Part-Time Operators
40 vCPU RDS
??% Junior DBA
100 vCPU RDS
??% DBA

Pigsty Pro offers an expanded range of features, supporting a wider variety of operating system distributions, PostgreSQL major versions, and a richer set of extension plugins. It also includes offline software packages tailored for each OS minor version to ensure optimal compatibility.

Pigsty subscriptions operate on an annual payment model, providing users with an annual license for the Pigsty commercial version. This includes access to the latest software versions and upgrade paths released within the year, along with comprehensive consulting, Q&A, and service support. A larger scale implies more complex scenarios, more issues, and a higher chance of failure events: thus, each subscription comes with a node scale limit. For example, if you are using the Pro subscription and manage 15 nodes, you will need to pay an additional subscription fee for each node beyond the limit (10,000 RMB per node).

Pigsty’s pricing strategy ensures value for money — you can immediately obtain top-notch DBA database architecture solutions and management best practices, all backed by consulting, Q&A, and service support, at a cost that is highly competitive compared to find-out & hiring rare database Guru or using cloud RDS.

If you have the following needs, please consider our subscription:

  • Running databases in critical scenarios and needing strict SLA guarantees.
  • Seeking backup for Pigsty and PostgreSQL-related issues.
  • Wanting guidance on best practices for PostgreSQL/Pigsty production environments.
  • Needing experts to help interpret monitoring charts, analyze and locate performance bottlenecks and fault root causes, and provide opinions.
  • Planning a database architecture that meets security, disaster recovery, and compliance requirements based on existing resources and business needs.
  • Needing to migrate other databases to PostgreSQL, or migrate and transform legacy instances.
  • Building an observability system, data dashboard, and visualization application based on the Prometheus/Grafana tech stack.
  • Seeking support for domestic trusted operating systems/domestic trusted ARM chip architectures and providing Chinese/localized interface support.
  • Moving off the cloud and seeking an open-source alternative to RDS for PostgreSQL - a cloud-neutral, vendor-lock-in-free solution.
  • Seeking professional technical support for Redis/ETCD/MinIO/Greenplum/Citus/TimescaleDB.
  • Wanting to avoid the restrictions of the AGPLv3 license of Pigsty itself, doing derivative works being forced to use the same open-source license for secondary development and branding.
  • Consider selling Pigsty as SaaS/PaaS/DBaaS or providing technical services/consulting services based on this distribution.

Service subscriptions are divided into two different levels, Standard Service Agreement, and Enterprise Service Agreement, as shown in the table below:

Commercial support contact: Email: [email protected], WeChat: pigsty-cc / RuohangFeng


Miscellaneous

We offer retail expert days that can be used for database architecting, failure analysis, postmortem, troubleshooting, performance analysis, problem-solving, teaching, and training, which can be purchased as needed.

  • Top Expert: 3,000 $ / day
  • Senior Expert: 2,000 ¥ / day

The above prices are exclusive of taxes. The minimum unit is half a day, less than that will be charged as half a day. Price is doubled outside regular working hours (5x8), and it’s tripled on public holidays. Pricing & Discount may vary depending on the industry and the technical level of the client’s team.

Expert days need to be arranged at least one day before. Emergency failure responding is not applicable here and only available to subscribed customers.

We offer teaching and training services on PostgreSQL, priced as follows:

  • PostgreSQL Application Development: 1 x expert day, up to 20 people.
  • PostgreSQL Management & Operation: 1 x expert day, up to 20 people.
  • PostgreSQL Kernel Architecture: 2 x expert day, up to 10 people.

We offer deployment consulting and architecting services, priced as follows:

  • Planning a deployment solution based on your existing resources and needs.
  • 150 $/h, at least one hour per case, remote only. Delivery includes the pigsty.yml file.

2 - Getting Started

Get Pigsty up & running based your resources and needs: preparation, provision, download, configure, and installation

2.1 - Installation

How to install Pigsty?

Install Pigsty with 4 steps: Prepare, Download, Configure and Install.

Also check offline installation if you don’t have the Internet access.


Short Version

Prepare a fresh Linux x86_64 node that runs compatible OS, then run as a sudo-able user:

curl -L https://get.pigsty.cc/install | bash

It will download Pigsty source to your home, then perform configure and install to finish the installation.

cd ~/pigsty   # get pigsty source and entering dir
./bootstrap   # download bootstrap pkgs & ansible [optional]
./configure   # pre-check and config templating   [optional] 
./install.yml # install pigsty according to pigsty.yml

A pigsty singleton node will be ready with Web Interface on port 80/443 and Postgres service on port 5432. You can add more nodes into Pigsty and deploy modules on them.

Example: Online Singleton Installation on Ubuntu 22.04:

asciicast

Example: Install with Offline Package (EL8)

asciicast


Prepare

Check Preparation for a complete guide of resource preparation.

Pigsty support the Linux kernel and x86_64/amd64 arch. It can run on any nodes: bare metal, virtual machines, or VM-like containers, but a static IPv4 address is required. The minimum spec is 1C1G. It is recommended to use bare metals or VMs with at least 2C4G. There’s no upper limit, and node param will be auto-tuned.

We recommend using fresh RockyLinux 8.9 or Ubuntu 22.04.3 as underlying operating systems. For a complete list of supported operating systems, please refer to Compatibility.

Public key ssh access to localhost and NOPASSWD sudo privilege is required to perform the installation, Try not using the root user. If you wish to manage more nodes, these nodes needs to be ssh / sudo accessible via your current admin node & admin user.

Pigsty relies on Ansible to execute playbooks. you have to install ansible and jmespath packages fist to run the install procedure. This can be done with the following command, or through the bootstrap procedure, especially when you do not have internet access..

sudo dnf install -y ansible python3.11-jmespath python3-cryptography
sudo yum install -y ansible   # EL7 does not need to install jmespath explicitly
sudo apt install -y ansible python3-jmespath
brew install ansible

Download

You can get & extract pigsty source via the following command:

curl -fsSL https://get.pigsty.cc/install | bash
Download Example Output
$ bash -c "$(curl -fsSL https://get.pigsty.cc/install)"
[v2.7.0] ===========================================
$ curl -fsSL https://pigsty.cc/install | bash
[Site] https://pigsty.io
[Demo] https://demo.pigsty.cc
[Repo] https://github.com/Vonng/pigsty
[Docs] https://pigsty.io/docs/setup/install
[Download] ===========================================
[ OK ] version = v2.7.0 (from default)
curl -fSL https://get.pigsty.cc/v2.7.0/pigsty-v2.7.0.tgz -o /tmp/pigsty-v2.7.0.tgz
########################################################################### 100.0%
[ OK ] md5sums = some_random_md5_hash_value_here_  /tmp/pigsty-v2.7.0.tgz
[Install] ===========================================
[ OK ] install = /home/vagrant/pigsty, from /tmp/pigsty-v2.7.0.tgz
[Resource] ===========================================
[HINT] rocky 8  have [OPTIONAL] offline package available: https://pigsty.io/docs/setup/offline
curl -fSL https://github.com/Vonng/pigsty/releases/download/v2.7.0/pigsty-pkg-v2.7.0.el8.x86_64.tgz -o /tmp/pkg.tgz
curl -fSL https://get.pigsty.cc/v2.7.0/pigsty-pkg-v2.7.0.el8.x86_64.tgz -o /tmp/pkg.tgz # or use alternative CDN
[TodoList] ===========================================
cd /home/vagrant/pigsty
./bootstrap      # [OPTIONAL] install ansible & use offline package
./configure      # [OPTIONAL] preflight-check and config generation
./install.yml    # install pigsty modules according to your config.
[Complete] ===========================================

If you don’t have the Internet access, check offline installation for details. You can download the source tarball with the following links and upload them with scp, ftp, etc…

curl https://get.pigsty.cc/v2.7.0/pigsty-v2.7.0.tgz -o pigsty.tgz
tar -xvf pigsty.tgz -C ~ ; cd ~/pigsty ;  # download manually and extract to home dir

You can also use git to download the Pigsty source. Please make sure to check out a specific version before using.

git clone https://github.com/Vonng/pigsty; cd pigsty;  git checkout v2.7.0;

Configure

configure will create a pigsty.yml config file according to your env.

This procedure is OPTIONAL if you know how to configure pigsty manually.

./configure # interactive-wizard, ask for IP address
./configure [-i|--ip <ipaddr>] [-m|--mode <name>]  # give primary IP & config mode 
            [-r|--region <default|china|europe>]   # choose upstream repo region
            [-n|--non-interactive]                 # skip interactive wizard
            [-x|--proxy]                           # write proxy env to config 
Configure Example Output
$ ./configure
configure pigsty v2.7.0 begin
[ OK ] region = china
[ OK ] kernel = Linux
[ OK ] machine = x86_64
[ OK ] package = rpm,dnf
[ OK ] vendor = rocky (Rocky Linux)
[ OK ] version = 8 (8.9)
[ OK ] sudo = vagrant ok
[ OK ] ssh = [email protected] ok
[WARN] Multiple IP address candidates found:
    (1) 192.168.121.165	    inet 192.168.121.165/24 brd 192.168.121.255 scope global dynamic noprefixroute eth0
    (2) 10.10.10.8	    inet 10.10.10.8/24 brd 10.10.10.255 scope global noprefixroute eth1
[ IN ] INPUT primary_ip address (of current meta node, e.g 10.10.10.10):
=> 10.10.10.8     # input the primary IP address itself, if there's multiple candidates
[ OK ] primary_ip = 10.10.10.8 (from input)
[ OK ] admin = [email protected] ok
[ OK ] config = el @ 10.10.10.8 [tiny -> oltp]
[ OK ] configure pigsty done
proceed with ./install.yml

  • -m|--mode: Generate config from templates according to mode: (auto|demo|sec|citus|el|el7|ubuntu|prod...)
  • -i|--ip: Replace IP address placeholder 10.10.10.10 with your primary ipv4 address of current node.
  • -r|--region: Set upstream repo mirror according to region (default|china|europe)
  • -n|--non-interactive: skip interactive wizard and using default/arg values
  • -x|--proxy: write current proxy env to the config proxy_env (http_proxy/HTTP_PROXYHTTPS_PROXYALL_PROXYNO_PROXY)

When -n|--non-interactive is specified, you have to specify a primary IP address with -i|--ip <ipaddr> in case of multiple IP address, since there’s no default value for primary IP address in this case.

If your machine’s network interface have multiple IP addresses, you’ll need to explicitly specify a primary IP address for the current node using -i|--ip <ipaddr>, or provide it during interactive inquiry. The address should be a static IP address, and you should avoid using any public IP addresses.

You can check and modify the generated config file ~/pigsty/pigsty.yml before installation.


Install

Run the install.yml playbook to perform a full installation on current node

./install.yml    # install everything in one-pass
Installation Output Example
[vagrant@meta pigsty]$ ./install.yml

PLAY [IDENTITY] ********************************************************************************************************************************

TASK [node_id : get node fact] *****************************************************************************************************************
changed: [10.10.10.12]
changed: [10.10.10.11]
changed: [10.10.10.13]
changed: [10.10.10.10]
...
...
PLAY RECAP **************************************************************************************************************************************************************************
10.10.10.10                : ok=288  changed=215  unreachable=0    failed=0    skipped=64   rescued=0    ignored=0
10.10.10.11                : ok=263  changed=194  unreachable=0    failed=0    skipped=88   rescued=0    ignored=1
10.10.10.12                : ok=263  changed=194  unreachable=0    failed=0    skipped=88   rescued=0    ignored=1
10.10.10.13                : ok=153  changed=121  unreachable=0    failed=0    skipped=53   rescued=0    ignored=1
localhost                  : ok=3    changed=0    unreachable=0    failed=0    skipped=4    rescued=0    ignored=0

It’s a standard ansible playbook, you can have fine-grained control with ansible options:

  • -l: limit execution targets
  • -t: limit execution tasks
  • -e: passing extra args
  • -i: use another config

Interface

Once installed, you’ll have four core modules PGSQL, INFRA, NODE, and ETCD the current node.

The PGSQL provides a PostgreSQL singleton which can be accessed via:

psql postgres://dbuser_dba:[email protected]/meta     # DBA / superuser (via IP)
psql postgres://dbuser_meta:[email protected]/meta   # business admin, read / write / ddl
psql postgres://dbuser_view:DBUser.View@pg-meta/meta       # read-only user

The INFRA module gives you an entire modern observability stack, exposed by Nginx on (80 / 443):

There are several services are exposed by Nginx (configured by infra_portal):

Component Port Domain Comment Public Demo
Nginx 80/443 h.pigsty Web Service Portal, Repo home.pigsty.cc
AlertManager 9093 a.pigsty Alter Aggregator a.pigsty.cc
Grafana 3000 g.pigsty Grafana Dashboard Home demo.pigsty.cc
Prometheus 9090 p.pigsty Prometheus Web UI p.pigsty.cc

Grafana Dashboards (g.pigsty, port 3000) credentials, user: admin / pass: pigsty

pigsty-home.jpg

You can access these web UI directly via IP + port. While the common best practice would be access them through Nginx and distinguish via domain names. You’ll need configure DNS records, or use the local static records (/etc/hosts) for that.


How to access Pigsty Web UI by domain name?

There are several options:

  1. Resolve internet domain names through a DNS service provider, suitable for systems accessible from the public internet.
  2. Configure internal network DNS server resolution records for internal domain name resolution.
  3. Modify the local machine’s /etc/hosts file to add static resolution records. (For Windows, it’s located at:)

We recommend the third method for common users. On the machine (which runs the browser), add the following record into /etc/hosts (sudo required) or C:\Windows\System32\drivers\etc\hosts in Windows:

<your_public_ip_address>  h.pigsty a.pigsty p.pigsty g.pigsty

You have to use the external IP address of the node here.


How to configure server side domain names?

The server-side domain name is configured with Nginx. If you want to replace the default domain name, simply enter the domain you wish to use in the parameter infra_portal. When you access the Grafana monitoring homepage via http://g.pigsty, it is actually accessed through the Nginx proxy to Grafana’s WebUI:

http://g.pigsty ️-> http://10.10.10.10:80 (nginx) -> http://10.10.10.10:3000 (grafana)

If nginx_sslmode is set to enabled or enforced, you can trust self-signed ca: files/pki/ca/ca.crt to use https in your browser.


How to use HTTPS in Pigsty WebUI?

Pigsty will generate self-signed certs for Nginx, if you wish to access via HTTPS without “Warning”, here are some options:

  • Apply & add real certs from trusted CA: such as Let’s Encrypt
  • Trust your generated CA crt as root ca in your OS and browser
  • Type thisisunsafe in Chrome will supress the warning

More

You can deploy & monitor more clusters with pigsty: add more nodes to pigsty.yml and run corresponding playbooks:

bin/node-add   pg-test      # init 3 nodes of cluster pg-test
bin/pgsql-add  pg-test      # init HA PGSQL Cluster pg-test
bin/redis-add  redis-ms     # init redis cluster redis-ms

Remember that most modules require the [NODE] module installed first. Check modules for detail

PGSQL, INFRA, NODE, ETCD, MINIO, REDIS, MONGO, DOCKER, ……






2.2 - Offline Install

How to install pigsty without Internet access? How to make your own offline packages.

Pigsty’s Standard Installation process requires Internet access, but production database servers are often isolated from the Internet.

Therefore, Pigsty offers an offline installation feature, allowing you to complete the installation and deployment in an environment without internet access.

If you have internet access, downloading the pre-made Offline Package in advance can help speed up the installation process and enhance the certainty and reliability of the installation.


Short Version

You have to download the Pigsty source tarball and the Offline Package in addition to the Standard Installation.

Package GitHub Release CDN Mirror
Pigsty Source Tarball pigsty-v2.7.0.tgz pigsty-v2.7.0.tgz
EL8 Offline Package pigsty-pkg-2.7.0.el8.x86_64.tgz pigsty-pkg-2.7.0.el8.x86_64.tgz
Debian 12 Offline Package pigsty-pkg-v2.7.0.debian12.x86_64.tgz pigsty-pkg-v2.7.0.debian12.x86_64.tgz
Ubuntu 22.04 Offline Package pigsty-pkg-v2.7.0.ubuntu22.x86_64.tgz pigsty-pkg-v2.7.0.ubuntu22.x86_64.tgz

You can also download them with curl:

VERSION=v2.6.0   # pigsty version
DISTRO=el8       # available distro: el7, el8, el9, debian11, debian12, ubuntu20, ubuntu22
curl "https://get.pigsty.cc/${VERSION}/pigsty-${VERSION}.tgz"                      -o ~/pigsty.tgz  # source tarball : ~/pigsty.tgz
curl "https://get.pigsty.cc/${VERSION}/pigsty-pkg-${VERSION}.${distro}.x86_64.tgz" -o /tmp/pkg.tgz  # offline package: /tmp/pkg.tgz

Then upload them to the isolated admin node, put the offline package at /tmp/pkg.tgz, then enable it th through the bootstrap procedure.

./bootstrap                 # you can install the pre-packed offline-package with '-y' flag, if you have Internet access
./configure; ./install.yml  # then continue with the standard configuration and installation tasks

What is offline installation?

Pigsty will download the required rpm/deb packages from the upstream yum/apt repo during the installation procedure and build a local software repo (located at /www/pigsty by default).

The local repo is served by Nginx and serves all nodes in this environment/deployment, including itself.

There are three main benefits to using a local repo:

  1. It can avoid repetitive download requests and traffic consumption, significantly speeding up the installation and improving its reliability.
  2. It will take a snapshot of available software versions, ensuring the consistency of the software versions installed across nodes in the deployment environment.
  3. The built local software repo can be packaged as a whole tarball and copied to an isolated environment with the same operating system for offline installation.

The principle of offline installation is: first, complete the Standard Installation process on a node with the same operating system and internet access. Then, take a snapshot of the built local repo (/www/pigsty) and pack it to make the Offline Package, and then deliver it to the isolated environment for use.

When the Pigsty Installation procedure finds that the local repo already exists, It will enter offline install mode, which will install from the built local repo rather than upstream.

Pigsty will skip the process of downloading and building the local software source from the internet and complete the entire installation process using the local software repo without the need for internet access.


Offline Package

The offline package is a tarball made by gzip and tar, placed under /tmp/pkg.tgz, and extract to /www/pigsty for use.

Pigsty has pre-packed offline packages for primary supported 3 major OS distros.

Which can skip downloading & repo building if you are using the exact same operating system.

https://github.com/Vonng/pigsty/releases/download/v2.7.0/pigsty-pkg-v2.7.0.el8.x86_64.tgz       # EL 8.9 (Green Obsidian)
https://github.com/Vonng/pigsty/releases/download/v2.7.0/pigsty-pkg-v2.7.0.debian12.x86_64.tgz  # Debian 12    (bookworm)
https://github.com/Vonng/pigsty/releases/download/v2.7.0/pigsty-pkg-v2.7.0.ubuntu22.x86_64.tgz  # Ubuntu 22.04 (jammy)
https://get.pigsty.cc/v2.7.0/pigsty-pkg-v2.7.0.el8.x86_64.tgz      # (Green Obsidian)
https://get.pigsty.cc/v2.7.0/pigsty-pkg-v2.7.0.debian12.x86_64.tgz # (bookworm, 12.4)
https://get.pigsty.cc/v2.7.0/pigsty-pkg-v2.7.0.ubuntu22.x86_64.tgz # (jammy, 22.04.3)

Bootstrap

Pigsty needs ansible to run the playbooks. You have to install Ansible even without Internet access.

Luckily, ansible and its dependencies are already included in Pigsty’s offline package. The process of extracting and installing ansible from the offline package is known as Bootstrap.

You have to download the offline package and place it under /tmp/pkg.tgz, then run the bootstrap command:

./bootstrap       # extract /tmp/pkg.tgz and install ansible

The bootstrap script will extract /tmp/pkg.tgz into /www/pigsty, setup a local file repo, and install ansible from it.

Bootstrap Procedure Logic
  1. Check preconditions

  2. Check local repo exists ?

    • Y -> Extract to /www/pigsty and create repo file to enable it
    • N -> Download offline package from the Internet? (if applicable)
      • Y -> Download from GitHub / CDN and extract & enable it
      • N -> Add basic os upstream repo file manually ?
        • Y -> add according to region / version
        • N -> leave it to user’s default configuration
  • Now we have an available repo for installing Ansible
    • Precedence: local pkg.tgz > downloaded pkg.tgz > upstream > user provide
  1. install boot utils from the available repo
    • el7,8,9: ansible createrepo_c unzip wget yum-utils sshpass
    • el8 extra: ansible python3.11-jmespath python3-cryptography createrepo_c unzip wget dnf-utils sshpass modulemd-tools
    • el9 extra: ansible python3-jmespath python3.11-jmespath createrepo_c unzip wget dnf-utils sshpass modulemd-tools
    • ubuntu/debian: ansible python3-jmespath dpkg-dev unzip wget sshpass acl
  2. Check ansible availability.

Example: Download offline package from the Internet (EL8)

On a RockyLinux 8.9 node with the internet access, during the ./bootstrap process, if local repo /www/pigsty or offline package /tmp/pkg.tgz nor available, the system will prompt the user to download the offline package. Simply reply y to proceed, or you can use bootstrap -y to automatically reply “yes” by default.

[vagrant@build-el8 pigsty]$ ./bootstrap
bootstrap pigsty v2.7.0 begin
[ OK ] region = china
[ OK ] kernel = Linux
[ OK ] machine = x86_64
[ OK ] package = rpm,dnf
[ OK ] vendor = rocky (Rocky Linux)
[ OK ] version = 8 (8.9)
[ OK ] sudo = vagrant ok
[ OK ] EL 8.9 has pre-packed offline package available:
       https://get.pigsty.cc/v2.7.0/pigsty-pkg-v2.7.0.el8.x86_64.tgz
[ IN ] offline package not exist on /tmp/pkg.tgz, download? (y/n):
=> y
[ OK ] $ curl https://get.pigsty.cc/v2.7.0/pigsty-pkg-v2.7.0.el8.x86_64.tgz -o /tmp/pkg.tgz

... (yum install progress)

[ OK ] repo = extract from /tmp/pkg.tgz
[ OK ] repo file = use /etc/yum.repos.d/pigsty-local.repo
[WARN] rpm cache = updating, make take a while
[ OK ] repo cache = created
[ OK ] install el8 utils

...(yum install output)

Installed:
  createrepo_c-0.17.7-6.el8.x86_64  createrepo_c-libs-0.17.7-6.el8.x86_64 drpm-0.4.1-3.el8.x86_64 modulemd-tools-0.7-8.el8.noarch python3-createrepo_c-0.17.7-6.el8.x86_64 python3-libmodulemd-2.13.0-1.el8.x86_64
  python3-pyyaml-3.12-12.el8.x86_64 sshpass-1.09-4.el8.x86_64             unzip-6.0-46.el8.x86_64

...(yum install output)                                                                                                                18/18

Installed:
  ansible-8.3.0-1.el8.noarch                      ansible-core-2.15.3-1.el8.x86_64                          git-core-2.39.3-1.el8_8.x86_64                        mpdecimal-2.5.1-3.el8.x86_64
  python3-cffi-1.11.5-6.el8.x86_64                python3-cryptography-3.2.1-7.el8_9.x86_64                 python3-jmespath-0.9.0-11.el8.noarch                  python3-pycparser-2.14-14.el8.noarch
  python3.11-3.11.5-1.el8_9.x86_64                python3.11-cffi-1.15.1-1.el8.x86_64                       python3.11-cryptography-37.0.2-5.el8.x86_64           python3.11-jmespath-1.0.1-1.el8.noarch
  python3.11-libs-3.11.5-1.el8_9.x86_64           python3.11-pip-wheel-22.3.1-4.el8_9.1.noarch              python3.11-ply-3.11-1.el8.noarch                      python3.11-pycparser-2.20-1.el8.noarch
  python3.11-pyyaml-6.0-1.el8.x86_64              python3.11-setuptools-wheel-65.5.1-2.el8.noarch

Complete!
[ OK ] ansible = ansible [core 2.15.3]
[ OK ] boostrap pigsty complete
proceed with ./configure

Example: Bootstrap from local downloaded offline package (Debian 12)

A Debian 12 node without the Internet access, the user has pre-downloaded offline package and uploaded it to a specified path on the node.

If the offline package exists at /tmp/pkg.tgz, the bootstrap process will use it directly, and the output will be like:

vagrant@debian12:~/pigsty$ ./bootstrap
bootstrap pigsty v2.7.0 begin
[ OK ] region = china
[ OK ] kernel = Linux
[ OK ] machine = x86_64
[ OK ] package = deb,apt
[ OK ] vendor = debian (Debian GNU/Linux)
[ OK ] version = 12 (12)
[ OK ] sudo = vagrant ok
[ OK ] cache = /tmp/pkg.tgz exists
[ OK ] repo = extract from /tmp/pkg.tgz
[ OK ] repo file = use /etc/apt/sources.list.d/pigsty-local.list
[WARN] apt cache = updating, make take a while

...(apt install output)

[ OK ] ansible = ansible [core 2.14.3]
[ OK ] boostrap pigsty complete
proceed with ./configure

Example: Bootstrap from the Internet (Ubuntu 20.04)

在一台有互联网访问的 Ubuntu 20.04 节点上,因为 Pigsty 官方没有提供离线软件包,因此选择在线安装。在这种情况下,Bootstrap 会使用可用的源,使用 yum/apt 安装 ansible 与依赖:

On an Ubuntu 20.04 node with internet access, since Pigsty does not provide an official offline package for this OS major version, the online installation is performed. In this case, the bootstrap process will use the available upstream repo to install ansible and its dependencies using yum or apt:

vagrant@ubuntu20:~/pigsty$ ./bootstrap
bootstrap pigsty v2.7.0 begin
[ OK ] region = china
[ OK ] kernel = Linux
[ OK ] machine = x86_64
[ OK ] package = deb,apt
[ OK ] vendor = ubuntu (Ubuntu)
[ OK ] version = 20 (20.04)
[ OK ] sudo = vagrant ok
[WARN] ubuntu 20 focal does not have corresponding offline package, use online install
[WARN] cache = missing and skip download
[WARN] repo = skip (/tmp/pkg.tgz not exists)
[ OK ] repo file = add ubuntu focal china upstream
[WARN] apt cache = updating, make take a while

...(apt update/install output)

[ OK ] ansible = ansible 2.9.6
[ OK ] boostrap pigsty complete
proceed with ./configure


Make Offline Package

Pigsty has a bin/cache script to make offline packages.

Run this script on installed nodes; it will compress the /www/pigsty repo dir into /tmp/pkg.tgz tarball.

bin/cache [version=v2.6.0]
          [pkg_path=/tmp/pkg.tgz]
          [repo_dir=/www/pigsty]

You can upload that /tmp/pkg.tgz to isolated production nodes for using with the same bootstrap procedure.

sudo rm -rf /www/pigsty            # remove obsolete files
sudo tar -xf /tmp/pkg.tgz -C /www  # extract to nginx dir
Make Offline Package Example Output

Example: make an offline package on freshly installed Ubuntu 22.04 with bin/cache:

vagrant@ubuntu22:~/pigsty$ bin/cache
[ OK ] create offline package on ubuntu22
[ OK ] pkg type = deb
[ OK ] repo dir = /www/pigsty
[ OK ] copy /www/pigsty to /tmp/pigsty-build/pigsty
[ OK ] grafana plugins = found, overwrite /tmp/pigsty-build/pigsty/plugins.tgz
knightss27-weathermap-panel  marcusolsson-dynamictext-panel	marcusolsson-json-datasource	marcusolsson-treemap-panel  volkovlabs-form-panel	 volkovlabs-image-panel
marcusolsson-calendar-panel  marcusolsson-hourly-heatmap-panel	marcusolsson-static-datasource	volkovlabs-echarts-panel    volkovlabs-grapi-datasource  volkovlabs-variable-panel
[ OK ] ubuntu 22 = no packages needs to be cleansed
[ OK ] package = making pigsty-pkg-v2.6.0.ubuntu22.x86_64.tgz

pigsty/
pigsty/liblua5.2-0_5.2.4-2_amd64.deb
.................. # Lot's of packages
pigsty/distro-info-data_0.52ubuntu0.6_all.deb

[ OK ] package = finish pigsty-pkg-v2.6.0.ubuntu22.x86_64.tgz 48f9cb2dd2cabb61b115bfe2ac9e002d
-rw-rw-r-- 1 vagrant vagrant 1.2G Feb 29 03:09 /tmp/pkg.tgz
scp ubuntu22:/tmp/pkg.tgz v2.6.0/pigsty-pkg-v2.6.0.ubuntu22.x86_64.tgz


2.3 - Configuration

Describe database and infrastructure as code using declarative Configuration

Pigsty treats Infra & Database as Code. You can describe the infrastructure & database clusters through a declarative interface. All your essential work is to describe your need in the inventory, then materialize it with a simple idempotent playbook.


Inventory

Each pigsty deployment has a corresponding config inventory. It could be stored in a local git-managed file in YAML format or dynamically generated from CMDB or any ansible compatible format. Pigsty uses a monolith YAML config file as the default config inventory, which is pigsty.yml, located in the pigsty home directory.

The inventory consists of two parts: global vars & multiple group definitions. You can define new clusters with inventory groups: all.children. And describe infra and set global default parameters for clusters with global vars: all.vars. Which may look like this:

all:                  # Top-level object: all
  vars: {...}         # Global Parameters
  children:           # Group Definitions
    infra:            # Group Definition: 'infra'
      hosts: {...}        # Group Membership: 'infra'
      vars:  {...}        # Group Parameters: 'infra'
    etcd:    {...}    # Group Definition: 'etcd'
    pg-meta: {...}    # Group Definition: 'pg-meta'
    pg-test: {...}    # Group Definition: 'pg-test'
    redis-test: {...} # Group Definition: 'redis-test'
    # ...

There are lots of config examples under files/pigsty


Cluster

Each group may represent a cluster, which could be a Node cluster, PostgreSQL cluster, Redis cluster, Etcd cluster, or Minio cluster, etc… They all use the same format: group vars & hosts. You can define cluster members with all.children.<cls>.hosts and describe cluster with cluster parameters in all.children.<cls>.vars. Here is an example of 3 nodes PostgreSQL HA cluster named pg-test:

pg-test:   # Group Name
  vars:    # Group Vars (Cluster Parameters)
    pg_cluster: pg-test
  hosts:   # Group Host (Cluster Membership)
    10.10.10.11: { pg_seq: 1, pg_role: primary } # Host1
    10.10.10.12: { pg_seq: 2, pg_role: replica } # Host2
    10.10.10.13: { pg_seq: 3, pg_role: offline } # Host3

You can also define parameters for a specific host, as known as host vars. It will override group vars and global vars. Which is usually used for assigning identities to nodes & database instances.


Parameter

Global vars, Group vars, and Host vars are dict objects consisting of a series of K-V pairs. Each pair is a named Parameter consisting of a string name as the key and a value of one of five types: boolean, string, number, array, or object. Check parameter reference for detailed syntax & semantics.

Every parameter has a proper default value except for mandatory IDENTITY PARAMETERS; they are used as identifiers and must be set explicitly, such as pg_cluster, pg_role, and pg_seq.

Parameters can be specified & overridden with the following precedence.

Playbook Args  >  Host Vars  >  Group Vars  >  Global Vars  >  Defaults

For examples:

  • Force removing existing databases with Playbook CLI Args -e pg_clean=true
  • Override an instance role with Instance Level Parameter pg_role on Host Vars
  • Override a cluster name with Cluster Level Parameter pg_cluster on Group Vars.
  • Specify global NTP servers with Global Parameter node_ntp_servers on Global Vars
  • If no pg_version is set, it will use the default value from role implementation (16 by default)

Template

There are numerous preset config templates for different scenarios under the files/pigsty directory.

During configure process, you can specify a template using the -m parameter. Otherwise, the single-node installation config template will be automatically selected based on your OS distribution.

Although the Pigsty no longer officially supports these OS distros, you can still use the following templates for older major OS versions to perform online installation:


Switch Config Inventory

To use a different config inventory, you can copy & paste the content into the pigsty.yml file in the home dir as needed.

You can also explicitly specify the config inventory file to use when executing Ansible playbooks by using the -i command-line parameter, for example:

./node.yml -i files/pigsty/rpmbuild.yml    # use another file as config inventory, rather than the default pigsty.yml

If you want to modify the default config inventory filename, you can change the inventory parameter in the ansible.cfg file in the home dir to point to your own inventory file path. This allows you to run the ansible-playbook command without explicitly specifying the -i parameter.

Pigsty allows you to use a database (CMDB) as a dynamic configuration source instead of a static configuration file. Pigsty provides three convenient scripts:

  • bin/inventory_load: Loads the content of the pigsty.yml into the local PostgreSQL database (meta.pigsty)
  • bin/inventory_cmdb: Switches the configuration source to the local PostgreSQL database (meta.pigsty)
  • bin/inventory_conf: Switches the configuration source to the local static configuration file pigsty.yml

Reference

Pigsty have 280+ parameters, check Parameter for details.

Module Section Description Count
INFRA META Pigsty Metadata 4
INFRA CA Self-Signed CA 3
INFRA INFRA_ID Infra Portals & Identity 2
INFRA REPO Local Software Repo 9
INFRA INFRA_PACKAGE Infra Packages 2
INFRA NGINX Nginx Web Server 7
INFRA DNS DNSMASQ Nameserver 3
INFRA PROMETHEUS Prometheus Stack 18
INFRA GRAFANA Grafana Stack 6
INFRA LOKI Loki Logging Service 4
NODE NODE_ID Node Identity Parameters 5
NODE NODE_DNS Node domain names & resolver 6
NODE NODE_PACKAGE Node Repo & Packages 5
NODE NODE_TUNE Node Tuning & Kernel features 10
NODE NODE_ADMIN Admin User & Credentials 7
NODE NODE_TIME Node Timezone, NTP, Crontabs 5
NODE NODE_VIP Node Keepalived L2 VIP 8
NODE HAPROXY HAProxy the load balancer 10
NODE NODE_EXPORTER Node Monitoring Agent 3
NODE PROMTAIL Promtail logging Agent 4
DOCKER DOCKER Docker Daemon 4
ETCD ETCD ETCD DCS Cluster 10
MINIO MINIO MINIO S3 Object Storage 15
REDIS REDIS Redis the key-value NoSQL cache 20
PGSQL PG_ID PG Identity Parameters 11
PGSQL PG_BUSINESS PG Business Object Definition 12
PGSQL PG_INSTALL Install PG Packages & Extensions 10
PGSQL PG_BOOTSTRAP Init HA PG Cluster with Patroni 39
PGSQL PG_PROVISION Create in-database objects 9
PGSQL PG_BACKUP Set Backup Repo with pgBackRest 5
PGSQL PG_SERVICE Exposing service, bind vip, dns 9
PGSQL PG_EXPORTER PG Monitor agent for Prometheus 15

2.4 - Preparation

How to prepare the nodes, network, OS distros, admin user, ports, and permissions for Pigsty.

Node

Pigsty supports the Linux kernel and x86_64/amd64 arch, applicable to any node.

A “node” refers to a resource that is SSH accessible and offers a bare OS environment, such as a physical machine, a virtual machine, or an OS container equipped with systemd and sshd.

Deploying Pigsty requires at least 1 node. The minimum spec requirement is 1C1G, but it is recommended to use at least 2C4G, with no upper limit: parameters will automatically optimize and adapt.

For demos, personal sites, devbox, or standalone monitoring infra, 1-2 nodes are recommended, while at least 3 nodes are suggested for an HA PostgreSQL cluster. For critical scenarios, 4-5 nodes are advisable.


Network

Pigsty requires nodes to use static IPv4 addresses, which means you should explicitly assign your nodes a specific fixed IP address rather than using DHCP-assigned addresses.

The IP address used by a node should be the primary IP address for internal network communications and will serve as the node’s unique identifier.

If you wish to use the optional Node VIP and PG VIP features, ensure all nodes are located within an L2 network.

Your firewall policy should ensure the required ports are open between nodes. For a detailed list of ports required by different modules, refer to Node: Ports.


Operating System

Pigsty supports various Linux OS. We recommend using RockyLinux 8.9 or Ubuntu 22.04.3 as the default OS for installing Pigsty.

Pigsty supports RHEL (7,8,9), Debian (11,12), Ubuntu (20,22), and many other compatible OS distros. Check Compatibility For a complete list of compatible OS distros.

When deploying on multiple nodes, we strongly recommend using the same version of the OS distro and the Linux kernel on all nodes.

We strongly recommend using a clean, minimally installed OS environment with en_US set as the primary language.


Admin User

You’ll need an “admin user” on all nodes where Pigsty is meant to be deployed — an OS user with nopass ssh login and nopass sudo permissions.

On the nodes where Pigsty is installed, you need an “administrative user” who has nopass ssh login and nopass sudo permissions.

No password sudo is required to execute commands during the installation process, such as installing packages, configuring system settings, etc.


SSH Permission

In addition to nopass sudo privilege, Pigsty also requires the admin user to have nopass ssh login privilege (login via ssh key).

For single-host installations setup, this means the admin user on the local node should be able to log in to the host itself via ssh without a password.

If your Pigsty deployment involves multiple nodes, this means the admin user on the admin node should be able to log in to all nodes managed by Pigsty (including the local node) via ssh without a password, and execute sudo commands without a password as well.

During the configure procedure, if your current admin user does not have any SSH key, it will attempt to address this issue by generating a new id_rsa key pair and adding it to the local ~/.ssh/authorized_keys file to ensure local SSH login capability for the local admin user.

By default, Pigsty creates an admin user dba (uid=88) on all managed nodes. If you are already using this user, we recommend that you change the node_admin_username to a new username with a different uid, or disable it using the node_admin_enabled parameter.


SSH Accessibility

If your environment has some restrictions on SSH access, such as a bastion server or ad hoc firewall rules that prevent simple SSH access via ssh <ip>, consider using SSH aliases.

For example, if there’s a node with IP 10.10.10.10 that can not be accessed directly via ssh but can be accessed via an ssh alias meta defined in ~/.ssh/config, then you can configure the ansible_host parameter for that node in the inventory to specify the SSH Alias on the host level:

nodes:    
  hosts:  # 10.10.10.10 can not be accessed directly via ssh, but can be accessed via ssh alias 'meta'
    10.10.10.10: { ansible_host: meta }

If the ssh alias does not meet your requirement, there are a plethora of custom ssh connection parameters that can bring fine-grained control over SSH connection behavior.

If the following cmd can be successfully executed on the admin node by the admin user, it means that the target node’s admin user is properly configured.

ssh <ip|alias> 'sudo ls'

Software

On the admin node, Pigsty requires ansible to initiate control. If you are using the singleton meta installation, Ansible is required on this node. It is not required for common nodes.

The bootstrap procedure will make every effort to do this for you. But you can always choose to install Ansible manually. The process of manually installing Ansible varies with different OS distros / major versions (usually involving an additional weak dependency jmespath):

sudo dnf install -y ansible python3.11-jmespath
sudo yum install -y ansible   # EL7 does not need to install jmespath explicitly
sudo apt install -y ansible python3-jmespath
brew install ansible

To install Pigsty, you also need to prepare the Pigsty source package. You can directly download a specific version from the GitHub Release page or use the following command to obtain the latest stable version:

curl -L https://get.pigsty.cc/install  | bash

If your env does not have Internet access, consider using the offline packages, which are pre-packed for different OS distros, and can be downloaded from the GitHub Release page.

2.5 - Planning

Prepare required resources according to your needs.

2.6 - Playbooks

Pigsty implement module controller with ansible idempotent playbooks, here are some necessary info you need to learn about it.

Playbooks are used in Pigsty to install modules on nodes.

To run playbooks, just treat them as executables. e.g. run with ./install.yml.


Playbooks

Here are default playbooks included in Pigsty.

Playbook Function
install.yml Install Pigsty on current node in one-pass
infra.yml Init pigsty infrastructure on infra nodes
infra-rm.yml Remove infrastructure components from infra nodes
node.yml Init node for pigsty, tune node into desired status
node-rm.yml Remove node from pigsty
pgsql.yml Init HA PostgreSQL clusters, or adding new replicas
pgsql-rm.yml Remove PostgreSQL cluster, or remove replicas
pgsql-user.yml Add new business user to existing PostgreSQL cluster
pgsql-db.yml Add new business database to existing PostgreSQL cluster
pgsql-monitor.yml Monitor remote postgres instance with local exporters
pgsql-migration.yml Generate Migration manual & scripts for existing PostgreSQL
redis.yml Init redis cluster/node/instance
redis-rm.yml Remove redis cluster/node/instance
etcd.yml Init etcd cluster (required for patroni HA DCS)
minio.yml Init minio cluster (optional for pgbackrest repo)
cert.yml Issue cert with pigsty self-signed CA (e.g. for pg clients)
docker.yml Install docker on nodes
mongo.yml 在节点上安装 Mongo/FerretDB

One-Pass Install

The special playbook install.yml is actually a composed playbook that install everything on current environment.


  playbook  / command / group         infra           nodes    etcd     minio     pgsql
[infra.yml] ./infra.yml [-l infra]   [+infra][+node] 
[node.yml]  ./node.yml                               [+node]  [+node]  [+node]   [+node]
[etcd.yml]  ./etcd.yml  [-l etcd ]                            [+etcd]
[minio.yml] ./minio.yml [-l minio]                                     [+minio]
[pgsql.yml] ./pgsql.yml                                                          [+pgsql]

Note that there’s a circular dependency between NODE and INFRA: to register a NODE to INFRA, the INFRA should already exist, while the INFRA module relies on NODE to work.

The solution is that INFRA playbook will also install NODE module in addition to INFRA on infra nodes. Make sure that infra nodes are init first. If you really want to init all nodes including infra in one-pass, install.yml is the way to go.


Ansible

Playbooks require ansible-playbook executable to run, playbooks which is included in ansible rpm / deb package.

Pigsty will try it’s best to install ansible on admin node during bootstrap.

You can install it by yourself with yum|apt|brew install ansible, it is included in default OS repo.

Knowledge about ansible is good but not required. Only four parameters needs your attention:

  • -l|--limit <pattern> : Limit execution target on specific group/host/pattern (Where)
  • -t|--tags <tags>: Only run tasks with specific tags (What)
  • -e|--extra-vars <vars>: Extra command line arguments (How)
  • -i|--inventory <path>: Using another inventory file (Conf)

Designate Inventory

To use a different config inventory, you can copy & paste the content into the pigsty.yml file in the home dir as needed.

The active inventory file can be specified with the -i|--inventory <path> parameter when running Ansible playbooks.

./node.yml  -i files/pigsty/rpmbuild.yml    # use another file as config inventory, rather than the default pigsty.yml
./pgsql.yml -i files/pigsty/rpm.yml         # install pgsql module on machine define in files/pigsty/rpm.yml
./redis.yml -i files/pigsty/redis.yml       # install redis module on machine define in files/pigsty/redis.yml

If you wish to permanently modify the default config inventory filename, you can change the inventory parameter in the ansible.cfg


Limit Host

The target of playbook can be limited with -l|-limit <selector>.

Missing this value could be dangerous since most playbooks will execute on all host, DO USE WITH CAUTION.

Here are some examples of host limit:

./pgsql.yml                 # run on all hosts (very dangerous!)
./pgsql.yml -l pg-test      # run on pg-test cluster
./pgsql.yml -l 10.10.10.10  # run on single host 10.10.10.10
./pgsql.yml -l pg-*         # run on host/group matching glob pattern `pg-*`
./pgsql.yml -l '10.10.10.11,&pg-test'     # run on 10.10.10.10 of group pg-test
/pgsql-rm.yml -l 'pg-test,!10.10.10.11'   # run on pg-test, except 10.10.10.11
./pgsql.yml -l pg-test      # Execute the pgsql playbook against the hosts in the pg-test cluster

Limit Tags

You can execute a subset of playbook with -t|--tags <tags>.

You can specify multiple tags in comma separated list, e.g. -t tag1,tag2.

If specified, tasks with given tags will be executed instead of entire playbook.

Here are some examples of task limit:

./pgsql.yml -t pg_clean    # cleanup existing postgres if necessary
./pgsql.yml -t pg_dbsu     # setup os user sudo for postgres dbsu
./pgsql.yml -t pg_install  # install postgres packages & extensions
./pgsql.yml -t pg_dir      # create postgres directories and setup fhs
./pgsql.yml -t pg_util     # copy utils scripts, setup alias and env
./pgsql.yml -t patroni     # bootstrap postgres with patroni
./pgsql.yml -t pg_user     # provision postgres business users
./pgsql.yml -t pg_db       # provision postgres business databases
./pgsql.yml -t pg_backup   # init pgbackrest repo & basebackup
./pgsql.yml -t pgbouncer   # deploy a pgbouncer sidecar with postgres
./pgsql.yml -t pg_vip      # bind vip to pgsql primary with vip-manager
./pgsql.yml -t pg_dns      # register dns name to infra dnsmasq
./pgsql.yml -t pg_service  # expose pgsql service with haproxy
./pgsql.yml -t pg_exporter # expose pgsql service with haproxy
./pgsql.yml -t pg_register # register postgres to pigsty infrastructure

# run multiple tasks: reload postgres & pgbouncer hba rules
./pgsql.yml -t pg_hba,pgbouncer_hba,pgbouncer_reload

# run multiple tasks: refresh haproxy config & reload it
./node.yml -t haproxy_config,haproxy_reload

Extra Vars

Extra command-line args can be passing via -e|-extra-vars KEY=VALUE.

It has the highest precedence over all other definition.

Here are some examples of extra vars

./node.yml -e ansible_user=admin -k -K   # run playbook as another user (with admin sudo password)
./pgsql.yml -e pg_clean=true             # force purging existing postgres when init a pgsql instance
./pgsql-rm.yml -e pg_uninstall=true      # explicitly uninstall rpm after postgres instance is removed
./redis.yml -l 10.10.10.10 -e redis_port=6379 -t redis  # init a specific redis instance: 10.10.10.11:6379
./redis-rm.yml -l 10.10.10.13 -e redis_port=6379        # remove a specific redis instance: 10.10.10.11:6379

Most playbooks are idempotent, meaning that some deployment playbooks may erase existing databases and create new ones without the protection option turned on.

Please read the documentation carefully, proofread the commands several times, and operate with caution. The author is not responsible for any loss of databases due to misuse.

2.7 - Provisioning

Introduce the 4 node sandbox environment. and provision VMs with vagrant & terraform

Pigsty runs on nodes, which are Bare Metals or Virtual Machines. You can prepare them manually, or using terraform & vagrant for provisioning.


Sandbox

Pigsty has a sandbox, which is a 4-node deployment with fixed IP addresses and other identifiers. Check demo.yml for details.

The sandbox consists of 4 nodes with fixed IP addresses: 10.10.10.10, 10.10.10.11, 10.10.10.12, 10.10.10.13.

There’s a primary singleton PostgreSQL cluster: pg-meta on the meta node, which can be used alone if you don’t care about PostgreSQL high availability.

  • meta 10.10.10.10 pg-meta pg-meta-1

There are 3 additional nodes in the sandbox, form a 3-instance PostgreSQL HA cluster pg-test.

  • node-1 10.10.10.11 pg-test.pg-test-1
  • node-2 10.10.10.12 pg-test.pg-test-2
  • node-3 10.10.10.13 pg-test.pg-test-3

Two optional L2 VIP are bind on primary instances of cluster pg-meta and pg-test:

  • 10.10.10.2 pg-meta
  • 10.10.10.3 pg-test

There’s also a 1-instance etcd cluster, and 1-instance minio cluster on the meta node, too.

pigsty-sandbox.jpg

You can run sandbox on local VMs or cloud VMs. Pigsty offers a local sandbox based on Vagrant (pulling up local VMs using Virtualbox or libvirt), and a cloud sandbox based on Terraform (creating VMs using the cloud vendor API).

  • Local sandbox can be run on your Mac/PC for free. Your Mac/PC should have at least 4C/8G to run the full 4-node sandbox.

  • Cloud sandbox can be easily created and shared. You will have to create a cloud account for that. VMs are created on-demand and can be destroyed with one command, which is also very cheap for a quick glance.


Vagrant

Vagrant can create local VMs according to specs in a declarative way. Check Vagrant Templates Intro for details

Vagrant will use VirtualBox as the default VM provider. however libvirt, docker, parallel desktop and vmware can also be used. We will use VirtualBox in this guide.

Installation

Make sure Vagrant and Virtualbox are installed and available on your OS.

If you are using macOS, You can use homebrew to install both of them with one command (reboot required). You can also use vagrant-libvirt on Linux.

/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
brew install vagrant virtualbox ansible   # Run on MacOS with one command, but only works on x86_64 Intel chips

Configuration

vagarnt/Vagranfile is a ruby script file describing VM nodes. Here are some default specs of Pigsty.

Templates Shortcut Spec Comment
meta.rb v1 4C8G x 1 Single Meta Node
full.rb v4 2C4G + 1C2G x 3 Full 4 Nodes Sandbox Demo
el7.rb v7 2C4G + 1C2G x 3 EL7 3-node Testing Env
el8.rb v8 2C4G + 1C2G x 3 EL8 3-node Testing Env
el9.rb v9 2C4G + 1C2G x 3 EL9 3-node Testing Env
build.rb vb 2C4G x 3 3-Node EL7,8,9 Building Environment
check.rb vc 2C4G x 30 30 Node EL7-EL9 PG 12-16 Env
minio.rb vm 2C4G x 3 + Disk 3-Node MinIO/etcd Testing Env
prod.rb vp 45 nodes Prod simulation with 45 Nodes

Each spec file contains a Specs variable describe VM nodes. For example, the full.rb contains the 4-node sandbox specs.

Specs = [
  {"name" => "meta",   "ip" => "10.10.10.10", "cpu" => "2",  "mem" => "4096", "image" => "generic/rocky9" },
  {"name" => "node-1", "ip" => "10.10.10.11", "cpu" => "1",  "mem" => "2048", "image" => "generic/rocky9" },
  {"name" => "node-2", "ip" => "10.10.10.12", "cpu" => "1",  "mem" => "2048", "image" => "generic/rocky9" },
  {"name" => "node-3", "ip" => "10.10.10.13", "cpu" => "1",  "mem" => "2048", "image" => "generic/rocky9" },
]

You can switch specs with the vagrant/switch script, it will render the final Vagrantfile according to the spec.

cd ~/pigsty
vagrant/switch <spec>

vagrant/switch meta     # singleton meta        | alias:  `make v1`
vagrant/switch full     # 4-node sandbox        | alias:  `make v4`
vagrant/switch el7      # 3-node el7 test       | alias:  `make v7`
vagrant/switch el8      # 3-node el8 test       | alias:  `make v8`
vagrant/switch el9      # 3-node el9 test       | alias:  `make v9`
vagrant/switch prod     # prod simulation       | alias:  `make vp`
vagrant/switch build    # building environment  | alias:  `make vd`
vagrant/switch minio    # 3-node minio env
vagrant/switch check    # 30-node check env

Management

After describing the VM nodes with specs and generate the vagrant/Vagrantfile. you can create the VMs with vagrant up command.

Pigsty templates will use your ~/.ssh/id_rsa[.pub] as the default ssh key for vagrant provisioning.

Make sure you have a valid ssh key pair before you start, you can generate one by: ssh-keygen -t rsa -b 2048

There are some makefile shortcuts that wrap the vagrant commands, you can use them to manage the VMs.

make         # = make start
make new     # destroy existing vm and create new ones
make ssh     # write VM ssh config to ~/.ssh/     (required)
make dns     # write VM DNS records to /etc/hosts (optional)
make start   # launch VMs and write ssh config    (up + ssh) 
make up      # launch VMs with vagrant up
make halt    # shutdown VMs (down,dw)
make clean   # destroy VMs (clean/del/destroy)
make status  # show VM status (st)
make pause   # pause VMs (suspend,pause)
make resume  # pause VMs (resume)
make nuke    # destroy all vm & volumes with virsh (if using libvirt) 

Shortcuts

You can create VMs with the following shortcuts:

make meta     # singleton meta
make full     # 4-node sandbox
make el7      # 3-node el7 test
make el8      # 3-node el8 test
make el9      # 3-node el9 test
make prod     # prod simulation
make build    # building environment
make minio    # 3-node minio env
make check    # 30-node check env
make meta  install   # create and install pigsty on 1-node singleton meta
make full  install   # create and install pigsty on 4-node sandbox
make prod  install   # create and install pigsty on 42-node KVM libvirt environment
make check install   # create and install pigsty on 30-node testing & validating environment
...

Terraform

Terraform is an open-source tool to practice ‘Infra as Code’. Describe the cloud resource you want and create them with one command.

Pigsty has terraform templates for AWS, Aliyun, and Tencent Cloud, you can use them to create VMs on the cloud for Pigsty Demo.

Terraform can be easily installed with homebrew, too: brew install terraform. You will have to create a cloud account to obtain AccessKey and AccessSecret credentials to proceed.

The terraform/ dir have two example templates: one for AWS, and one for Aliyun, you can adjust them to fit your need, or modify them if you are using a different cloud vendor.

Take Aliyun as example:

cd terraform                          # goto the terraform dir
cp spec/aliyun.tf terraform.tf        # use aliyun template

You have to perform terraform init before terraform apply:

terraform init      # install terraform provider: aliyun (required only for the first time)
terraform apply     # generate execution plans: create VMs, virtual segments/switches/security groups

After running apply and answering yes to the prompt, Terraform will create the VMs and configure the network for you.

The admin node ip address will be printed out at the end of the execution, you can ssh login and start pigsty installation.

2.8 - Security

Security considerations and best-practices in Pigsty

Pigsty already provides a secure-by-default authentication and access control model, which is sufficient for most scenarios.

pigsty-acl.jpg

But if you want to further strengthen the security of the system, the following suggestions are for your reference:


Confidentiality

Important Files

Secure your pigsty config inventory

  • pigsty.yml has highly sensitive information, including passwords, certificates, and keys.
  • You should limit access to admin/infra nodes, only accessible by the admin/dba users
  • Limit access to the git repo, if you are using git to manage your pigsty source.

Secure your CA private key and other certs

  • These files are very important, and will be generated under files/pki under pigsty source dir by default.
  • You should secure & backup them in a safe place periodically.

Passwords

Always change these passwords, DO NOT USE THE DEFAULT VALUES:

Please change MinIO user secret key and pgbackrest_repo references

If you are using remote backup method, secure backup with distinct passwords

  • Use aes-256-cbc for pgbackrest_repo.*.cipher_type
  • When setting a password, you can use ${pg_cluster} placeholder as part of the password to avoid using the same password.

Use advanced password encryption method for PostgreSQL

  • use pg_pwd_enc default scram-sha-256 instead of legacy md5

Enforce a strong pg password with the passwordcheck extension.

  • add $lib/passwordcheck to pg_libs to enforce password policy.

Encrypt remote backup with an encryption algorithm

Add an expiration date to biz user passwords.

  • You can set an expiry date for each user for compliance purposes.

  • Don’t forget to refresh these passwords periodically.

    - { name: dbuser_meta , password: Pleas3-ChangeThisPwd ,expire_in: 7300 ,pgbouncer: true ,roles: [ dbrole_admin ]    ,comment: pigsty admin user }
    - { name: dbuser_view , password: Make.3ure-Compl1ance  ,expire_in: 7300 ,pgbouncer: true ,roles: [ dbrole_readonly ] ,comment: read-only viewer for meta database }
    - { name: postgres     ,superuser: true  ,expire_in: 7300                        ,comment: system superuser }
    - { name: replicator ,replication: true  ,expire_in: 7300 ,roles: [pg_monitor, dbrole_readonly]   ,comment: system replicator }
    - { name: dbuser_dba   ,superuser: true  ,expire_in: 7300 ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 , comment: pgsql admin user }
    - { name: dbuser_monitor ,roles: [pg_monitor] ,expire_in: 7300 ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }
    

Do not log changing password statement into postgres log.

SET log_statement TO 'none';
ALTER USER "{{ user.name }}" PASSWORD '{{ user.password }}';
SET log_statement TO DEFAULT;

IP Addresses

Bind to specific IP addresses rather than all addresses for postgres/pgbouncer/patroni

  • The default pg_listen address is 0.0.0.0, which is all IPv4 addresses.
  • consider using pg_listen: '${ip},${vip},${lo}' to bind to specific addresses for better security.

Do not expose any port to the Internet; except 80/443, the infra portal

  • Grafana/Prometheus are bind to all IP address by default for convenience.
  • You can modify their bind configuration to listen on localhost/intranet IP and expose by Nginx.
  • Redis server are bind to all IP address by default for convenience. You can change redis_bind_address to listen on intranet IP.
  • You can also implement it with the security group or firewall rules.

Limit postgres client access with HBA

  • There’s a security enhance config template: security.yml

Limit patroni admin access from the infra/admin node.


Network Traffic

  • Access Nginx with SSL and domain names

  • Secure Patroni REST API with SSL

    • patroni_ssl_enabled is disabled by default
    • Since it affects health checks and API invocation.
    • Note this is a global option, and you have to decide before deployment.
  • Secure Pgbouncer Client Traffic with SSL

    • pgbouncer_sslmode is disable by default
    • Since it has a significant performance impact.

Integrity


Consistency

Use consistency-first mode for PostgreSQL.

  • Use crit.yml templates for pg_conf will trade some availability for the best consistency.

Use node crit tuned template for better consistency

  • set node_tune to crit to reduce dirty page ratio.
  • Enable data checksum to detect silent data corruption.
    • pg_checksum is disabled by default, and enabled for crit.yml by default
    • This can be enabled later, which requires a full cluster scan/stop.

Audit

  • Enable log_connections and log_disconnections after the pg cluster bootstrap.
    • Audit incoming sessions; this is enabled in crit.yml by default.

Availability

  • Do not access the database directly via a fixed IP address; use VIP, DNS, HAProxy, or their combination.

    • Haproxy will handle the traffic control for the clients in case of failover/switchover.
  • Use enough nodes for serious production deployment.

    • You need at least three nodes (tolerate one node failure) to achieve production-grade high availability.
    • If you only have two nodes, you can tolerate the failure of the specific standby node.
    • If you have one node, use an external S3/MinIO for cold backup & wal archive storage.
  • Trade off between availability and consistency for PostgreSQL.

    • pg_rpo : trade-off between Availability and Consistency
    • pg_rto : trade-off between failure chance and impact
  • Use multiple infra nodes in serious production deployment (e.g., 1~3)

    • Usually, 2 ~ 3 is enough for a large production deployment.
  • Use enough etcd members and use even numbers (1,3,5,7).

2.9 - FAQ

Frequently asked questions about download, setup, configuration, and installation in Pigsty.

If you have any unlisted questions or suggestions, please create an Issue or ask the community for help.


How to Get the Pigsty Source Package?

Use the following command to install Pigsty with one click: bash -c "$(curl -fsSL https://get.pigsty.cc/install)"

This command will automatically download the latest stable version pigsty.tgz and extract it to the ~/pigsty directory. You can also manually download a specific version of the Pigsty source code from the following locations.

If you need to install it in an environment without internet access, you can download it in advance in a networked environment and transfer it to the production server via scp/sftp or CDROM/USB.


How to Speed Up RPM Downloads from Upstream Repositories?

Consider using a local repository mirror, which can be configured with the repo_upstream parameter. You can choose region to use different mirror sites.

For example, you can set region = china, which will use the URL with the key china in the baseurl instead of default.

If some repositories are blocked by a firewall or the GFW, consider using proxy_env to bypass it.


How to resolve node package conflict?

Beware that Pigsty’s pre-built offline packages are tailored for specific minor versions OS Distors. Therefore, if the major.minor version of your OS distro does not precisely align, we advise against using the offline installation packages. Instead, following the default installation procedure and download the package directly from upstream repo through the Internet, which will acquire the versions that exactly match your OS version.

If online installation doesn’t work for you, you can first try modifying the upstream software sources used by Pigsty. For example, in EL family operating systems, Pigsty’s default upstream sources use a major version placeholder $releasever, which resolves to specific major versions like 7, 8, 9. However, many operating system distributions offer a Vault, allowing you to use a package mirror for a specific version. Therefore, you could replace the front part of the repo_upstream parameter’s BaseURL with a specific Vault minor version repository, such as:

  • https://dl.rockylinux.org/pub/rocky/$releasever (Original BaseURL prefix, without vault)
  • https://vault.centos.org/7.6.1810/ (Using 7.6 instead of the default 7.9)
  • https://dl.rockylinux.org/vault/rocky/8.6/ (Using 8.6 instead of the default 8.9)
  • https://dl.rockylinux.org/vault/rocky/9.2/ (Using 9.2 instead of the default 9.3)

Make sure the vault URL path exists & valid before replacing the old values. Beware that some repo like epel do not offer specific minor version subdirs. Upstream repo that support this approach include: base, updates, extras, centos-sclo, centos-sclo-rh, baseos, appstream, extras, crb, powertools, pgdg-common, pgdg1*

repo_upstream:
  - { name: pigsty-local   ,description: 'Pigsty Local'      ,module: local ,releases: [7,8,9] ,baseurl: { default: 'http://${admin_ip}/pigsty'  }} # used by intranet nodes
  - { name: pigsty-infra   ,description: 'Pigsty INFRA'      ,module: infra ,releases: [7,8,9] ,baseurl: { default: 'https://repo.pigsty.io/rpm/infra/$basearch' ,china: 'https://repo.pigsty.cc/rpm/infra/$basearch' }}
  - { name: pigsty-pgsql   ,description: 'Pigsty PGSQL'      ,module: pgsql ,releases: [7,8,9] ,baseurl: { default: 'https://repo.pigsty.io/rpm/pgsql/el$releasever.$basearch' ,china: 'https://repo.pigsty.cc/rpm/pgsql/el$releasever.$basearch' }}
  - { name: nginx          ,description: 'Nginx Repo'        ,module: infra ,releases: [7,8,9] ,baseurl: { default: 'https://nginx.org/packages/centos/$releasever/$basearch/' }}
  - { name: docker-ce      ,description: 'Docker CE'         ,module: infra ,releases: [7,8,9] ,baseurl: { default: 'https://download.docker.com/linux/centos/$releasever/$basearch/stable'        ,china: 'https://mirrors.aliyun.com/docker-ce/linux/centos/$releasever/$basearch/stable'  ,europe: 'https://mirrors.xtom.de/docker-ce/linux/centos/$releasever/$basearch/stable' }}
  - { name: base           ,description: 'EL 7 Base'         ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/os/$basearch/'                    ,china: 'https://mirrors.tuna.tsinghua.edu.cn/centos/$releasever/os/$basearch/'           ,europe: 'https://mirrors.xtom.de/centos/$releasever/os/$basearch/'           }}
  - { name: updates        ,description: 'EL 7 Updates'      ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/updates/$basearch/'               ,china: 'https://mirrors.tuna.tsinghua.edu.cn/centos/$releasever/updates/$basearch/'      ,europe: 'https://mirrors.xtom.de/centos/$releasever/updates/$basearch/'      }}
  - { name: extras         ,description: 'EL 7 Extras'       ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/extras/$basearch/'                ,china: 'https://mirrors.tuna.tsinghua.edu.cn/centos/$releasever/extras/$basearch/'       ,europe: 'https://mirrors.xtom.de/centos/$releasever/extras/$basearch/'       }}
  - { name: epel           ,description: 'EL 7 EPEL'         ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://download.fedoraproject.org/pub/epel/$releasever/$basearch/'            ,china: 'https://mirrors.tuna.tsinghua.edu.cn/epel/$releasever/$basearch/'                ,europe: 'https://mirrors.xtom.de/epel/$releasever/$basearch/'                }}
  - { name: centos-sclo    ,description: 'EL 7 SCLo'         ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/sclo/$basearch/sclo/'             ,china: 'https://mirrors.aliyun.com/centos/$releasever/sclo/$basearch/sclo/'              ,europe: 'https://mirrors.xtom.de/centos/$releasever/sclo/$basearch/sclo/'    }}
  - { name: centos-sclo-rh ,description: 'EL 7 SCLo rh'      ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/sclo/$basearch/rh/'               ,china: 'https://mirrors.aliyun.com/centos/$releasever/sclo/$basearch/rh/'                ,europe: 'https://mirrors.xtom.de/centos/$releasever/sclo/$basearch/rh/'      }}
  - { name: baseos         ,description: 'EL 8+ BaseOS'      ,module: node  ,releases: [  8,9] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/BaseOS/$basearch/os/'         ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/BaseOS/$basearch/os/'          ,europe: 'https://mirrors.xtom.de/rocky/$releasever/BaseOS/$basearch/os/'     }}
  - { name: appstream      ,description: 'EL 8+ AppStream'   ,module: node  ,releases: [  8,9] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/AppStream/$basearch/os/'      ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/AppStream/$basearch/os/'       ,europe: 'https://mirrors.xtom.de/rocky/$releasever/AppStream/$basearch/os/'  }}
  - { name: extras         ,description: 'EL 8+ Extras'      ,module: node  ,releases: [  8,9] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/extras/$basearch/os/'         ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/extras/$basearch/os/'          ,europe: 'https://mirrors.xtom.de/rocky/$releasever/extras/$basearch/os/'     }}
  - { name: crb            ,description: 'EL 9 CRB'          ,module: node  ,releases: [    9] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/CRB/$basearch/os/'            ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/CRB/$basearch/os/'             ,europe: 'https://mirrors.xtom.de/rocky/$releasever/CRB/$basearch/os/'        }}
  - { name: powertools     ,description: 'EL 8 PowerTools'   ,module: node  ,releases: [  8  ] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/PowerTools/$basearch/os/'     ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/PowerTools/$basearch/os/'      ,europe: 'https://mirrors.xtom.de/rocky/$releasever/PowerTools/$basearch/os/' }}
  - { name: epel           ,description: 'EL 8+ EPEL'        ,module: node  ,releases: [  8,9] ,baseurl: { default: 'http://download.fedoraproject.org/pub/epel/$releasever/Everything/$basearch/' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/epel/$releasever/Everything/$basearch/'     ,europe: 'https://mirrors.xtom.de/epel/$releasever/Everything/$basearch/'     }}
  - { name: pgdg-common    ,description: 'PostgreSQL Common' ,module: pgsql ,releases: [7,8,9] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/redhat/rhel-$releasever-$basearch' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/redhat/rhel-$releasever-$basearch' }}
  - { name: pgdg-extras    ,description: 'PostgreSQL Extra'  ,module: pgsql ,releases: [7,8,9] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/pgdg-rhel$releasever-extras/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/pgdg-rhel$releasever-extras/redhat/rhel-$releasever-$basearch' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/pgdg-rhel$releasever-extras/redhat/rhel-$releasever-$basearch' }}
  - { name: pgdg-el8fix    ,description: 'PostgreSQL EL8FIX' ,module: pgsql ,releases: [  8  ] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/pgdg-centos8-sysupdates/redhat/rhel-8-x86_64/' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/pgdg-centos8-sysupdates/redhat/rhel-8-x86_64/' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/pgdg-centos8-sysupdates/redhat/rhel-8-x86_64/' } }
  - { name: pgdg-el9fix    ,description: 'PostgreSQL EL9FIX' ,module: pgsql ,releases: [    9] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/pgdg-rocky9-sysupdates/redhat/rhel-9-x86_64/'  ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/pgdg-rocky9-sysupdates/redhat/rhel-9-x86_64/' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/pgdg-rocky9-sysupdates/redhat/rhel-9-x86_64/' }}
  - { name: pgdg15         ,description: 'PostgreSQL 15'     ,module: pgsql ,releases: [7    ] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/15/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/15/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/15/redhat/rhel-$releasever-$basearch' }}
  - { name: pgdg16         ,description: 'PostgreSQL 16'     ,module: pgsql ,releases: [  8,9] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/16/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/16/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/16/redhat/rhel-$releasever-$basearch' }}
  - { name: timescaledb    ,description: 'TimescaleDB'       ,module: pgsql ,releases: [7,8,9] ,baseurl: { default: 'https://packagecloud.io/timescale/timescaledb/el/$releasever/$basearch'  }}

After explicitly defining and overriding the repo_upstream in the Pigsty configuration file, (you may clear the /www/pigsty/repo_complete flag) try the installation again. If the upstream software source and the mirror source software do not solve the problem, you might consider replacing them with the operating system’s built-in software sources and attempt a direct installation from upstream once more.

Finally, if the above methods do not resolve the issue, consider removing conflicting packages from node_packages, infra_packages, pg_packages, pg_extensions, or remove or upgrade the conflicting packages on the existing system.


What does bootstrap do?

Check the environment, ask for downloading offline packages, and make sure the essential tool ansible is installed.

It will make sure the essential tool ansible is installed by various means.

When you download the Pigsty source code, you can enter the directory and execute the bootstrap script. It will check if your node environment is ready, and if it does not find offline packages, it will ask if you want to download them from the internet if applicable.

You can choose y to use offline packages, which will make the installation procedure faster. You can also choose n to skip and download directly from the internet during the installation process, which will download the latest software versions and reduce the chance of RPM conflicts.


What does configure do?

Detect the environment, generate the configuration, enable the offline package (optional), and install the essential tool Ansible.

After downloading the Pigsty source package and unpacking it, you may have to execute ./configure to complete the environment configuration. This is optional if you already know how to configure Pigsty properly.

The configure procedure will detect your node environment and generate a pigsty config file: pigsty.yml for you.


What is the Pigsty config file?

pigsty.yml under the pigsty home dir is the default config file.

Pigsty uses a single config file pigsty.yml, to describe the entire environment, and you can define everything there. There are many config examples in files/pigsty for your reference.

You can pass the -i <path> to playbooks to use other configuration files. For example, you want to install redis according to another config: redis.yml:

./redis.yml -i files/pigsty/redis.yml

How to use the CMDB as config inventory

The default config file path is specified in ansible.cfg: inventory = pigsty.yml

You can switch to a dynamic CMDB inventory with bin/inventory_cmdb, and switch back to the local config file with bin/inventory_conf. You must also load the current config file inventory to CMDB with bin/inventory_load.

If CMDB is used, you must edit the inventory config from the database rather than the config file.


What is the IP address placeholder in the config file?

Pigsty uses 10.10.10.10 as a placeholder for the current node IP, which will be replaced with the primary IP of the current node during the configuration.

When the configure detects multiple NICs with multiple IPs on the current node, the config wizard will prompt for the primary IP to be used, i.e., the IP used by the user to access the node from the internal network. Note that please do not use the public IP.

This IP will be used to replace 10.10.10.10 in the config file template.


Which parameters need your attention?

Usually, in a singleton installation, there is no need to make any adjustments to the config files.

Pigsty provides 265 config parameters to customize the entire infra/node/etcd/minio/pgsql. However, there are a few parameters that can be adjusted in advance if needed:

  • When accessing web service components, the domain name is infra_portal (some services can only be accessed using the domain name through the Nginx proxy).
  • Pigsty assumes that a /data dir exists to hold all data; you can adjust these paths if the data disk mount point differs from this.
  • Don’t forget to change those passwords in the config file for your production deployment.

Installation


What was executed during installation?

When running make install, the ansible-playbook install.yml will be invoked to install everything on all nodes

Which will:

  • Install INFRA module on the current node.
  • Install NODE module on the current node.
  • Install ETCD module on the current node.
  • The MinIO module is optional, and will not be installed by default.
  • Install PGSQL module on the current node.

How to resolve RPM conflict?

There may have a slight chance that rpm conflict occurs during node/infra/pgsql packages installation.

The simplest way to resolve this is to install without offline packages, which will download directly from the upstream repo.

If there are only a few problematic RPM/DEB pakages, you can use a trick to fix the yum/apt repo quickly:

rm -rf /www/pigsty/repo_complete    # delete the repo_complete flag file to mark this repo incomplete
rm -rf SomeBrokenPackages           # delete problematic RPM/DEB packages
./infra.yml -t repo_upstream        # write upstream repos. you can also use /etc/yum.repos.d/backup/*
./infra.yml -t repo_pkg             # download rpms according to your current OS

How to create local VMs with vagrant

The first time you use Vagrant to pull up a particular OS repo, it will download the corresponding BOX.

Pigsty sandbox uses generic/rocky9 image box by default, and Vagrant will download the rocky/9 box for the first time the VM is started.

Using a proxy may increase the download speed. Box only needs to be downloaded once, and will be reused when recreating the sandbox.


RPMs error on Aliyun CentOS 7.9

Aliyun CentOS 7.9 server has DNS caching service nscd installed by default. Just remove it.

Aliyun’s CentOS 7.9 repo has nscd installed by default, locking out the glibc version, which can cause RPM dependency errors during installation.

"Error: Package: nscd-2.17-307.el7.1.x86_64 (@base)"

Run yum remove -y nscd on all nodes to resolve this issue, and with Ansible, you can batch.

ansible all -b -a 'yum remove -y nscd'

RPMs error on Tencent Qcloud Rocky 9.1

Tencent Qcloud Rocky 9.1 require extra annobin packages

./infra.yml -t repo_upstream      # add upstream repos
cd /www/pigsty;                   # download missing packages
repotrack annobin gcc-plugin-annobin libuser
./infra.yml -t repo_create        # create repo

Ansible command timeout (Timeout waiting for xxx)

The default ssh timeout for ansible command is 10 seconds, some commands may take longer than that due to network latency or other reasons.

You can increase the timeout parameter in the ansible config file ansible.cfg:

[defaults]
timeout = 10 # change to 60,120 or more

3 - Concept

Learn about core concept about Pigsty: architecture, cluster models, infra, PG HA, PITR, and service access.

3.1 - Architecture

Pigsty’s modular architecture, compose modules in a declarative manner

Modular Architecture and Declarative Interface!

  • Pigsty deployment is described by config inventory and materialized with ansible playbooks.
  • Pigsty works on Linux x86_64 common nodes, i.e., bare metals or virtual machines.
  • Pigsty uses a modular design that can be freely composed for different scenarios.
  • The config controls where & how to install modules with parameters
  • The playbooks will adjust nodes into the desired status in an idempotent manner.

Modules

Pigsty uses a modular design, and there are six default modules: PGSQL, INFRA, NODE, ETCD, REDIS, and MINIO.

  • PGSQL: Autonomous ha Postgres cluster powered by Patroni, Pgbouncer, HAproxy, PgBackrest, etc…
  • INFRA: Local yum/apt repo, Prometheus, Grafana, Loki, AlertManager, PushGateway, Blackbox Exporter…
  • NODE: Tune node to desired state, name, timezone, NTP, ssh, sudo, haproxy, docker, promtail, keepalived
  • ETCD: Distributed key-value store will be used as DCS for high-available Postgres clusters.
  • REDIS: Redis servers in standalone master-replica, sentinel, cluster mode with Redis exporter.
  • MINIO: S3 compatible simple object storage server, can be used as an optional backup center for Postgres.

You can compose them freely in a declarative manner. If you want host monitoring, INFRA & NODE will suffice. Add additional ETCD and PGSQL are used for HA PG Clusters. Deploying them on multiple nodes will form a ha cluster. You can reuse pigsty infra and develop your modules, considering optional REDIS and MINIO as examples.

pigsty-sandbox.jpg


Singleton Meta

Pigsty will install on a single node (BareMetal / VirtualMachine) by default. The install.yml playbook will install INFRA, ETCD, PGSQL, and optional MINIO modules on the current node, which will give you a full-featured observability infrastructure (Prometheus, Grafana, Loki, AlertManager, PushGateway, BlackboxExporter, etc… ) and a battery-included PostgreSQL Singleton Instance (Named meta).

This node now has a self-monitoring system, visualization toolsets, and a Postgres database with autoconfigured PITR. You can use this node for devbox, testing, running demos, and doing data visualization & analysis. Or, furthermore, adding more nodes to it!

pigsty-arch.jpg


Monitoring

The installed Singleton Meta can be use as an admin node and monitoring center, to take more nodes & Database servers under it’s surveillance & control.

If you want to install the Prometheus / Grafana observability stack, Pigsty just deliver the best practice for you! It has fine-grained dashboards for Nodes & PostgreSQL, no matter these nodes or PostgreSQL servers are managed by Pigsty or not, you can have a production-grade monitoring & alerting immediately with simple configuration.

pigsty-dashboard.jpg


HA PG Cluster

With Pigsty, you can have your own local production-grade HA PostgreSQL RDS as much as you want.

And to create such a HA PostgreSQL cluster, All you have to do is describe it & run the playbook:

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: replica }
  vars: { pg_cluster: pg-test }
$ bin/pgsql-add pg-test  # init cluster 'pg-test'

Which will gives you a following cluster with monitoring , replica, backup all set.

pigsty-ha.png

Hardware failures are covered by self-healing HA architecture powered by patroni, etcd, and haproxy, which will perform auto failover in case of leader failure under 30 seconds. With the self-healing traffic control powered by haproxy, the client may not even notice there’s a failure at all, in case of a switchover or replica failure.

Software Failures, human errors, and DC Failure are covered by pgbackrest, and optional MinIO clusters. Which gives you the ability to perform point-in-time recovery to anytime (as long as your storage is capable)


Database as Code

Pigsty follows IaC & GitOPS philosophy: Pigsty deployment is described by declarative Config Inventory and materialized with idempotent playbooks.

The user describes the desired status with Parameters in a declarative manner, and the playbooks tune target nodes into that status in an idempotent manner. It’s like Kubernetes CRD & Operator but works on Bare Metals & Virtual Machines.

pigsty-iac.jpg

Take the default config snippet as an example, which describes a node 10.10.10.10 with modules INFRA, NODE, ETCD, and PGSQL installed.

# infra cluster for proxy, monitor, alert, etc...
infra: { hosts: { 10.10.10.10: { infra_seq: 1 } } }

# minio cluster, s3 compatible object storage
minio: { hosts: { 10.10.10.10: { minio_seq: 1 } }, vars: { minio_cluster: minio } }

# etcd cluster for ha postgres DCS
etcd: { hosts: { 10.10.10.10: { etcd_seq: 1 } }, vars: { etcd_cluster: etcd } }

# postgres example cluster: pg-meta
pg-meta: { hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary }, vars: { pg_cluster: pg-meta } }

To materialize it, use the following playbooks:

./infra.yml -l infra    # init infra module on group 'infra'
./etcd.yml  -l etcd     # init etcd module on group 'etcd'
./minio.yml -l minio    # init minio module on group 'minio'
./pgsql.yml -l pg-meta  # init pgsql module on group 'pgsql'

It would be straightforward to perform regular administration tasks. For example, if you wish to add a new replica/database/user to an existing HA PostgreSQL cluster, all you need to do is add a host in config & run that playbook on it, such as:

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: replica } # <-- add new instance
  vars: { pg_cluster: pg-test }
$ bin/pgsql-add  pg-test 10.10.10.13

You can even manage many PostgreSQL Entities using this approach: User/Role, Database, Service, HBA Rules, Extensions, Schemas, etc…

Check PGSQL Conf for details.

3.2 - Cluster Model

Pigsty abstracts different types of functionalities into modules & clusters.

PGSQL for production environments is organized in clusters, which clusters are logical entities consisting of a set of database instances associated by primary-replica. Each database cluster is an autonomous serving unit consisting of at least one database instance (primary).


ER Diagram

Let’s get started with ER diagram. There are four types of core entities in Pigsty’s PGSQL module:

  • PGSQL Cluster: An autonomous PostgreSQL business unit, used as the top-level namespace for other entities.
  • PGSQL Service: A named abstraction of cluster ability, route traffics, and expose postgres services with node ports.
  • PGSQL Instance: A single postgres server which is a group of running processes & database files on a single node.
  • PGSQL Node: An abstraction of hardware resources, which can be bare metal, virtual machine, or even k8s pods.

pigsty-er.jpg

Naming Convention

  • The cluster name should be a valid domain name, without any dot: [a-zA-Z0-9-]+
  • Service name should be prefixed with cluster name, and suffixed with a single word: such as primary, replica, offline, delayed, join by -
  • Instance name is prefixed with cluster name and suffixed with an integer, join by -, e.g., ${cluster}-${seq}.
  • Node is identified by its IP address, and its hostname is usually the same as the instance name since they are 1:1 deployed.

Identity Parameter

Pigsty uses identity parameters to identify entities: PG_ID.

In addition to the node IP address, three parameters: pg_cluster, pg_role, and pg_seq are the minimum set of parameters necessary to define a postgres cluster. Take the sandbox testing cluster pg-test as an example:

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: replica }
  vars:
    pg_cluster: pg-test

The three members of the cluster are identified as follows.

cluster seq role host / ip instance service nodename
pg-test 1 primary 10.10.10.11 pg-test-1 pg-test-primary pg-test-1
pg-test 2 replica 10.10.10.12 pg-test-2 pg-test-replica pg-test-2
pg-test 3 replica 10.10.10.13 pg-test-3 pg-test-replica pg-test-3

There are:

  • One Cluster: The cluster is named as pg-test.
  • Two Roles: primary and replica.
  • Three Instances: The cluster consists of three instances: pg-test-1, pg-test-2, pg-test-3.
  • Three Nodes: The cluster is deployed on three nodes: 10.10.10.11, 10.10.10.12, and 10.10.10.13.
  • Four services:

And in the monitoring system (Prometheus/Grafana/Loki), corresponding metrics will be labeled with these identities:

pg_up{cls="pg-meta", ins="pg-meta-1", ip="10.10.10.10", job="pgsql"}
pg_up{cls="pg-test", ins="pg-test-1", ip="10.10.10.11", job="pgsql"}
pg_up{cls="pg-test", ins="pg-test-2", ip="10.10.10.12", job="pgsql"}
pg_up{cls="pg-test", ins="pg-test-3", ip="10.10.10.13", job="pgsql"}

3.3 - Monitor System

The architecture and implementation of Pigsty’s monitoring system, the service discovery details

3.4 - Self-Signed CA

Pigsty comes with a set of self-signed CA PKI for issuing SSL certs to encrypt network traffic.

Pigsty has some security best practices: encrypting network traffic with SSL and encrypting the Web interface with HTTPS.

To achieve this, Pigsty comes with a built-in local self-signed Certificate Authority (CA) for issuing SSL certificates to encrypt network communication.

By default, SSL and HTTPS are enabled but not enforced. For environments with higher security requirements, you can enforce the use of SSL and HTTPS.


Local CA

Pigsty, by default, generates a self-signed CA in the Pigsty source code directory (~/pigsty) on the ADMIN node during initialization. This CA is used when SSL, HTTPS, digital signatures, issuing database client certificates, and advanced security features are needed.

Hence, each Pigsty deployment uses a unique CA, and CAs from different Pigsty deployments do not trust each other.

The local CA consists of two files, typically located in the files/pki/ca directory:

  • ca.crt: The self-signed root CA certificate, which should be distributed and installed on all managed nodes for certificate verification.
  • ca.key: The CA private key, used to issue certificates and verify CA identity. It should be securely stored to prevent leaks!

Using an Existing CA

If you already have a CA public and private key infrastructure, Pigsty can also be configured to use an existing CA.

Simply place your CA public and private key files in the files/pki/ca directory.

files/pki/ca/ca.key     # The essential CA private key file, must exist; if not, a new one will be randomly generated by default
files/pki/ca/ca.crt     # If a certificate file is absent, Pigsty will automatically generate a new root certificate file from the CA private key

When Pigsty executes the install.yml and infra.yml playbooks for installation, if the ca.key private key file is found in the files/pki/ca directory, the existing CA will be used. The ca.crt file can be generated from the ca.key private key, so if there is no certificate file, Pigsty will automatically generate a new root certificate file from the CA private key.


Trust CA

During the Pigsty installation, ca.crt is distributed to all nodes under the /etc/pki/ca.crt path during the node_ca task in the node.yml playbook.

The default paths for trusted CA root certificates differ between EL family and Debian family operating systems, hence the distribution path and update methods also vary.

rm -rf /etc/pki/ca-trust/source/anchors/ca.crt
ln -s /etc/pki/ca.crt /etc/pki/ca-trust/source/anchors/ca.crt
/bin/update-ca-trust
rm -rf /usr/local/share/ca-certificates/ca.crt
ln -s /etc/pki/ca.crt /usr/local/share/ca-certificates/ca.crt
/usr/sbin/update-ca-certificates

By default, Pigsty will issue HTTPS certificates for domain names used by web systems on infrastructure nodes, allowing you to access Pigsty’s web systems via HTTPS. If you do not want your browser on the client computer to display “untrusted CA certificate” messages, you can distribute ca.crt to the trusted certificate directory on the client computer.

You can double-click the ca.crt file to add it to the system keychain, for example, on macOS systems, you need to open “Keychain Access,” search for pigsty-ca, and then “trust” this root certificate.



Check Cert

Use the following command to view the contents of the Pigsty CA certificate

openssl x509 -text -in /etc/pki/ca.crt
Local CA Root Cert Content
Certificate:
    Data:
        Version: 3 (0x2)
        Serial Number:
            50:29:e3:60:96:93:f4:85:14:fe:44:81:73:b5:e1:09:2a:a8:5c:0a
        Signature Algorithm: sha256WithRSAEncryption
        Issuer: O=pigsty, OU=ca, CN=pigsty-ca
        Validity
            Not Before: Feb  7 00:56:27 2023 GMT
            Not After : Jan 14 00:56:27 2123 GMT
        Subject: O=pigsty, OU=ca, CN=pigsty-ca
        Subject Public Key Info:
            Public Key Algorithm: rsaEncryption
                Public-Key: (4096 bit)
                Modulus:
                    00:c1:41:74:4f:28:c3:3c:2b:13:a2:37:05:87:31:
                    ....
                    e6:bd:69:a5:5b:e3:b4:c0:65:09:6e:84:14:e9:eb:
                    90:f7:61
                Exponent: 65537 (0x10001)
        X509v3 extensions:
            X509v3 Subject Alternative Name: 
                DNS:pigsty-ca
            X509v3 Key Usage: 
                Digital Signature, Certificate Sign, CRL Sign
            X509v3 Basic Constraints: critical
                CA:TRUE, pathlen:1
            X509v3 Subject Key Identifier: 
                C5:F6:23:CE:BA:F3:96:F6:4B:48:A5:B1:CD:D4:FA:2B:BD:6F:A6:9C
    Signature Algorithm: sha256WithRSAEncryption
    Signature Value:
        89:9d:21:35:59:6b:2c:9b:c7:6d:26:5b:a9:49:80:93:81:18:
        ....
        9e:dd:87:88:0d:c4:29:9e
-----BEGIN CERTIFICATE-----
...
cXyWAYcvfPae3YeIDcQpng==
-----END CERTIFICATE-----

Issue Database Client Certs

If you wish to authenticate via client certificates, you can manually issue PostgreSQL client certificates using the local CA and the cert.yml playbook.

Set the certificate’s CN field to the database username:

./cert.yml -e cn=dbuser_dba
./cert.yml -e cn=dbuser_monitor

The issued certificates will default to being generated in the files/pki/misc/<cn>.{key,crt} path.

3.5 - Infra as Code

Pigsty treat infra & database as code. Manage them in a declarative manner

Infra as Code, Database as Code, Declarative API & Idempotent Playbooks, GitOPS works like a charm.

Pigsty provides a declarative interface: Describe everything in a config file, and Pigsty operates it to the desired state with idempotent playbooks. It works like Kubernetes CRDs & Operators but for databases and infrastructures on any nodes: bare metal or virtual machines.


Declare Module

Take the default config snippet as an example, which describes a node 10.10.10.10 with modules INFRA, NODE, ETCD, and PGSQL installed.

# infra cluster for proxy, monitor, alert, etc...
infra: { hosts: { 10.10.10.10: { infra_seq: 1 } } }

# minio cluster, s3 compatible object storage
minio: { hosts: { 10.10.10.10: { minio_seq: 1 } }, vars: { minio_cluster: minio } }

# etcd cluster for ha postgres DCS
etcd: { hosts: { 10.10.10.10: { etcd_seq: 1 } }, vars: { etcd_cluster: etcd } }

# postgres example cluster: pg-meta
pg-meta: { hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary }, vars: { pg_cluster: pg-meta } }

To materialize it, use the following playbooks:

./infra.yml -l infra    # init infra module on node 10.10.10.10
./etcd.yml  -l etcd     # init etcd  module on node 10.10.10.10
./minio.yml -l minio    # init minio module on node 10.10.10.10
./pgsql.yml -l pg-meta  # init pgsql module on node 10.10.10.10

Declare Cluster

You can declare the PGSQL module on multiple nodes, and form a cluster.

For example, to create a three-node HA cluster based on streaming replication, just adding the following definition to the all.children section of the config file pigsty.yml:

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: replica } # <-- add new instance
  vars: { pg_cluster: pg-test }

Then create the cluster with the pgsql.yml Playbook.

$ bin/pgsql-add  pg-test 10.10.10.13

pigsty-iac.jpg

You can deploy different kinds of instance roles such as primary, replica, offline, delayed, sync standby, and different kinds of clusters, such as standby clusters, Citus clusters, and even Redis / MinIO / Etcd clusters.


Declare Cluster Internal

Not only can you define clusters in a declarative manner, but you can also specify the databases, users, services, and HBA rules within the cluster. For example, the following configuration file deeply customizes the content of the default pg-meta single-node database cluster:

This includes declaring six business databases and seven business users, adding an additional standby service (a synchronous replica providing read capabilities with no replication delay), defining some extra pg_hba rules, an L2 VIP address pointing to the cluster’s primary database, and a customized backup strategy.

pg-meta:
  hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary , pg_offline_query: true } }
  vars:
    pg_cluster: pg-meta
    pg_databases:                       # define business databases on this cluster, array of database definition
      - name: meta                      # REQUIRED, `name` is the only mandatory field of a database definition
        baseline: cmdb.sql              # optional, database sql baseline path, (relative path among ansible search path, e.g files/)
        pgbouncer: true                 # optional, add this database to pgbouncer database list? true by default
        schemas: [pigsty]               # optional, additional schemas to be created, array of schema names
        extensions:                     # optional, additional extensions to be installed: array of `{name[,schema]}`
          - { name: postgis , schema: public }
          - { name: timescaledb }
        comment: pigsty meta database   # optional, comment string for this database
        owner: postgres                # optional, database owner, postgres by default
        template: template1            # optional, which template to use, template1 by default
        encoding: UTF8                 # optional, database encoding, UTF8 by default. (MUST same as template database)
        locale: C                      # optional, database locale, C by default.  (MUST same as template database)
        lc_collate: C                  # optional, database collate, C by default. (MUST same as template database)
        lc_ctype: C                    # optional, database ctype, C by default.   (MUST same as template database)
        tablespace: pg_default         # optional, default tablespace, 'pg_default' by default.
        allowconn: true                # optional, allow connection, true by default. false will disable connect at all
        revokeconn: false              # optional, revoke public connection privilege. false by default. (leave connect with grant option to owner)
        register_datasource: true      # optional, register this database to grafana datasources? true by default
        connlimit: -1                  # optional, database connection limit, default -1 disable limit
        pool_auth_user: dbuser_meta    # optional, all connection to this pgbouncer database will be authenticated by this user
        pool_mode: transaction         # optional, pgbouncer pool mode at database level, default transaction
        pool_size: 64                  # optional, pgbouncer pool size at database level, default 64
        pool_size_reserve: 32          # optional, pgbouncer pool size reserve at database level, default 32
        pool_size_min: 0               # optional, pgbouncer pool size min at database level, default 0
        pool_max_db_conn: 100          # optional, max database connections at database level, default 100
      - { name: grafana  ,owner: dbuser_grafana  ,revokeconn: true ,comment: grafana primary database }
      - { name: bytebase ,owner: dbuser_bytebase ,revokeconn: true ,comment: bytebase primary database }
      - { name: kong     ,owner: dbuser_kong     ,revokeconn: true ,comment: kong the api gateway database }
      - { name: gitea    ,owner: dbuser_gitea    ,revokeconn: true ,comment: gitea meta database }
      - { name: wiki     ,owner: dbuser_wiki     ,revokeconn: true ,comment: wiki meta database }
    pg_users:                           # define business users/roles on this cluster, array of user definition
      - name: dbuser_meta               # REQUIRED, `name` is the only mandatory field of a user definition
        password: DBUser.Meta           # optional, password, can be a scram-sha-256 hash string or plain text
        login: true                     # optional, can log in, true by default  (new biz ROLE should be false)
        superuser: false                # optional, is superuser? false by default
        createdb: false                 # optional, can create database? false by default
        createrole: false               # optional, can create role? false by default
        inherit: true                   # optional, can this role use inherited privileges? true by default
        replication: false              # optional, can this role do replication? false by default
        bypassrls: false                # optional, can this role bypass row level security? false by default
        pgbouncer: true                 # optional, add this user to pgbouncer user-list? false by default (production user should be true explicitly)
        connlimit: -1                   # optional, user connection limit, default -1 disable limit
        expire_in: 3650                 # optional, now + n days when this role is expired (OVERWRITE expire_at)
        expire_at: '2030-12-31'         # optional, YYYY-MM-DD 'timestamp' when this role is expired  (OVERWRITTEN by expire_in)
        comment: pigsty admin user      # optional, comment string for this user/role
        roles: [dbrole_admin]           # optional, belonged roles. default roles are: dbrole_{admin,readonly,readwrite,offline}
        parameters: {}                  # optional, role level parameters with `ALTER ROLE SET`
        pool_mode: transaction          # optional, pgbouncer pool mode at user level, transaction by default
        pool_connlimit: -1              # optional, max database connections at user level, default -1 disable limit
      - {name: dbuser_view     ,password: DBUser.Viewer   ,pgbouncer: true ,roles: [dbrole_readonly], comment: read-only viewer for meta database}
      - {name: dbuser_grafana  ,password: DBUser.Grafana  ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for grafana database   }
      - {name: dbuser_bytebase ,password: DBUser.Bytebase ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for bytebase database  }
      - {name: dbuser_kong     ,password: DBUser.Kong     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for kong api gateway   }
      - {name: dbuser_gitea    ,password: DBUser.Gitea    ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for gitea service      }
      - {name: dbuser_wiki     ,password: DBUser.Wiki     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for wiki.js service    }
    pg_services:                        # extra services in addition to pg_default_services, array of service definition
      # standby service will route {ip|name}:5435 to sync replica's pgbouncer (5435->6432 standby)
      - name: standby                   # required, service name, the actual svc name will be prefixed with `pg_cluster`, e.g: pg-meta-standby
        port: 5435                      # required, service exposed port (work as kubernetes service node port mode)
        ip: "*"                         # optional, service bind ip address, `*` for all ip by default
        selector: "[]"                  # required, service member selector, use JMESPath to filter inventory
        dest: default                   # optional, destination port, default|postgres|pgbouncer|<port_number>, 'default' by default
        check: /sync                    # optional, health check url path, / by default
        backup: "[? pg_role == `primary`]"  # backup server selector
        maxconn: 3000                   # optional, max allowed front-end connection
        balance: roundrobin             # optional, haproxy load balance algorithm (roundrobin by default, other: leastconn)
        options: 'inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100'
    pg_hba_rules:
      - {user: dbuser_view , db: all ,addr: infra ,auth: pwd ,title: 'allow grafana dashboard access cmdb from infra nodes'}
    pg_vip_enabled: true
    pg_vip_address: 10.10.10.2/24
    pg_vip_interface: eth1
    node_crontab:  # make a full backup 1 am everyday
      - '00 01 * * * postgres /pg/bin/pg-backup full'

Declare Access Control

You can also deeply customize Pigsty’s access control capabilities through declarative configuration. For example, the following configuration file provides deep security customization for the pg-meta cluster:

  • Utilizes a three-node core cluster template: crit.yml, to ensure data consistency is prioritized, with zero data loss during failover.
  • Enables L2 VIP, and restricts the database and connection pool listening addresses to three specific addresses: local loopback IP, internal network IP, and VIP.
  • The template mandatorily enables SSL API for Patroni and SSL for Pgbouncer, and in the HBA rules, it enforces SSL usage for accessing the database cluster.
  • Additionally, the $libdir/passwordcheck extension is enabled in pg_libs to enforce a password strength security policy.

Lastly, a separate pg-meta-delay cluster is declared as a delayed replica of pg-meta from one hour ago, for use in emergency data deletion recovery.

pg-meta:      # 3 instance postgres cluster `pg-meta`
  hosts:
    10.10.10.10: { pg_seq: 1, pg_role: primary }
    10.10.10.11: { pg_seq: 2, pg_role: replica }
    10.10.10.12: { pg_seq: 3, pg_role: replica , pg_offline_query: true }
  vars:
    pg_cluster: pg-meta
    pg_conf: crit.yml
    pg_users:
      - { name: dbuser_meta , password: DBUser.Meta   , pgbouncer: true , roles: [ dbrole_admin ] , comment: pigsty admin user }
      - { name: dbuser_view , password: DBUser.Viewer , pgbouncer: true , roles: [ dbrole_readonly ] , comment: read-only viewer for meta database }
    pg_databases:
      - {name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty] ,extensions: [{name: postgis, schema: public}, {name: timescaledb}]}
    pg_default_service_dest: postgres
    pg_services:
      - { name: standby ,src_ip: "*" ,port: 5435 , dest: default ,selector: "[]" , backup: "[? pg_role == `primary`]" }
    pg_vip_enabled: true
    pg_vip_address: 10.10.10.2/24
    pg_vip_interface: eth1
    pg_listen: '${ip},${vip},${lo}'
    patroni_ssl_enabled: true
    pgbouncer_sslmode: require
    pgbackrest_method: minio
    pg_libs: 'timescaledb, $libdir/passwordcheck, pg_stat_statements, auto_explain' # add passwordcheck extension to enforce strong password
    pg_default_roles:                 # default roles and users in postgres cluster
      - { name: dbrole_readonly  ,login: false ,comment: role for global read-only access     }
      - { name: dbrole_offline   ,login: false ,comment: role for restricted read-only access }
      - { name: dbrole_readwrite ,login: false ,roles: [dbrole_readonly]               ,comment: role for global read-write access }
      - { name: dbrole_admin     ,login: false ,roles: [pg_monitor, dbrole_readwrite]  ,comment: role for object creation }
      - { name: postgres     ,superuser: true  ,expire_in: 7300                        ,comment: system superuser }
      - { name: replicator ,replication: true  ,expire_in: 7300 ,roles: [pg_monitor, dbrole_readonly]   ,comment: system replicator }
      - { name: dbuser_dba   ,superuser: true  ,expire_in: 7300 ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 , comment: pgsql admin user }
      - { name: dbuser_monitor ,roles: [pg_monitor] ,expire_in: 7300 ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }
    pg_default_hba_rules:             # postgres host-based auth rules by default
      - {user: '${dbsu}'    ,db: all         ,addr: local     ,auth: ident ,title: 'dbsu access via local os user ident'  }
      - {user: '${dbsu}'    ,db: replication ,addr: local     ,auth: ident ,title: 'dbsu replication from local os ident' }
      - {user: '${repl}'    ,db: replication ,addr: localhost ,auth: ssl   ,title: 'replicator replication from localhost'}
      - {user: '${repl}'    ,db: replication ,addr: intra     ,auth: ssl   ,title: 'replicator replication from intranet' }
      - {user: '${repl}'    ,db: postgres    ,addr: intra     ,auth: ssl   ,title: 'replicator postgres db from intranet' }
      - {user: '${monitor}' ,db: all         ,addr: localhost ,auth: pwd   ,title: 'monitor from localhost with password' }
      - {user: '${monitor}' ,db: all         ,addr: infra     ,auth: ssl   ,title: 'monitor from infra host with password'}
      - {user: '${admin}'   ,db: all         ,addr: infra     ,auth: ssl   ,title: 'admin @ infra nodes with pwd & ssl'   }
      - {user: '${admin}'   ,db: all         ,addr: world     ,auth: cert  ,title: 'admin @ everywhere with ssl & cert'   }
      - {user: '+dbrole_readonly',db: all    ,addr: localhost ,auth: ssl   ,title: 'pgbouncer read/write via local socket'}
      - {user: '+dbrole_readonly',db: all    ,addr: intra     ,auth: ssl   ,title: 'read/write biz user via password'     }
      - {user: '+dbrole_offline' ,db: all    ,addr: intra     ,auth: ssl   ,title: 'allow etl offline tasks from intranet'}
    pgb_default_hba_rules:            # pgbouncer host-based authentication rules
      - {user: '${dbsu}'    ,db: pgbouncer   ,addr: local     ,auth: peer  ,title: 'dbsu local admin access with os ident'}
      - {user: 'all'        ,db: all         ,addr: localhost ,auth: pwd   ,title: 'allow all user local access with pwd' }
      - {user: '${monitor}' ,db: pgbouncer   ,addr: intra     ,auth: ssl   ,title: 'monitor access via intranet with pwd' }
      - {user: '${monitor}' ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other monitor access addr' }
      - {user: '${admin}'   ,db: all         ,addr: intra     ,auth: ssl   ,title: 'admin access via intranet with pwd'   }
      - {user: '${admin}'   ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other admin access addr'   }
      - {user: 'all'        ,db: all         ,addr: intra     ,auth: ssl   ,title: 'allow all user intra access with pwd' }

# OPTIONAL delayed cluster for pg-meta
pg-meta-delay:                    # delayed instance for pg-meta (1 hour ago)
  hosts: { 10.10.10.13: { pg_seq: 1, pg_role: primary, pg_upstream: 10.10.10.10, pg_delay: 1h } }
  vars: { pg_cluster: pg-meta-delay }

Citus Distributive Cluster

Example: Citus Distributed Cluster: 5 Nodes
all:
  children:
    pg-citus0: # citus coordinator, pg_group = 0
      hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus0 , pg_group: 0 }
    pg-citus1: # citus data node 1
      hosts: { 10.10.10.11: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus1 , pg_group: 1 }
    pg-citus2: # citus data node 2
      hosts: { 10.10.10.12: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus2 , pg_group: 2 }
    pg-citus3: # citus data node 3, with an extra replica
      hosts:
        10.10.10.13: { pg_seq: 1, pg_role: primary }
        10.10.10.14: { pg_seq: 2, pg_role: replica }
      vars: { pg_cluster: pg-citus3 , pg_group: 3 }
  vars:                               # global parameters for all citus clusters
    pg_mode: citus                    # pgsql cluster mode: citus
    pg_shard: pg-citus                # citus shard name: pg-citus
    patroni_citus_db: meta            # citus distributed database name
    pg_dbsu_password: DBUser.Postgres # all dbsu password access for citus cluster
    pg_libs: 'citus, timescaledb, pg_stat_statements, auto_explain' # citus will be added by patroni automatically
    pg_extensions:
      - postgis34_${ pg_version }* timescaledb-2-postgresql-${ pg_version }* pgvector_${ pg_version }* citus_${ pg_version }*
    pg_users: [ { name: dbuser_meta ,password: DBUser.Meta ,pgbouncer: true ,roles: [ dbrole_admin ] } ]
    pg_databases: [ { name: meta ,extensions: [ { name: citus }, { name: postgis }, { name: timescaledb } ] } ]
    pg_hba_rules:
      - { user: 'all' ,db: all  ,addr: 127.0.0.1/32 ,auth: ssl ,title: 'all user ssl access from localhost' }
      - { user: 'all' ,db: all  ,addr: intra        ,auth: ssl ,title: 'all user ssl access from intranet'  }

Redis Clusters

Example: Redis Cluster/Sentinel/Standalone
redis-ms: # redis classic primary & replica
  hosts: { 10.10.10.10: { redis_node: 1 , redis_instances: { 6379: { }, 6380: { replica_of: '10.10.10.10 6379' } } } }
  vars: { redis_cluster: redis-ms ,redis_password: 'redis.ms' ,redis_max_memory: 64MB }

redis-meta: # redis sentinel x 3
  hosts: { 10.10.10.11: { redis_node: 1 , redis_instances: { 26379: { } ,26380: { } ,26381: { } } } }
  vars:
    redis_cluster: redis-meta
    redis_password: 'redis.meta'
    redis_mode: sentinel
    redis_max_memory: 16MB
    redis_sentinel_monitor: # primary list for redis sentinel, use cls as name, primary ip:port
      - { name: redis-ms, host: 10.10.10.10, port: 6379 ,password: redis.ms, quorum: 2 }

redis-test: # redis native cluster: 3m x 3s
  hosts:
    10.10.10.12: { redis_node: 1 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } }
    10.10.10.13: { redis_node: 2 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } }
  vars: { redis_cluster: redis-test ,redis_password: 'redis.test' ,redis_mode: cluster, redis_max_memory: 32MB }

Etcd Cluster

Example: ETCD 3 Node Cluster
etcd: # dcs service for postgres/patroni ha consensus
  hosts:  # 1 node for testing, 3 or 5 for production
    10.10.10.10: { etcd_seq: 1 }  # etcd_seq required
    10.10.10.11: { etcd_seq: 2 }  # assign from 1 ~ n
    10.10.10.12: { etcd_seq: 3 }  # odd number please
  vars: # cluster level parameter override roles/etcd
    etcd_cluster: etcd  # mark etcd cluster name etcd
    etcd_safeguard: false # safeguard against purging
    etcd_clean: true # purge etcd during init process

MinIO Cluster

Example: Minio 3 Node Deployment
minio:
  hosts:
    10.10.10.10: { minio_seq: 1 }
    10.10.10.11: { minio_seq: 2 }
    10.10.10.12: { minio_seq: 3 }
  vars:
    minio_cluster: minio
    minio_data: '/data{1...2}'        # use two disk per node
    minio_node: '${minio_cluster}-${minio_seq}.pigsty' # minio node name pattern
    haproxy_services:
      - name: minio                     # [REQUIRED] service name, unique
        port: 9002                      # [REQUIRED] service port, unique
        options:
          - option httpchk
          - option http-keep-alive
          - http-check send meth OPTIONS uri /minio/health/live
          - http-check expect status 200
        servers:
          - { name: minio-1 ,ip: 10.10.10.10 , port: 9000 , options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
          - { name: minio-2 ,ip: 10.10.10.11 , port: 9000 , options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
          - { name: minio-3 ,ip: 10.10.10.12 , port: 9000 , options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }

3.6 - High Availability

Pigsty uses Patroni to achieve high availability for PostgreSQL, ensuring automatic failover.

Overview

Primary Failure RTO ≈ 30s, RPO < 1MB, Replica Failure RTO≈0 (reset current conn)

Pigsty’s PostgreSQL cluster has battery-included high-availability powered by Patroni, Etcd, and HAProxy

When your have two or more instances in the PostgreSQL cluster, you have the ability to self-healing from hardware failures without any further configuration — as long as any instance within the cluster survives, the cluster can serve its services. Clients simply need to connect to any node in the cluster to obtain full services without worrying about replication topology changes.

By default, the recovery time objective (RTO) for primary failure is approximately 30s ~ 60s, and the data recovery point objective (RPO) is < 1MB; for standby failure, RPO = 0, RTO ≈ 0 (instantaneous). In consistency-first mode, zero data loss during failover is guaranteed: RPO = 0. These metrics can be configured as needed based on your actual hardware conditions and reliability requirements.

Pigsty incorporates an HAProxy load balancer for automatic traffic switching, offering multiple access methods for clients such as DNS/VIP/LVS. Failovers and switchover are almost imperceptible to the business side except for sporadic interruptions, meaning applications do not need connection string modifications or restarts.

pigsty-ha

What problems does High-Availability solve?

  • Elevates the availability aspect of data safety C/IA to a new height: RPO ≈ 0, RTO < 30s.
  • Enables seamless rolling maintenance capabilities, minimizing maintenance window requirements for great convenience.
  • Hardware failures can self-heal immediately without human intervention, allowing operations DBAs to sleep soundly.
  • Standbys can carry read-only requests, sharing the load with the primary to make full use of resources.

What are the costs of High Availability?

  • Infrastructure dependency: High availability relies on DCS (etcd/zk/consul) for consensus.
  • Increased entry barrier: A meaningful high-availability deployment environment requires at least three nodes.
  • Additional resource consumption: Each new standby consumes additional resources, which isn’t a major issue.
  • Significantly higher complexity costs: Backup costs significantly increase, requiring tools to manage complexity.

Limitations of High Availability

Since replication is real-time, all changes are immediately applied to the standby. Thus, high-availability solutions based on streaming replication cannot address human errors and software defects that cause data deletions or modifications. (e.g., DROP TABLE, or DELETE data) Such failures require the use of Delayed Clusters or Point-In-Time Recovery using previous base backups and WAL archives.

Strategy RTO (Time to Recover) RPO (Max Data Loss)
Standalone + Do Nothing Permanent data loss, irrecoverable Total data loss
Standalone + Basic Backup Depends on backup size and bandwidth (hours) Loss of data since last backup (hours to days)
Standalone + Basic Backup +
WAL Archiving
Depends on backup size and bandwidth (hours) Loss of last unarchived data (tens of MB)
Primary-Replica + Manual Failover Dozens of minutes Replication Lag (about 100KB)
Primary-Replica + Auto Failover Within a minute Replication Lag (about 100KB)
Primary-Replica + Auto Failover +
Synchronous Commit
Within a minute No data loss

Implementation

In Pigsty, the high-availability architecture works as follows:

  • PostgreSQL uses standard streaming replication to set up physical standby databases. In case of a primary database failure, the standby takes over.
  • Patroni is responsible for managing PostgreSQL server processes and handles high-availability-related matters.
  • Etcd provides Distributed Configuration Store (DCS) capabilities and is used for leader election after a failure.
  • Patroni relies on Etcd to reach a consensus on cluster leadership and offers a health check interface to the outside.
  • HAProxy exposes cluster services externally and utilizes the Patroni health check interface to automatically route traffic to healthy nodes.
  • vip-manager offers an optional layer 2 VIP, retrieves leader information from Etcd, and binds the VIP to the node hosting the primary database.

Upon primary database failure, a new round of leader election is triggered. The healthiest standby in the cluster (with the highest LSN and least data loss) wins and is promoted to the new primary. After the promotion of the winning standby, read-write traffic is immediately routed to the new primary. The impact of a primary failure is temporary unavailability of write services: from the primary’s failure to the promotion of a new primary, write requests will be blocked or directly fail, typically lasting 15 to 30 seconds, usually not exceeding 1 minute.

When a standby fails, read-only traffic is routed to other standbys. If all standbys fail, the primary will eventually carry the read-only traffic. The impact of a standby failure is partial read-only query interruption: queries currently running on the failed standby will be aborted due to connection reset and immediately taken over by another available standby.

Failure detection is jointly completed by Patroni and Etcd. The cluster leader holds a lease, if the cluster leader fails to renew the lease in time (10s) due to a failure, the lease will be released, triggering a failover and a new round of cluster elections.

Even without any failures, you can still proactively perform a Switchover to change the primary of the cluster. In this case, write queries on the primary will be interrupted and immediately routed to the new primary for execution. This operation can typically be used for rolling maintenance/upgrades of the database server.


Trade Offs

The ttl can be tuned with pg_rto, which is 30s by default, increasing it will cause longer failover wait time, while decreasing it will increase the false-positive failover rate (e.g. network jitter).

Pigsty will use availability first mode by default, which means when primary fails, it will try to failover ASAP, data not replicated to the replica may be lost (usually 100KB), and the max potential data loss is controlled by pg_rpo, which is 1MB by default.

Recovery Time Objective (RTO) and Recovery Point Objective (RPO) are two parameters that need careful consideration when designing a high-availability cluster.

The default values of RTO and RPO used by Pigsty meet the reliability requirements for most scenarios. You can adjust them based on your hardware level, network quality, and business needs.

The maximum duration of unavailability during a failover is controlled by the pg_rto parameter, with a default value of 30s. Increasing it will lead to a longer duration of unavailability for write operations during primary failover, while decreasing it will increase the rate of false failovers (e.g., due to brief network jitters).

The upper limit of potential data loss is controlled by the pg_rpo parameter, defaulting to 1MB. Lowering this value can reduce the upper limit of data loss during failovers but also increases the likelihood of refusing automatic failovers due to insufficiently healthy standbys (too far behind).

Pigsty defaults to an availability-first mode, meaning that it will proceed with a failover as quickly as possible when the primary fails, and data not yet replicated to the standby might be lost (under regular ten-gigabit networks, replication delay is usually between a few KB to 100KB).

If you need to ensure no data loss during failovers, you can use the crit.yml template to ensure no data loss during failovers, but this will come at the cost of some performance.


Parameters

pg_rto

name: pg_rto, type: int, level: C

recovery time objective in seconds, This will be used as Patroni TTL value, 30s by default.

If a primary instance is missing for such a long time, a new leader election will be triggered.

Decrease the value can reduce the unavailable time (unable to write) of the cluster during failover, but it will make the cluster more sensitive to network jitter, thus increase the chance of false-positive failover.

Config this according to your network condition and expectation to trade-off between chance and impact, the default value is 30s, and it will be populated to the following patroni parameters:

# the TTL to acquire the leader lock (in seconds). Think of it as the length of time before initiation of the automatic failover process. Default value: 30
ttl: {{ pg_rto }}

# the number of seconds the loop will sleep. Default value: 10 , this is patroni check loop interval
loop_wait: {{ (pg_rto / 3)|round(0, 'ceil')|int }}

# timeout for DCS and PostgreSQL operation retries (in seconds). DCS or network issues shorter than this will not cause Patroni to demote the leader. Default value: 10
retry_timeout: {{ (pg_rto / 3)|round(0, 'ceil')|int }}

# the amount of time a primary is allowed to recover from failures before failover is triggered (in seconds), Max RTO: 2 loop wait + primary_start_timeout
primary_start_timeout: {{ (pg_rto / 3)|round(0, 'ceil')|int }}

pg_rpo

name: pg_rpo, type: int, level: C

recovery point objective in bytes, 1MiB at most by default

default values: 1048576, which will tolerate at most 1MiB data loss during failover.

when the primary is down and all replicas are lagged, you have to make a tough choice to trade off between Availability and Consistency:

  • Promote a replica to be the new primary and bring system back online ASAP, with the price of an acceptable data loss (e.g. less than 1MB).
  • Wait for the primary to come back (which may never be) or human intervention to avoid any data loss.

You can use crit.yml crit.yml template to ensure no data loss during failover, but it will sacrifice some performance.

3.7 - Point-in-Time Recovery

Pigsty utilizes pgBackRest for PostgreSQL point-in-time recovery, allowing users to roll back to any point within the backup policy limits.

Overview

You can roll back your cluster to any point in time, avoiding data loss caused by software defects and human errors.

Pigsty’s PostgreSQL clusters come with an automatically configured Point in Time Recovery (PITR) solution, based on the backup component pgBackRest and the optional object storage repository MinIO.

High Availability solutions can address hardware failures, but they are powerless against data deletions/overwrites/deletions caused by software defects and human errors. For such scenarios, Pigsty offers an out-of-the-box Point in Time Recovery (PITR) capability, enabled by default without any additional configuration.

Pigsty provides you with the default configuration for base backups and WAL archiving, allowing you to use local directories and disks, or dedicated MinIO clusters or S3 object storage services to store backups and achieve off-site disaster recovery. When using local disks, by default, you retain the ability to recover to any point in time within the past day. When using MinIO or S3, by default, you retain the ability to recover to any point in time within the past week. As long as storage space permits, you can keep a recoverable time span as long as desired, based on your budget.


What problems does Point in Time Recovery (PITR) solve?

  • Enhanced disaster recovery capability: RPO reduces from ∞ to a few MBs, RTO from ∞ to a few hours/minutes.
  • Ensures data security: Data Integrity among C/I/A: avoids data consistency issues caused by accidental deletions.
  • Ensures data security: Data Availability among C/I/A: provides a safety net for “permanently unavailable” disasters.
Singleton Strategy Event RTO RPO
Do nothing Crash Permanently lost All lost
Basic backup Crash Depends on backup size and bandwidth (a few hours) Loss of data after the last backup (a few hours to days)
Basic backup + WAL Archiving Crash Depends on backup size and bandwidth (a few hours) Loss of data not yet archived (a few dozen MBs)

What are the costs of Point in Time Recovery?

  • Reduced C in data security: Confidentiality, creating additional leakage points, requiring extra protection for backups.
  • Additional resource consumption: local storage or network traffic/bandwidth costs, usually not a problem.
  • Increased complexity cost: users need to invest in backup management.

Limitations of Point in Time Recovery

If PITR is the only method for fault recovery, the RTO and RPO metrics are inferior compared to High Availability solutions, and it’s usually best to use both in combination.

  • RTO: With only a single machine + PITR, recovery time depends on backup size and network/disk bandwidth, ranging from tens of minutes to several hours or days.
  • RPO: With only a single machine + PITR, a crash might result in the loss of a small amount of data, as one or several WAL log segments might not yet be archived, losing between 16 MB to several dozen MBs of data.

Apart from PITR, you can also use Delayed Clusters in Pigsty to address data deletion or alteration issues caused by human errors or software defects.


How does PITR works?

Point in Time Recovery allows you to roll back your cluster to any “specific moment” in the past, avoiding data loss caused by software defects and human errors. To achieve this, two key preparations are necessary: Base Backups and WAL Archiving. Having a Base Backup allows users to restore the database to the state at the time of the backup, while having WAL Archiving from a certain base backup enables users to restore the database to any point in time after the base backup.

fig-10-02.png

For a detailed principle, refer to: Base Backups and Point in Time Recovery; for specific operations, refer to PGSQL Management: Backup and Restore.

Base Backups

Pigsty uses pgBackRest to manage PostgreSQL backups. pgBackRest will initialize an empty repository on all cluster instances, but it will only use the repository on the primary instance.

pgBackRest supports three backup modes: Full Backup, Incremental Backup, and Differential Backup, with the first two being the most commonly used. A Full Backup takes a complete physical snapshot of the database cluster at a current moment, while an Incremental Backup records the differences between the current database cluster and the last full backup.

Pigsty provides a wrapper command for backups: /pg/bin/pg-backup [full|incr]. You can make base backups periodically as needed through Crontab or any other task scheduling system.

WAL Archiving

By default, Pigsty enables WAL archiving on the primary instance of the cluster and continuously pushes WAL segment files to the backup repository using the pgbackrest command-line tool.

pgBackRest automatically manages the required WAL files and promptly cleans up expired backups and their corresponding WAL archive files according to the backup retention policy.

If you do not need PITR functionality, you can disable WAL archiving by configuring the cluster: archive_mode: off, and remove node_crontab to stop periodic backup tasks.


Implementation

By default, Pigsty provides two preset backup strategies: using the local filesystem backup repository by default, where a full backup is taken daily to ensure users can roll back to any point within a day at any time. The alternative strategy uses a dedicated MinIO cluster or S3 storage for backups, with a full backup on Monday and incremental backups daily, keeping two weeks of backups and WAL archives by default.

Pigsty uses pgBackRest to manage backups, receive WAL archives, and perform PITR. The backup repository can be flexibly configured (pgbackrest_repo): by default, it uses the local filesystem (local) of the primary instance, but it can also use other disk paths, or the optional MinIO service (minio) and cloud-based S3 services.

pgbackrest_repo:                  # pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository
  local:                          # default pgbackrest repo with local posix fs
    path: /pg/backup              # local backup directory, `/pg/backup` by default
    retention_full_type: count    # retention full backups by count
    retention_full: 2             # keep 2, at most 3 full backup when using local fs repo
  minio:                          # optional minio repo for pgbackrest
    type: s3                      # minio is s3-compatible, so s3 is used
    s3_endpoint: sss.pigsty       # minio endpoint domain name, `sss.pigsty` by default
    s3_region: us-east-1          # minio region, us-east-1 by default, useless for minio
    s3_bucket: pgsql              # minio bucket name, `pgsql` by default
    s3_key: pgbackrest            # minio user access key for pgbackrest
    s3_key_secret: S3User.Backup  # minio user secret key for pgbackrest
    s3_uri_style: path            # use path style uri for minio rather than host style
    path: /pgbackrest             # minio backup path, default is `/pgbackrest`
    storage_port: 9000            # minio port, 9000 by default
    storage_ca_file: /etc/pki/ca.crt  # minio ca file path, `/etc/pki/ca.crt` by default
    bundle: y                     # bundle small files into a single file
    cipher_type: aes-256-cbc      # enable AES encryption for remote backup repo
    cipher_pass: pgBackRest       # AES encryption password, default is 'pgBackRest'
    retention_full_type: time     # retention full backup by time on minio repo
    retention_full: 14            # keep full backup for last 14 days

Pigsty has two built-in backup options: local file system repository with daily full backups or dedicated MinIO/S3 storage with weekly full and daily incremental backups, retaining two weeks’ worth by default.

The target repositories in Pigsty parameter pgbackrest_repo are translated into repository definitions in the /etc/pgbackrest/pgbackrest.conf configuration file. For example, if you define a West US region S3 repository for cold backups, you could use the following reference configuration.

s3:    # ------> /etc/pgbackrest/pgbackrest.conf
  repo1-type: s3                                   # ----> repo1-type=s3
  repo1-s3-region: us-west-1                       # ----> repo1-s3-region=us-west-1
  repo1-s3-endpoint: s3-us-west-1.amazonaws.com    # ----> repo1-s3-endpoint=s3-us-west-1.amazonaws.com
  repo1-s3-key: '<your_access_key>'                # ----> repo1-s3-key=<your_access_key>
  repo1-s3-key-secret: '<your_secret_key>'         # ----> repo1-s3-key-secret=<your_secret_key>
  repo1-s3-bucket: pgsql                           # ----> repo1-s3-bucket=pgsql
  repo1-s3-uri-style: host                         # ----> repo1-s3-uri-style=host
  repo1-path: /pgbackrest                          # ----> repo1-path=/pgbackrest
  repo1-bundle: y                                  # ----> repo1-bundle=y
  repo1-cipher-type: aes-256-cbc                   # ----> repo1-cipher-type=aes-256-cbc
  repo1-cipher-pass: pgBackRest                    # ----> repo1-cipher-pass=pgBackRest
  repo1-retention-full-type: time                  # ----> repo1-retention-full-type=time
  repo1-retention-full: 90                         # ----> repo1-retention-full=90

Recovery

You can use the following encapsulated commands for Point in Time Recovery of the PostgreSQL database cluster.

By default, Pigsty uses incremental, differential, parallel recovery, allowing you to restore to a specified point in time as quickly as possible.

pg-pitr                                 # restore to wal archive stream end (e.g. used in case of entire DC failure)
pg-pitr -i                              # restore to the time of latest backup complete (not often used)
pg-pitr --time="2022-12-30 14:44:44+08" # restore to specific time point (in case of drop db, drop table)
pg-pitr --name="my-restore-point"       # restore TO a named restore point create by pg_create_restore_point
pg-pitr --lsn="0/7C82CB8" -X            # restore right BEFORE a LSN
pg-pitr --xid="1234567" -X -P           # restore right BEFORE a specific transaction id, then promote
pg-pitr --backup=latest                 # restore to latest backup set
pg-pitr --backup=20221108-105325        # restore to a specific backup set, which can be checked with pgbackrest info

pg-pitr                                 # pgbackrest --stanza=pg-meta restore
pg-pitr -i                              # pgbackrest --stanza=pg-meta --type=immediate restore
pg-pitr -t "2022-12-30 14:44:44+08"     # pgbackrest --stanza=pg-meta --type=time --target="2022-12-30 14:44:44+08" restore
pg-pitr -n "my-restore-point"           # pgbackrest --stanza=pg-meta --type=name --target=my-restore-point restore
pg-pitr -b 20221108-105325F             # pgbackrest --stanza=pg-meta --type=name --set=20221230-120101F restore
pg-pitr -l "0/7C82CB8" -X               # pgbackrest --stanza=pg-meta --type=lsn --target="0/7C82CB8" --target-exclusive restore
pg-pitr -x 1234567 -X -P                # pgbackrest --stanza=pg-meta --type=xid --target="0/7C82CB8" --target-exclusive --target-action=promote restore

During PITR, you can observe the LSN point status of the cluster using the Pigsty monitoring system to determine if it has successfully restored to the specified time point, transaction point, LSN point, or other points.

pitr

3.8 - Services & Access

Pigsty employs HAProxy for service access, offering optional pgBouncer for connection pooling, and optional L2 VIP and DNS access.

Split read & write, route traffic to the right place, and achieve stable & reliable access to the PostgreSQL cluster.

Service is an abstraction to seal the details of the underlying cluster, especially during cluster failover/switchover.


Personal User

Service is meaningless to personal users. You can access the database with raw IP address or whatever method you like.

psql postgres://dbuser_dba:[email protected]/meta     # dbsu direct connect
psql postgres://dbuser_meta:[email protected]/meta   # default business admin user
psql postgres://dbuser_view:DBUser.View@pg-meta/meta       # default read-only user

Service Overview

We utilize a PostgreSQL database cluster based on replication in real-world production environments. Within the cluster, only one instance is the leader (primary) that can accept writes. Other instances (replicas) continuously fetch WAL from the leader to stay synchronized. Additionally, replicas can handle read-only queries and offload the primary in read-heavy, write-light scenarios. Thus, distinguishing between write and read-only requests is a common practice.

Moreover, we pool requests through a connection pooling middleware (Pgbouncer) for high-frequency, short-lived connections to reduce the overhead of connection and backend process creation. And, for scenarios like ETL and change execution, we need to bypass the connection pool and directly access the database servers. Furthermore, high-availability clusters may undergo failover during failures, causing a change in the cluster leadership. Therefore, the RW requests should be re-routed automatically to the new leader.

These varied requirements (read-write separation, pooling vs. direct connection, and client request failover) have led to the abstraction of the service concept.

Typically, a database cluster must provide this basic service:

  • Read-write service (primary): Can read and write to the database.

For production database clusters, at least these two services should be provided:

  • Read-write service (primary): Write data: Only carried by the primary.
  • Read-only service (replica): Read data: Can be carried by replicas, but fallback to the primary if no replicas are available.

Additionally, there might be other services, such as:

  • Direct access service (default): Allows (admin) users to bypass the connection pool and directly access the database.
  • Offline replica service (offline): A dedicated replica that doesn’t handle online read traffic, used for ETL and analytical queries.
  • Synchronous replica service (standby): A read-only service with no replication delay, handled by synchronous standby/primary for read queries.
  • Delayed replica service (delayed): Accesses older data from the same cluster from a certain time ago, handled by delayed replicas.

Default Service

Pigsty will enable four default services for each PostgreSQL cluster:

service port description
primary 5433 pgbouncer read/write, connect to primary 5432 or 6432
replica 5434 pgbouncer read-only, connect to replicas 5432/6432
default 5436 admin or direct access to primary
offline 5438 OLAP, ETL, personal user, interactive queries

Take the default pg-meta cluster as an example, you can access these services in the following ways:

psql postgres://dbuser_meta:DBUser.Meta@pg-meta:5433/meta   # pg-meta-primary : production read/write via primary pgbouncer(6432)
psql postgres://dbuser_meta:DBUser.Meta@pg-meta:5434/meta   # pg-meta-replica : production read-only via replica pgbouncer(6432)
psql postgres://dbuser_dba:DBUser.DBA@pg-meta:5436/meta     # pg-meta-default : Direct connect primary via primary postgres(5432)
psql postgres://dbuser_stats:DBUser.Stats@pg-meta:5438/meta # pg-meta-offline : Direct connect offline via offline postgres(5432)

pigsty-ha.png

Here the pg-meta domain name point to the cluster’s L2 VIP, which in turn points to the haproxy load balancer on the primary instance. It is responsible for routing traffic to different instances, check Access Services for details.


Primary Service

The primary service may be the most critical service for production usage.

It will route traffic to the primary instance, depending on pg_default_service_dest:

  • pgbouncer: route traffic to primary pgbouncer port (6432), which is the default behavior
  • postgres: route traffic to primary postgres port (5432) directly if you don’t want to use pgbouncer
- { name: primary ,port: 5433 ,dest: default  ,check: /primary   ,selector: "[]" }

It means all cluster members will be included in the primary service (selector: "[]"), but the one and only one instance that past health check (check: /primary) will be used as the primary instance. Patroni will guarantee that only one instance is primary at any time, so the primary service will always route traffic to THE primary instance.

Example: pg-test-primary haproxy config
listen pg-test-primary
    bind *:5433
    mode tcp
    maxconn 5000
    balance roundrobin
    option httpchk
    option http-keep-alive
    http-check send meth OPTIONS uri /primary
    http-check expect status 200
    default-server inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100
    # servers
    server pg-test-1 10.10.10.11:6432 check port 8008 weight 100
    server pg-test-3 10.10.10.13:6432 check port 8008 weight 100
    server pg-test-2 10.10.10.12:6432 check port 8008 weight 100

Replica Service

The replica service is used for production read-only traffic.

There may be many more read-only queries than read-write queries in real-world scenarios. You may have many replicas.

The replica service will route traffic to Pgbouncer or postgres depending on pg_default_service_dest, just like the primary service.

- { name: replica ,port: 5434 ,dest: default  ,check: /read-only ,selector: "[]" , backup: "[? pg_role == `primary` || pg_role == `offline` ]" }

The replica service traffic will try to use common pg instances with pg_role = replica to alleviate the load on the primary instance as much as possible. It will try NOT to use instances with pg_role = offline to avoid mixing OLAP & OLTP queries as much as possible.

All cluster members will be included in the replica service (selector: "[]") when it passes the read-only health check (check: /read-only). primary and offline instances are used as backup servers, which will take over in case of all replica instances are down.

Example: pg-test-replica haproxy config
listen pg-test-replica
    bind *:5434
    mode tcp
    maxconn 5000
    balance roundrobin
    option httpchk
    option http-keep-alive
    http-check send meth OPTIONS uri /read-only
    http-check expect status 200
    default-server inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100
    # servers
    server pg-test-1 10.10.10.11:6432 check port 8008 weight 100 backup
    server pg-test-3 10.10.10.13:6432 check port 8008 weight 100
    server pg-test-2 10.10.10.12:6432 check port 8008 weight 100

Default Service

The default service will route to primary postgres (5432) by default.

It is quite like the primary service, except it will always bypass pgbouncer, regardless of pg_default_service_dest. Which is useful for administration connection, ETL writes, CDC changing data capture, etc…

- { name: primary ,port: 5433 ,dest: default  ,check: /primary   ,selector: "[]" }
Example: pg-test-default haproxy config
listen pg-test-default
    bind *:5436
    mode tcp
    maxconn 5000
    balance roundrobin
    option httpchk
    option http-keep-alive
    http-check send meth OPTIONS uri /primary
    http-check expect status 200
    default-server inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100
    # servers
    server pg-test-1 10.10.10.11:5432 check port 8008 weight 100
    server pg-test-3 10.10.10.13:5432 check port 8008 weight 100
    server pg-test-2 10.10.10.12:5432 check port 8008 weight 100

Offline Service

The Offline service will route traffic to dedicate postgres instance directly.

Which could be a pg_role = offline instance, or a pg_offline_query flagged instance.

If no such instance is found, it will fall back to any replica instances. the bottom line is: it will never route traffic to the primary instance.

- { name: offline ,port: 5438 ,dest: postgres ,check: /replica   ,selector: "[? pg_role == `offline` || pg_offline_query ]" , backup: "[? pg_role == `replica` && !pg_offline_query]"}
listen pg-test-offline
    bind *:5438
    mode tcp
    maxconn 5000
    balance roundrobin
    option httpchk
    option http-keep-alive
    http-check send meth OPTIONS uri /replica
    http-check expect status 200
    default-server inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100
    # servers
    server pg-test-3 10.10.10.13:5432 check port 8008 weight 100
    server pg-test-2 10.10.10.12:5432 check port 8008 weight 100 backup

Access Service

Pigsty expose service with haproxy. Which is enabled on all nodes by default.

haproxy load balancers are idempotent among same pg cluster by default, you use ANY / ALL of them by all means.

The typical method is access via cluster domain name, which resolve to cluster L2 VIP, or all instances ip address in a round-robin manner.

Service can be implemented in different ways, You can even implement you own access method such as L4 LVS, F5, etc… instead of haproxy.

pigsty-access.jpg

You can use different combination of host & port, they are provide PostgreSQL service in different ways.

Host

type sample description
Cluster Domain Name pg-test via cluster domain name (resolved by dnsmasq @ infra nodes)
Cluster VIP Address 10.10.10.3 via a L2 VIP address managed by vip-manager, bind to primary
Instance Hostname pg-test-1 Access via any instance hostname (resolved by dnsmasq @ infra nodes)
Instance IP Address 10.10.10.11 Access any instance ip address

Port

Pigsty uses different ports to distinguish between pg services:

port service type description
5432 postgres database Direct access to postgres server
6432 pgbouncer middleware Go through connection pool middleware before postgres
5433 primary service Access primary pgbouncer (or postgres)
5434 replica service Access replica pgbouncer (or postgres)
5436 default service Access primary postgres
5438 offline service Access offline postgres

Combinations

# Access via cluster domain
postgres://test@pg-test:5432/test # DNS -> L2 VIP -> primary direct connection
postgres://test@pg-test:6432/test # DNS -> L2 VIP -> primary connection pool -> primary
postgres://test@pg-test:5433/test # DNS -> L2 VIP -> HAProxy -> Primary Connection Pool -> Primary
postgres://test@pg-test:5434/test # DNS -> L2 VIP -> HAProxy -> Replica Connection Pool -> Replica
postgres://dbuser_dba@pg-test:5436/test # DNS -> L2 VIP -> HAProxy -> Primary direct connection (for Admin)
postgres://dbuser_stats@pg-test:5438/test # DNS -> L2 VIP -> HAProxy -> offline direct connection (for ETL/personal queries)

# Direct access via cluster VIP
postgres://[email protected]:5432/test # L2 VIP -> Primary direct access
postgres://[email protected]:6432/test # L2 VIP -> Primary Connection Pool -> Primary
postgres://[email protected]:5433/test # L2 VIP -> HAProxy -> Primary Connection Pool -> Primary
postgres://[email protected]:5434/test # L2 VIP -> HAProxy -> Repilca Connection Pool -> Replica
postgres://[email protected]:5436/test # L2 VIP -> HAProxy -> Primary direct connection (for Admin)
postgres://[email protected]::5438/test # L2 VIP -> HAProxy -> offline direct connect (for ETL/personal queries)

# Specify any cluster instance name directly
postgres://test@pg-test-1:5432/test # DNS -> Database Instance Direct Connect (singleton access)
postgres://test@pg-test-1:6432/test # DNS -> connection pool -> database
postgres://test@pg-test-1:5433/test # DNS -> HAProxy -> connection pool -> database read/write
postgres://test@pg-test-1:5434/test # DNS -> HAProxy -> connection pool -> database read-only
postgres://dbuser_dba@pg-test-1:5436/test # DNS -> HAProxy -> database direct connect
postgres://dbuser_stats@pg-test-1:5438/test # DNS -> HAProxy -> database offline read/write

# Directly specify any cluster instance IP access
postgres://[email protected]:5432/test # Database instance direct connection (directly specify instance, no automatic traffic distribution)
postgres://[email protected]:6432/test # Connection Pool -> Database
postgres://[email protected]:5433/test # HAProxy -> connection pool -> database read/write
postgres://[email protected]:5434/test # HAProxy -> connection pool -> database read-only
postgres://[email protected]:5436/test # HAProxy -> Database Direct Connections
postgres://[email protected]:5438/test # HAProxy -> database offline read-write

# Smart client automatic read/write separation (connection pooling)
postgres://[email protected]:6432,10.10.10.12:6432,10.10.10.13:6432/test?target_session_attrs=primary
postgres://[email protected]:6432,10.10.10.12:6432,10.10.10.13:6432/test?target_session_attrs=prefer-standby

3.9 - Access Control

Pigsty has moderate security best-practice: password & certs, built-in ACLs, encrypted network traffics and cold backups

Pigsty has a battery-included access control model based on Role System and Privileges.


Role System

Pigsty has a default role system consist of four default roles and four default users

Role name Attributes Member of Description
dbrole_readonly NOLOGIN role for global read-only access
dbrole_readwrite NOLOGIN dbrole_readonly role for global read-write access
dbrole_admin NOLOGIN pg_monitor,dbrole_readwrite role for object creation
dbrole_offline NOLOGIN role for restricted read-only access
postgres SUPERUSER system superuser
replicator REPLICATION pg_monitor,dbrole_readonly system replicator
dbuser_dba SUPERUSER dbrole_admin pgsql admin user
dbuser_monitor pg_monitor pgsql monitor user
pg_default_roles:                 # default roles and users in postgres cluster
  - { name: dbrole_readonly  ,login: false ,comment: role for global read-only access     }
  - { name: dbrole_offline   ,login: false ,comment: role for restricted read-only access }
  - { name: dbrole_readwrite ,login: false ,roles: [dbrole_readonly] ,comment: role for global read-write access }
  - { name: dbrole_admin     ,login: false ,roles: [pg_monitor, dbrole_readwrite] ,comment: role for object creation }
  - { name: postgres     ,superuser: true  ,comment: system superuser }
  - { name: replicator ,replication: true  ,roles: [pg_monitor, dbrole_readonly] ,comment: system replicator }
  - { name: dbuser_dba   ,superuser: true  ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 ,comment: pgsql admin user }
  - { name: dbuser_monitor ,roles: [pg_monitor] ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }

Default Roles

There are four default roles in pigsty:

  • Read Only (dbrole_readonly): Role for global read-only access
  • Read Write (dbrole_readwrite): Role for global read-write access, inherits dbrole_readonly.
  • Admin (dbrole_admin): Role for DDL commands, inherits dbrole_readwrite.
  • Offline (dbrole_offline): Role for restricted read-only access (offline instance)

Default roles are defined in pg_default_roles, change default roles is not recommended.

- { name: dbrole_readonly  , login: false , comment: role for global read-only access  }                            # production read-only role
- { name: dbrole_offline ,   login: false , comment: role for restricted read-only access (offline instance) }      # restricted-read-only role
- { name: dbrole_readwrite , login: false , roles: [dbrole_readonly], comment: role for global read-write access }  # production read-write role
- { name: dbrole_admin , login: false , roles: [pg_monitor, dbrole_readwrite] , comment: role for object creation } # production DDL change role

Default Users

There are four default users in pigsty, too.

  • Superuser (postgres), the owner and creator of the cluster, same as the OS dbsu.
  • Replication user (replicator), the system user used for primary-replica.
  • Monitor user (dbuser_monitor), a user used to monitor database and connection pool metrics.
  • Admin user (dbuser_dba), the admin user who performs daily operations and database changes.

Default users’ username/password are defined with dedicate parameters (except for dbsu password):

!> Remember to change these password in production deployment !

pg_dbsu: postgres                             # os user for the database
pg_replication_username: replicator           # system replication user
pg_replication_password: DBUser.Replicator    # system replication password
pg_monitor_username: dbuser_monitor           # system monitor user
pg_monitor_password: DBUser.Monitor           # system monitor password
pg_admin_username: dbuser_dba                 # system admin user
pg_admin_password: DBUser.DBA                 # system admin password

To define extra options, specify them in pg_default_roles:

- { name: postgres     ,superuser: true                                          ,comment: system superuser }
- { name: replicator ,replication: true  ,roles: [pg_monitor, dbrole_readonly]   ,comment: system replicator }
- { name: dbuser_dba   ,superuser: true  ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 , comment: pgsql admin user }
- { name: dbuser_monitor   ,roles: [pg_monitor, dbrole_readonly] ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }

Privileges

Pigsty has a battery-included privilege model that works with default roles.

  • All users have access to all schemas.
  • Read-Only user can read from all tables. (SELECT, EXECUTE)
  • Read-Write user can write to all tables run DML. (INSERT, UPDATE, DELETE).
  • Admin user can create object and run DDL (CREATE, USAGE, TRUNCATE, REFERENCES, TRIGGER).
  • Offline user is Read-Only user with limited access on offline instance (pg_role = 'offline' or pg_offline_query = true)
  • Object created by admin users will have correct privilege.
  • Default privileges are installed on all databases, including template database.
  • Database connect privilege is covered by database definition
  • CREATE privileges of database & public schema are revoked from PUBLIC by default

Object Privilege

Default object privileges are defined in pg_default_privileges.

- GRANT USAGE      ON SCHEMAS   TO dbrole_readonly
- GRANT SELECT     ON TABLES    TO dbrole_readonly
- GRANT SELECT     ON SEQUENCES TO dbrole_readonly
- GRANT EXECUTE    ON FUNCTIONS TO dbrole_readonly
- GRANT USAGE      ON SCHEMAS   TO dbrole_offline
- GRANT SELECT     ON TABLES    TO dbrole_offline
- GRANT SELECT     ON SEQUENCES TO dbrole_offline
- GRANT EXECUTE    ON FUNCTIONS TO dbrole_offline
- GRANT INSERT     ON TABLES    TO dbrole_readwrite
- GRANT UPDATE     ON TABLES    TO dbrole_readwrite
- GRANT DELETE     ON TABLES    TO dbrole_readwrite
- GRANT USAGE      ON SEQUENCES TO dbrole_readwrite
- GRANT UPDATE     ON SEQUENCES TO dbrole_readwrite
- GRANT TRUNCATE   ON TABLES    TO dbrole_admin
- GRANT REFERENCES ON TABLES    TO dbrole_admin
- GRANT TRIGGER    ON TABLES    TO dbrole_admin
- GRANT CREATE     ON SCHEMAS   TO dbrole_admin

Newly created objects will have corresponding privileges when it is created by admin users

The \ddp+ may looks like:

Type Access privileges
function =X
dbrole_readonly=X
dbrole_offline=X
dbrole_admin=X
schema dbrole_readonly=U
dbrole_offline=U
dbrole_admin=UC
sequence dbrole_readonly=r
dbrole_offline=r
dbrole_readwrite=wU
dbrole_admin=rwU
table dbrole_readonly=r
dbrole_offline=r
dbrole_readwrite=awd
dbrole_admin=arwdDxt

Default Privilege

ALTER DEFAULT PRIVILEGES allows you to set the privileges that will be applied to objects created in the future. It does not affect privileges assigned to already-existing objects, and objects created by non-admin users.

Pigsty will use the following default privileges:

{% for priv in pg_default_privileges %}
ALTER DEFAULT PRIVILEGES FOR ROLE {{ pg_dbsu }} {{ priv }};
{% endfor %}

{% for priv in pg_default_privileges %}
ALTER DEFAULT PRIVILEGES FOR ROLE {{ pg_admin_username }} {{ priv }};
{% endfor %}

-- for additional business admin, they can SET ROLE to dbrole_admin
{% for priv in pg_default_privileges %}
ALTER DEFAULT PRIVILEGES FOR ROLE "dbrole_admin" {{ priv }};
{% endfor %}

Which will be rendered in pg-init-template.sql alone with ALTER DEFAULT PRIVILEGES statement for admin users.

These SQL command will be executed on postgres & template1 during cluster bootstrap, and newly created database will inherit it from tempalte1 by default.

That is to say, to maintain the correct object privilege, you have to run DDL with admin users, which could be:

  1. {{ pg_dbsu }}, postgres by default
  2. {{ pg_admin_username }}, dbuser_dba by default
  3. Business admin user granted with dbrole_admin

It’s wise to use postgres as global object owner to perform DDL changes. If you wish to create objects with business admin user, YOU MUST USE SET ROLE dbrole_admin before running that DDL to maintain the correct privileges.

You can also ALTER DEFAULT PRIVILEGE FOR ROLE <some_biz_admin> XXX to grant default privilege to business admin user, too.


Database Privilege

Database privilege is covered by database definition.

There are 3 database level privileges: CONNECT, CREATE, TEMP, and a special ‘privilege’: OWNERSHIP.

- name: meta         # required, `name` is the only mandatory field of a database definition
  owner: postgres    # optional, specify a database owner, {{ pg_dbsu }} by default
  allowconn: true    # optional, allow connection, true by default. false will disable connect at all
  revokeconn: false  # optional, revoke public connection privilege. false by default. (leave connect with grant option to owner)
  • If owner exists, it will be used as database owner instead of default {{ pg_dbsu }}
  • If revokeconn is false, all users have the CONNECT privilege of the database, this is the default behavior.
  • If revokeconn is set to true explicitly:
    • CONNECT privilege of the database will be revoked from PUBLIC
    • CONNECT privilege will be granted to {{ pg_replication_username }}, {{ pg_monitor_username }} and {{ pg_admin_username }}
    • CONNECT privilege will be granted to database owner with GRANT OPTION

revokeconn flag can be used for database access isolation, you can create different business users as the owners for each database and set the revokeconn option for all of them.

Example: Database Isolation
pg-infra:
  hosts:
    10.10.10.40: { pg_seq: 1, pg_role: primary }
    10.10.10.41: { pg_seq: 2, pg_role: replica , pg_offline_query: true }
  vars:
    pg_cluster: pg-infra
    pg_users:
      - { name: dbuser_confluence, password: mc2iohos , pgbouncer: true, roles: [ dbrole_admin ] }
      - { name: dbuser_gitlab, password: sdf23g22sfdd , pgbouncer: true, roles: [ dbrole_readwrite ] }
      - { name: dbuser_jira, password: sdpijfsfdsfdfs , pgbouncer: true, roles: [ dbrole_admin ] }
    pg_databases:
      - { name: confluence , revokeconn: true, owner: dbuser_confluence , connlimit: 100 }
      - { name: gitlab , revokeconn: true, owner: dbuser_gitlab, connlimit: 100 }
      - { name: jira , revokeconn: true, owner: dbuser_jira , connlimit: 100 }

Create Privilege

Pigsty revokes the CREATE privilege on database from PUBLIC by default, for security consideration. And this is the default behavior since PostgreSQL 15.

The database owner have the full capability to adjust these privileges as they see fit.

4 - References

Detailed information and list: supported OS distros, available modules, monitor metrics, extensions, cost compare & analysis, glossary

4.1 - Compatibility

Supported operating systems, kernels & arch, PostgreSQL major versions and feature sets.

Overview

Pigsty recommends using Linux kernel, amd64 arch, and RockyLinux 8.9, Debian 12 or Ubuntu 22.04 as base OS.

Kernel Architecture Compatibility: Linux kernel, amd64 architecture (x86_64)

EL Distribution Support: EL7, EL8, EL9; (RHEL, Rocky, CentOS, Alma, Oracle, Anolis,…)

Debian Distribution Support: Ubuntu 22.04 jammy, 20.04 focal; Debian 12 bookworm and 11 bullseye.

Pigsty does not use any virtualization or containerization technologies, running directly on the bare OS. Supported operating systems include EL 7/8/9 (RHEL, Rocky, CentOS, Alma, Oracle, Anolis,…), Ubuntu 20.04 / 22.04 & Debian 11/12. EL is our long-term supported OS, while support for Ubuntu/Debian systems was introduced in the recent v2.5 version. The main difference between EL and Debian distributions is the significant variation in package names, as well as the default availability of PostgreSQL extensions.

We strongly recommend using RockyLinux 8.9 or Ubuntu 22.04.3 LTS as the OS for Pigsty. We have prepared offline packages for these specific minor versions of OS distros. This ensures a stable, reliable, and smooth installation even without internet access. Using other operating system distributions for the standard installation requires Internet access to download and build a local repo.

If you have advanced compatibility requirements, such as using specific operating system distributions, major versions, or minor versions, we offer advance compatibility support options.


Kernel & Arch Support

Currently, Pigsty only supports the Linux kernel and the x86_64 / amd64 chip architecture.

MacOS and Windows operating systems can install Pigsty via Linux virtual machines/containers. We provide Vagrant local sandbox support, allowing you to use virtualization software like Vagrant and Virtualbox to effortlessly bring up the deployment environment required by Pigsty on other operating systems.


EL Distribution Support

The EL series operating systems are Pigsty’s primary support target, including compatible distributions such as Red Hat Enterprise Linux, RockyLinux, CentOS, AlmaLinux, OracleLinux, Anolis, etc. Pigsty supports the latest three major versions: 7, 8, 9

  • EL9: RHEL, RockyLinux, AlmaLinux (Rocky 9.3 recommended)
  • EL8: RHEL, RockyLinux, AlmaLinux, Anolis (Rocky 8.9 recommended)
  • EL7: RHEL, CentOS 7.9 (CentOS 7.9 recommended)
Code EL Distros Minor PG16 PG15 PG14 PG13 PG12 Limitation
EL8 RHEL 8 / Rocky8 / Alma8 / Anolis8 8.9 Standard Feature Set
EL9 RHEL 9 / Rocky9 / Alma9 9.3 Missing pgxnclient, perf deps broken
EL7 RHEL7 / CentOS7 7.9 PG16, supabase, pgml, duckdb_fdw,… unavailable

Debian Distribution Support

Pigsty supports Ubuntu / Debian series operating systems and their compatible distributions, currently supporting the two most recent LTS major versions, namely:

  • U22: Ubuntu 22.04 LTS jammy (22.04.3 Recommended)
  • U20: Ubuntu 20.04 LTS focal (20.04.6)
  • D12: Debian 12 bookworm (12.4)
  • D11: Debian 11 bullseye (11.8)
Code Debian Distros Minor PG16 PG15 PG14 PG13 PG12 Limitations
U22 Ubuntu 22.04 (jammy) 22.04.3 Standard Debian series feature set
U20 Ubuntu 20.04 (focal) 20.04.6 Some extensions require online installation
D12 Debian 12 (bookworm) 12.4
D11 Debian 11 (bullseye) 11.8

Vagrant Boxes

When deploying Pigsty on cloud servers, you might consider using the following operating system images in Vagrant, which are also the images used for Pigsty’s development, testing, and building.


Terraform Images

When deploying Pigsty on cloud servers, you might consider using the following operating system base images in Terraform, using Alibaba Cloud as an example:

  • CentOS 7.9 : centos_7_9_x64_20G_alibase_20231220.vhd
  • Rocky 8.9 : rockylinux_8_9_x64_20G_alibase_20231221.vhd
  • Rocky 9.3 : rockylinux_9_3_x64_20G_alibase_20231221.vhd
  • Ubuntu 20.04.3 : ubuntu_20_04_x64_20G_alibase_20231221.vhd
  • Ubuntu 22.04.6 : ubuntu_22_04_x64_20G_alibase_20231221.vhd
  • Debian 11.7 : debian_11_7_x64_20G_alibase_20230907.vhd
  • Debian 12.4 : debian_12_4_x64_20G_alibase_20231220.vhd
  • Anolis 8.8 : anolisos_8_8_x64_20G_rhck_alibase_20230804.vhd

References

4.2 - Parameters

Pigsty has 280+ parameters to describe every aspet of the environment & various modules

There are 280+ parameters in Pigsty describing all aspect of the deployment.

ID Name Module Section Type Level Comment
101 version INFRA META string G pigsty version string
102 admin_ip INFRA META ip G admin node ip address
103 region INFRA META enum G upstream mirror region: default,china,europe
104 proxy_env INFRA META dict G global proxy env when downloading packages
105 ca_method INFRA CA enum G create,recreate,copy, create by default
106 ca_cn INFRA CA string G ca common name, fixed as pigsty-ca
107 cert_validity INFRA CA interval G cert validity, 20 years by default
108 infra_seq INFRA INFRA_ID int I infra node identity, REQUIRED
109 infra_portal INFRA INFRA_ID dict G infra services exposed via portal
110 repo_enabled INFRA REPO bool G/I create a yum repo on this infra node?
111 repo_home INFRA REPO path G repo home dir, /www by default
112 repo_name INFRA REPO string G repo name, pigsty by default
113 repo_endpoint INFRA REPO url G access point to this repo by domain or ip:port
114 repo_remove INFRA REPO bool G/A remove existing upstream repo
115 repo_modules INFRA REPO string G/A which repo modules are installed in repo_upstream
116 repo_upstream INFRA REPO upstream[] G where to download upstream packages
117 repo_packages INFRA REPO string[] G which packages to be included
118 repo_url_packages INFRA REPO string[] G extra packages from url
120 infra_packages INFRA INFRA_PACKAGE string[] G packages to be installed on infra nodes
121 infra_packages_pip INFRA INFRA_PACKAGE string G pip installed packages for infra nodes
130 nginx_enabled INFRA NGINX bool G/I enable nginx on this infra node?
131 nginx_exporter_enabled INFRA NGINX bool G/I enable nginx_exporter on this infra node?
132 nginx_sslmode INFRA NGINX enum G nginx ssl mode? disable,enable,enforce
133 nginx_home INFRA NGINX path G nginx content dir, /www by default
134 nginx_port INFRA NGINX port G nginx listen port, 80 by default
135 nginx_ssl_port INFRA NGINX port G nginx ssl listen port, 443 by default
136 nginx_navbar INFRA NGINX index[] G nginx index page navigation links
140 dns_enabled INFRA DNS bool G/I setup dnsmasq on this infra node?
141 dns_port INFRA DNS port G dns server listen port, 53 by default
142 dns_records INFRA DNS string[] G dynamic dns records resolved by dnsmasq
150 prometheus_enabled INFRA PROMETHEUS bool G/I enable prometheus on this infra node?
151 prometheus_clean INFRA PROMETHEUS bool G/A clean prometheus data during init?
152 prometheus_data INFRA PROMETHEUS path G prometheus data dir, /data/prometheus by default
153 prometheus_sd_dir INFRA PROMETHEUS path G prometheus file service discovery directory
154 prometheus_sd_interval INFRA PROMETHEUS interval G prometheus target refresh interval, 5s by default
155 prometheus_scrape_interval INFRA PROMETHEUS interval G prometheus scrape & eval interval, 10s by default
156 prometheus_scrape_timeout INFRA PROMETHEUS interval G prometheus global scrape timeout, 8s by default
157 prometheus_options INFRA PROMETHEUS arg G prometheus extra server options
158 pushgateway_enabled INFRA PROMETHEUS bool G/I setup pushgateway on this infra node?
159 pushgateway_options INFRA PROMETHEUS arg G pushgateway extra server options
160 blackbox_enabled INFRA PROMETHEUS bool G/I setup blackbox_exporter on this infra node?
161 blackbox_options INFRA PROMETHEUS arg G blackbox_exporter extra server options
162 alertmanager_enabled INFRA PROMETHEUS bool G/I setup alertmanager on this infra node?
163 alertmanager_options INFRA PROMETHEUS arg G alertmanager extra server options
164 exporter_metrics_path INFRA PROMETHEUS path G exporter metric path, /metrics by default
165 exporter_install INFRA PROMETHEUS enum G how to install exporter? none,yum,binary
166 exporter_repo_url INFRA PROMETHEUS url G exporter repo file url if install exporter via yum
170 grafana_enabled INFRA GRAFANA bool G/I enable grafana on this infra node?
171 grafana_clean INFRA GRAFANA bool G/A clean grafana data during init?
172 grafana_admin_username INFRA GRAFANA username G grafana admin username, admin by default
173 grafana_admin_password INFRA GRAFANA password G grafana admin password, pigsty by default
174 grafana_plugin_cache INFRA GRAFANA path G path to grafana plugins cache tarball
175 grafana_plugin_list INFRA GRAFANA string[] G grafana plugins to be downloaded with grafana-cli
176 loki_enabled INFRA LOKI bool G/I enable loki on this infra node?
177 loki_clean INFRA LOKI bool G/A whether remove existing loki data?
178 loki_data INFRA LOKI path G loki data dir, /data/loki by default
179 loki_retention INFRA LOKI interval G loki log retention period, 15d by default
201 nodename NODE NODE_ID string I node instance identity, use hostname if missing, optional
202 node_cluster NODE NODE_ID string C node cluster identity, use ’nodes’ if missing, optional
203 nodename_overwrite NODE NODE_ID bool C overwrite node’s hostname with nodename?
204 nodename_exchange NODE NODE_ID bool C exchange nodename among play hosts?
205 node_id_from_pg NODE NODE_ID bool C use postgres identity as node identity if applicable?
210 node_write_etc_hosts NODE NODE_DNS bool G/C/I modify /etc/hosts on target node?
211 node_default_etc_hosts NODE NODE_DNS string[] G static dns records in /etc/hosts
212 node_etc_hosts NODE NODE_DNS string[] C extra static dns records in /etc/hosts
213 node_dns_method NODE NODE_DNS enum C how to handle dns servers: add,none,overwrite
214 node_dns_servers NODE NODE_DNS string[] C dynamic nameserver in /etc/resolv.conf
215 node_dns_options NODE NODE_DNS string[] C dns resolv options in /etc/resolv.conf
220 node_repo_modules NODE NODE_PACKAGE string C upstream repo to be added on node, local by default
221 node_repo_remove NODE NODE_PACKAGE bool C remove existing repo on node?
223 node_packages NODE NODE_PACKAGE string[] C packages to be installed current nodes
224 node_default_packages NODE NODE_PACKAGE string[] G default packages to be installed on all nodes
230 node_disable_firewall NODE NODE_TUNE bool C disable node firewall? true by default
231 node_disable_selinux NODE NODE_TUNE bool C disable node selinux? true by default
232 node_disable_numa NODE NODE_TUNE bool C disable node numa, reboot required
233 node_disable_swap NODE NODE_TUNE bool C disable node swap, use with caution
234 node_static_network NODE NODE_TUNE bool C preserve dns resolver settings after reboot
235 node_disk_prefetch NODE NODE_TUNE bool C setup disk prefetch on HDD to increase performance
236 node_kernel_modules NODE NODE_TUNE string[] C kernel modules to be enabled on this node
237 node_hugepage_count NODE NODE_TUNE int C number of 2MB hugepage, take precedence over ratio
238 node_hugepage_ratio NODE NODE_TUNE float C node mem hugepage ratio, 0 disable it by default
239 node_overcommit_ratio NODE NODE_TUNE float C node mem overcommit ratio, 0 disable it by default
240 node_tune NODE NODE_TUNE enum C node tuned profile: none,oltp,olap,crit,tiny
241 node_sysctl_params NODE NODE_TUNE dict C sysctl parameters in k:v format in addition to tuned
250 node_data NODE NODE_ADMIN path C node main data directory, /data by default
251 node_admin_enabled NODE NODE_ADMIN bool C create a admin user on target node?
252 node_admin_uid NODE NODE_ADMIN int C uid and gid for node admin user
253 node_admin_username NODE NODE_ADMIN username C name of node admin user, dba by default
254 node_admin_ssh_exchange NODE NODE_ADMIN bool C exchange admin ssh key among node cluster
255 node_admin_pk_current NODE NODE_ADMIN bool C add current user’s ssh pk to admin authorized_keys
256 node_admin_pk_list NODE NODE_ADMIN string[] C ssh public keys to be added to admin user
260 node_timezone NODE NODE_TIME string C setup node timezone, empty string to skip
261 node_ntp_enabled NODE NODE_TIME bool C enable chronyd time sync service?
262 node_ntp_servers NODE NODE_TIME string[] C ntp servers in /etc/chrony.conf
263 node_crontab_overwrite NODE NODE_TIME bool C overwrite or append to /etc/crontab?
264 node_crontab NODE NODE_TIME string[] C crontab entries in /etc/crontab
270 vip_enabled NODE NODE_VIP bool C enable vip on this node cluster?
271 vip_address NODE NODE_VIP ip C node vip address in ipv4 format, required if vip is enabled
272 vip_vrid NODE NODE_VIP int C required, integer, 1-254, should be unique among same VLAN
273 vip_role NODE NODE_VIP enum I optional, master/backup, backup by default, use as init role
274 vip_preempt NODE NODE_VIP bool C/I optional, true/false, false by default, enable vip preemption
275 vip_interface NODE NODE_VIP string C/I node vip network interface to listen, eth0 by default
276 vip_dns_suffix NODE NODE_VIP string C node vip dns name suffix, empty string by default
277 vip_exporter_port NODE NODE_VIP port C keepalived exporter listen port, 9650 by default
280 haproxy_enabled NODE HAPROXY bool C enable haproxy on this node?
281 haproxy_clean NODE HAPROXY bool G/C/A cleanup all existing haproxy config?
282 haproxy_reload NODE HAPROXY bool A reload haproxy after config?
283 haproxy_auth_enabled NODE HAPROXY bool G enable authentication for haproxy admin page
284 haproxy_admin_username NODE HAPROXY username G haproxy admin username, admin by default
285 haproxy_admin_password NODE HAPROXY password G haproxy admin password, pigsty by default
286 haproxy_exporter_port NODE HAPROXY port C haproxy admin/exporter port, 9101 by default
287 haproxy_client_timeout NODE HAPROXY interval C client side connection timeout, 24h by default
288 haproxy_server_timeout NODE HAPROXY interval C server side connection timeout, 24h by default
289 haproxy_services NODE HAPROXY service[] C list of haproxy service to be exposed on node
290 node_exporter_enabled NODE NODE_EXPORTER bool C setup node_exporter on this node?
291 node_exporter_port NODE NODE_EXPORTER port C node exporter listen port, 9100 by default
292 node_exporter_options NODE NODE_EXPORTER arg C extra server options for node_exporter
293 promtail_enabled NODE PROMTAIL bool C enable promtail logging collector?
294 promtail_clean NODE PROMTAIL bool G/A purge existing promtail status file during init?
295 promtail_port NODE PROMTAIL port C promtail listen port, 9080 by default
296 promtail_positions NODE PROMTAIL path C promtail position status file path
401 docker_enabled NODE DOCKER bool C enable docker on this node?
402 docker_cgroups_driver NODE DOCKER enum C docker cgroup fs driver: cgroupfs,systemd
403 docker_registry_mirrors NODE DOCKER string[] C docker registry mirror list
404 docker_image_cache NODE DOCKER path C docker image cache dir, /tmp/docker by default
501 etcd_seq ETCD ETCD int I etcd instance identifier, REQUIRED
502 etcd_cluster ETCD ETCD string C etcd cluster & group name, etcd by default
503 etcd_safeguard ETCD ETCD bool G/C/A prevent purging running etcd instance?
504 etcd_clean ETCD ETCD bool G/C/A purging existing etcd during initialization?
505 etcd_data ETCD ETCD path C etcd data directory, /data/etcd by default
506 etcd_port ETCD ETCD port C etcd client port, 2379 by default
507 etcd_peer_port ETCD ETCD port C etcd peer port, 2380 by default
508 etcd_init ETCD ETCD enum C etcd initial cluster state, new or existing
509 etcd_election_timeout ETCD ETCD int C etcd election timeout, 1000ms by default
510 etcd_heartbeat_interval ETCD ETCD int C etcd heartbeat interval, 100ms by default
601 minio_seq MINIO MINIO int I minio instance identifier, REQUIRED
602 minio_cluster MINIO MINIO string C minio cluster name, minio by default
603 minio_clean MINIO MINIO bool G/C/A cleanup minio during init?, false by default
604 minio_user MINIO MINIO username C minio os user, minio by default
605 minio_node MINIO MINIO string C minio node name pattern
606 minio_data MINIO MINIO path C minio data dir(s), use {x…y} to specify multi drivers
607 minio_domain MINIO MINIO string G minio service domain name, sss.pigsty by default
608 minio_port MINIO MINIO port C minio service port, 9000 by default
609 minio_admin_port MINIO MINIO port C minio console port, 9001 by default
610 minio_access_key MINIO MINIO username C root access key, minioadmin by default
611 minio_secret_key MINIO MINIO password C root secret key, minioadmin by default
612 minio_extra_vars MINIO MINIO string C extra environment variables for minio server
613 minio_alias MINIO MINIO string G alias name for local minio deployment
614 minio_buckets MINIO MINIO bucket[] C list of minio bucket to be created
615 minio_users MINIO MINIO user[] C list of minio user to be created
701 redis_cluster REDIS REDIS string C redis cluster name, required identity parameter
702 redis_instances REDIS REDIS dict I redis instances definition on this redis node
703 redis_node REDIS REDIS int I redis node sequence number, node int id required
710 redis_fs_main REDIS REDIS path C redis main data mountpoint, /data by default
711 redis_exporter_enabled REDIS REDIS bool C install redis exporter on redis nodes?
712 redis_exporter_port REDIS REDIS port C redis exporter listen port, 9121 by default
713 redis_exporter_options REDIS REDIS string C/I cli args and extra options for redis exporter
720 redis_safeguard REDIS REDIS bool G/C/A prevent purging running redis instance?
721 redis_clean REDIS REDIS bool G/C/A purging existing redis during init?
722 redis_rmdata REDIS REDIS bool G/C/A remove redis data when purging redis server?
723 redis_mode REDIS REDIS enum C redis mode: standalone,cluster,sentinel
724 redis_conf REDIS REDIS string C redis config template path, except sentinel
725 redis_bind_address REDIS REDIS ip C redis bind address, empty string will use host ip
726 redis_max_memory REDIS REDIS size C/I max memory used by each redis instance
727 redis_mem_policy REDIS REDIS enum C redis memory eviction policy
728 redis_password REDIS REDIS password C redis password, empty string will disable password
729 redis_rdb_save REDIS REDIS string[] C redis rdb save directives, disable with empty list
730 redis_aof_enabled REDIS REDIS bool C enable redis append only file?
731 redis_rename_commands REDIS REDIS dict C rename redis dangerous commands
732 redis_cluster_replicas REDIS REDIS int C replica number for one master in redis cluster
733 redis_sentinel_monitor REDIS REDIS master[] C sentinel master list, works on sentinel cluster only
801 pg_mode PGSQL PG_ID enum C pgsql cluster mode: pgsql,citus,gpsql
802 pg_cluster PGSQL PG_ID string C pgsql cluster name, REQUIRED identity parameter
803 pg_seq PGSQL PG_ID int I pgsql instance seq number, REQUIRED identity parameter
804 pg_role PGSQL PG_ID enum I pgsql role, REQUIRED, could be primary,replica,offline
805 pg_instances PGSQL PG_ID dict I define multiple pg instances on node in {port:ins_vars} format
806 pg_upstream PGSQL PG_ID ip I repl upstream ip addr for standby cluster or cascade replica
807 pg_shard PGSQL PG_ID string C pgsql shard name, optional identity for sharding clusters
808 pg_group PGSQL PG_ID int C pgsql shard index number, optional identity for sharding clusters
809 gp_role PGSQL PG_ID enum C greenplum role of this cluster, could be master or segment
810 pg_exporters PGSQL PG_ID dict C additional pg_exporters to monitor remote postgres instances
811 pg_offline_query PGSQL PG_ID bool I set to true to enable offline query on this instance
820 pg_users PGSQL PG_BUSINESS user[] C postgres business users
821 pg_databases PGSQL PG_BUSINESS database[] C postgres business databases
822 pg_services PGSQL PG_BUSINESS service[] C postgres business services
823 pg_hba_rules PGSQL PG_BUSINESS hba[] C business hba rules for postgres
824 pgb_hba_rules PGSQL PG_BUSINESS hba[] C business hba rules for pgbouncer
831 pg_replication_username PGSQL PG_BUSINESS username G postgres replication username, replicator by default
832 pg_replication_password PGSQL PG_BUSINESS password G postgres replication password, DBUser.Replicator by default
833 pg_admin_username PGSQL PG_BUSINESS username G postgres admin username, dbuser_dba by default
834 pg_admin_password PGSQL PG_BUSINESS password G postgres admin password in plain text, DBUser.DBA by default
835 pg_monitor_username PGSQL PG_BUSINESS username G postgres monitor username, dbuser_monitor by default
836 pg_monitor_password PGSQL PG_BUSINESS password G postgres monitor password, DBUser.Monitor by default
837 pg_dbsu_password PGSQL PG_BUSINESS password G/C postgres dbsu password, empty string disable it by default
840 pg_dbsu PGSQL PG_INSTALL username C os dbsu name, postgres by default, better not change it
841 pg_dbsu_uid PGSQL PG_INSTALL int C os dbsu uid and gid, 26 for default postgres users and groups
842 pg_dbsu_sudo PGSQL PG_INSTALL enum C dbsu sudo privilege, none,limit,all,nopass. limit by default
843 pg_dbsu_home PGSQL PG_INSTALL path C postgresql home directory, /var/lib/pgsql by default
844 pg_dbsu_ssh_exchange PGSQL PG_INSTALL bool C exchange postgres dbsu ssh key among same pgsql cluster
845 pg_version PGSQL PG_INSTALL enum C postgres major version to be installed, 16 by default
846 pg_bin_dir PGSQL PG_INSTALL path C postgres binary dir, /usr/pgsql/bin by default
847 pg_log_dir PGSQL PG_INSTALL path C postgres log dir, /pg/log/postgres by default
848 pg_packages PGSQL PG_INSTALL string[] C pg packages to be installed, ${pg_version} will be replaced
849 pg_extensions PGSQL PG_INSTALL string[] C pg extensions to be installed, ${pg_version} will be replaced
850 pg_safeguard PGSQL PG_BOOTSTRAP bool G/C/A prevent purging running postgres instance? false by default
851 pg_clean PGSQL PG_BOOTSTRAP bool G/C/A purging existing postgres during pgsql init? true by default
852 pg_data PGSQL PG_BOOTSTRAP path C postgres data directory, /pg/data by default
853 pg_fs_main PGSQL PG_BOOTSTRAP path C mountpoint/path for postgres main data, /data by default
854 pg_fs_bkup PGSQL PG_BOOTSTRAP path C mountpoint/path for pg backup data, /data/backup by default
855 pg_storage_type PGSQL PG_BOOTSTRAP enum C storage type for pg main data, SSD,HDD, SSD by default
856 pg_dummy_filesize PGSQL PG_BOOTSTRAP size C size of /pg/dummy, hold 64MB disk space for emergency use
857 pg_listen PGSQL PG_BOOTSTRAP ip(s) C/I postgres/pgbouncer listen addresses, comma separated list
858 pg_port PGSQL PG_BOOTSTRAP port C postgres listen port, 5432 by default
859 pg_localhost PGSQL PG_BOOTSTRAP path C postgres unix socket dir for localhost connection
860 pg_namespace PGSQL PG_BOOTSTRAP path C top level key namespace in etcd, used by patroni & vip
861 patroni_enabled PGSQL PG_BOOTSTRAP bool C if disabled, no postgres cluster will be created during init
862 patroni_mode PGSQL PG_BOOTSTRAP enum C patroni working mode: default,pause,remove
863 patroni_port PGSQL PG_BOOTSTRAP port C patroni listen port, 8008 by default
864 patroni_log_dir PGSQL PG_BOOTSTRAP path C patroni log dir, /pg/log/patroni by default
865 patroni_ssl_enabled PGSQL PG_BOOTSTRAP bool G secure patroni RestAPI communications with SSL?
866 patroni_watchdog_mode PGSQL PG_BOOTSTRAP enum C patroni watchdog mode: automatic,required,off. off by default
867 patroni_username PGSQL PG_BOOTSTRAP username C patroni restapi username, postgres by default
868 patroni_password PGSQL PG_BOOTSTRAP password C patroni restapi password, Patroni.API by default
869 patroni_citus_db PGSQL PG_BOOTSTRAP string C citus database managed by patroni, postgres by default
870 pg_conf PGSQL PG_BOOTSTRAP enum C config template: oltp,olap,crit,tiny. oltp.yml by default
871 pg_max_conn PGSQL PG_BOOTSTRAP int C postgres max connections, auto will use recommended value
872 pg_shared_buffer_ratio PGSQL PG_BOOTSTRAP float C postgres shared buffer memory ratio, 0.25 by default, 0.1~0.4
873 pg_rto PGSQL PG_BOOTSTRAP int C recovery time objective in seconds, 30s by default
874 pg_rpo PGSQL PG_BOOTSTRAP int C recovery point objective in bytes, 1MiB at most by default
875 pg_libs PGSQL PG_BOOTSTRAP string C preloaded libraries, timescaledb,pg_stat_statements,auto_explain by default
876 pg_delay PGSQL PG_BOOTSTRAP interval I replication apply delay for standby cluster leader
877 pg_checksum PGSQL PG_BOOTSTRAP bool C enable data checksum for postgres cluster?
878 pg_pwd_enc PGSQL PG_BOOTSTRAP enum C passwords encryption algorithm: md5,scram-sha-256
879 pg_encoding PGSQL PG_BOOTSTRAP enum C database cluster encoding, UTF8 by default
880 pg_locale PGSQL PG_BOOTSTRAP enum C database cluster local, C by default
881 pg_lc_collate PGSQL PG_BOOTSTRAP enum C database cluster collate, C by default
882 pg_lc_ctype PGSQL PG_BOOTSTRAP enum C database character type, en_US.UTF8 by default
890 pgbouncer_enabled PGSQL PG_BOOTSTRAP bool C if disabled, pgbouncer will not be launched on pgsql host
891 pgbouncer_port PGSQL PG_BOOTSTRAP port C pgbouncer listen port, 6432 by default
892 pgbouncer_log_dir PGSQL PG_BOOTSTRAP path C pgbouncer log dir, /pg/log/pgbouncer by default
893 pgbouncer_auth_query PGSQL PG_BOOTSTRAP bool C query postgres to retrieve unlisted business users?
894 pgbouncer_poolmode PGSQL PG_BOOTSTRAP enum C pooling mode: transaction,session,statement, transaction by default
895 pgbouncer_sslmode PGSQL PG_BOOTSTRAP enum C pgbouncer client ssl mode, disable by default
900 pg_provision PGSQL PG_PROVISION bool C provision postgres cluster after bootstrap
901 pg_init PGSQL PG_PROVISION string G/C provision init script for cluster template, pg-init by default
902 pg_default_roles PGSQL PG_PROVISION role[] G/C default roles and users in postgres cluster
903 pg_default_privileges PGSQL PG_PROVISION string[] G/C default privileges when created by admin user
904 pg_default_schemas PGSQL PG_PROVISION string[] G/C default schemas to be created
905 pg_default_extensions PGSQL PG_PROVISION extension[] G/C default extensions to be created
906 pg_reload PGSQL PG_PROVISION bool A reload postgres after hba changes
907 pg_default_hba_rules PGSQL PG_PROVISION hba[] G/C postgres default host-based authentication rules
908 pgb_default_hba_rules PGSQL PG_PROVISION hba[] G/C pgbouncer default host-based authentication rules
910 pgbackrest_enabled PGSQL PG_BACKUP bool C enable pgbackrest on pgsql host?
911 pgbackrest_clean PGSQL PG_BACKUP bool C remove pg backup data during init?
912 pgbackrest_log_dir PGSQL PG_BACKUP path C pgbackrest log dir, /pg/log/pgbackrest by default
913 pgbackrest_method PGSQL PG_BACKUP enum C pgbackrest repo method: local,minio,etc…
914 pgbackrest_repo PGSQL PG_BACKUP dict G/C pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository
921 pg_weight PGSQL PG_SERVICE int I relative load balance weight in service, 100 by default, 0-255
922 pg_service_provider PGSQL PG_SERVICE string G/C dedicate haproxy node group name, or empty string for local nodes by default
923 pg_default_service_dest PGSQL PG_SERVICE enum G/C default service destination if svc.dest=‘default’
924 pg_default_services PGSQL PG_SERVICE service[] G/C postgres default service definitions
931 pg_vip_enabled PGSQL PG_SERVICE bool C enable a l2 vip for pgsql primary? false by default
932 pg_vip_address PGSQL PG_SERVICE cidr4 C vip address in <ipv4>/<mask> format, require if vip is enabled
933 pg_vip_interface PGSQL PG_SERVICE string C/I vip network interface to listen, eth0 by default
934 pg_dns_suffix PGSQL PG_SERVICE string C pgsql dns suffix, ’’ by default
935 pg_dns_target PGSQL PG_SERVICE enum C auto, primary, vip, none, or ad hoc ip
940 pg_exporter_enabled PGSQL PG_EXPORTER bool C enable pg_exporter on pgsql hosts?
941 pg_exporter_config PGSQL PG_EXPORTER string C pg_exporter configuration file name
942 pg_exporter_cache_ttls PGSQL PG_EXPORTER string C pg_exporter collector ttl stage in seconds, ‘1,10,60,300’ by default
943 pg_exporter_port PGSQL PG_EXPORTER port C pg_exporter listen port, 9630 by default
944 pg_exporter_params PGSQL PG_EXPORTER string C extra url parameters for pg_exporter dsn
945 pg_exporter_url PGSQL PG_EXPORTER pgurl C overwrite auto-generate pg dsn if specified
946 pg_exporter_auto_discovery PGSQL PG_EXPORTER bool C enable auto database discovery? enabled by default
947 pg_exporter_exclude_database PGSQL PG_EXPORTER string C csv of database that WILL NOT be monitored during auto-discovery
948 pg_exporter_include_database PGSQL PG_EXPORTER string C csv of database that WILL BE monitored during auto-discovery
949 pg_exporter_connect_timeout PGSQL PG_EXPORTER int C pg_exporter connect timeout in ms, 200 by default
950 pg_exporter_options PGSQL PG_EXPORTER arg C overwrite extra options for pg_exporter
951 pgbouncer_exporter_enabled PGSQL PG_EXPORTER bool C enable pgbouncer_exporter on pgsql hosts?
952 pgbouncer_exporter_port PGSQL PG_EXPORTER port C pgbouncer_exporter listen port, 9631 by default
953 pgbouncer_exporter_url PGSQL PG_EXPORTER pgurl C overwrite auto-generate pgbouncer dsn if specified
954 pgbouncer_exporter_options PGSQL PG_EXPORTER arg C overwrite extra options for pgbouncer_exporter

INFRA

Parameters about pigsty infrastructure components: local yum repo, nginx, dnsmasq, prometheus, grafana, loki, alertmanager, pushgateway, blackbox_exporter, etc…


META

This section contains some metadata of current pigsty deployments, such as version string, admin node IP address, repo mirror region and http(s) proxy when downloading pacakges.

version: v2.6.0                   # pigsty version string
admin_ip: 10.10.10.10             # admin node ip address
region: default                   # upstream mirror region: default,china,europe
proxy_env:                        # global proxy env when downloading packages
  no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.*,*.myqcloud.com,*.tsinghua.edu.cn"
  # http_proxy:  # set your proxy here: e.g http://user:[email protected]
  # https_proxy: # set your proxy here: e.g http://user:[email protected]
  # all_proxy:   # set your proxy here: e.g http://user:[email protected]

version

name: version, type: string, level: G

pigsty version string

default value:v2.6.0

It will be used for pigsty introspection & content rendering.

admin_ip

name: admin_ip, type: ip, level: G

admin node ip address

default value:10.10.10.10

Node with this ip address will be treated as admin node, usually point to the first node that install Pigsty.

The default value 10.10.10.10 is a placeholder which will be replaced during configure

This parameter is referenced by many other parameters, such as:

The exact string ${admin_ip} will be replaced with the actual admin_ip for above parameters.

region

name: region, type: enum, level: G

upstream mirror region: default,china,europe

default value: default

If a region other than default is set, and there’s a corresponding entry in repo_upstream.[repo].baseurl, it will be used instead of default.

For example, if china is used, pigsty will use China mirrors designated in repo_upstream if applicable.

proxy_env

name: proxy_env, type: dict, level: G

global proxy env when downloading packages

default value:

proxy_env: # global proxy env when downloading packages
  http_proxy: 'http://username:[email protected]'
  https_proxy: 'http://username:[email protected]'
  all_proxy: 'http://username:[email protected]'
  no_proxy: "localhost,127.0.0.1,10.0.0.0/8,192.168.0.0/16,*.pigsty,*.aliyun.com,mirrors.aliyuncs.com,mirrors.tuna.tsinghua.edu.cn,mirrors.zju.edu.cn"

It’s quite important to use http proxy in restricted production environment, or your Internet access is blocked (e.g. Mainland China)


CA

Self-Signed CA used by pigsty. It is required to support advanced security features.

ca_method: create                 # create,recreate,copy, create by default
ca_cn: pigsty-ca                  # ca common name, fixed as pigsty-ca
cert_validity: 7300d              # cert validity, 20 years by default

ca_method

name: ca_method, type: enum, level: G

available options: create,recreate,copy

default value: create

  • create: Create a new CA public-private key pair if not exists, use if exists
  • recreate: Always re-create a new CA public-private key pair
  • copy: Copy the existing CA public and private keys from local files/pki/ca, abort if missing

If you already have a pair of ca.crt and ca.key, put them under files/pki/ca and set ca_method to copy.

ca_cn

name: ca_cn, type: string, level: G

ca common name, not recommending to change it.

default value: pigsty-ca

you can check that with openssl x509 -text -in /etc/pki/ca.crt

cert_validity

name: cert_validity, type: interval, level: G

cert validity, 20 years by default, which is enough for most scenarios

default value: 7300d


INFRA_ID

Infrastructure identity and portal definition.

#infra_seq: 1                     # infra node identity, explicitly required
infra_portal:                     # infra services exposed via portal
  home         : { domain: h.pigsty }
  grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" ,websocket: true }
  prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
  alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
  blackbox     : { endpoint: "${admin_ip}:9115" }
  loki         : { endpoint: "${admin_ip}:3100" }

infra_seq

name: infra_seq, type: int, level: I

infra node identity, REQUIRED, no default value, you have to assign it explicitly.

infra_portal

name: infra_portal, type: dict, level: G

infra services exposed via portal.

default value will expose home, grafana, prometheus, alertmanager via nginx with corresponding domain names.

infra_portal:                     # infra services exposed via portal
  home         : { domain: h.pigsty }
  grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" ,websocket: true }
  prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
  alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
  blackbox     : { endpoint: "${admin_ip}:9115" }
  loki         : { endpoint: "${admin_ip}:3100" }

Each record contains three subsections: key as name, representing the component name, the external access domain, and the internal TCP port, respectively. and the value contains domain, and endpoint, and other options.

  • The name definition of the default record is fixed and referenced by other modules, so do not modify the default entry names.
  • The domain is the domain name that should be used for external access to this upstream server. domain names will be added to Nginx SSL cert SAN.
  • The endpoint is an internally reachable TCP port. and ${admin_ip} will be replaced with actual admin_ip in runtime.
  • If websocket is set to true, http protocol will be auto upgraded for ws connections.
  • If scheme is given (http or https), it will be used as part of proxy_pass URL.

REPO

This section is about local software repo. Pigsty will create a local software repo (APT/YUM) when init an infra node.

In the initialization process, Pigsty will download all packages and their dependencies (specified by repo_packages) from the Internet upstream repo (specified by repo_upstream) to {{ nginx_home }} / {{ repo_name }} (default is /www/pigsty), and the total size of all dependent software is about 1GB or so.

When creating a local repo, Pigsty will skip the software download phase if the directory already exists and if there is a marker file named repo_complete in the dir.

If the download speed of some packages is too slow, you can set the download proxy to complete the first download by using the proxy_env config entry or directly download the pre-packaged offline package, which is essentially a local software source built on the same operating system.

repo_enabled: true                # create a yum repo on this infra node?
repo_home: /www                   # repo home dir, `/www` by default
repo_name: pigsty                 # repo name, pigsty by default
repo_endpoint: http://${admin_ip}:80 # access point to this repo by domain or ip:port
repo_remove: true                 # remove existing upstream repo
repo_modules: infra,node,pgsql    # install upstream repo during repo bootstrap
repo_upstream:                    # where to download
  - { name: pigsty-local   ,description: 'Pigsty Local'      ,module: local ,releases: [7,8,9] ,baseurl: { default: 'http://${admin_ip}/pigsty'  }} # used by intranet nodes
  - { name: pigsty-infra   ,description: 'Pigsty INFRA'      ,module: infra ,releases: [7,8,9] ,baseurl: { default: 'https://repo.pigsty.io/rpm/infra/$basearch' ,china: 'https://repo.pigsty.cc/rpm/infra/$basearch' }}
  - { name: pigsty-pgsql   ,description: 'Pigsty PGSQL'      ,module: pgsql ,releases: [7,8,9] ,baseurl: { default: 'https://repo.pigsty.io/rpm/pgsql/el$releasever.$basearch' ,china: 'https://repo.pigsty.cc/rpm/pgsql/el$releasever.$basearch' }}
  - { name: nginx          ,description: 'Nginx Repo'        ,module: infra ,releases: [7,8,9] ,baseurl: { default: 'https://nginx.org/packages/centos/$releasever/$basearch/' }}
  - { name: docker-ce      ,description: 'Docker CE'         ,module: infra ,releases: [7,8,9] ,baseurl: { default: 'https://download.docker.com/linux/centos/$releasever/$basearch/stable'        ,china: 'https://mirrors.aliyun.com/docker-ce/linux/centos/$releasever/$basearch/stable'  ,europe: 'https://mirrors.xtom.de/docker-ce/linux/centos/$releasever/$basearch/stable' }}
  - { name: base           ,description: 'EL 7 Base'         ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/os/$basearch/'                    ,china: 'https://mirrors.tuna.tsinghua.edu.cn/centos/$releasever/os/$basearch/'           ,europe: 'https://mirrors.xtom.de/centos/$releasever/os/$basearch/'           }}
  - { name: updates        ,description: 'EL 7 Updates'      ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/updates/$basearch/'               ,china: 'https://mirrors.tuna.tsinghua.edu.cn/centos/$releasever/updates/$basearch/'      ,europe: 'https://mirrors.xtom.de/centos/$releasever/updates/$basearch/'      }}
  - { name: extras         ,description: 'EL 7 Extras'       ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/extras/$basearch/'                ,china: 'https://mirrors.tuna.tsinghua.edu.cn/centos/$releasever/extras/$basearch/'       ,europe: 'https://mirrors.xtom.de/centos/$releasever/extras/$basearch/'       }}
  - { name: epel           ,description: 'EL 7 EPEL'         ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://download.fedoraproject.org/pub/epel/$releasever/$basearch/'            ,china: 'https://mirrors.tuna.tsinghua.edu.cn/epel/$releasever/$basearch/'                ,europe: 'https://mirrors.xtom.de/epel/$releasever/$basearch/'                }}
  - { name: centos-sclo    ,description: 'EL 7 SCLo'         ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/sclo/$basearch/sclo/'             ,china: 'https://mirrors.aliyun.com/centos/$releasever/sclo/$basearch/sclo/'              ,europe: 'https://mirrors.xtom.de/centos/$releasever/sclo/$basearch/sclo/'    }}
  - { name: centos-sclo-rh ,description: 'EL 7 SCLo rh'      ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/sclo/$basearch/rh/'               ,china: 'https://mirrors.aliyun.com/centos/$releasever/sclo/$basearch/rh/'                ,europe: 'https://mirrors.xtom.de/centos/$releasever/sclo/$basearch/rh/'      }}
  - { name: baseos         ,description: 'EL 8+ BaseOS'      ,module: node  ,releases: [  8,9] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/BaseOS/$basearch/os/'         ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/BaseOS/$basearch/os/'          ,europe: 'https://mirrors.xtom.de/rocky/$releasever/BaseOS/$basearch/os/'     }}
  - { name: appstream      ,description: 'EL 8+ AppStream'   ,module: node  ,releases: [  8,9] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/AppStream/$basearch/os/'      ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/AppStream/$basearch/os/'       ,europe: 'https://mirrors.xtom.de/rocky/$releasever/AppStream/$basearch/os/'  }}
  - { name: extras         ,description: 'EL 8+ Extras'      ,module: node  ,releases: [  8,9] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/extras/$basearch/os/'         ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/extras/$basearch/os/'          ,europe: 'https://mirrors.xtom.de/rocky/$releasever/extras/$basearch/os/'     }}
  - { name: crb            ,description: 'EL 9 CRB'          ,module: node  ,releases: [    9] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/CRB/$basearch/os/'            ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/CRB/$basearch/os/'             ,europe: 'https://mirrors.xtom.de/rocky/$releasever/CRB/$basearch/os/'        }}
  - { name: powertools     ,description: 'EL 8 PowerTools'   ,module: node  ,releases: [  8  ] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/PowerTools/$basearch/os/'     ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/PowerTools/$basearch/os/'      ,europe: 'https://mirrors.xtom.de/rocky/$releasever/PowerTools/$basearch/os/' }}
  - { name: epel           ,description: 'EL 8+ EPEL'        ,module: node  ,releases: [  8,9] ,baseurl: { default: 'http://download.fedoraproject.org/pub/epel/$releasever/Everything/$basearch/' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/epel/$releasever/Everything/$basearch/'     ,europe: 'https://mirrors.xtom.de/epel/$releasever/Everything/$basearch/'     }}
  - { name: pgdg-common    ,description: 'PostgreSQL Common' ,module: pgsql ,releases: [7,8,9] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/redhat/rhel-$releasever-$basearch' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/redhat/rhel-$releasever-$basearch' }}
  - { name: pgdg-extras    ,description: 'PostgreSQL Extra'  ,module: pgsql ,releases: [7,8,9] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/pgdg-rhel$releasever-extras/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/pgdg-rhel$releasever-extras/redhat/rhel-$releasever-$basearch' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/pgdg-rhel$releasever-extras/redhat/rhel-$releasever-$basearch' }}
  - { name: pgdg-el8fix    ,description: 'PostgreSQL EL8FIX' ,module: pgsql ,releases: [  8  ] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/pgdg-centos8-sysupdates/redhat/rhel-8-x86_64/' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/pgdg-centos8-sysupdates/redhat/rhel-8-x86_64/' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/pgdg-centos8-sysupdates/redhat/rhel-8-x86_64/' } }
  - { name: pgdg-el9fix    ,description: 'PostgreSQL EL9FIX' ,module: pgsql ,releases: [    9] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/pgdg-rocky9-sysupdates/redhat/rhel-9-x86_64/'  ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/pgdg-rocky9-sysupdates/redhat/rhel-9-x86_64/' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/pgdg-rocky9-sysupdates/redhat/rhel-9-x86_64/' }}
  - { name: pgdg15         ,description: 'PostgreSQL 15'     ,module: pgsql ,releases: [7    ] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/15/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/15/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/15/redhat/rhel-$releasever-$basearch' }}
  - { name: pgdg16         ,description: 'PostgreSQL 16'     ,module: pgsql ,releases: [  8,9] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/16/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/16/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/16/redhat/rhel-$releasever-$basearch' }}
  - { name: timescaledb    ,description: 'TimescaleDB'       ,module: pgsql ,releases: [7,8,9] ,baseurl: { default: 'https://packagecloud.io/timescale/timescaledb/el/$releasever/$basearch'  }}
repo_packages:
  - ansible python3 python3-pip python3-virtualenv python3-requests python3-jmespath python3.11-jmespath python3.11-pip dnf-utils modulemd-tools createrepo_c sshpass # Distro & Boot
  - nginx dnsmasq etcd haproxy vip-manager pg_exporter pgbackrest_exporter                                                                                            # Pigsty Addons
  - grafana loki logcli promtail prometheus2 alertmanager pushgateway node_exporter blackbox_exporter nginx_exporter keepalived_exporter                              # Infra Packages
  - lz4 unzip bzip2 zlib yum pv jq git ncdu make patch bash lsof wget uuid tuned nvme-cli numactl grubby sysstat iotop htop rsync tcpdump perf flamegraph             # Node Tools 1
  - netcat socat ftp lrzsz net-tools ipvsadm bind-utils telnet audit ca-certificates openssl openssh-clients readline vim-minimal keepalived chrony                   # Node Tools 2
  - patroni patroni-etcd pgbouncer pgbadger pgbackrest pgloader pg_activity pg_filedump timescaledb-tools scws libduckdb pgFormatter # pgxnclient missing in el9      # PGSQL Common Tools
  - postgresql16* pg_repack_16* wal2json_16* passwordcheck_cracklib_16* pglogical_16* pg_cron_16* postgis34_16* timescaledb-2-postgresql-16* pgvector_16* citus_16*   # PGDG 16 Packages
  - pgml_16* pg_graphql_16 pg_net_16* pgsql-http_16* pgsql-gzip_16* vault_16 pgjwt_16 pg_tle_16* pg_roaringbitmap_16* pointcloud_16* zhparser_16* hydra_16* apache-age_16* duckdb_fdw_16* pg_sparse_16* pg_sparse_16* pg_bm25_16* pg_analytics_16*
  - orafce_16* mongo_fdw_16* tds_fdw_16* mysql_fdw_16 hdfs_fdw_16 sqlite_fdw_16 pgbouncer_fdw_16 powa_16* pg_stat_kcache_16* pg_stat_monitor_16* pg_qualstats_16 pg_track_settings_16 pg_wait_sampling_16 hll_16 pgaudit_16
  - plprofiler_16* plsh_16* pldebugger_16 plpgsql_check_16* pgtt_16 pgq_16* pgsql_tweaks_16 count_distinct_16 hypopg_16 timestamp9_16* semver_16* prefix_16* periods_16 ip4r_16 tdigest_16 pgmp_16 extra_window_functions_16 topn_16
  - pg_background_16 e-maj_16 pg_prioritize_16 pgcryptokey_16 logerrors_16 pg_top_16 pg_comparator_16 pg_ivm_16* pgsodium_16* pgfincore_16* ddlx_16 credcheck_16 safeupdate_16 pg_squeeze_16* pg_fkpart_16 pg_jobmon_16
  - pg_partman_16 pg_permissions_16 pgexportdoc_16 pgimportdoc_16 pg_statement_rollback_16* pg_hint_plan_16* pg_auth_mon_16 pg_checksums_16 pg_failover_slots_16 pg_readonly_16* pg_uuidv7_16* set_user_16* rum_16
  - redis_exporter mysqld_exporter mongodb_exporter docker-ce docker-compose-plugin redis minio mcli ferretdb duckdb sealos  # Miscellaneous Packages
  #- mysqlcompat_16 system_stats_16 multicorn2_16* plproxy_16 geoip_16 pgcopydb_16 pg_catcheck_16 pg_store_plans_16* postgresql-unit_16 # not available for PG 16 yet
repo_url_packages:
  - https://repo.pigsty.cc/etc/pev.html
  - https://repo.pigsty.cc/etc/chart.tgz
  - https://repo.pigsty.cc/etc/plugins.tgz

repo_enabled

name: repo_enabled, type: bool, level: G/I

create a yum repo on this infra node? default value: true

If you have multiple infra nodes, you can disable yum repo on other standby nodes to reduce Internet traffic.

repo_home

name: repo_home, type: path, level: G

repo home dir, /www by default

repo_name

name: repo_name, type: string, level: G

repo name, pigsty by default, it is not wise to change this value

repo_endpoint

name: repo_endpoint, type: url, level: G

access point to this repo by domain or ip:port, default value: http://${admin_ip}:80

If you have changed the nginx_port or nginx_ssl_port, or use a different infra node from admin node, please adjust this parameter accordingly.

The ${admin_ip} will be replaced with actual admin_ip during runtime.

repo_remove

name: repo_remove, type: bool, level: G/A

remove existing upstream repo, default value: true

If you want to keep existing upstream repo, set this value to false.

repo_modules

name: repo_modules, type: string, level: G/A

which repo modules are installed in repo_upstream, default value: infra,node,pgsql

This is a comma separated value string, it is used to filter entries in repo_upstream with corresponding module field.

For Ubuntu / Debian users, you can add redis to the list: infra,node,pgsql,redis

repo_upstream

name: repo_upstream, type: upstream[], level: G

where to download upstream packages, default values are for EL 7/8/9:

repo_upstream:                    # where to download
  - { name: pigsty-local   ,description: 'Pigsty Local'      ,module: local ,releases: [7,8,9] ,baseurl: { default: 'http://${admin_ip}/pigsty'  }} # used by intranet nodes
  - { name: pigsty-infra   ,description: 'Pigsty INFRA'      ,module: infra ,releases: [7,8,9] ,baseurl: { default: 'https://repo.pigsty.io/rpm/infra/$basearch' ,china: 'https://repo.pigsty.cc/rpm/infra/$basearch' }}
  - { name: pigsty-pgsql   ,description: 'Pigsty PGSQL'      ,module: pgsql ,releases: [7,8,9] ,baseurl: { default: 'https://repo.pigsty.io/rpm/pgsql/el$releasever.$basearch' ,china: 'https://repo.pigsty.cc/rpm/pgsql/el$releasever.$basearch' }}
  - { name: nginx          ,description: 'Nginx Repo'        ,module: infra ,releases: [7,8,9] ,baseurl: { default: 'https://nginx.org/packages/centos/$releasever/$basearch/' }}
  - { name: docker-ce      ,description: 'Docker CE'         ,module: infra ,releases: [7,8,9] ,baseurl: { default: 'https://download.docker.com/linux/centos/$releasever/$basearch/stable'        ,china: 'https://mirrors.aliyun.com/docker-ce/linux/centos/$releasever/$basearch/stable'  ,europe: 'https://mirrors.xtom.de/docker-ce/linux/centos/$releasever/$basearch/stable' }}
  - { name: base           ,description: 'EL 7 Base'         ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/os/$basearch/'                    ,china: 'https://mirrors.tuna.tsinghua.edu.cn/centos/$releasever/os/$basearch/'           ,europe: 'https://mirrors.xtom.de/centos/$releasever/os/$basearch/'           }}
  - { name: updates        ,description: 'EL 7 Updates'      ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/updates/$basearch/'               ,china: 'https://mirrors.tuna.tsinghua.edu.cn/centos/$releasever/updates/$basearch/'      ,europe: 'https://mirrors.xtom.de/centos/$releasever/updates/$basearch/'      }}
  - { name: extras         ,description: 'EL 7 Extras'       ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/extras/$basearch/'                ,china: 'https://mirrors.tuna.tsinghua.edu.cn/centos/$releasever/extras/$basearch/'       ,europe: 'https://mirrors.xtom.de/centos/$releasever/extras/$basearch/'       }}
  - { name: epel           ,description: 'EL 7 EPEL'         ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://download.fedoraproject.org/pub/epel/$releasever/$basearch/'            ,china: 'https://mirrors.tuna.tsinghua.edu.cn/epel/$releasever/$basearch/'                ,europe: 'https://mirrors.xtom.de/epel/$releasever/$basearch/'                }}
  - { name: centos-sclo    ,description: 'EL 7 SCLo'         ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/sclo/$basearch/sclo/'             ,china: 'https://mirrors.aliyun.com/centos/$releasever/sclo/$basearch/sclo/'              ,europe: 'https://mirrors.xtom.de/centos/$releasever/sclo/$basearch/sclo/'    }}
  - { name: centos-sclo-rh ,description: 'EL 7 SCLo rh'      ,module: node  ,releases: [7    ] ,baseurl: { default: 'http://mirror.centos.org/centos/$releasever/sclo/$basearch/rh/'               ,china: 'https://mirrors.aliyun.com/centos/$releasever/sclo/$basearch/rh/'                ,europe: 'https://mirrors.xtom.de/centos/$releasever/sclo/$basearch/rh/'      }}
  - { name: baseos         ,description: 'EL 8+ BaseOS'      ,module: node  ,releases: [  8,9] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/BaseOS/$basearch/os/'         ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/BaseOS/$basearch/os/'          ,europe: 'https://mirrors.xtom.de/rocky/$releasever/BaseOS/$basearch/os/'     }}
  - { name: appstream      ,description: 'EL 8+ AppStream'   ,module: node  ,releases: [  8,9] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/AppStream/$basearch/os/'      ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/AppStream/$basearch/os/'       ,europe: 'https://mirrors.xtom.de/rocky/$releasever/AppStream/$basearch/os/'  }}
  - { name: extras         ,description: 'EL 8+ Extras'      ,module: node  ,releases: [  8,9] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/extras/$basearch/os/'         ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/extras/$basearch/os/'          ,europe: 'https://mirrors.xtom.de/rocky/$releasever/extras/$basearch/os/'     }}
  - { name: crb            ,description: 'EL 9 CRB'          ,module: node  ,releases: [    9] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/CRB/$basearch/os/'            ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/CRB/$basearch/os/'             ,europe: 'https://mirrors.xtom.de/rocky/$releasever/CRB/$basearch/os/'        }}
  - { name: powertools     ,description: 'EL 8 PowerTools'   ,module: node  ,releases: [  8  ] ,baseurl: { default: 'https://dl.rockylinux.org/pub/rocky/$releasever/PowerTools/$basearch/os/'     ,china: 'https://mirrors.aliyun.com/rockylinux/$releasever/PowerTools/$basearch/os/'      ,europe: 'https://mirrors.xtom.de/rocky/$releasever/PowerTools/$basearch/os/' }}
  - { name: epel           ,description: 'EL 8+ EPEL'        ,module: node  ,releases: [  8,9] ,baseurl: { default: 'http://download.fedoraproject.org/pub/epel/$releasever/Everything/$basearch/' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/epel/$releasever/Everything/$basearch/'     ,europe: 'https://mirrors.xtom.de/epel/$releasever/Everything/$basearch/'     }}
  - { name: pgdg-common    ,description: 'PostgreSQL Common' ,module: pgsql ,releases: [7,8,9] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/redhat/rhel-$releasever-$basearch' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/redhat/rhel-$releasever-$basearch' }}
  - { name: pgdg-extras    ,description: 'PostgreSQL Extra'  ,module: pgsql ,releases: [7,8,9] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/pgdg-rhel$releasever-extras/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/pgdg-rhel$releasever-extras/redhat/rhel-$releasever-$basearch' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/pgdg-rhel$releasever-extras/redhat/rhel-$releasever-$basearch' }}
  - { name: pgdg-el8fix    ,description: 'PostgreSQL EL8FIX' ,module: pgsql ,releases: [  8  ] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/pgdg-centos8-sysupdates/redhat/rhel-8-x86_64/' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/pgdg-centos8-sysupdates/redhat/rhel-8-x86_64/' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/pgdg-centos8-sysupdates/redhat/rhel-8-x86_64/' } }
  - { name: pgdg-el9fix    ,description: 'PostgreSQL EL9FIX' ,module: pgsql ,releases: [    9] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/common/pgdg-rocky9-sysupdates/redhat/rhel-9-x86_64/'  ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/common/pgdg-rocky9-sysupdates/redhat/rhel-9-x86_64/' , europe: 'https://mirrors.xtom.de/postgresql/repos/yum/common/pgdg-rocky9-sysupdates/redhat/rhel-9-x86_64/' }}
  - { name: pgdg15         ,description: 'PostgreSQL 15'     ,module: pgsql ,releases: [7    ] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/15/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/15/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/15/redhat/rhel-$releasever-$basearch' }}
  - { name: pgdg16         ,description: 'PostgreSQL 16'     ,module: pgsql ,releases: [  8,9] ,baseurl: { default: 'https://download.postgresql.org/pub/repos/yum/16/redhat/rhel-$releasever-$basearch' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/yum/16/redhat/rhel-$releasever-$basearch' ,europe: 'https://mirrors.xtom.de/postgresql/repos/yum/16/redhat/rhel-$releasever-$basearch' }}
  - { name: timescaledb    ,description: 'TimescaleDB'       ,module: pgsql ,releases: [7,8,9] ,baseurl: { default: 'https://packagecloud.io/timescale/timescaledb/el/$releasever/$basearch'  }}

For Debian (11,12) / Ubuntu (20.04,22.04) the proper value needs to be explicitly specified in global/cluster/host vars:

repo_upstream:                    # where to download
  - { name: pigsty-local  ,description: 'Pigsty Local'     ,module: local ,releases: [11,12,20,22] ,baseurl: { default: 'http://${admin_ip}/pigsty ./' }}
  - { name: pigsty-pgsql  ,description: 'Pigsty PgSQL'     ,module: pgsql ,releases: [11,12,20,22] ,baseurl: { default: 'https://repo.pigsty.io/deb/pgsql/${distro_codename}.amd64/ ./', china: 'https://repo.pigsty.cc/deb/pgsql/${distro_codename}.amd64/ ./' }}
  - { name: pigsty-infra  ,description: 'Pigsty Infra'     ,module: infra ,releases: [11,12,20,22] ,baseurl: { default: 'https://repo.pigsty.io/deb/infra/amd64/ ./', china: 'https://repo.pigsty.cc/deb/infra/amd64/ ./' }}
  - { name: nginx         ,description: 'Nginx'            ,module: infra ,releases: [11,12,20,22] ,baseurl: { default: 'http://nginx.org/packages/mainline/${distro_name} ${distro_codename} nginx' }}
  - { name: base          ,description: 'Debian Basic'     ,module: node  ,releases: [11,12      ] ,baseurl: { default: 'http://deb.debian.org/debian/ ${distro_codename} main non-free-firmware'         ,china: 'https://mirrors.aliyun.com/debian/ ${distro_codename} main restricted universe multiverse' }}
  - { name: updates       ,description: 'Debian Updates'   ,module: node  ,releases: [11,12      ] ,baseurl: { default: 'http://deb.debian.org/debian/ ${distro_codename}-updates main non-free-firmware' ,china: 'https://mirrors.aliyun.com/debian/ ${distro_codename}-updates main restricted universe multiverse' }}
  - { name: security      ,description: 'Debian Security'  ,module: node  ,releases: [11,12      ] ,baseurl: { default: 'http://security.debian.org/debian-security ${distro_codename}-security main non-free-firmware' }}
  - { name: base          ,description: 'Ubuntu Basic'     ,module: node  ,releases: [      20,22] ,baseurl: { default: 'https://mirrors.edge.kernel.org/${distro_name}/ ${distro_codename}   main universe multiverse restricted' ,china: 'https://mirrors.aliyun.com/${distro_name}/ ${distro_codename}   main restricted universe multiverse' }}
  - { name: updates       ,description: 'Ubuntu Updates'   ,module: node  ,releases: [      20,22] ,baseurl: { default: 'https://mirrors.edge.kernel.org/ubuntu/ ${distro_codename}-backports main restricted universe multiverse' ,china: 'https://mirrors.aliyun.com/ubuntu/ ${distro_codename}-updates   main restricted universe multiverse' }}
  - { name: backports     ,description: 'Ubuntu Backports' ,module: node  ,releases: [      20,22] ,baseurl: { default: 'https://mirrors.edge.kernel.org/ubuntu/ ${distro_codename}-security  main restricted universe multiverse' ,china: 'https://mirrors.aliyun.com/ubuntu/ ${distro_codename}-backports main restricted universe multiverse' }}
  - { name: security      ,description: 'Ubuntu Security'  ,module: node  ,releases: [      20,22] ,baseurl: { default: 'https://mirrors.edge.kernel.org/ubuntu/ ${distro_codename}-updates   main restricted universe multiverse' ,china: 'https://mirrors.aliyun.com/ubuntu/ ${distro_codename}-security  main restricted universe multiverse' }}
  - { name: pgdg          ,description: 'PGDG'             ,module: pgsql ,releases: [11,12,20,22] ,baseurl: { default: 'http://apt.postgresql.org/pub/repos/apt/ ${distro_codename}-pgdg main' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/postgresql/repos/apt/ ${distro_codename}-pgdg main' }}
  - { name: citus         ,description: 'Citus'            ,module: pgsql ,releases: [11,12,20,22] ,baseurl: { default: 'https://packagecloud.io/citusdata/community/${distro_name}/ ${distro_codename} main'   }}
  - { name: timescaledb   ,description: 'Timescaledb'      ,module: pgsql ,releases: [11,12,20,22] ,baseurl: { default: 'https://packagecloud.io/timescale/timescaledb/${distro_name}/ ${distro_codename} main' }}
  - { name: redis         ,description: 'Redis'            ,module: redis ,releases: [11,12,20,22] ,baseurl: { default: 'https://packages.redis.io/deb ${distro_codename} main' }}
  - { name: docker-ce     ,description: 'Docker'           ,module: infra ,releases: [11,12,20,22] ,baseurl: { default: 'https://download.docker.com/linux/${distro_name} ${distro_codename} stable' ,china: 'https://mirrors.tuna.tsinghua.edu.cn/docker-ce/linux//${distro_name} ${distro_codename} stable' }}

Pigsty build.yml will have the default value for each OS.

repo_packages

name: repo_packages, type: string[], level: G

which packages to be included, default values:

repo_packages:
  - ansible python3 python3-pip python3-virtualenv python3-requests python3-jmespath python3.11-jmespath python3.11-pip dnf-utils modulemd-tools createrepo_c sshpass # Distro & Boot
  - nginx dnsmasq etcd haproxy vip-manager pg_exporter pgbackrest_exporter                                                                                            # Pigsty Addons
  - grafana loki logcli promtail prometheus2 alertmanager pushgateway node_exporter blackbox_exporter nginx_exporter keepalived_exporter                              # Infra Packages
  - lz4 unzip bzip2 zlib yum pv jq git ncdu make patch bash lsof wget uuid tuned nvme-cli numactl grubby sysstat iotop htop rsync tcpdump perf flamegraph             # Node Tools 1
  - netcat socat ftp lrzsz net-tools ipvsadm bind-utils telnet audit ca-certificates openssl openssh-clients readline vim-minimal keepalived chrony                   # Node Tools 2
  - patroni patroni-etcd pgbouncer pgbadger pgbackrest pgloader pg_activity pg_filedump timescaledb-tools scws libduckdb pgFormatter # pgxnclient missing in el9      # PGSQL Common Tools
  - postgresql16* pg_repack_16* wal2json_16* passwordcheck_cracklib_16* pglogical_16* pg_cron_16* postgis34_16* timescaledb-2-postgresql-16* pgvector_16* citus_16*   # PGDG 16 Packages
  - pgml_16* pg_graphql_16 pg_net_16* pgsql-http_16* pgsql-gzip_16* vault_16 pgjwt_16 pg_tle_16* pg_roaringbitmap_16* pointcloud_16* zhparser_16* hydra_16* apache-age_16* duckdb_fdw_16* pg_sparse_16* pg_sparse_16* pg_bm25_16* pg_analytics_16*
  - orafce_16* mongo_fdw_16* tds_fdw_16* mysql_fdw_16 hdfs_fdw_16 sqlite_fdw_16 pgbouncer_fdw_16 powa_16* pg_stat_kcache_16* pg_stat_monitor_16* pg_qualstats_16 pg_track_settings_16 pg_wait_sampling_16 hll_16 pgaudit_16
  - plprofiler_16* plsh_16* pldebugger_16 plpgsql_check_16* pgtt_16 pgq_16* pgsql_tweaks_16 count_distinct_16 hypopg_16 timestamp9_16* semver_16* prefix_16* periods_16 ip4r_16 tdigest_16 pgmp_16 extra_window_functions_16 topn_16
  - pg_background_16 e-maj_16 pg_prioritize_16 pgcryptokey_16 logerrors_16 pg_top_16 pg_comparator_16 pg_ivm_16* pgsodium_16* pgfincore_16* ddlx_16 credcheck_16 safeupdate_16 pg_squeeze_16* pg_fkpart_16 pg_jobmon_16
  - pg_partman_16 pg_permissions_16 pgexportdoc_16 pgimportdoc_16 pg_statement_rollback_16* pg_hint_plan_16* pg_auth_mon_16 pg_checksums_16 pg_failover_slots_16 pg_readonly_16* pg_uuidv7_16* set_user_16* rum_16
  - redis_exporter mysqld_exporter mongodb_exporter docker-ce docker-compose-plugin redis minio mcli ferretdb duckdb sealos  # Miscellaneous Packages
  #- mysqlcompat_16 system_stats_16 multicorn2_16* plproxy_16 geoip_16 pgcopydb_16 pg_catcheck_16 pg_store_plans_16* postgresql-unit_16 # not available for PG 16 yet

Each line is a set of package names separated by spaces, where the specified software will be downloaded via repotrack.

EL7 packages is slightly different, here are some ad hoc packages:

  • EL7: python36-requests python36-idna yum-utils yum-utils, and postgis33
  • EL8: python3.11-jmespath dnf-utils modulemd-tools, and postgis34
  • EL9: Same as EL8, Missing pgxnclient yet, add python3-jmespath

For debian/ubuntu, the proper value needs to be explicitly specified in global/cluster/host vars:

repo_packages:                    # which packages to be included
  - ansible python3 python3-pip python3-venv python3-jmespath dpkg-dev sshpass                                                                        # Distro & Boot
  - nginx dnsmasq etcd haproxy vip-manager pg-exporter pgbackrest-exporter                                                                            # Pigsty Addon
  - grafana loki logcli promtail prometheus2 alertmanager pushgateway node-exporter blackbox-exporter nginx-exporter keepalived-exporter              # Infra Packages
  - redis-exporter mysqld-exporter mongodb-exporter docker-ce docker-compose-plugin redis minio mcli ferretdb duckdb sealos                           # Miscellaneous
  - lz4 unzip bzip2 zlib1g pv jq git ncdu make patch bash lsof wget uuid tuned nvme-cli numactl sysstat iotop htop rsync tcpdump linux-tools-generic  # Node Tools 1
  - netcat socat ftp lrzsz net-tools ipvsadm dnsutils telnet ca-certificates openssl openssh-client libreadline-dev vim-tiny keepalived acl chrony    # Node Tools 2
  - patroni pgbouncer pgbackrest pgbadger pgloader pg-activity pgloader pg-activity postgresql-filedump pgxnclient pgformatter                        # PGSQL Packages
  - postgresql-client-16 postgresql-16 postgresql-server-dev-16 postgresql-plpython3-16 postgresql-plperl-16 postgresql-pltcl-16 postgresql-16-wal2json postgresql-16-repack
  - postgresql-16-postgis-3 postgresql-16-postgis-3-scripts postgresql-16-citus-12.1 postgresql-16-pgvector timescaledb-2-postgresql-16               # PGDG 16 Extensions
  - postgresql-16-age postgresql-16-asn1oid postgresql-16-auto-failover postgresql-16-bgw-replstatus postgresql-16-pg-catcheck postgresql-16-pg-checksums postgresql-16-credcheck postgresql-16-cron postgresql-16-debversion postgresql-16-decoderbufs postgresql-16-dirtyread postgresql-16-extra-window-functions postgresql-16-first-last-agg postgresql-16-hll postgresql-16-hypopg postgresql-16-icu-ext postgresql-16-ip4r postgresql-16-jsquery
  - postgresql-16-londiste-sql postgresql-16-mimeo postgresql-16-mysql-fdw postgresql-16-numeral postgresql-16-ogr-fdw postgresql-16-omnidb postgresql-16-oracle-fdw postgresql-16-orafce postgresql-16-partman postgresql-16-periods postgresql-16-pgaudit postgresql-16-pgauditlogtofile postgresql-16-pgextwlist postgresql-16-pg-fact-loader postgresql-16-pg-failover-slots postgresql-16-pgfincore postgresql-16-pgl-ddl-deploy postgresql-16-pglogical postgresql-16-pglogical-ticker
  - postgresql-16-pgmemcache postgresql-16-pgmp postgresql-16-pgpcre postgresql-16-pgq3 postgresql-16-pgq-node postgresql-16-pg-qualstats postgresql-16-pgsphere postgresql-16-pg-stat-kcache postgresql-16-pgtap postgresql-16-pg-track-settings postgresql-16-pg-wait-sampling postgresql-16-pldebugger postgresql-16-pllua postgresql-16-plpgsql-check postgresql-16-plprofiler postgresql-16-plproxy postgresql-16-plsh postgresql-16-pointcloud
  - postgresql-16-powa postgresql-16-prefix postgresql-16-preprepare postgresql-16-prioritize postgresql-16-q3c postgresql-16-rational postgresql-16-rum postgresql-16-semver postgresql-16-set-user postgresql-16-show-plans postgresql-16-similarity postgresql-16-snakeoil postgresql-16-squeeze postgresql-16-tablelog postgresql-16-tdigest postgresql-16-tds-fdw postgresql-16-toastinfo postgresql-16-topn postgresql-16-unit
  - pg-graphql pg-net pg-analytics pg-bm25 pg-sparse
  #- postgresql-16-pljava postgresql-16-plr postgresql-16-rdkit postgresql-16-pgrouting-doc postgresql-16-pgrouting postgresql-16-pgrouting-scripts postgresql-16-h3 postgresql-16-pgdg-pgroonga postgresql-16-pgpool2 postgresql-16-slony1-2 postgresql-16-repmgr

There are some differences between Ubuntu / Debian too:

  • Ubuntu 22.04: postgresql-pgml-15, postgresql-15-rdkit, linux-tools-generic(perf), netcat, ftp
  • Ubuntu 20.04: postgresql-15-rdkit not available. postgresql-15-postgis-3 must be installed online (without local repo)
  • Debian 12: netcat -> netcat-openbsdftp -> tnftplinux-tools-generic(perf) -> linux-perf, the rest is same as Ubuntu
  • Debian 11: Same as Debian 12, except for postgresql-15-rdkit not available

Each line is a set of package names separated by spaces, where the specified software and their dependencies will be downloaded via repotrack or apt download accordingly.

Pigsty build.yml will have the default value for each OS.

repo_url_packages

name: repo_url_packages, type: string[], level: G

extra packages from url, default values:

repo_url_packages:
  - https://repo.pigsty.cc/etc/pev.html     # postgres explain visualizer
  - https://repo.pigsty.cc/etc/chart.tgz    # grafana extra map geojson data
  - https://repo.pigsty.cc/etc/plugins.tgz  # grafana plugins

These are optional add-ons, which will be downloaded via URL from the Internet directly.

For example, if you don’t download the plugins.tgz, Pigsty will download it later during grafana setup.


INFRA_PACKAGE

These packages are installed on infra nodes only, including common rpm/deb/pip packages.

infra_packages

name: infra_packages, type: string[], level: G

packages to be installed on infra nodes, default value:

infra_packages:                   # packages to be installed on infra nodes
  - grafana,loki,logcli,promtail,prometheus2,alertmanager,pushgateway
  - node_exporter,blackbox_exporter,nginx_exporter,pg_exporter
  - nginx,dnsmasq,ansible,etcd,python3-requests,redis,mcli

Default value for Debian/Ubuntu should be explicitly overwrite:

infra_packages:                   # packages to be installed on infra nodes
  - grafana,loki,logcli,promtail,prometheus2,alertmanager,pushgateway,blackbox-exporter
  - node-exporter,blackbox-exporter,nginx-exporter,redis-exporter,pg-exporter
  - nginx,dnsmasq,ansible,etcd,python3-requests,redis,mcli

infra_packages_pip

name: infra_packages_pip, type: string, level: G

pip installed packages for infra nodes, default value is empty string


NGINX

Pigsty exposes all Web services through Nginx: Home Page, Grafana, Prometheus, AlertManager, etc…, and other optional tools such as PGWe, Jupyter Lab, Pgadmin, Bytebase ,and other static resource & report such as pev, schemaspy & pgbadger

This nginx also serves as a local yum/apt repo.

nginx_enabled: true               # enable nginx on this infra node?
nginx_exporter_enabled: true      # enable nginx_exporter on this infra node?
nginx_sslmode: enable             # nginx ssl mode? disable,enable,enforce
nginx_home: /www                  # nginx content dir, `/www` by default
nginx_port: 80                    # nginx listen port, 80 by default
nginx_ssl_port: 443               # nginx ssl listen port, 443 by default
nginx_navbar:                     # nginx index page navigation links
  - { name: CA Cert ,url: '/ca.crt'   ,desc: 'pigsty self-signed ca.crt'   }
  - { name: Package ,url: '/pigsty'   ,desc: 'local yum repo packages'     }
  - { name: PG Logs ,url: '/logs'     ,desc: 'postgres raw csv logs'       }
  - { name: Reports ,url: '/report'   ,desc: 'pgbadger summary report'     }
  - { name: Explain ,url: '/pigsty/pev.html' ,desc: 'postgres explain visualizer' }

nginx_enabled

name: nginx_enabled, type: bool, level: G/I

enable nginx on this infra node? default value: true

nginx_exporter_enabled

name: nginx_exporter_enabled, type: bool, level: G/I

enable nginx_exporter on this infra node? default value: true.

set to false will disable /nginx health check stub too: If your nginx does not support /nginx stub, you can set this value to false to disable it.

nginx_sslmode

name: nginx_sslmode, type: enum, level: G

nginx ssl mode? which could be: disable, enable, enforce, the default value: enable

  • disable: listen on nginx_port and serve plain HTTP only
  • enable: also listen on nginx_ssl_port and serve HTTPS
  • enforce: all links will be rendered as https:// by default

nginx_home

name: nginx_home, type: path, level: G

nginx web server static content dir, /www by default

Nginx root directory which contains static resource and repo resource. It’s wise to set this value same as repo_home so that local repo content is automatically served.

nginx_port

name: nginx_port, type: port, level: G

nginx listen port which serves the HTTP requests, 80 by default.

If your default 80 port is occupied or unavailable, you can consider using another port, and change repo_endpoint and repo_upstream (the local entry) accordingly.

nginx_ssl_port

name: nginx_ssl_port, type: port, level: G

nginx ssl listen port, 443 by default

nginx_navbar

name: nginx_navbar, type: index[], level: G

nginx index page navigation links

default value:

nginx_navbar:                     # nginx index page navigation links
  - { name: CA Cert ,url: '/ca.crt'   ,desc: 'pigsty self-signed ca.crt'   }
  - { name: Package ,url: '/pigsty'   ,desc: 'local yum repo packages'     }
  - { name: PG Logs ,url: '/logs'     ,desc: 'postgres raw csv logs'       }
  - { name: Reports ,url: '/report'   ,desc: 'pgbadger summary report'     }
  - { name: Explain ,url: '/pigsty/pev.html' ,desc: 'postgres explain visualizer' }

Each record is rendered as a navigation link to the Pigsty home page App drop-down menu, and the apps are all optional, mounted by default on the Pigsty default server under http://pigsty/.

The url parameter specifies the URL PATH for the app, with the exception that if the ${grafana} string is present in the URL, it will be automatically replaced with the Grafana domain name defined in infra_portal.


DNS

Pigsty will launch a default DNSMASQ server on infra nodes to serve DNS inquiry. such as h.pigsty a.pigsty p.pigsty g.pigsty and sss.pigsty for optional MinIO service.

All records will be added to infra node’s /etc/hosts.d/*.

You have to add nameserver {{ admin_ip }} to your /etc/resolv to use this dns server, and node_dns_servers will do the trick.

dns_enabled: true                 # setup dnsmasq on this infra node?
dns_port: 53                      # dns server listen port, 53 by default
dns_records:                      # dynamic dns records resolved by dnsmasq
  - "${admin_ip} h.pigsty a.pigsty p.pigsty g.pigsty"
  - "${admin_ip} api.pigsty adm.pigsty cli.pigsty ddl.pigsty lab.pigsty git.pigsty sss.pigsty wiki.pigsty"

dns_enabled

name: dns_enabled, type: bool, level: G/I

setup dnsmasq on this infra node? default value: true

If you don’t want to use the default DNS server, you can set this value to false to disable it. And use node_default_etc_hosts and node_etc_hosts instead.

dns_port

name: dns_port, type: port, level: G

dns server listen port, 53 by default

dns_records

name: dns_records, type: string[], level: G

dynamic dns records resolved by dnsmasq, Some auxiliary domain names will be written to /etc/hosts.d/default on infra nodes by default

dns_records:                      # dynamic dns records resolved by dnsmasq
  - "${admin_ip} h.pigsty a.pigsty p.pigsty g.pigsty"
  - "${admin_ip} api.pigsty adm.pigsty cli.pigsty ddl.pigsty lab.pigsty git.pigsty sss.pigsty wiki.pigsty"

PROMETHEUS

Prometheus is used as time-series database for metrics scrape, storage & analysis.

prometheus_enabled: true          # enable prometheus on this infra node?
prometheus_clean: true            # clean prometheus data during init?
prometheus_data: /data/prometheus # prometheus data dir, `/data/prometheus` by default
prometheus_sd_dir: /etc/prometheus/targets # prometheus file service discovery directory
prometheus_sd_interval: 5s        # prometheus target refresh interval, 5s by default
prometheus_scrape_interval: 10s   # prometheus scrape & eval interval, 10s by default
prometheus_scrape_timeout: 8s     # prometheus global scrape timeout, 8s by default
prometheus_options: '--storage.tsdb.retention.time=15d' # prometheus extra server options
pushgateway_enabled: true         # setup pushgateway on this infra node?
pushgateway_options: '--persistence.interval=1m' # pushgateway extra server options
blackbox_enabled: true            # setup blackbox_exporter on this infra node?
blackbox_options: ''              # blackbox_exporter extra server options
alertmanager_enabled: true        # setup alertmanager on this infra node?
alertmanager_options: ''          # alertmanager extra server options
exporter_metrics_path: /metrics   # exporter metric path, `/metrics` by default
exporter_install: none            # how to install exporter? none,yum,binary
exporter_repo_url: ''             # exporter repo file url if install exporter via yum

prometheus_enabled

name: prometheus_enabled, type: bool, level: G/I

enable prometheus on this infra node?

default value: true

prometheus_clean

name: prometheus_clean, type: bool, level: G/A

clean prometheus data during init? default value: true

prometheus_data

name: prometheus_data, type: path, level: G

prometheus data dir, /data/prometheus by default

prometheus_sd_dir

name: prometheus_sd_dir, type: path, level: G, default value: /etc/prometheus/targets

prometheus static file service discovery target dir, prometheus will find dynamic monitoring targets from this directory.

prometheus_sd_interval

name: prometheus_sd_interval, type: interval, level: G, default value: 5s

Prometheus will check prometheus_sd_interval dir per 5s by default to find out new monitoring targets.

prometheus_scrape_interval

name: prometheus_scrape_interval, type: interval, level: G

prometheus scrape & eval interval, 10s by default

prometheus_scrape_timeout

name: prometheus_scrape_timeout, type: interval, level: G

prometheus global scrape timeout, 8s by default

DO NOT set this larger than prometheus_scrape_interval

prometheus_options

name: prometheus_options, type: arg, level: G

prometheus extra server options

default value: --storage.tsdb.retention.time=15d

Extra cli args for prometheus server, the default value will set up a 15-day data retention to limit disk usage.

pushgateway_enabled

name: pushgateway_enabled, type: bool, level: G/I

setup pushgateway on this infra node? default value: true

pushgateway_options

name: pushgateway_options, type: arg, level: G

pushgateway extra server options, default value: --persistence.interval=1m

blackbox_enabled

name: blackbox_enabled, type: bool, level: G/I

setup blackbox_exporter on this infra node? default value: true

blackbox_options

name: blackbox_options, type: arg, level: G

blackbox_exporter extra server options, default value is empty string

alertmanager_enabled

name: alertmanager_enabled, type: bool, level: G/I

setup alertmanager on this infra node? default value: true

alertmanager_options

name: alertmanager_options, type: arg, level: G

alertmanager extra server options, default value is empty string

exporter_metrics_path

name: exporter_metrics_path, type: path, level: G

exporter metric path, /metrics by default

exporter_install

name: exporter_install, type: enum, level: G

(OBSOLETE) how to install exporter? none,yum,binary

default value: none

Specify how to install Exporter:

  • none: No installation, (by default, the Exporter has been previously installed by the node_pkg task)
  • yum: Install using yum (if yum installation is enabled, run yum to install node_exporter and pg_exporter before deploying Exporter)
  • binary: Install using a copy binary (copy node_exporter and pg_exporter binary directly from the meta node, not recommended)

When installing with yum, if exporter_repo_url is specified (not empty), the installation will first install the REPO file under that URL into /etc/yum.repos.d. This feature allows you to install Exporter directly without initializing the node infrastructure. It is not recommended for regular users to use binary installation. This mode is usually used for emergency troubleshooting and temporary problem fixes.

<meta>:<pigsty>/files/node_exporter ->  <target>:/usr/bin/node_exporter
<meta>:<pigsty>/files/pg_exporter   ->  <target>:/usr/bin/pg_exporter

exporter_repo_url

name: exporter_repo_url, type: url, level: G

(OBSOLETE) exporter repo file url if install exporter via yum

default value is empty string

Default is empty; when exporter_install is yum, the repo specified by this parameter will be added to the node source list.


GRAFANA

Grafana is the visualization platform for Pigsty’s monitoring system.

It can also be used as a low code data visualization environment

grafana_enabled: true             # enable grafana on this infra node?
grafana_clean: true               # clean grafana data during init?
grafana_admin_username: admin     # grafana admin username, `admin` by default
grafana_admin_password: pigsty    # grafana admin password, `pigsty` by default
grafana_plugin_cache: /www/pigsty/plugins.tgz # path to grafana plugins cache tarball
grafana_plugin_list:              # grafana plugins to be downloaded with grafana-cli
  - volkovlabs-echarts-panel
  - volkovlabs-image-panel
  - volkovlabs-form-panel
  - volkovlabs-variable-panel
  - volkovlabs-grapi-datasource
  - marcusolsson-static-datasource
  - marcusolsson-json-datasource
  - marcusolsson-dynamictext-panel
  - marcusolsson-treemap-panel
  - marcusolsson-calendar-panel
  - marcusolsson-hourly-heatmap-panel
  - knightss27-weathermap-panel
loki_enabled: true                # enable loki on this infra node?
loki_clean: false                 # whether remove existing loki data?
loki_data: /data/loki             # loki data dir, `/data/loki` by default
loki_retention: 15d               # loki log retention period, 15d by default

grafana_enabled

name: grafana_enabled, type: bool, level: G/I

enable grafana on this infra node? default value: true

grafana_clean

name: grafana_clean, type: bool, level: G/A

clean grafana data during init? default value: true

grafana_admin_username

name: grafana_admin_username, type: username, level: G

grafana admin username, admin by default

grafana_admin_password

name: grafana_admin_password, type: password, level: G

grafana admin password, pigsty by default

default value: pigsty

WARNING: Change this to a strong password before deploying to production environment

grafana_plugin_cache

name: grafana_plugin_cache, type: path, level: G

path to grafana plugins cache tarball

default value: /www/pigsty/plugins.tgz

If that cache exists, pigsty use that instead of downloading plugins from the Internet

grafana_plugin_list

name: grafana_plugin_list, type: string[], level: G

grafana plugins to be downloaded with grafana-cli

default value:

grafana_plugin_list:              # grafana plugins to be downloaded with grafana-cli
  - volkovlabs-echarts-panel
  - volkovlabs-image-panel
  - volkovlabs-form-panel
  - volkovlabs-variable-panel
  - volkovlabs-grapi-datasource
  - marcusolsson-static-datasource
  - marcusolsson-json-datasource
  - marcusolsson-dynamictext-panel
  - marcusolsson-treemap-panel
  - marcusolsson-calendar-panel
  - marcusolsson-hourly-heatmap-panel
  - knightss27-weathermap-panel

LOKI

loki_enabled

name: loki_enabled, type: bool, level: G/I

enable loki on this infra node? default value: true

loki_clean

name: loki_clean, type: bool, level: G/A

whether remove existing loki data? default value: false

loki_data

name: loki_data, type: path, level: G

loki data dir, default value: /data/loki

loki_retention

name: loki_retention, type: interval, level: G

loki log retention period, 15d by default


NODE

Node module are tuning target nodes into desired state and take it into the Pigsty monitor system.


NODE_ID

Each node has identity parameters that are configured through the parameters in <cluster>.hosts and <cluster>.vars. Check NODE Identity for details.

nodename

name: nodename, type: string, level: I

node instance identity, use hostname if missing, optional

no default value, Null or empty string means nodename will be set to node’s current hostname.

If node_id_from_pg is true (by default) and nodename is not explicitly defined, nodename will try to use ${pg_cluster}-${pg_seq} first, if PGSQL is not defined on this node, it will fall back to default HOSTNAME.

If nodename_overwrite is true, the node name will also be used as the HOSTNAME.

node_cluster

name: node_cluster, type: string, level: C

node cluster identity, use ’nodes’ if missing, optional

default values: nodes

If node_id_from_pg is true (by default) and node_cluster is not explicitly defined, node_cluster will try to use ${pg_cluster} first, if PGSQL is not defined on this node, it will fall back to default HOSTNAME.

nodename_overwrite

name: nodename_overwrite, type: bool, level: C

overwrite node’s hostname with nodename?

default value is true, a non-empty node name nodename will override the hostname of the current node.

When the nodename parameter is undefined or an empty string, but node_id_from_pg is true, the node name will try to use {{ pg_cluster }}-{{ pg_seq }}, borrow identity from the 1:1 PostgreSQL Instance’s ins name.

No changes are made to the hostname if the nodename is undefined, empty, or an empty string and node_id_from_pg is false.

nodename_exchange

name: nodename_exchange, type: bool, level: C

exchange nodename among play hosts?

default value is false

When this parameter is enabled, node names are exchanged between the same group of nodes executing the node.yml playbook, written to /etc/hosts.

node_id_from_pg

name: node_id_from_pg, type: bool, level: C

use postgres identity as node identity if applicable?

default value is true

Boworrow PostgreSQL cluster & instance identity if application.

It’s useful to use same identity for postgres & node if there’s a 1:1 relationship


NODE_DNS

Pigsty configs static DNS records and dynamic DNS resolver for nodes.

If you already have a DNS server, set node_dns_method to none to disable dynamic DNS setup.

node_write_etc_hosts: true        # modify `/etc/hosts` on target node?
node_default_etc_hosts:           # static dns records in `/etc/hosts`
  - "${admin_ip} h.pigsty a.pigsty p.pigsty g.pigsty"
node_etc_hosts: []                # extra static dns records in `/etc/hosts`
node_dns_method: add              # how to handle dns servers: add,none,overwrite
node_dns_servers: ['${admin_ip}'] # dynamic nameserver in `/etc/resolv.conf`
node_dns_options:                 # dns resolv options in `/etc/resolv.conf`
  - options single-request-reopen timeout:1

node_write_etc_hosts

name: node_write_etc_hosts, type: ‘bool’, level: G|C|I

modify /etc/hosts on target node?

For example, the docker VM can not modify /etc/hosts by default, so you can set this value to false to disable the modification.

node_default_etc_hosts

name: node_default_etc_hosts, type: string[], level: G

static dns records in /etc/hosts

default value:

["${admin_ip} h.pigsty a.pigsty p.pigsty g.pigsty"]

node_default_etc_hosts is an array. Each element is a DNS record with format <ip> <name>.

It is used for global static DNS records. You can use node_etc_hosts for ad hoc records for each cluster.

Make sure to write a DNS record like 10.10.10.10 h.pigsty a.pigsty p.pigsty g.pigsty to /etc/hosts to ensure that the local yum repo can be accessed using the domain name before the DNS Nameserver starts.

node_etc_hosts

name: node_etc_hosts, type: string[], level: C

extra static dns records in /etc/hosts

default values: []

Same as node_default_etc_hosts, but in addition to it.

node_dns_method

name: node_dns_method, type: enum, level: C

how to handle dns servers: add,none,overwrite

default values: add

  • add: Append the records in node_dns_servers to /etc/resolv.conf and keep the existing DNS servers. (default)
  • overwrite: Overwrite /etc/resolv.conf with the record in node_dns_servers
  • none: If a DNS server is provided in the production env, the DNS server config can be skipped.

node_dns_servers

name: node_dns_servers, type: string[], level: C

dynamic nameserver in /etc/resolv.conf

default values: ["${admin_ip}"] , the default nameserver on admin node will be added to /etc/resolv.conf as the first nameserver.

node_dns_options

name: node_dns_options, type: string[], level: C

dns resolv options in /etc/resolv.conf, default value:

- options single-request-reopen timeout:1

NODE_PACKAGE

This section is about upstream yum repos & packages to be installed.

node_repo_modules: local          # upstream repo to be added on node, local by default.
node_repo_remove: true            # remove existing repo on node?
node_packages: [ ]                # packages to be installed current nodes
node_default_packages:            # default packages to be installed on all nodes
  - lz4,unzip,bzip2,zlib,yum,pv,jq,git,ncdu,make,patch,bash,lsof,wget,uuid,tuned,nvme-cli,numactl,grubby,sysstat,iotop,htop,rsync,tcpdump,python3,python3-pip
  - netcat,socat,ftp,lrzsz,net-tools,ipvsadm,bind-utils,telnet,audit,ca-certificates,openssl,readline,vim-minimal,node_exporter,etcd,haproxy

For Ubuntu nodes, use this default value explicitly:

- lz4,unzip,bzip2,zlib1g,pv,jq,git,ncdu,make,patch,bash,lsof,wget,uuid,tuned,nvme-cli,numactl,sysstat,iotop,htop,rsync,tcpdump,chrony,acl,python3,python3-pip
- netcat,ftp,socat,lrzsz,net-tools,ipvsadm,dnsutils,telnet,ca-certificates,openssl,openssh-client,libreadline-dev,vim-tiny,keepalived,node-exporter,etcd,haproxy

For debian nodes, use this default value explicitly:

- lz4,unzip,bzip2,zlib1g,pv,jq,git,ncdu,make,patch,bash,lsof,wget,uuid,tuned,nvme-cli,numactl,sysstat,iotop,htop,rsync,tcpdump,chrony,acl,python3,python3-pip
- netcat-openbsd,tnftp,socat,lrzsz,net-tools,ipvsadm,dnsutils,telnet,ca-certificates,openssl,openssh-client,libreadline-dev,vim-tiny,keepalived,node-exporter,etcd,haproxy

node_repo_modules

name: node_repo_modules, type: string, level: C/A

upstream repo to be added on node, default value: local

This parameter specifies the upstream repo to be added to the node. It is used to filter the repo_upstream entries and only the entries with the same module value will be added to the node’s software source. Which is similar to the repo_modules parameter.

node_repo_remove

name: node_repo_remove, type: bool, level: C/A

remove existing repo on node?

default value is true, and thus Pigsty will move existing repo file in /etc/yum.repos.d to a backup dir: /etc/yum.repos.d/backup before adding upstream repos On Debian/Ubuntu, Pigsty will backup & move /etc/apt/sources.list(.d) to /etc/apt/backup.

node_packages

name: node_packages, type: string[], level: C

packages to be installed current nodes, default values: []

Each element is a comma-separated list of package names, which will be installed on the current node in addition to node_default_packages

Like node_default_packages, but in addition to it. designed for overwriting in cluster/instance level.

node_default_packages

name: node_default_packages, type: string[], level: G

default packages to be installed on all nodes, the default value is for EL 7/8/9:

node_default_packages:            # default packages to be installed on all nodes
  - lz4,unzip,bzip2,zlib,yum,pv,jq,git,ncdu,make,patch,bash,lsof,wget,uuid,tuned,nvme-cli,numactl,grubby,sysstat,iotop,htop,rsync,tcpdump,chrony,python3
  - netcat,socat,ftp,lrzsz,net-tools,ipvsadm,bind-utils,telnet,audit,ca-certificates,openssl,readline,vim-minimal,node_exporter,etcd,haproxy,python3-pip

For Ubuntu, the appropriate default value would be:

- lz4,unzip,bzip2,zlib1g,pv,jq,git,ncdu,make,patch,bash,lsof,wget,uuid,tuned,linux-tools-generic,nvme-cli,numactl,sysstat,iotop,htop,rsync,tcpdump,chrony,acl,python3,python3-pip
- netcat,socat,ftp,lrzsz,net-tools,ipvsadm,dnsutils,telnet,ca-certificates,openssl,openssh-client,libreadline-dev,vim-tiny,keepalived,node-exporter,etcd,haproxy

For Debian, the appropriate default value would be:

- lz4,unzip,bzip2,zlib1g,pv,jq,git,ncdu,make,patch,bash,lsof,wget,uuid,tuned,linux-perf,nvme-cli,numactl,sysstat,iotop,htop,rsync,tcpdump,chrony,acl,python3,python3-pip
- netcat-openbsd,socat,tnftp,lrzsz,net-tools,ipvsadm,dnsutils,telnet,ca-certificates,openssl,openssh-client,libreadline-dev,vim-tiny,keepalived,node-exporter,etcd,haproxy

NODE_TUNE

Configure tuned templates, features, kernel modules, sysctl params on node.

node_disable_firewall: true       # disable node firewall? true by default
node_disable_selinux: true        # disable node selinux? true by default
node_disable_numa: false          # disable node numa, reboot required
node_disable_swap: false          # disable node swap, use with caution
node_static_network: true         # preserve dns resolver settings after reboot
node_disk_prefetch: false         # setup disk prefetch on HDD to increase performance
node_kernel_modules: [ softdog, br_netfilter, ip_vs, ip_vs_rr, ip_vs_wrr, ip_vs_sh ]
node_hugepage_count: 0            # number of 2MB hugepage, take precedence over ratio
node_hugepage_ratio: 0            # node mem hugepage ratio, 0 disable it by default
node_overcommit_ratio: 0          # node mem overcommit ratio, 0 disable it by default
node_tune: oltp                   # node tuned profile: none,oltp,olap,crit,tiny
node_sysctl_params: { }           # sysctl parameters in k:v format in addition to tuned

node_disable_firewall

name: node_disable_firewall, type: bool, level: C

disable node firewall? true by default

default value is true

node_disable_selinux

name: node_disable_selinux, type: bool, level: C

disable node selinux? true by default

default value is true

node_disable_numa

name: node_disable_numa, type: bool, level: C

disable node numa, reboot required

default value is false

Boolean flag, default is not off. Note that turning off NUMA requires a reboot of the machine before it can take effect!

If you don’t know how to set the CPU affinity, it is recommended to turn off NUMA.

node_disable_swap

name: node_disable_swap, type: bool, level: C

disable node swap, use with caution

default value is false

But turning off SWAP is not recommended. But SWAP should be disabled when your node is used for a Kubernetes deployment.

If there is enough memory and the database is deployed exclusively. it may slightly improve performance

node_static_network

name: node_static_network, type: bool, level: C

preserve dns resolver settings after reboot, default value is true

Enabling static networking means that machine reboots will not overwrite your DNS Resolv config with NIC changes. It is recommended to enable it in production environment.

node_disk_prefetch

name: node_disk_prefetch, type: bool, level: C

setup disk prefetch on HDD to increase performance

default value is false, Consider enable this when using HDD.

node_kernel_modules

name: node_kernel_modules, type: string[], level: C

kernel modules to be enabled on this node

default value:

node_kernel_modules: [ softdog, br_netfilter, ip_vs, ip_vs_rr, ip_vs_wrr, ip_vs_sh ]

An array consisting of kernel module names declaring the kernel modules that need to be installed on the node.

node_hugepage_count

name: node_hugepage_count, type: int, level: C

number of 2MB hugepage, take precedence over ratio, 0 by default

Take precedence over node_hugepage_ratio. If a non-zero value is given, it will be written to /etc/sysctl.d/hugepage.conf

If node_hugepage_count and node_hugepage_ratio are both 0 (default), hugepage will be disabled at all.

Negative value will not work, and number higher than 90% node mem will be ceil to 90% of node mem.

It should slightly larger than pg_shared_buffer_ratio, if not zero.

node_hugepage_ratio

name: node_hugepage_ratio, type: float, level: C

node mem hugepage ratio, 0 disable it by default, valid range: 0 ~ 0.40

default values: 0, which will set vm.nr_hugepages=0 and not use HugePage at all.

Percent of this memory will be allocated as HugePage, and reserved for PostgreSQL.

It should be equal or slightly larger than pg_shared_buffer_ratio, if not zero.

For example, if you have default 25% mem for postgres shard buffers, you can set this value to 0.27 ~ 0.30, Wasted hugepage can be reclaimed later with /pg/bin/pg-tune-hugepage

node_overcommit_ratio

name: node_overcommit_ratio, type: int, level: C

node mem overcommit ratio, 0 disable it by default. this is an integer from 0 to 100+ .

default values: 0, which will set vm.overcommit_memory=0, otherwise vm.overcommit_memory=2 will be used, and this value will be used as vm.overcommit_ratio.

It is recommended to set use a vm.overcommit_ratio on dedicated pgsql nodes. e.g. 50 ~ 100.

node_tune

name: node_tune, type: enum, level: C

node tuned profile: none,oltp,olap,crit,tiny

default values: oltp

  • tiny: Micro Virtual Machine (1 ~ 3 Core, 1 ~ 8 GB Mem)
  • oltp: Regular OLTP templates with optimized latency
  • olap : Regular OLAP templates to optimize throughput
  • crit: Core financial business templates, optimizing the number of dirty pages

Usually, the database tuning template pg_conf should be paired with the node tuning template: node_tune

node_sysctl_params

name: node_sysctl_params, type: dict, level: C

sysctl parameters in k:v format in addition to tuned

default values: {}

Dictionary K-V structure, Key is kernel sysctl parameter name, Value is the parameter value.

You can also define sysctl parameters with tuned profile


NODE_ADMIN

This section is about admin users and it’s credentials.

node_data: /data                  # node main data directory, `/data` by default
node_admin_enabled: true          # create a admin user on target node?
node_admin_uid: 88                # uid and gid for node admin user
node_admin_username: dba          # name of node admin user, `dba` by default
node_admin_ssh_exchange: true     # exchange admin ssh key among node cluster
node_admin_pk_current: true       # add current user's ssh pk to admin authorized_keys
node_admin_pk_list: []            # ssh public keys to be added to admin user

node_data

name: node_data, type: path, level: C

node main data directory, /data by default

default values: /data

If specified, this path will be used as major data disk mountpoint. And a dir will be created and throwing a warning if path not exists.

The data dir is owned by root with mode 0777.

node_admin_enabled

name: node_admin_enabled, type: bool, level: C

create a admin user on target node?

default value is true

Create an admin user on each node (password-free sudo and ssh), an admin user named dba (uid=88) will be created by default, which can access other nodes in the env and perform sudo from the meta node via SSH password-free.

node_admin_uid

name: node_admin_uid, type: int, level: C

uid and gid for node admin user

default values: 88

node_admin_username

name: node_admin_username, type: username, level: C

name of node admin user, dba by default

default values: dba

node_admin_ssh_exchange

name: node_admin_ssh_exchange, type: bool, level: C

exchange admin ssh key among node cluster

default value is true

When enabled, Pigsty will exchange SSH public keys between members during playbook execution, allowing admins node_admin_username to access each other from different nodes.

node_admin_pk_current

name: node_admin_pk_current, type: bool, level: C

add current user’s ssh pk to admin authorized_keys

default value is true

When enabled, on the current node, the SSH public key (~/.ssh/id_rsa.pub) of the current user is copied to the authorized_keys of the target node admin user.

When deploying in a production env, be sure to pay attention to this parameter, which installs the default public key of the user currently executing the command to the admin user of all machines.

node_admin_pk_list

name: node_admin_pk_list, type: string[], level: C

ssh public keys to be added to admin user

default values: []

Each element of the array is a string containing the key written to the admin user ~/.ssh/authorized_keys, and the user with the corresponding private key can log in as an admin user.

When deploying in production envs, be sure to note this parameter and add only trusted keys to this list.


NODE_TIME

node_timezone: ''                 # setup node timezone, empty string to skip
node_ntp_enabled: true            # enable chronyd time sync service?
node_ntp_servers:                 # ntp servers in `/etc/chrony.conf`
  - pool pool.ntp.org iburst
node_crontab_overwrite: true      # overwrite or append to `/etc/crontab`?
node_crontab: [ ]                 # crontab entries in `/etc/crontab`

node_timezone

name: node_timezone, type: string, level: C

setup node timezone, empty string to skip

default value is empty string, which will not change the default timezone (usually UTC)

node_ntp_enabled

name: node_ntp_enabled, type: bool, level: C

enable chronyd time sync service?

default value is true, and thus Pigsty will override the node’s /etc/chrony.conf by with node_ntp_servers.

If you already a NTP server configured, just set to false to leave it be.

node_ntp_servers

name: node_ntp_servers, type: string[], level: C

ntp servers in /etc/chrony.conf, default value: ["pool pool.ntp.org iburst"]

It only takes effect if node_ntp_enabled is true.

You can use ${admin_ip} to sync time with ntp server on admin node rather than public ntp server.

node_ntp_servers: [ 'pool ${admin_ip} iburst' ]

node_crontab_overwrite

name: node_crontab_overwrite, type: bool, level: C

overwrite or append to /etc/crontab?

default value is true, and pigsty will render records in node_crontab in overwrite mode rather than appending to it.

node_crontab

name: node_crontab, type: string[], level: C

crontab entries in /etc/crontab

default values: []


NODE_VIP

You can bind an optional L2 VIP among one node cluster, which is disabled by default.

L2 VIP can only be used in same L2 LAN, which may incurs extra restrictions on your network topology.

If enabled, You have to manually assign the vip_address and vip_vrid for each node cluster.

It is user’s responsibility to ensure that the address / vrid is unique among the same LAN.

vip_enabled: false                # enable vip on this node cluster?
# vip_address:         [IDENTITY] # node vip address in ipv4 format, required if vip is enabled
# vip_vrid:            [IDENTITY] # required, integer, 1-254, should be unique among same VLAN
vip_role: backup                  # optional, `master/backup`, backup by default, use as init role
vip_preempt: false                # optional, `true/false`, false by default, enable vip preemption
vip_interface: eth0               # node vip network interface to listen, `eth0` by default
vip_dns_suffix: ''                # node vip dns name suffix, empty string by default
vip_exporter_port: 9650           # keepalived exporter listen port, 9650 by default

vip_enabled

name: vip_enabled, type: bool, level: C

enable vip on this node cluster? default value is false, means no L2 VIP is created for this node cluster.

L2 VIP can only be used in same L2 LAN, which may incurs extra restrictions on your network topology.

vip_address

name: vip_address, type: ip, level: C

node vip address in IPv4 format, required if node vip_enabled.

no default value. This parameter must be explicitly assigned and unique in your LAN.

vip_vrid

name: vip_address, type: ip, level: C

integer, 1-254, should be unique in same VLAN, required if node vip_enabled.

no default value. This parameter must be explicitly assigned and unique in your LAN.

vip_role

name: vip_role, type: enum, level: I

node vip role, could be master or backup, will be used as initial keepalived state.

vip_preempt

name: vip_preempt, type: bool, level: C/I

optional, true/false, false by default, enable vip preemption

default value is false, means no preempt is happening when a backup have higher priority than living master.

vip_interface

name: vip_interface, type: string, level: C/I

node vip network interface to listen, eth0 by default.

It should be the same primary intranet interface of your node, which is the IP address you used in the inventory file.

If your node have different interface, you can override it on instance vars

vip_dns_suffix

name: vip_dns_suffix, type: string, level: C/I

node vip dns name suffix, empty string by default. It will be used as the DNS name of the node VIP.

vip_exporter_port

name: vip_exporter_port, type: port, level: C/I

keepalived exporter listen port, 9650 by default.


HAPROXY

HAProxy is installed on every node by default, exposing services in a NodePort manner.

It is used by PGSQL Service.

haproxy_enabled: true             # enable haproxy on this node?
haproxy_clean: false              # cleanup all existing haproxy config?
haproxy_reload: true              # reload haproxy after config?
haproxy_auth_enabled: true        # enable authentication for haproxy admin page
haproxy_admin_username: admin     # haproxy admin username, `admin` by default
haproxy_admin_password: pigsty    # haproxy admin password, `pigsty` by default
haproxy_exporter_port: 9101       # haproxy admin/exporter port, 9101 by default
haproxy_client_timeout: 24h       # client side connection timeout, 24h by default
haproxy_server_timeout: 24h       # server side connection timeout, 24h by default
haproxy_services: []              # list of haproxy service to be exposed on node

haproxy_enabled

name: haproxy_enabled, type: bool, level: C

enable haproxy on this node?

default value is true

haproxy_clean

name: haproxy_clean, type: bool, level: G/C/A

cleanup all existing haproxy config?

default value is false

haproxy_reload

name: haproxy_reload, type: bool, level: A

reload haproxy after config?

default value is true, it will reload haproxy after config change.

If you wish to check before apply, you can turn off this with cli args and check it.

haproxy_auth_enabled

name: haproxy_auth_enabled, type: bool, level: G

enable authentication for haproxy admin page

default value is true, which will require a http basic auth for admin page.

disable it is not recommended, since your traffic control will be exposed

haproxy_admin_username

name: haproxy_admin_username, type: username, level: G

haproxy admin username, admin by default

haproxy_admin_password

name: haproxy_admin_password, type: password, level: G

haproxy admin password, pigsty by default

PLEASE CHANGE IT IN YOUR PRODUCTION ENVIRONMENT!

haproxy_exporter_port

name: haproxy_exporter_port, type: port, level: C

haproxy admin/exporter port, 9101 by default

haproxy_client_timeout

name: haproxy_client_timeout, type: interval, level: C

client side connection timeout, 24h by default

haproxy_server_timeout

name: haproxy_server_timeout, type: interval, level: C

server side connection timeout, 24h by default

haproxy_services

name: haproxy_services, type: service[], level: C

list of haproxy service to be exposed on node, default values: []

Each element is a service definition, here is an ad hoc haproxy service example:

haproxy_services:                   # list of haproxy service

  # expose pg-test read only replicas
  - name: pg-test-ro                # [REQUIRED] service name, unique
    port: 5440                      # [REQUIRED] service port, unique
    ip: "*"                         # [OPTIONAL] service listen addr, "*" by default
    protocol: tcp                   # [OPTIONAL] service protocol, 'tcp' by default
    balance: leastconn              # [OPTIONAL] load balance algorithm, roundrobin by default (or leastconn)
    maxconn: 20000                  # [OPTIONAL] max allowed front-end connection, 20000 by default
    default: 'inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100'
    options:
      - option httpchk
      - option http-keep-alive
      - http-check send meth OPTIONS uri /read-only
      - http-check expect status 200
    servers:
      - { name: pg-test-1 ,ip: 10.10.10.11 , port: 5432 , options: check port 8008 , backup: true }
      - { name: pg-test-2 ,ip: 10.10.10.12 , port: 5432 , options: check port 8008 }
      - { name: pg-test-3 ,ip: 10.10.10.13 , port: 5432 , options: check port 8008 }

It will be rendered to /etc/haproxy/<service.name>.cfg and take effect after reload.


NODE_EXPORTER

node_exporter_enabled: true       # setup node_exporter on this node?
node_exporter_port: 9100          # node exporter listen port, 9100 by default
node_exporter_options: '--no-collector.softnet --no-collector.nvme --collector.tcpstat --collector.processes'

node_exporter_enabled

name: node_exporter_enabled, type: bool, level: C

setup node_exporter on this node? default value is true

node_exporter_port

name: node_exporter_port, type: port, level: C

node exporter listen port, 9100 by default

node_exporter_options

name: node_exporter_options, type: arg, level: C

extra server options for node_exporter, default value: --no-collector.softnet --no-collector.nvme --collector.tcpstat --collector.processes

Pigsty enables tcpstat, processes collectors and and disable nvme, softnet metrics collectors by default.


PROMTAIL

Promtail will collect logs from other modules, and send them to LOKI

  • INFRA: Infra logs, collected only on infra nodes.

    • nginx-access: /var/log/nginx/access.log
    • nginx-error: /var/log/nginx/error.log
    • grafana: /var/log/grafana/grafana.log
  • NODES: Host node logs, collected on all nodes.

    • syslog: /var/log/messages
    • dmesg: /var/log/dmesg
    • cron: /var/log/cron
  • PGSQL: PostgreSQL logs, collected when a node is defined with pg_cluster.

    • postgres: /pg/log/postgres/*.csv
    • patroni: /pg/log/patroni.log
    • pgbouncer: /pg/log/pgbouncer/pgbouncer.log
    • pgbackrest: /pg/log/pgbackrest/*.log
  • REDIS: Redis logs, collected when a node is defined with redis_cluster.

    • redis: /var/log/redis/*.log

Log directory are customizable according to pg_log_dir, patroni_log_dir, pgbouncer_log_dir, pgbackrest_log_dir

promtail_enabled: true            # enable promtail logging collector?
promtail_clean: false             # purge existing promtail status file during init?
promtail_port: 9080               # promtail listen port, 9080 by default
promtail_positions: /var/log/positions.yaml # promtail position status file path

promtail_enabled

name: promtail_enabled, type: bool, level: C

enable promtail logging collector?

default value is true

promtail_clean

name: promtail_clean, type: bool, level: G/A

purge existing promtail status file during init?

default value is false, if you choose to clean, Pigsty will remove the existing state file defined by promtail_positions which means that Promtail will recollect all logs on the current node and send them to Loki again.

promtail_port

name: promtail_port, type: port, level: C

promtail listen port, 9080 by default

default values: 9080

promtail_positions

name: promtail_positions, type: path, level: C

promtail position status file path

default values: /var/log/positions.yaml

Promtail records the consumption offsets of all logs, which are periodically written to the file specified by promtail_positions.


DOCKER

You can install docker on nodes with docker.yml

docker_enabled: false             # enable docker on this node?
docker_cgroups_driver: systemd    # docker cgroup fs driver: cgroupfs,systemd
docker_registry_mirrors: []       # docker registry mirror list
docker_image_cache: /tmp/docker   # docker image cache dir, `/tmp/docker` by default

docker_enabled

name: docker_enabled, type: bool, level: C

enable docker on this node? default value is false

docker_cgroups_driver

name: docker_cgroups_driver, type: enum, level: C

docker cgroup fs driver, could be cgroupfs or systemd, default values: systemd

docker_registry_mirrors

name: docker_registry_mirrors, type: string[], level: C

docker registry mirror list, default values: [], Example:

[ "https://mirror.ccs.tencentyun.com" ]         # tencent cloud mirror, intranet only
["https://registry.cn-hangzhou.aliyuncs.com"]   # aliyun cloud mirror, login required

docker_image_cache

name: docker_image_cache, type: path, level: C

docker image cache dir, /tmp/docker by default.

The local docker image cache with .tgz suffix under this directory will be loaded into docker one by one:

cat {{ docker_image_cache }}/*.tgz | gzip -d -c - | docker load

ETCD

ETCD is a distributed, reliable key-value store for the most critical data of a distributed system, and pigsty use etcd as DCS, Which is critical to PostgreSQL High-Availability.

Pigsty has a hard coded group name etcd for etcd cluster, it can be an existing & external etcd cluster, or a new etcd cluster created by Pigsty with etcd.yml.

#etcd_seq: 1                      # etcd instance identifier, explicitly required
#etcd_cluster: etcd               # etcd cluster & group name, etcd by default
etcd_safeguard: false             # prevent purging running etcd instance?
etcd_clean: true                  # purging existing etcd during initialization?
etcd_data: /data/etcd             # etcd data directory, /data/etcd by default
etcd_port: 2379                   # etcd client port, 2379 by default
etcd_peer_port: 2380              # etcd peer port, 2380 by default
etcd_init: new                    # etcd initial cluster state, new or existing
etcd_election_timeout: 1000       # etcd election timeout, 1000ms by default
etcd_heartbeat_interval: 100      # etcd heartbeat interval, 100ms by default

etcd_seq

name: etcd_seq, type: int, level: I

etcd instance identifier, REQUIRED

no default value, you have to specify it explicitly. Here is a 3-node etcd cluster example:

etcd: # dcs service for postgres/patroni ha consensus
  hosts:  # 1 node for testing, 3 or 5 for production
    10.10.10.10: { etcd_seq: 1 }  # etcd_seq required
    10.10.10.11: { etcd_seq: 2 }  # assign from 1 ~ n
    10.10.10.12: { etcd_seq: 3 }  # odd number please
  vars: # cluster level parameter override roles/etcd
    etcd_cluster: etcd  # mark etcd cluster name etcd
    etcd_safeguard: false # safeguard against purging
    etcd_clean: true # purge etcd during init process

etcd_cluster

name: etcd_cluster, type: string, level: C

etcd cluster & group name, etcd by default

default values: etcd, which is a fixed group name, can be useful when you want to use deployed some extra etcd clusters

etcd_safeguard

name: etcd_safeguard, type: bool, level: G/C/A

prevent purging running etcd instance? default value is false

If enabled, running etcd instance will not be purged by etcd.yml playbook.

etcd_clean

name: etcd_clean, type: bool, level: G/C/A

purging existing etcd during initialization? default value is true

If enabled, running etcd instance will be purged by etcd.yml playbook, which makes the playbook fully idempotent.

But if etcd_safeguard is enabled, it will still abort on any running etcd instance.

etcd_data

name: etcd_data, type: path, level: C

etcd data directory, /data/etcd by default

etcd_port

name: etcd_port, type: port, level: C

etcd client port, 2379 by default

etcd_peer_port

name: etcd_peer_port, type: port, level: C

etcd peer port, 2380 by default

etcd_init

name: etcd_init, type: enum, level: C

etcd initial cluster state, new or existing

default values: new, which will create a standalone new etcd cluster.

The value existing is used when trying to add new member to existing etcd cluster.

etcd_election_timeout

name: etcd_election_timeout, type: int, level: C

etcd election timeout, 1000 (ms) by default

etcd_heartbeat_interval

name: etcd_heartbeat_interval, type: int, level: C

etcd heartbeat interval, 100 (ms) by default


MINIO

Minio is a S3 compatible object storage service. Which is used as an optional central backup storage repo for PostgreSQL.

But you can use it for other purpose, such as storing large files, document, pictures & videos.

#minio_seq: 1                     # minio instance identifier, REQUIRED
minio_cluster: minio              # minio cluster name, minio by default
minio_clean: false                # cleanup minio during init?, false by default
minio_user: minio                 # minio os user, `minio` by default
minio_node: '${minio_cluster}-${minio_seq}.pigsty' # minio node name pattern
minio_data: '/data/minio'         # minio data dir(s), use {x...y} to specify multi drivers
minio_domain: sss.pigsty          # minio external domain name, `sss.pigsty` by default
minio_port: 9000                  # minio service port, 9000 by default
minio_admin_port: 9001            # minio console port, 9001 by default
minio_access_key: minioadmin      # root access key, `minioadmin` by default
minio_secret_key: minioadmin      # root secret key, `minioadmin` by default
minio_extra_vars: ''              # extra environment variables
minio_alias: sss                  # alias name for local minio deployment
minio_buckets: [ { name: pgsql }, { name: infra },  { name: redis } ]
minio_users:
  - { access_key: dba , secret_key: S3User.DBA, policy: consoleAdmin }
  - { access_key: pgbackrest , secret_key: S3User.Backup, policy: readwrite }

minio_seq

name: minio_seq, type: int, level: I

minio instance identifier, REQUIRED identity parameters. no default value, you have to assign it manually

minio_cluster

name: minio_cluster, type: string, level: C

minio cluster name, minio by default. This is useful when deploying multiple MinIO clusters

minio_clean

name: minio_clean, type: bool, level: G/C/A

cleanup minio during init?, false by default

minio_user

name: minio_user, type: username, level: C

minio os user name, minio by default

minio_node

name: minio_node, type: string, level: C

minio node name pattern, this is used for multi-node deployment

default values: ${minio_cluster}-${minio_seq}.pigsty

minio_data

name: minio_data, type: path, level: C

minio data dir(s)

default values: /data/minio, which is a common dir for single-node deployment.

For a multi-drive deployment, you can use {x...y} notion to specify multi drivers.

minio_domain

name: minio_domain, type: string, level: G

minio service domain name, sss.pigsty by default.

The client can access minio S3 service via this domain name. This name will be registered to local DNSMASQ and included in SSL certs.

minio_port

name: minio_port, type: port, level: C

minio service port, 9000 by default

minio_admin_port

name: minio_admin_port, type: port, level: C

minio console port, 9001 by default

minio_access_key

name: minio_access_key, type: username, level: C

root access key, minioadmin by default

minio_secret_key

name: minio_secret_key, type: password, level: C

root secret key, minioadmin by default

default values: minioadmin

PLEASE CHANGE THIS IN YOUR DEPLOYMENT

minio_extra_vars

name: minio_extra_vars, type: string, level: C

extra environment variables for minio server. Check Minio Server for the complete list.

default value is empty string, you can use multiline string to passing multiple environment variables.

minio_alias

name: minio_alias, type: string, level: G

MinIO alias name for the local MinIO cluster

default values: sss, which will be written to infra nodes’ / admin users’ client alias profile.

minio_buckets

name: minio_buckets, type: bucket[], level: C

list of minio bucket to be created by default:

minio_buckets: [ { name: pgsql }, { name: infra },  { name: redis } ]

Three default buckets are created for module PGSQL, INFRA, and REDIS

minio_users

name: minio_users, type: user[], level: C

list of minio user to be created, default value:

minio_users:
  - { access_key: dba , secret_key: S3User.DBA, policy: consoleAdmin }
  - { access_key: pgbackrest , secret_key: S3User.Backup, policy: readwrite }

Two default users are created for PostgreSQL DBA and pgBackREST.

PLEASE ADJUST THESE USERS & CREDENTIALS IN YOUR DEPLOYMENT!


REDIS

#redis_cluster:        <CLUSTER> # redis cluster name, required identity parameter
#redis_node: 1            <NODE> # redis node sequence number, node int id required
#redis_instances: {}      <NODE> # redis instances definition on this redis node
redis_fs_main: /data              # redis main data mountpoint, `/data` by default
redis_exporter_enabled: true      # install redis exporter on redis nodes?
redis_exporter_port: 9121         # redis exporter listen port, 9121 by default
redis_exporter_options: ''        # cli args and extra options for redis exporter
redis_safeguard: false            # prevent purging running redis instance?
redis_clean: true                 # purging existing redis during init?
redis_rmdata: true                # remove redis data when purging redis server?
redis_mode: standalone            # redis mode: standalone,cluster,sentinel
redis_conf: redis.conf            # redis config template path, except sentinel
redis_bind_address: '0.0.0.0'     # redis bind address, empty string will use host ip
redis_max_memory: 1GB             # max memory used by each redis instance
redis_mem_policy: allkeys-lru     # redis memory eviction policy
redis_password: ''                # redis password, empty string will disable password
redis_rdb_save: ['1200 1']        # redis rdb save directives, disable with empty list
redis_aof_enabled: false          # enable redis append only file?
redis_rename_commands: {}         # rename redis dangerous commands
redis_cluster_replicas: 1         # replica number for one master in redis cluster
redis_sentinel_monitor: []        # sentinel master list, works on sentinel cluster only

redis_cluster

name: redis_cluster, type: string, level: C

redis cluster name, required identity parameter.

no default value, you have to define it explicitly.

Comply with regexp [a-z][a-z0-9-]*, it is recommended to use the same name as the group name and start with redis-

redis_node

name: redis_node, type: int, level: I

redis node sequence number, unique integer among redis cluster is required

You have to explicitly define the node id for each redis node. integer start from 0 or 1.

redis_instances

name: redis_instances, type: dict, level: I

redis instances definition on this redis node

no default value, you have to define redis instances on each redis node using this parameter explicitly.

Here is an example for a native redis cluster definition

redis-test: # redis native cluster: 3m x 3s
  hosts:
    10.10.10.12: { redis_node: 1 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } }
    10.10.10.13: { redis_node: 2 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } }
  vars: { redis_cluster: redis-test ,redis_password: 'redis.test' ,redis_mode: cluster, redis_max_memory: 32MB }

The port number should be unique among the node, and the replica_of in value should be instance member of the same redis cluster.

redis_instances:
    6379: {}
    6380: { replica_of: '10.10.10.13 6379' }
    6381: { replica_of: '10.10.10.13 6379' }

redis_fs_main

name: redis_fs_main, type: path, level: C

redis main data mountpoint, /data by default

default values: /data, and /data/redis will be used as the redis data directory.

redis_exporter_enabled

name: redis_exporter_enabled, type: bool, level: C

install redis exporter on redis nodes?

default value is true, which will launch a redis_exporter on this redis_node

redis_exporter_port

name: redis_exporter_port, type: port, level: C

redis exporter listen port, 9121 by default

default values: 9121

redis_exporter_options

name: redis_exporter_options, type: string, level: C/I

cli args and extra options for redis exporter, will be added to /etc/defaut/redis_exporter.

default value is empty string

redis_safeguard

name: redis_safeguard, type: bool, level: G/C/A

prevent purging running redis instance?

default value is false, if set to true, and redis instance is running, init / remove playbook will abort immediately.

redis_clean

name: redis_clean, type: bool, level: G/C/A

purging existing redis during init?

default value is true, which will remove redis server during redis init or remove.

redis_rmdata

name: redis_rmdata, type: bool, level: G/C/A

remove redis data when purging redis server?

default value is true, which will remove redis rdb / aof along with redis instance.

redis_mode

name: redis_mode, type: enum, level: C

redis mode: standalone,cluster,sentinel

default values: standalone

  • standalone: setup redis as standalone (master-slave) mode
  • cluster: setup this redis cluster as a redis native cluster
  • sentinel: setup redis as sentinel for standalone redis HA

redis_conf

name: redis_conf, type: string, level: C

redis config template path, except sentinel

default values: redis.conf, which is a template file in roles/redis/templates/redis.conf.

If you want to use your own redis config template, you can put it in templates/ directory and set this parameter to the template file name.

Note that redis sentinel are using a different template file, which is roles/redis/templates/redis-sentinel.conf

redis_bind_address

name: redis_bind_address, type: ip, level: C

redis bind address, empty string will use inventory hostname

default values: 0.0.0.0, which will bind to all available IPv4 address on this host

PLEASE bind to intranet IP only in production environment, i.e. set this value to ''

redis_max_memory

name: redis_max_memory, type: size, level: C/I

max memory used by each redis instance, default values: 1GB

redis_mem_policy

name: redis_mem_policy, type: enum, level: C

redis memory eviction policy

default values: allkeys-lru, check redis eviction policy for more details

  • noeviction: New values aren’t saved when memory limit is reached. When a database uses replication, this applies to the primary database
  • allkeys-lru: Keeps most recently used keys; removes least recently used (LRU) keys
  • allkeys-lfu: Keeps frequently used keys; removes least frequently used (LFU) keys
  • volatile-lru: Removes least recently used keys with the expire field set to true.
  • volatile-lfu: Removes least frequently used keys with the expire field set to true.
  • allkeys-random: Randomly removes keys to make space for the new data added.
  • volatile-random: Randomly removes keys with expire field set to true.
  • volatile-ttl: Removes keys with expire field set to true and the shortest remaining time-to-live (TTL) value.

redis_password

name: redis_password, type: password, level: C/N

redis password, empty string will disable password, which is the default behavior

Note that due to the implementation limitation of redis_exporter, you can only set one redis_password per node. This is usually not a problem, because pigsty does not allow deploying two different redis cluster on the same node.

PLEASE use a strong password in production environment

redis_rdb_save

name: redis_rdb_save, type: string[], level: C

redis rdb save directives, disable with empty list, check redis persist for details.

the default value is ["1200 1"]: dump the dataset to disk every 20 minutes if at least 1 key changed:

redis_aof_enabled

name: redis_aof_enabled, type: bool, level: C

enable redis append only file? default value is false.

redis_rename_commands

name: redis_rename_commands, type: dict, level: C

rename redis dangerous commands, which is a dict of k:v old: new

default values: {}, you can hide dangerous commands like FLUSHDB and FLUSHALL by setting this value, here’s an example:

{
  "keys": "op_keys",
  "flushdb": "op_flushdb",
  "flushall": "op_flushall",
  "config": "op_config"  
}

redis_cluster_replicas

name: redis_cluster_replicas, type: int, level: C

replica number for one master/primary in redis cluster, default values: 1

redis_sentinel_monitor

name: redis_sentinel_monitor, type: master[], level: C

This can only be used when redis_mode is set to sentinel.

List of redis master to be monitored by this sentinel cluster. each master is defined as a dict with name, host, port, password, quorum keys.

redis_sentinel_monitor:  # primary list for redis sentinel, use cls as name, primary ip:port
  - { name: redis-src, host: 10.10.10.45, port: 6379 ,password: redis.src, quorum: 1 }
  - { name: redis-dst, host: 10.10.10.48, port: 6379 ,password: redis.dst, quorum: 1 }

The name and host are mandatory, port, password, quorum are optional, quorum is used to set the quorum for this master, usually large than half of the sentinel instances.


PGSQL

PGSQL module requires NODE module to be installed, and you also need a viable ETCD cluster to store cluster meta data.

Install PGSQL module on a single node will create a primary instance which a standalone PGSQL server/instance. Install it on additional nodes will create replicas, which can be used for serving read-only traffics, or use as standby backup. You can also create offline instance of ETL/OLAP/Interactive queries, use Sync Standby and Quorum Commit to increase data consistency, or even form a standby cluster and delayed standby cluster for disaster recovery.

You can define multiple PGSQL clusters and form a horizontal sharding cluster, which is a group of PGSQL clusters running on different nodes. Pigsty has native citus cluster group support, which can extend your PGSQL cluster to a distributed database sharding cluster.


PG_ID

Here are some common parameters used to identify PGSQL entities: instance, service, etc…

# pg_cluster:           #CLUSTER  # pgsql cluster name, required identity parameter
# pg_seq: 0             #INSTANCE # pgsql instance seq number, required identity parameter
# pg_role: replica      #INSTANCE # pgsql role, required, could be primary,replica,offline
# pg_instances: {}      #INSTANCE # define multiple pg instances on node in `{port:ins_vars}` format
# pg_upstream:          #INSTANCE # repl upstream ip addr for standby cluster or cascade replica
# pg_shard:             #CLUSTER  # pgsql shard name, optional identity for sharding clusters
# pg_group: 0           #CLUSTER  # pgsql shard index number, optional identity for sharding clusters
# gp_role: master       #CLUSTER  # greenplum role of this cluster, could be master or segment
pg_offline_query: false #INSTANCE # set to true to enable offline query on this instance

You have to assign these identity parameters explicitly, there’s no default value for them.

Name Type Level Description
pg_cluster string C PG database cluster name
pg_seq number I PG database instance id
pg_role enum I PG database instance role
pg_shard string C PG database shard name of cluster
pg_group number C PG database shard index of cluster
  • pg_cluster: It identifies the name of the cluster, which is configured at the cluster level.
  • pg_role: Configured at the instance level, identifies the role of the ins. Only the primary role will be handled specially. If not filled in, the default is the replica role and the special delayed and offline roles.
  • pg_seq: Used to identify the ins within the cluster, usually with an integer number incremented from 0 or 1, which is not changed once it is assigned.
  • {{ pg_cluster }}-{{ pg_seq }} is used to uniquely identify the ins, i.e. pg_instance.
  • {{ pg_cluster }}-{{ pg_role }} is used to identify the services within the cluster, i.e. pg_service.
  • pg_shard and pg_group are used for horizontally sharding clusters, for citus, greenplum, and matrixdb only.

pg_cluster, pg_role, pg_seq are core identity params, which are required for any Postgres cluster, and must be explicitly specified. Here’s an example:

pg-test:
  hosts:
    10.10.10.11: {pg_seq: 1, pg_role: replica}
    10.10.10.12: {pg_seq: 2, pg_role: primary}
    10.10.10.13: {pg_seq: 3, pg_role: replica}
  vars:
    pg_cluster: pg-test

All other params can be inherited from the global config or the default config, but the identity params must be explicitly specified and manually assigned.

pg_mode

name: pg_mode, type: enum, level: C

pgsql cluster mode, cloud be pgsql, citus, or gpsql, pgsql by default.

If pg_mode is set to citus or gpsql, pg_shard and pg_group will be required for horizontal sharding clusters.

pg_cluster

name: pg_cluster, type: string, level: C

pgsql cluster name, REQUIRED identity parameter

The cluster name will be used as the namespace for PGSQL related resources within that cluster.

The naming needs to follow the specific naming pattern: [a-z][a-z0-9-]* to be compatible with the requirements of different constraints on the identity.

pg_seq

name: pg_seq, type: int, level: I

pgsql instance seq number, REQUIRED identity parameter

A serial number of this instance, unique within its cluster, starting from 0 or 1.

pg_role

name: pg_role, type: enum, level: I

pgsql role, REQUIRED, could be primary,replica,offline

Roles for PGSQL instance, can be: primary, replica, standby or offline.

  • primary: Primary, there is one and only one primary in a cluster.
  • replica: Replica for carrying online read-only traffic, there may be a slight replication delay through (10ms~100ms, 100KB).
  • standby: Special replica that is always synced with primary, there’s no replication delay & data loss on this replica. (currently same as replica)
  • offline: Offline replica for taking on offline read-only traffic, such as statistical analysis/ETL/personal queries, etc.

Identity params, required params, and instance-level params.

pg_instances

name: pg_instances, type: dict, level: I

define multiple pg instances on node in {port:ins_vars} format.

This parameter is reserved for multi-instance deployment on a single node which is not implemented in Pigsty yet.

pg_upstream

name: pg_upstream, type: ip, level: I

Upstream ip address for standby cluster or cascade replica

Setting pg_upstream is set on primary instance indicate that this cluster is a Standby Cluster, and will receiving changes from upstream instance, thus the primary is actually a standby leader.

Setting pg_upstream for a non-primary instance will explicitly set a replication upstream instance, if it is different from the primary ip addr, this instance will become a cascade replica. And it’s user’s responsibility to ensure that the upstream IP addr is another instance in the same cluster.

pg_shard

name: pg_shard, type: string, level: C

pgsql shard name, required identity parameter for sharding clusters (e.g. citus cluster), optional for common pgsql clusters.

When multiple pgsql clusters serve the same business together in a horizontally sharding style, Pigsty will mark this group of clusters as a Sharding Group.

pg_shard is the name of the shard group name. It’s usually the prefix of pg_cluster.

For example, if we have a sharding group pg-citus, and 4 clusters in it, there identity params will be:

cls pg_shard: pg-citus
cls pg_group = 0:   pg-citus0
cls pg_group = 1:   pg-citus1
cls pg_group = 2:   pg-citus2
cls pg_group = 3:   pg-citus3

pg_group

name: pg_group, type: int, level: C

pgsql shard index number, required identity for sharding clusters, optional for common pgsql clusters.

Sharding cluster index of sharding group, used in pair with pg_shard. You can use any non-negative integer as the index number.

gp_role

name: gp_role, type: enum, level: C

greenplum/matrixdb role of this cluster, could be master or segment

  • master: mark the postgres cluster as greenplum master, which is the default value
  • segment mark the postgres cluster as greenplum segment

This parameter is only used for greenplum/matrixdb database, and is ignored for common pgsql cluster.

pg_exporters

name: pg_exporters, type: dict, level: C

additional pg_exporters to monitor remote postgres instances, default values: {}

If you wish to monitoring remote postgres instances, define them in pg_exporters and load them with pgsql-monitor.yml playbook.

pg_exporters: # list all remote instances here, alloc a unique unused local port as k
    20001: { pg_cluster: pg-foo, pg_seq: 1, pg_host: 10.10.10.10 }
    20004: { pg_cluster: pg-foo, pg_seq: 2, pg_host: 10.10.10.11 }
    20002: { pg_cluster: pg-bar, pg_seq: 1, pg_host: 10.10.10.12 }
    20003: { pg_cluster: pg-bar, pg_seq: 1, pg_host: 10.10.10.13 }

Check PGSQL Monitoring for details.

pg_offline_query

name: pg_offline_query, type: bool, level: I

set to true to enable offline query on this instance

default value is false

When set to true, the user group dbrole_offline can connect to the ins and perform offline queries, regardless of the role of the current instance, just like a offline instance.

If you just have one replica or even one primary in your postgres cluster, adding this could mark it for accepting ETL, slow queries with interactive access.


PG_BUSINESS

Database credentials, In-Database Objects that need to be taken care of by Users.

Default Database Users:

WARNING: YOU HAVE TO CHANGE THESE DEFAULT PASSWORDs in production environment.

# postgres business object definition, overwrite in group vars
pg_users: []                      # postgres business users
pg_databases: []                  # postgres business databases
pg_services: []                   # postgres business services
pg_hba_rules: []                  # business hba rules for postgres
pgb_hba_rules: []                 # business hba rules for pgbouncer
# global credentials, overwrite in global vars
pg_dbsu_password: ''              # dbsu password, empty string means no dbsu password by default
pg_replication_username: replicator
pg_replication_password: DBUser.Replicator
pg_admin_username: dbuser_dba
pg_admin_password: DBUser.DBA
pg_monitor_username: dbuser_monitor
pg_monitor_password: DBUser.Monitor

pg_users

name: pg_users, type: user[], level: C

postgres business users, has to be defined at cluster level.

default values: [], each object in the array defines a User/Role. Examples:

- name: dbuser_meta               # REQUIRED, `name` is the only mandatory field of a user definition
  password: DBUser.Meta           # optional, password, can be a scram-sha-256 hash string or plain text
  login: true                     # optional, can log in, true by default  (new biz ROLE should be false)
  superuser: false                # optional, is superuser? false by default
  createdb: false                 # optional, can create database? false by default
  createrole: false               # optional, can create role? false by default
  inherit: true                   # optional, can this role use inherited privileges? true by default
  replication: false              # optional, can this role do replication? false by default
  bypassrls: false                # optional, can this role bypass row level security? false by default
  pgbouncer: true                 # optional, add this user to pgbouncer user-list? false by default (production user should be true explicitly)
  connlimit: -1                   # optional, user connection limit, default -1 disable limit
  expire_in: 3650                 # optional, now + n days when this role is expired (OVERWRITE expire_at)
  expire_at: '2030-12-31'         # optional, YYYY-MM-DD 'timestamp' when this role is expired  (OVERWRITTEN by expire_in)
  comment: pigsty admin user      # optional, comment string for this user/role
  roles: [dbrole_admin]           # optional, belonged roles. default roles are: dbrole_{admin,readonly,readwrite,offline}
  parameters: {}                  # optional, role level parameters with `ALTER ROLE SET`
  pool_mode: transaction          # optional, pgbouncer pool mode at user level, transaction by default
  pool_connlimit: -1              # optional, max database connections at user level, default -1 disable limit
  search_path: public             # key value config parameters according to postgresql documentation (e.g: use pigsty as default search_path)

The only mandatory field of a user definition is name, and the rest are optional.

pg_databases

name: pg_databases, type: database[], level: C

postgres business databases, has to be defined at cluster level.

default values: [], each object in the array defines a Database. Examples:

- name: meta                      # REQUIRED, `name` is the only mandatory field of a database definition
  baseline: cmdb.sql              # optional, database sql baseline path, (relative path among ansible search path, e.g files/)
  pgbouncer: true                 # optional, add this database to pgbouncer database list? true by default
  schemas: [pigsty]               # optional, additional schemas to be created, array of schema names
  extensions:                     # optional, additional extensions to be installed: array of `{name[,schema]}`
    - { name: postgis , schema: public }
    - { name: timescaledb }
  comment: pigsty meta database   # optional, comment string for this database
  owner: postgres                 # optional, database owner, postgres by default
  template: template1             # optional, which template to use, template1 by default
  encoding: UTF8                  # optional, database encoding, UTF8 by default. (MUST same as template database)
  locale: C                       # optional, database locale, C by default.  (MUST same as template database)
  lc_collate: C                   # optional, database collate, C by default. (MUST same as template database)
  lc_ctype: C                     # optional, database ctype, C by default.   (MUST same as template database)
  tablespace: pg_default          # optional, default tablespace, 'pg_default' by default.
  allowconn: true                 # optional, allow connection, true by default. false will disable connect at all
  revokeconn: false               # optional, revoke public connection privilege. false by default. (leave connect with grant option to owner)
  register_datasource: true       # optional, register this database to grafana datasources? true by default
  connlimit: -1                   # optional, database connection limit, default -1 disable limit
  pool_auth_user: dbuser_meta     # optional, all connection to this pgbouncer database will be authenticated by this user
  pool_mode: transaction          # optional, pgbouncer pool mode at database level, default transaction
  pool_size: 64                   # optional, pgbouncer pool size at database level, default 64
  pool_size_reserve: 32           # optional, pgbouncer pool size reserve at database level, default 32
  pool_size_min: 0                # optional, pgbouncer pool size min at database level, default 0
  pool_max_db_conn: 100           # optional, max database connections at database level, default 100

In each database definition, the DB name is mandatory and the rest are optional.

pg_services

name: pg_services, type: service[], level: C

postgres business services exposed via haproxy, has to be defined at cluster level.

You can define ad hoc services with pg_services in additional to default pg_default_services

default values: [], each object in the array defines a Service. Examples:

- name: standby                   # required, service name, the actual svc name will be prefixed with `pg_cluster`, e.g: pg-meta-standby
  port: 5435                      # required, service exposed port (work as kubernetes service node port mode)
  ip: "*"                         # optional, service bind ip address, `*` for all ip by default
  selector: "[]"                  # required, service member selector, use JMESPath to filter inventory
  dest: default                   # optional, destination port, default|postgres|pgbouncer|<port_number>, 'default' by default
  check: /sync                    # optional, health check url path, / by default
  backup: "[? pg_role == `primary`]"  # backup server selector
  maxconn: 3000                   # optional, max allowed front-end connection
  balance: roundrobin             # optional, haproxy load balance algorithm (roundrobin by default, other: leastconn)
  options: 'inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100'

pg_hba_rules

name: pg_hba_rules, type: hba[], level: C

business hba rules for postgres

default values: [], each object in array is an HBA Rule definition:

Which are array of hba object, each hba object may look like

# RAW HBA RULES
- title: allow intranet password access
  role: common
  rules:
    - host   all  all  10.0.0.0/8      md5
    - host   all  all  172.16.0.0/12   md5
    - host   all  all  192.168.0.0/16  md5
  • title: Rule Title, transform into comment in hba file
  • rules: Array of strings, each string is a raw hba rule record
  • role: Applied roles, where to install these hba rules
    • common: apply for all instances
    • primary, replica,standby, offline: apply on corresponding instances with that pg_role.
    • special case: HBA rule with role == 'offline' will be installed on instance with pg_offline_query flag

or you can use another alias form

- addr: 'intra'    # world|intra|infra|admin|local|localhost|cluster|<cidr>
  auth: 'pwd'      # trust|pwd|ssl|cert|deny|<official auth method>
  user: 'all'      # all|${dbsu}|${repl}|${admin}|${monitor}|<user>|<group>
  db: 'all'        # all|replication|....
  rules: []        # raw hba string precedence over above all
  title: allow intranet password access

pg_default_hba_rules is similar to this, but is used for global HBA rule settings

pgb_hba_rules

name: pgb_hba_rules, type: hba[], level: C

business hba rules for pgbouncer, default values: []

Similar to pg_hba_rules, array of hba rule object, except this is for pgbouncer.

pg_replication_username

name: pg_replication_username, type: username, level: G

postgres replication username, replicator by default

This parameter is globally used, it not wise to change it.

pg_replication_password

name: pg_replication_password, type: password, level: G

postgres replication password, DBUser.Replicator by default

WARNING: CHANGE THIS IN PRODUCTION ENVIRONMENT!!!!

pg_admin_username

name: pg_admin_username, type: username, level: G

postgres admin username, dbuser_dba by default, which is a global postgres superuser.

default values: dbuser_dba

pg_admin_password

name: pg_admin_password, type: password, level: G

postgres admin password in plain text, DBUser.DBA by default

WARNING: CHANGE THIS IN PRODUCTION ENVIRONMENT!!!!

pg_monitor_username

name: pg_monitor_username, type: username, level: G

postgres monitor username, dbuser_monitor by default, which is a global monitoring user.

pg_monitor_password

name: pg_monitor_password, type: password, level: G

postgres monitor password, DBUser.Monitor by default.

Try not using the @:/ character in the password to avoid problems with PGURL string.

WARNING: CHANGE THIS IN PRODUCTION ENVIRONMENT!!!!

pg_dbsu_password

name: pg_dbsu_password, type: password, level: G/C

PostgreSQL dbsu password for pg_dbsu, empty string means no dbsu password, which is the default behavior.

WARNING: It’s not recommend to set a dbsu password for common PGSQL clusters, except for pg_mode = citus.


PG_INSTALL

This section is responsible for installing PostgreSQL & Extensions.

If you wish to install a different major version, just make sure repo packages exists and overwrite pg_version on cluster level.

To install extra extensions, overwrite pg_extensions on cluster level. Beware that not all extensions are available with other major versions.

pg_dbsu: postgres                 # os dbsu name, postgres by default, better not change it
pg_dbsu_uid: 26                   # os dbsu uid and gid, 26 for default postgres users and groups
pg_dbsu_sudo: limit               # dbsu sudo privilege, none,limit,all,nopass. limit by default
pg_dbsu_home: /var/lib/pgsql      # postgresql home directory, `/var/lib/pgsql` by default
pg_dbsu_ssh_exchange: true        # exchange postgres dbsu ssh key among same pgsql cluster
pg_version: 15                    # postgres major version to be installed, 15 by default
pg_bin_dir: /usr/pgsql/bin        # postgres binary dir, `/usr/pgsql/bin` by default
pg_log_dir: /pg/log/postgres      # postgres log dir, `/pg/log/postgres` by default
pg_packages:                      # pg packages to be installed, `${pg_version}` will be replaced
  - postgresql${pg_version}*
  - patroni pgbouncer pgbackrest pg_exporter pgbadger vip-manager patroni-etcd             # pgdg common tools
  - pg_repack_${pg_version}* wal2json_${pg_version}* passwordcheck_cracklib_${pg_version}* # important extensions
pg_extensions: # citus & hydra are exclusive
  - postgis34_${pg_version}* timescaledb-2-postgresql-${pg_version}* pgvector_${pg_version}*

pg_dbsu

name: pg_dbsu, type: username, level: C

os dbsu name, postgres by default, it’s not wise to change it.

When installing Greenplum / MatrixDB, set this parameter to the corresponding default value: gpadmin|mxadmin.

pg_dbsu_uid

name: pg_dbsu_uid, type: int, level: C

os dbsu uid and gid, 26 for default postgres users and groups, which is consistent with the official pgdg RPM.

For Ubuntu/Debian, there’s no default postgres UID/GID, consider using another ad hoc value, such as 543 instead.

pg_dbsu_sudo

name: pg_dbsu_sudo, type: enum, level: C

dbsu sudo privilege, coud be none, limit ,all ,nopass. limit by default

  • none: No Sudo privilege
  • limit: Limited sudo privilege to execute systemctl commands for database-related components, default.
  • all: Full sudo privilege, password required.
  • nopass: Full sudo privileges without a password (not recommended).

default values: limit, which only allow sudo systemctl <start|stop|reload> <postgres|patroni|pgbouncer|...>

pg_dbsu_home

name: pg_dbsu_home, type: path, level: C

postgresql home directory, /var/lib/pgsql by default, which is consistent with the official pgdg RPM.

pg_dbsu_ssh_exchange

name: pg_dbsu_ssh_exchange, type: bool, level: C

exchange postgres dbsu ssh key among same pgsql cluster?

default value is true, means the dbsu can ssh to each other among the same cluster.

pg_version

name: pg_version, type: enum, level: C

postgres major version to be installed, 15 by default

Note that PostgreSQL physical stream replication cannot cross major versions, so do not configure this on instance level.

You can use the parameters in pg_packages and pg_extensions to install rpms for the specific pg major version.

pg_bin_dir

name: pg_bin_dir, type: path, level: C

postgres binary dir, /usr/pgsql/bin by default

The default value is a soft link created manually during the installation process, pointing to the specific Postgres version dir installed.

For example /usr/pgsql -> /usr/pgsql-15. For more details, check PGSQL File Structure for details.

pg_log_dir

name: pg_log_dir, type: path, level: C

postgres log dir, /pg/log/postgres by default.

caveat: if pg_log_dir is prefixed with pg_data it will not be created explicit (it will be created by postgres itself then).

pg_packages

name: pg_packages, type: string[], level: C

pg packages to be installed, ${pg_version} will be replaced to the actual value of pg_version

PostgreSQL, pgbouncer, pg_exporter, pgbadger, vip-manager, patroni, pgbackrest are install by default.

pg_packages:                      # pg packages to be installed, `${pg_version}` will be replaced
  - postgresql${pg_version}*
  - patroni pgbouncer pgbackrest pg_exporter pgbadger vip-manager patroni-etcd             # pgdg common tools
  - pg_repack_${pg_version}* wal2json_${pg_version}* passwordcheck_cracklib_${pg_version}* # important extensions

For Ubuntu/Debian, the proper value has to be replaced explicitly:

pg_packages:                      # pg packages to be installed, `${pg_version}` will be replaced
  - postgresql-*-${pg_version}
  - patroni pgbouncer pgbackrest pg-exporter pgbadger vip-manager
  - postgresql-${pg_version}-repack postgresql-${pg_version}-wal2json

pg_extensions

name: pg_extensions, type: string[], level: C

pg extensions to be installed, ${pg_version} will be replaced with actual pg_version

Pigsty will install the following extensions for all database instances by default: postgis, timescaledb, pgvector, pg_repack, wal2json and passwordcheck_cracklib.

pg_extensions: # citus & hydra are exclusive
  - postgis34_${pg_version}* timescaledb-2-postgresql-${pg_version}* pgvector_${pg_version}*

For Ubuntu/Debian, the proper value has to be replaced explicitly:

pg_extensions:                    # pg extensions to be installed, `${pg_version}` will be replaced
  - postgresql-${pg_version}-postgis* timescaledb-2-postgresql-${pg_version} postgresql-${pg_version}-pgvector 

Beware that not all extensions are available with other PG major versions, but Pigsty guarantees that important extensions wal2json, pg_repack and passwordcheck_cracklib (EL only) are available on all PG major versions.


PG_BOOTSTRAP

Bootstrap a postgres cluster with patroni, and setup pgbouncer connection pool along with it.

It also init cluster template databases with default roles, schemas & extensions & default privileges specified in PG_PROVISION

pg_safeguard: false               # prevent purging running postgres instance? false by default
pg_clean: true                    # purging existing postgres during pgsql init? true by default
pg_data: /pg/data                 # postgres data directory, `/pg/data` by default
pg_fs_main: /data                 # mountpoint/path for postgres main data, `/data` by default
pg_fs_bkup: /data/backups         # mountpoint/path for pg backup data, `/data/backup` by default
pg_storage_type: SSD              # storage type for pg main data, SSD,HDD, SSD by default
pg_dummy_filesize: 64MiB          # size of `/pg/dummy`, hold 64MB disk space for emergency use
pg_listen: '0.0.0.0'              # postgres listen address, `0.0.0.0` (all ipv4 addr) by default
pg_port: 5432                     # postgres listen port, 5432 by default
pg_localhost: /var/run/postgresql # postgres unix socket dir for localhost connection
pg_namespace: /pg                 # top level key namespace in etcd, used by patroni & vip
patroni_enabled: true             # if disabled, no postgres cluster will be created during init
patroni_mode: default             # patroni working mode: default,pause,remove
patroni_port: 8008                # patroni listen port, 8008 by default
patroni_log_dir: /pg/log/patroni  # patroni log dir, `/pg/log/patroni` by default
patroni_ssl_enabled: false        # secure patroni RestAPI communications with SSL?
patroni_watchdog_mode: off        # patroni watchdog mode: automatic,required,off. off by default
patroni_username: postgres        # patroni restapi username, `postgres` by default
patroni_password: Patroni.API     # patroni restapi password, `Patroni.API` by default
patroni_citus_db: postgres        # citus database managed by patroni, postgres by default
pg_conf: oltp.yml                 # config template: oltp,olap,crit,tiny. `oltp.yml` by default
pg_max_conn: auto                 # postgres max connections, `auto` will use recommended value
pg_shared_buffer_ratio: 0.25      # postgres shared buffer ratio, 0.25 by default, 0.1~0.4
pg_rto: 30                        # recovery time objective in seconds,  `30s` by default
pg_rpo: 1048576                   # recovery point objective in bytes, `1MiB` at most by default
pg_libs: 'timescaledb, pg_stat_statements, auto_explain'  # extensions to be loaded
pg_delay: 0                       # replication apply delay for standby cluster leader
pg_checksum: false                # enable data checksum for postgres cluster?
pg_pwd_enc: scram-sha-256         # passwords encryption algorithm: md5,scram-sha-256
pg_encoding: UTF8                 # database cluster encoding, `UTF8` by default
pg_locale: C                      # database cluster local, `C` by default
pg_lc_collate: C                  # database cluster collate, `C` by default
pg_lc_ctype: en_US.UTF8           # database character type, `en_US.UTF8` by default
pgbouncer_enabled: true           # if disabled, pgbouncer will not be launched on pgsql host
pgbouncer_port: 6432              # pgbouncer listen port, 6432 by default
pgbouncer_log_dir: /pg/log/pgbouncer  # pgbouncer log dir, `/pg/log/pgbouncer` by default
pgbouncer_auth_query: false       # query postgres to retrieve unlisted business users?
pgbouncer_poolmode: transaction   # pooling mode: transaction,session,statement, transaction by default
pgbouncer_sslmode: disable        # pgbouncer client ssl mode, disable by default

pg_safeguard

name: pg_safeguard, type: bool, level: G/C/A

prevent purging running postgres instance? false by default

If enabled, pgsql.yml & pgsql-rm.yml will abort immediately if any postgres instance is running.

pg_clean

name: pg_clean, type: bool, level: G/C/A

purging existing postgres during pgsql init? true by default

default value is true, it will purge existing postgres instance during pgsql.yml init. which makes the playbook idempotent.

if set to false, pgsql.yml will abort if there’s already a running postgres instance. and pgsql-rm.yml will NOT remove postgres data (only stop the server).

pg_data

name: pg_data, type: path, level: C

postgres data directory, /pg/data by default

default values: /pg/data, DO NOT CHANGE IT.

It’s a soft link that point to underlying data directory.

Check PGSQL File Structure for details.

pg_fs_main

name: pg_fs_main, type: path, level: C

mountpoint/path for postgres main data, /data by default

default values: /data, which will be used as parent dir of postgres main data directory: /data/postgres.

It’s recommended to use NVME SSD for postgres main data storage, Pigsty is optimized for SSD storage by default. But HDD is also supported, you can change pg_storage_type to HDD to optimize for HDD storage.

pg_fs_bkup

name: pg_fs_bkup, type: path, level: C

mountpoint/path for pg backup data, /data/backup by default

If you are using the default pgbackrest_method = local, it is recommended to have a separate disk for backup storage.

The backup disk should be large enough to hold all your backups, at least enough for 3 basebackups + 2 days WAL archive. This is usually not a problem since you can use cheap & large HDD for that.

It’s recommended to use a separate disk for backup storage, otherwise pigsty will fall back to the main data disk.

pg_storage_type

name: pg_storage_type, type: enum, level: C

storage type for pg main data, SSD,HDD, SSD by default

default values: SSD, it will affect some tuning parameters, such as random_page_cost & effective_io_concurrency

pg_dummy_filesize

name: pg_dummy_filesize, type: size, level: C

size of /pg/dummy, default values: 64MiB, which hold 64MB disk space for emergency use

When the disk is full, removing the placeholder file can free up some space for emergency use, it is recommended to use at least 8GiB for production use.

pg_listen

name: pg_listen, type: ip, level: C

postgres/pgbouncer listen address, 0.0.0.0 (all ipv4 addr) by default

You can use placeholder in this variable:

  • ${ip}: translate to inventory_hostname, which is primary private IP address in the inventory
  • ${vip}: if pg_vip_enabled, this will translate to host part of pg_vip_address
  • ${lo}: will translate to 127.0.0.1

For example: '${ip},${lo}' or '${ip},${vip},${lo}'.

pg_port

name: pg_port, type: port, level: C

postgres listen port, 5432 by default.

pg_localhost

name: pg_localhost, type: path, level: C

postgres unix socket dir for localhost connection, default values: /var/run/postgresql

The Unix socket dir for PostgreSQL and Pgbouncer local connection, which is used by pg_exporter and patroni.

pg_namespace

name: pg_namespace, type: path, level: C

top level key namespace in etcd, used by patroni & vip, default values is: /pg , and it’s not recommended to change it.

patroni_enabled

name: patroni_enabled, type: bool, level: C

if disabled, no postgres cluster will be created during init

default value is true, If disabled, Pigsty will skip pulling up patroni (thus postgres).

This option is useful when trying to add some components to an existing postgres instance.

patroni_mode

name: patroni_mode, type: enum, level: C

patroni working mode: default, pause, remove

default values: default

  • default: Bootstrap PostgreSQL cluster with Patroni
  • pause: Just like default, but entering maintenance mode after bootstrap
  • remove: Init the cluster with Patroni, them remove Patroni and use raw PostgreSQL instead.

patroni_port

name: patroni_port, type: port, level: C

patroni listen port, 8008 by default, changing it is not recommended.

The Patroni API server listens on this port for health checking & API requests.

patroni_log_dir

name: patroni_log_dir, type: path, level: C

patroni log dir, /pg/log/patroni by default, which will be collected by promtail.

patroni_ssl_enabled

name: patroni_ssl_enabled, type: bool, level: G

Secure patroni RestAPI communications with SSL? default value is false

This parameter is a global flag that can only be set before deployment.

Since if SSL is enabled for patroni, you’ll have to perform healthcheck, metrics scrape and API call with HTTPS instead of HTTP.

patroni_watchdog_mode

name: patroni_watchdog_mode, type: string, level: C

In case of primary failure, patroni can use watchdog to fencing the old primary node to avoid split-brain.

patroni watchdog mode: automatic, required, off:

  • off: not using watchdog. avoid fencing at all. This is the default value.
  • automatic: Enable watchdog if the kernel has softdog module enabled and watchdog is owned by dbsu
  • required: Force watchdog, refuse to start if softdog is not available

default value is off, you should not enable watchdog on infra nodes to avoid fencing.

For those critical systems where data consistency prevails over availability, it is recommended to enable watchdog.

Beware that if all your traffic is accessed via haproxy, there is no risk of brain split at all.

patroni_username

name: patroni_username, type: username, level: C

patroni restapi username, postgres by default, used in pair with patroni_password

Patroni unsafe RESTAPI is protected by username/password by default, check Config Cluster and Patroni RESTAPI for details.

patroni_password

name: patroni_password, type: password, level: C

patroni restapi password, Patroni.API by default

WARNING: CHANGE THIS IN PRODUCTION ENVIRONMENT!!!!

patroni_citus_db

name: patroni_citus_db, type: string, level: C

citus database managed by patroni, postgres by default.

Patroni 3.0’s native citus will specify a managed database for citus. which is created by patroni itself.

pg_conf

name: pg_conf, type: enum, level: C

config template: {oltp,olap,crit,tiny}.yml, oltp.yml by default

  • tiny.yml: optimize for tiny nodes, virtual machines, small demo, (1~8Core, 1~16GB)
  • oltp.yml: optimize for OLTP workloads and latency sensitive applications, (4C8GB+), which is the default template
  • olap.yml: optimize for OLAP workloads and throughput (4C8G+)
  • crit.yml: optimize for data consistency and critical applications (4C8G+)

default values: oltp.yml, but configure procedure will set this value to tiny.yml if current node is a tiny node.

You can have your own template, just put it under templates/<mode>.yml and set this value to the template name.

pg_max_conn

name: pg_max_conn, type: int, level: C

postgres max connections, You can specify a value between 50 and 5000, or use auto to use recommended value.

default value is auto, which will set max connections according to the pg_conf and pg_default_service_dest.

  • tiny: 100
  • olap: 200
  • oltp: 200 (pgbouncer) / 1000 (postgres)
    • pg_default_service_dest = pgbouncer : 200
    • pg_default_service_dest = postgres : 1000
  • crit: 200 (pgbouncer) / 1000 (postgres)
    • pg_default_service_dest = pgbouncer : 200
    • pg_default_service_dest = postgres : 1000

It’s not recommended to set this value greater than 5000, otherwise you have to increase the haproxy service connection limit manually as well.

Pgbouncer’s transaction pooling can alleviate the problem of too many OLTP connections, but it’s not recommended to use it in OLAP scenarios.

pg_shared_buffer_ratio

name: pg_shared_buffer_ratio, type: float, level: C

postgres shared buffer memory ratio, 0.25 by default, 0.1~0.4

default values: 0.25, means 25% of node memory will be used as PostgreSQL shard buffers.

Setting this value greater than 0.4 (40%) is usually not a good idea.

Note that shared buffer is only part of shared memory in PostgreSQL, to calculate the total shared memory, use show shared_memory_size_in_huge_pages;.

pg_rto

name: pg_rto, type: int, level: C

recovery time objective in seconds, This will be used as Patroni TTL value, 30s by default.

If a primary instance is missing for such a long time, a new leader election will be triggered.

Decrease the value can reduce the unavailable time (unable to write) of the cluster during failover, but it will make the cluster more sensitive to network jitter, thus increase the chance of false-positive failover.

Config this according to your network condition and expectation to trade-off between chance and impact, the default value is 30s, and it will be populated to the following patroni parameters:

# the TTL to acquire the leader lock (in seconds). Think of it as the length of time before initiation of the automatic failover process. Default value: 30
ttl: {{ pg_rto }}

# the number of seconds the loop will sleep. Default value: 10 , this is patroni check loop interval
loop_wait: {{ (pg_rto / 3)|round(0, 'ceil')|int }}

# timeout for DCS and PostgreSQL operation retries (in seconds). DCS or network issues shorter than this will not cause Patroni to demote the leader. Default value: 10
retry_timeout: {{ (pg_rto / 3)|round(0, 'ceil')|int }}

# the amount of time a primary is allowed to recover from failures before failover is triggered (in seconds), Max RTO: 2 loop wait + primary_start_timeout
primary_start_timeout: {{ (pg_rto / 3)|round(0, 'ceil')|int }}

pg_rpo

name: pg_rpo, type: int, level: C

recovery point objective in bytes, 1MiB at most by default

default values: 1048576, which will tolerate at most 1MiB data loss during failover.

when the primary is down and all replicas are lagged, you have to make a tough choice to trade off between Availability and Consistency:

  • Promote a replica to be the new primary and bring system back online ASAP, with the price of an acceptable data loss (e.g. less than 1MB).
  • Wait for the primary to come back (which may never be) or human intervention to avoid any data loss.

You can use crit.yml conf template to ensure no data loss during failover, but it will sacrifice some performance.

pg_libs

name: pg_libs, type: string, level: C

shared preloaded libraries, pg_stat_statements,auto_explain by default.

They are two extensions that come with PostgreSQL, and it is strongly recommended to enable them.

For existing clusters, you can configure the shared_preload_libraries parameter of the cluster and apply it.

If you want to use TimescaleDB or Citus extensions, you need to add timescaledb or citus to this list. timescaledb and citus should be placed at the top of this list, for example:

citus,timescaledb,pg_stat_statements,auto_explain

Other extensions that need to be loaded can also be added to this list, such as pg_cron, pgml, etc.

Generally, citus and timescaledb have the highest priority and should be added to the top of the list.

pg_delay

name: pg_delay, type: interval, level: I

replication apply delay for standby cluster leader , default values: 0.

if this value is set to a positive value, the standby cluster leader will be delayed for this time before apply WAL changes.

Check delayed standby cluster for details.

pg_checksum

name: pg_checksum, type: bool, level: C

enable data checksum for postgres cluster?, default value is false.

This parameter can only be set before PGSQL deployment. (but you can enable it manually later)

If pg_conf crit.yml template is used, data checksum is always enabled regardless of this parameter to ensure data integrity.

pg_pwd_enc

name: pg_pwd_enc, type: enum, level: C

passwords encryption algorithm: md5,scram-sha-256

default values: scram-sha-256, if you have compatibility issues with old clients, you can set it to md5 instead.

pg_encoding

name: pg_encoding, type: enum, level: C

database cluster encoding, UTF8 by default

pg_locale

name: pg_locale, type: enum, level: C

database cluster local, C by default

pg_lc_collate

name: pg_lc_collate, type: enum, level: C

database cluster collate, C by default, It’s not recommended to change this value unless you know what you are doing.

pg_lc_ctype

name: pg_lc_ctype, type: enum, level: C

database character type, en_US.UTF8 by default

pgbouncer_enabled

name: pgbouncer_enabled, type: bool, level: C

default value is true, if disabled, pgbouncer will not be launched on pgsql host

pgbouncer_port

name: pgbouncer_port, type: port, level: C

pgbouncer listen port, 6432 by default

pgbouncer_log_dir

name: pgbouncer_log_dir, type: path, level: C

pgbouncer log dir, /pg/log/pgbouncer by default, referenced by promtail the logging agent.

pgbouncer_auth_query

name: pgbouncer_auth_query, type: bool, level: C

query postgres to retrieve unlisted business users? default value is false

If enabled, pgbouncer user will be authenticated against postgres database with SELECT username, password FROM monitor.pgbouncer_auth($1), otherwise, only the users with pgbouncer: true will be allowed to connect to pgbouncer.

pgbouncer_poolmode

name: pgbouncer_poolmode, type: enum, level: C

Pgbouncer pooling mode: transaction, session, statement, transaction by default

  • session: Session-level pooling with the best compatibility.
  • transaction: Transaction-level pooling with better performance (lots of small conns), could break some session level features such as notify/listen, etc…
  • statements: Statement-level pooling which is used for simple read-only queries.

If you application has some compatibility issues with pgbouncer, you can try to change this value to session instead.

pgbouncer_sslmode

name: pgbouncer_sslmode, type: enum, level: C

pgbouncer client ssl mode, disable by default

default values: disable, beware that this may have a huge performance impact on your pgbouncer.

  • disable: Plain TCP. If client requests TLS, it’s ignored. Default.
  • allow: If client requests TLS, it is used. If not, plain TCP is used. If the client presents a client certificate, it is not validated.
  • prefer: Same as allow.
  • require: Client must use TLS. If not, the client connection is rejected. If the client presents a client certificate, it is not validated.
  • verify-ca: Client must use TLS with valid client certificate.
  • verify-full: Same as verify-ca.

PG_PROVISION

PG_BOOTSTRAP will bootstrap a new postgres cluster with patroni, while PG_PROVISION will create default objects in the cluster, including:

pg_provision: true                # provision postgres cluster after bootstrap
pg_init: pg-init                  # provision init script for cluster template, `pg-init` by default
pg_default_roles:                 # default roles and users in postgres cluster
  - { name: dbrole_readonly  ,login: false ,comment: role for global read-only access     }
  - { name: dbrole_offline   ,login: false ,comment: role for restricted read-only access }
  - { name: dbrole_readwrite ,login: false ,roles: [dbrole_readonly] ,comment: role for global read-write access }
  - { name: dbrole_admin     ,login: false ,roles: [pg_monitor, dbrole_readwrite] ,comment: role for object creation }
  - { name: postgres     ,superuser: true  ,comment: system superuser }
  - { name: replicator ,replication: true  ,roles: [pg_monitor, dbrole_readonly] ,comment: system replicator }
  - { name: dbuser_dba   ,superuser: true  ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 ,comment: pgsql admin user }
  - { name: dbuser_monitor ,roles: [pg_monitor] ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }
pg_default_privileges:            # default privileges when created by admin user
  - GRANT USAGE      ON SCHEMAS   TO dbrole_readonly
  - GRANT SELECT     ON TABLES    TO dbrole_readonly
  - GRANT SELECT     ON SEQUENCES TO dbrole_readonly
  - GRANT EXECUTE    ON FUNCTIONS TO dbrole_readonly
  - GRANT USAGE      ON SCHEMAS   TO dbrole_offline
  - GRANT SELECT     ON TABLES    TO dbrole_offline
  - GRANT SELECT     ON SEQUENCES TO dbrole_offline
  - GRANT EXECUTE    ON FUNCTIONS TO dbrole_offline
  - GRANT INSERT     ON TABLES    TO dbrole_readwrite
  - GRANT UPDATE     ON TABLES    TO dbrole_readwrite
  - GRANT DELETE     ON TABLES    TO dbrole_readwrite
  - GRANT USAGE      ON SEQUENCES TO dbrole_readwrite
  - GRANT UPDATE     ON SEQUENCES TO dbrole_readwrite
  - GRANT TRUNCATE   ON TABLES    TO dbrole_admin
  - GRANT REFERENCES ON TABLES    TO dbrole_admin
  - GRANT TRIGGER    ON TABLES    TO dbrole_admin
  - GRANT CREATE     ON SCHEMAS   TO dbrole_admin
pg_default_schemas: [ monitor ]   # default schemas to be created
pg_default_extensions:            # default extensions to be created
  - { name: adminpack          ,schema: pg_catalog }
  - { name: pg_stat_statements ,schema: monitor }
  - { name: pgstattuple        ,schema: monitor }
  - { name: pg_buffercache     ,schema: monitor }
  - { name: pageinspect        ,schema: monitor }
  - { name: pg_prewarm         ,schema: monitor }
  - { name: pg_visibility      ,schema: monitor }
  - { name: pg_freespacemap    ,schema: monitor }
  - { name: postgres_fdw       ,schema: public  }
  - { name: file_fdw           ,schema: public  }
  - { name: btree_gist         ,schema: public  }
  - { name: btree_gin          ,schema: public  }
  - { name: pg_trgm            ,schema: public  }
  - { name: intagg             ,schema: public  }
  - { name: intarray           ,schema: public  }
  - { name: pg_repack }
pg_reload: true                   # reload postgres after hba changes
pg_default_hba_rules:             # postgres default host-based authentication rules
  - {user: '${dbsu}'    ,db: all         ,addr: local     ,auth: ident ,title: 'dbsu access via local os user ident'  }
  - {user: '${dbsu}'    ,db: replication ,addr: local     ,auth: ident ,title: 'dbsu replication from local os ident' }
  - {user: '${repl}'    ,db: replication ,addr: localhost ,auth: pwd   ,title: 'replicator replication from localhost'}
  - {user: '${repl}'    ,db: replication ,addr: intra     ,auth: pwd   ,title: 'replicator replication from intranet' }
  - {user: '${repl}'    ,db: postgres    ,addr: intra     ,auth: pwd   ,title: 'replicator postgres db from intranet' }
  - {user: '${monitor}' ,db: all         ,addr: localhost ,auth: pwd   ,title: 'monitor from localhost with password' }
  - {user: '${monitor}' ,db: all         ,addr: infra     ,auth: pwd   ,title: 'monitor from infra host with password'}
  - {user: '${admin}'   ,db: all         ,addr: infra     ,auth: ssl   ,title: 'admin @ infra nodes with pwd & ssl'   }
  - {user: '${admin}'   ,db: all         ,addr: world     ,auth: ssl   ,title: 'admin @ everywhere with ssl & pwd'    }
  - {user: '+dbrole_readonly',db: all    ,addr: localhost ,auth: pwd   ,title: 'pgbouncer read/write via local socket'}
  - {user: '+dbrole_readonly',db: all    ,addr: intra     ,auth: pwd   ,title: 'read/write biz user via password'     }
  - {user: '+dbrole_offline' ,db: all    ,addr: intra     ,auth: pwd   ,title: 'allow etl offline tasks from intranet'}
pgb_default_hba_rules:            # pgbouncer default host-based authentication rules
  - {user: '${dbsu}'    ,db: pgbouncer   ,addr: local     ,auth: peer  ,title: 'dbsu local admin access with os ident'}
  - {user: 'all'        ,db: all         ,addr: localhost ,auth: pwd   ,title: 'allow all user local access with pwd' }
  - {user: '${monitor}' ,db: pgbouncer   ,addr: intra     ,auth: pwd   ,title: 'monitor access via intranet with pwd' }
  - {user: '${monitor}' ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other monitor access addr' }
  - {user: '${admin}'   ,db: all         ,addr: intra     ,auth: pwd   ,title: 'admin access via intranet with pwd'   }
  - {user: '${admin}'   ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other admin access addr'   }
  - {user: 'all'        ,db: all         ,addr: intra     ,auth: pwd   ,title: 'allow all user intra access with pwd' }

pg_provision

name: pg_provision, type: bool, level: C

provision postgres cluster after bootstrap, default value is true.

If disabled, postgres cluster will not be provisioned after bootstrap.

pg_init

name: pg_init, type: string, level: G/C

Provision init script for cluster template, pg-init by default, which is located in roles/pgsql/templates/pg-init

You can add your own logic in the init script, or provide a new one in templates/ and set pg_init to the new script name.

pg_default_roles

name: pg_default_roles, type: role[], level: G/C

default roles and users in postgres cluster.

Pigsty has a built-in role system, check PGSQL Access Control for details.

pg_default_roles:                 # default roles and users in postgres cluster
  - { name: dbrole_readonly  ,login: false ,comment: role for global read-only access     }
  - { name: dbrole_offline   ,login: false ,comment: role for restricted read-only access }
  - { name: dbrole_readwrite ,login: false ,roles: [dbrole_readonly] ,comment: role for global read-write access }
  - { name: dbrole_admin     ,login: false ,roles: [pg_monitor, dbrole_readwrite] ,comment: role for object creation }
  - { name: postgres     ,superuser: true  ,comment: system superuser }
  - { name: replicator ,replication: true  ,roles: [pg_monitor, dbrole_readonly] ,comment: system replicator }
  - { name: dbuser_dba   ,superuser: true  ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 ,comment: pgsql admin user }
  - { name: dbuser_monitor ,roles: [pg_monitor] ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }

pg_default_privileges

name: pg_default_privileges, type: string[], level: G/C

default privileges for each databases:

pg_default_privileges:            # default privileges when created by admin user
  - GRANT USAGE      ON SCHEMAS   TO dbrole_readonly
  - GRANT SELECT     ON TABLES    TO dbrole_readonly
  - GRANT SELECT     ON SEQUENCES TO dbrole_readonly
  - GRANT EXECUTE    ON FUNCTIONS TO dbrole_readonly
  - GRANT USAGE      ON SCHEMAS   TO dbrole_offline
  - GRANT SELECT     ON TABLES    TO dbrole_offline
  - GRANT SELECT     ON SEQUENCES TO dbrole_offline
  - GRANT EXECUTE    ON FUNCTIONS TO dbrole_offline
  - GRANT INSERT     ON TABLES    TO dbrole_readwrite
  - GRANT UPDATE     ON TABLES    TO dbrole_readwrite
  - GRANT DELETE     ON TABLES    TO dbrole_readwrite
  - GRANT USAGE      ON SEQUENCES TO dbrole_readwrite
  - GRANT UPDATE     ON SEQUENCES TO dbrole_readwrite
  - GRANT TRUNCATE   ON TABLES    TO dbrole_admin
  - GRANT REFERENCES ON TABLES    TO dbrole_admin
  - GRANT TRIGGER    ON TABLES    TO dbrole_admin
  - GRANT CREATE     ON SCHEMAS   TO dbrole_admin

Pigsty has a built-in privileges base on default role system, check PGSQL Privileges for details.

pg_default_schemas

name: pg_default_schemas, type: string[], level: G/C

default schemas to be created, default values is: [ monitor ], which will create a monitor schema on all databases.

pg_default_extensions

name: pg_default_extensions, type: extension[], level: G/C

default extensions to be created, default value:

pg_default_extensions: # default extensions to be created
  - { name: adminpack          ,schema: pg_catalog }
  - { name: pg_stat_statements ,schema: monitor }
  - { name: pgstattuple        ,schema: monitor }
  - { name: pg_buffercache     ,schema: monitor }
  - { name: pageinspect        ,schema: monitor }
  - { name: pg_prewarm         ,schema: monitor }
  - { name: pg_visibility      ,schema: monitor }
  - { name: pg_freespacemap    ,schema: monitor }
  - { name: postgres_fdw       ,schema: public  }
  - { name: file_fdw           ,schema: public  }
  - { name: btree_gist         ,schema: public  }
  - { name: btree_gin          ,schema: public  }
  - { name: pg_trgm            ,schema: public  }
  - { name: intagg             ,schema: public  }
  - { name: intarray           ,schema: public  }
  - { name: pg_repack }

The only 3rd party extension is pg_repack, which is important for database maintenance, all other extensions are built-in postgres contrib extensions.

Monitor related extensions are installed in monitor schema, which is created by pg_default_schemas.

pg_reload

name: pg_reload, type: bool, level: A

reload postgres after hba changes, default value is true

This is useful when you want to check before applying HBA changes, set it to false to disable reload.

pg_default_hba_rules

name: pg_default_hba_rules, type: hba[], level: G/C

postgres default host-based authentication rules, array of hba rule object.

default value provides a fair enough security level for common scenarios, check PGSQL Authentication for details.

pg_default_hba_rules:             # postgres default host-based authentication rules
  - {user: '${dbsu}'    ,db: all         ,addr: local     ,auth: ident ,title: 'dbsu access via local os user ident'  }
  - {user: '${dbsu}'    ,db: replication ,addr: local     ,auth: ident ,title: 'dbsu replication from local os ident' }
  - {user: '${repl}'    ,db: replication ,addr: localhost ,auth: pwd   ,title: 'replicator replication from localhost'}
  - {user: '${repl}'    ,db: replication ,addr: intra     ,auth: pwd   ,title: 'replicator replication from intranet' }
  - {user: '${repl}'    ,db: postgres    ,addr: intra     ,auth: pwd   ,title: 'replicator postgres db from intranet' }
  - {user: '${monitor}' ,db: all         ,addr: localhost ,auth: pwd   ,title: 'monitor from localhost with password' }
  - {user: '${monitor}' ,db: all         ,addr: infra     ,auth: pwd   ,title: 'monitor from infra host with password'}
  - {user: '${admin}'   ,db: all         ,addr: infra     ,auth: ssl   ,title: 'admin @ infra nodes with pwd & ssl'   }
  - {user: '${admin}'   ,db: all         ,addr: world     ,auth: ssl   ,title: 'admin @ everywhere with ssl & pwd'    }
  - {user: '+dbrole_readonly',db: all    ,addr: localhost ,auth: pwd   ,title: 'pgbouncer read/write via local socket'}
  - {user: '+dbrole_readonly',db: all    ,addr: intra     ,auth: pwd   ,title: 'read/write biz user via password'     }
  - {user: '+dbrole_offline' ,db: all    ,addr: intra     ,auth: pwd   ,title: 'allow etl offline tasks from intranet'}

pgb_default_hba_rules

name: pgb_default_hba_rules, type: hba[], level: G/C

pgbouncer default host-based authentication rules, array or hba rule object.

default value provides a fair enough security level for common scenarios, check PGSQL Authentication for details.

pgb_default_hba_rules:            # pgbouncer default host-based authentication rules
  - {user: '${dbsu}'    ,db: pgbouncer   ,addr: local     ,auth: peer  ,title: 'dbsu local admin access with os ident'}
  - {user: 'all'        ,db: all         ,addr: localhost ,auth: pwd   ,title: 'allow all user local access with pwd' }
  - {user: '${monitor}' ,db: pgbouncer   ,addr: intra     ,auth: pwd   ,title: 'monitor access via intranet with pwd' }
  - {user: '${monitor}' ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other monitor access addr' }
  - {user: '${admin}'   ,db: all         ,addr: intra     ,auth: pwd   ,title: 'admin access via intranet with pwd'   }
  - {user: '${admin}'   ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other admin access addr'   }
  - {user: 'all'        ,db: all         ,addr: intra     ,auth: pwd   ,title: 'allow all user intra access with pwd' }

PG_BACKUP

This section defines variables for pgBackRest, which is used for PGSQL PITR (Point-In-Time-Recovery).

Check PGSQL Backup & PITR for details.

pgbackrest_enabled: true          # enable pgbackrest on pgsql host?
pgbackrest_clean: true            # remove pg backup data during init?
pgbackrest_log_dir: /pg/log/pgbackrest # pgbackrest log dir, `/pg/log/pgbackrest` by default
pgbackrest_method: local          # pgbackrest repo method: local,minio,[user-defined...]
pgbackrest_repo:                  # pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository
  local:                          # default pgbackrest repo with local posix fs
    path: /pg/backup              # local backup directory, `/pg/backup` by default
    retention_full_type: count    # retention full backups by count
    retention_full: 2             # keep 2, at most 3 full backup when using local fs repo
  minio:                          # optional minio repo for pgbackrest
    type: s3                      # minio is s3-compatible, so s3 is used
    s3_endpoint: sss.pigsty       # minio endpoint domain name, `sss.pigsty` by default
    s3_region: us-east-1          # minio region, us-east-1 by default, useless for minio
    s3_bucket: pgsql              # minio bucket name, `pgsql` by default
    s3_key: pgbackrest            # minio user access key for pgbackrest
    s3_key_secret: S3User.Backup  # minio user secret key for pgbackrest
    s3_uri_style: path            # use path style uri for minio rather than host style
    path: /pgbackrest             # minio backup path, default is `/pgbackrest`
    storage_port: 9000            # minio port, 9000 by default
    storage_ca_file: /etc/pki/ca.crt  # minio ca file path, `/etc/pki/ca.crt` by default
    bundle: y                     # bundle small files into a single file
    cipher_type: aes-256-cbc      # enable AES encryption for remote backup repo
    cipher_pass: pgBackRest       # AES encryption password, default is 'pgBackRest'
    retention_full_type: time     # retention full backup by time on minio repo
    retention_full: 14            # keep full backup for last 14 days

pgbackrest_enabled

name: pgbackrest_enabled, type: bool, level: C

enable pgBackRest on pgsql host? default value is true

pgbackrest_clean

name: pgbackrest_clean, type: bool, level: C

remove pg backup data during init? default value is true

pgbackrest_log_dir

name: pgbackrest_log_dir, type: path, level: C

pgBackRest log dir, /pg/log/pgbackrest by default, which is referenced by promtail the logging agent.

pgbackrest_method

name: pgbackrest_method, type: enum, level: C

pgBackRest repo method: local, minio, or other user-defined methods, local by default

This parameter is used to determine which repo to use for pgBackRest, all available repo methods are defined in pgbackrest_repo.

Pigsty will use local backup repo by default, which will create a backup repo on primary instance’s /pg/backup directory. The underlying storage is specified by pg_fs_bkup.

pgbackrest_repo

name: pgbackrest_repo, type: dict, level: G/C

pgBackRest repo document: https://pgbackrest.org/configuration.html#section-repository

default value includes two repo methods: local and minio, which are defined as follows:

pgbackrest_repo:                  # pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository
  local:                          # default pgbackrest repo with local posix fs
    path: /pg/backup              # local backup directory, `/pg/backup` by default
    retention_full_type: count    # retention full backups by count
    retention_full: 2             # keep 2, at most 3 full backup when using local fs repo
  minio:                          # optional minio repo for pgbackrest
    type: s3                      # minio is s3-compatible, so s3 is used
    s3_endpoint: sss.pigsty       # minio endpoint domain name, `sss.pigsty` by default
    s3_region: us-east-1          # minio region, us-east-1 by default, useless for minio
    s3_bucket: pgsql              # minio bucket name, `pgsql` by default
    s3_key: pgbackrest            # minio user access key for pgbackrest
    s3_key_secret: S3User.Backup  # minio user secret key for pgbackrest
    s3_uri_style: path            # use path style uri for minio rather than host style
    path: /pgbackrest             # minio backup path, default is `/pgbackrest`
    storage_port: 9000            # minio port, 9000 by default
    storage_ca_file: /etc/pki/ca.crt  # minio ca file path, `/etc/pki/ca.crt` by default
    bundle: y                     # bundle small files into a single file
    cipher_type: aes-256-cbc      # enable AES encryption for remote backup repo
    cipher_pass: pgBackRest       # AES encryption password, default is 'pgBackRest'
    retention_full_type: time     # retention full backup by time on minio repo
    retention_full: 14            # keep full backup for last 14 days

PG_SERVICE

This section is about exposing PostgreSQL service to outside world: including:

  • Exposing different PostgreSQL services on different ports with haproxy
  • Bind an optional L2 VIP to the primary instance with vip-manager
  • Register cluster/instance DNS records with to dnsmasq on infra nodes
pg_weight: 100          #INSTANCE # relative load balance weight in service, 100 by default, 0-255
pg_default_service_dest: pgbouncer # default service destination if svc.dest='default'
pg_default_services:              # postgres default service definitions
  - { name: primary ,port: 5433 ,dest: default  ,check: /primary   ,selector: "[]" }
  - { name: replica ,port: 5434 ,dest: default  ,check: /read-only ,selector: "[]" , backup: "[? pg_role == `primary` || pg_role == `offline` ]" }
  - { name: default ,port: 5436 ,dest: postgres ,check: /primary   ,selector: "[]" }
  - { name: offline ,port: 5438 ,dest: postgres ,check: /replica   ,selector: "[? pg_role == `offline` || pg_offline_query ]" , backup: "[? pg_role == `replica` && !pg_offline_query]"}
pg_vip_enabled: false             # enable a l2 vip for pgsql primary? false by default
pg_vip_address: 127.0.0.1/24      # vip address in `<ipv4>/<mask>` format, require if vip is enabled
pg_vip_interface: eth0            # vip network interface to listen, eth0 by default
pg_dns_suffix: ''                 # pgsql dns suffix, '' by default
pg_dns_target: auto               # auto, primary, vip, none, or ad hoc ip

pg_weight

name: pg_weight, type: int, level: G

relative load balance weight in service, 100 by default, 0-255

default values: 100. you have to define it at instance vars, and reload-service to take effect.

pg_service_provider

name: pg_service_provider, type: string, level: G/C

dedicate haproxy node group name, or empty string for local nodes by default.

If specified, PostgreSQL Services will be registered to the dedicated haproxy node group instead of this pgsql cluster nodes.

Do remember to allocate unique ports on dedicate haproxy nodes for each service!

For example, if we define following parameters on 3-node pg-test cluster:

pg_service_provider: infra       # use load balancer on group `infra`
pg_default_services:             # alloc port 10001 and 10002 for pg-test primary/replica service  
  - { name: primary ,port: 10001 ,dest: postgres  ,check: /primary   ,selector: "[]" }
  - { name: replica ,port: 10002 ,dest: postgres  ,check: /read-only ,selector: "[]" , backup: "[? pg_role == `primary` || pg_role == `offline` ]" }

pg_default_service_dest

name: pg_default_service_dest, type: enum, level: G/C

When defining a service, if svc.dest=‘default’, this parameter will be used as the default value.

default values: pgbouncer, means 5433 primary service and 5434 replica service will route traffic to pgbouncer by default.

If you don’t want to use pgbouncer, set it to postgres instead. traffic will be route to postgres directly.

pg_default_services

name: pg_default_services, type: service[], level: G/C

postgres default service definitions

default value is four default services definition, which is explained in PGSQL Service

pg_default_services:               # postgres default service definitions
  - { name: primary ,port: 5433 ,dest: default  ,check: /primary   ,selector: "[]" }
  - { name: replica ,port: 5434 ,dest: default  ,check: /read-only ,selector: "[]" , backup: "[? pg_role == `primary` || pg_role == `offline` ]" }
  - { name: default ,port: 5436 ,dest: postgres ,check: /primary   ,selector: "[]" }
  - { name: offline ,port: 5438 ,dest: postgres ,check: /replica   ,selector: "[? pg_role == `offline` || pg_offline_query ]" , backup: "[? pg_role == `replica` && !pg_offline_query]"}

pg_vip_enabled

name: pg_vip_enabled, type: bool, level: C

enable a l2 vip for pgsql primary?

default value is false, means no L2 VIP is created for this cluster.

L2 VIP can only be used in same L2 network, which may incurs extra restrictions on your network topology.

pg_vip_address

name: pg_vip_address, type: cidr4, level: C

vip address in <ipv4>/<mask> format, if vip is enabled, this parameter is required.

default values: 127.0.0.1/24. This value is consist of two parts: ipv4 and mask, separated by /.

pg_vip_interface

name: pg_vip_interface, type: string, level: C/I

vip network interface to listen, eth0 by default.

It should be the same primary intranet interface of your node, which is the IP address you used in the inventory file.

If your node have different interface, you can override it on instance vars:

pg-test:
    hosts:
        10.10.10.11: {pg_seq: 1, pg_role: replica ,pg_vip_interface: eth0 }
        10.10.10.12: {pg_seq: 2, pg_role: primary ,pg_vip_interface: eth1 }
        10.10.10.13: {pg_seq: 3, pg_role: replica ,pg_vip_interface: eth2 }
    vars:
        pg_vip_enabled: true          # enable L2 VIP for this cluster, bind to primary instance by default
        pg_vip_address: 10.10.10.3/24 # the L2 network CIDR: 10.10.10.0/24, the vip address: 10.10.10.3
        # pg_vip_interface: eth1      # if your node have non-uniform interface, you can define it here

pg_dns_suffix

name: pg_dns_suffix, type: string, level: C

pgsql dns suffix, ’’ by default, cluster DNS name is defined as {{ pg_cluster }}{{ pg_dns_suffix }}

For example, if you set pg_dns_suffix to .db.vip.company.tld for cluster pg-test, then the cluster DNS name will be pg-test.db.vip.company.tld

pg_dns_target

name: pg_dns_target, type: enum, level: C

Could be: auto, primary, vip, none, or an ad hoc ip address, which will be the target IP address of cluster DNS record.

default values: auto , which will bind to pg_vip_address if pg_vip_enabled, or fallback to cluster primary instance ip address.

  • vip: bind to pg_vip_address
  • primary: resolve to cluster primary instance ip address
  • auto: resolve to pg_vip_address if pg_vip_enabled, or fallback to cluster primary instance ip address.
  • none: do not bind to any ip address
  • <ipv4>: bind to the given IP address

PG_EXPORTER

pg_exporter_enabled: true              # enable pg_exporter on pgsql hosts?
pg_exporter_config: pg_exporter.yml    # pg_exporter configuration file name
pg_exporter_cache_ttls: '1,10,60,300'  # pg_exporter collector ttl stage in seconds, '1,10,60,300' by default
pg_exporter_port: 9630                 # pg_exporter listen port, 9630 by default
pg_exporter_params: 'sslmode=disable'  # extra url parameters for pg_exporter dsn
pg_exporter_url: ''                    # overwrite auto-generate pg dsn if specified
pg_exporter_auto_discovery: true       # enable auto database discovery? enabled by default
pg_exporter_exclude_database: 'template0,template1,postgres' # csv of database that WILL NOT be monitored during auto-discovery
pg_exporter_include_database: ''       # csv of database that WILL BE monitored during auto-discovery
pg_exporter_connect_timeout: 200       # pg_exporter connect timeout in ms, 200 by default
pg_exporter_options: ''                # overwrite extra options for pg_exporter
pgbouncer_exporter_enabled: true       # enable pgbouncer_exporter on pgsql hosts?
pgbouncer_exporter_port: 9631          # pgbouncer_exporter listen port, 9631 by default
pgbouncer_exporter_url: ''             # overwrite auto-generate pgbouncer dsn if specified
pgbouncer_exporter_options: ''         # overwrite extra options for pgbouncer_exporter

pg_exporter_enabled

name: pg_exporter_enabled, type: bool, level: C

enable pg_exporter on pgsql hosts?

default value is true, if you don’t want to install pg_exporter, set it to false.

pg_exporter_config

name: pg_exporter_config, type: string, level: C

pg_exporter configuration file name, used by pg_exporter & pgbouncer_exporter

default values: pg_exporter.yml, if you want to use a custom configuration file, you can specify its relative path here.

Your config file should be placed in files/<filename>.yml. For example, if you want to monitor a remote PolarDB instance, you can use the sample config: files/polar_exporter.yml.

pg_exporter_cache_ttls

name: pg_exporter_cache_ttls, type: string, level: C

pg_exporter collector ttl stage in seconds, ‘1,10,60,300’ by default

default values: 1,10,60,300, which will use 1s, 10s, 60s, 300s for different metric collectors.

ttl_fast: "{{ pg_exporter_cache_ttls.split(',')[0]|int }}"         # critical queries
ttl_norm: "{{ pg_exporter_cache_ttls.split(',')[1]|int }}"         # common queries
ttl_slow: "{{ pg_exporter_cache_ttls.split(',')[2]|int }}"         # slow queries (e.g table size)
ttl_slowest: "{{ pg_exporter_cache_ttls.split(',')[3]|int }}"      # ver slow queries (e.g bloat)

pg_exporter_port

name: pg_exporter_port, type: port, level: C

pg_exporter listen port, 9630 by default

pg_exporter_params

name: pg_exporter_params, type: string, level: C

extra url parameters for pg_exporter dsn

default values: sslmode=disable, which will disable SSL for monitoring connection (since it’s local unix socket by default)

pg_exporter_url

name: pg_exporter_url, type: pgurl, level: C

overwrite auto-generate pg dsn if specified

default value is empty string, If specified, it will be used as the pg_exporter dsn instead of constructing from other parameters:

This could be useful if you want to monitor a remote pgsql instance, or you want to use a different user/password for monitoring.

'postgres://{{ pg_monitor_username }}:{{ pg_monitor_password }}@{{ pg_host }}:{{ pg_port }}/postgres{% if pg_exporter_params != '' %}?{{ pg_exporter_params }}{% endif %}'

pg_exporter_auto_discovery

name: pg_exporter_auto_discovery, type: bool, level: C

enable auto database discovery? enabled by default

default value is true, which will auto-discover all databases on the postgres server and spawn a new pg_exporter connection for each database.

pg_exporter_exclude_database

name: pg_exporter_exclude_database, type: string, level: C

csv of database that WILL NOT be monitored during auto-discovery

default values: template0,template1,postgres, which will be excluded for database auto discovery.

pg_exporter_include_database

name: pg_exporter_include_database, type: string, level: C

csv of database that WILL BE monitored during auto-discovery

default value is empty string. If this value is set, only the databases in this list will be monitored during auto discovery.

pg_exporter_connect_timeout

name: pg_exporter_connect_timeout, type: int, level: C

pg_exporter connect timeout in ms, 200 by default

default values: 200ms , which is enough for most cases.

If your remote pgsql server is in another continent, you may want to increase this value to avoid connection timeout.

pg_exporter_options

name: pg_exporter_options, type: arg, level: C

overwrite extra options for pg_exporter

default value is empty string, which will fall back the following default options:

--log.level=info

If you want to customize logging options or other pg_exporter options, you can set it here.

pgbouncer_exporter_enabled

name: pgbouncer_exporter_enabled, type: bool, level: C

enable pgbouncer_exporter on pgsql hosts?

default value is true, which will enable pg_exporter for pgbouncer connection pooler.

pgbouncer_exporter_port

name: pgbouncer_exporter_port, type: port, level: C

pgbouncer_exporter listen port, 9631 by default

default values: 9631

pgbouncer_exporter_url

name: pgbouncer_exporter_url, type: pgurl, level: C

overwrite auto-generate pgbouncer dsn if specified

default value is empty string, If specified, it will be used as the pgbouncer_exporter dsn instead of constructing from other parameters:

'postgres://{{ pg_monitor_username }}:{{ pg_monitor_password }}@:{{ pgbouncer_port }}/pgbouncer?host={{ pg_localhost }}&sslmode=disable'

This could be useful if you want to monitor a remote pgbouncer instance, or you want to use a different user/password for monitoring.

pgbouncer_exporter_options

name: pgbouncer_exporter_options, type: arg, level: C

overwrite extra options for pgbouncer_exporter, default value is empty string.

--log.level=info

default value is empty string, which will fall back the following default options:

If you want to customize logging options or other pgbouncer_exporter options, you can set it here.

4.3 - Extension List

List of PostgreSQL extensions supported by Pigsty, and their compatibility on different OS distros.

Pigsty has rich support for PostgreSQL extensions, including 230 RPM extensions and 189 DEB extensions.

There are 255 unique extensions in total (rpm + deb + contrib), including 73 common contrib extensions and 91 extensions both available to rpm/deb 。

Pigsty also maintains 34 RPM Extensions and 10 DEB extensions in its own repo.


RPM Extension

Pigsty has 230 extensions available on EL compatible distros. Including 73 contrib extensions and 157 extra RPM extensions, 34 of which are maintained by Pigsty.

Based on el8, there are 6 extensions not yet ready for PG 16 (marked with ), so the available count is 224, actually.

name version category repo pkg description comment
pg_cron 1.6 ADMIN pgdg16 pg_cron_16 Job scheduler for PostgreSQL
pg_repack 1.5.0 ADMIN pgdg16 pg_repack_16 Reorganize tables in PostgreSQL databases with minimal locks
ddlx 0.27 ADMIN pgdg16 ddlx_16 DDL eXtractor functions
pg_dirtyread 2 ADMIN pigsty-pgsql pg_dirtyread_16 Read dead but unvacuumed rows from table
pg_readonly 1.0.0 ADMIN pgdg16 pg_readonly_16 cluster database read only
pg_squeeze 1.6 ADMIN pgdg16 pg_squeeze_16 A tool to remove unused space from a relation.
pgagent 4.2 ADMIN pgdg16 pgagent_16 A PostgreSQL job scheduler
pgautofailover 2.1 ADMIN pgdg16 pg_auto_failover_16 pg_auto_failover
pgdd 0.5.2 ADMIN pigsty-pgsql pgdd_16 An in-database data dictionary providing database introspection via standard SQL query syntax. Developed using pgx (https://github.com/zombodb/pgx).
pgfincore 1.3.1 ADMIN pgdg16 pgfincore_16 examine and manage the os buffer cache
pgl_ddl_deploy 2.2 ADMIN pgdg16 pgl_ddl_deploy_16 automated ddl deployment using pglogical
pgpool_adm 1.5 ADMIN pgdg16 pgpool-II-pg16-extensions Administrative functions for pgPool
pgpool_recovery 1.4 ADMIN pgdg16 pgpool-II-pg16-extensions recovery functions for pgpool-II for V4.3
pgpool_regclass 1.0 ADMIN pgdg16 pgpool-II-pg16-extensions replacement for regclass
prioritize 1.0 ADMIN pgdg16 prioritize_16 get and set the priority of PostgreSQL backends
safeupdate 1.4 ADMIN pgdg16 safeupdate_16 Require criteria for UPDATE and DELETE
pgml 2.8.1 AI pigsty-pgsql pgml_16 PostgresML: Run AL/ML workloads with SQL interface
vector 0.7.0 AI pgdg16 pgvector_16 vector data type and ivfflat and hnsw access methods
pg_tiktoken 0.0.1 AI pigsty-pgsql pg_tiktoken_16 pg_tictoken: tiktoken tokenizer for use with OpenAI models in postgres
svector 0.6.1 AI pigsty-pgsql pg_sparse_16 pg_sparse: Sparse vector data type and sparse HNSW access methods obsolete
vectorize 0.15.0 AI pigsty-pgsql pg_vectorize_16 The simplest way to do vector search on Postgres deps: pgmq, pg_cron
wal2json 2.5.3 ETL pgdg16 wal2json_16 Changing data capture in JSON format
decoderbufs 0.1.0 ETL pgdg16 postgres-decoderbufs_16 Logical decoding plugin that delivers WAL stream changes using a Protocol Buffer format
pg_bulkload 3.1.21 ETL pgdg16 pg_bulkload_16 pg_bulkload is a high speed data loading utility for PostgreSQL
pg_fact_loader 2.0 ETL pgdg16 pg_fact_loader_16 build fact tables with Postgres
wrappers 0.3.1 FDW pigsty-pgsql wrappers_16 Foreign data wrappers developed by Supabase
db2_fdw 6.0.1 FDW pgdg16-non-free db2_fdw_16 foreign data wrapper for DB2 access extra db2 deps
hdfs_fdw 2.0.5 FDW pgdg16 hdfs_fdw_16 foreign-data wrapper for remote hdfs servers
mongo_fdw 1.1 FDW pgdg16 mongo_fdw_16 foreign data wrapper for MongoDB access
mysql_fdw 1.2 FDW pgdg16 mysql_fdw_16 Foreign data wrapper for querying a MySQL server
ogr_fdw 1.1 FDW pgdg16 ogr_fdw_16 foreign-data wrapper for GIS data access
oracle_fdw 1.2 FDW pgdg16-non-free oracle_fdw_16 foreign data wrapper for Oracle access extra oracle deps
pgbouncer_fdw 1.1.0 FDW pgdg16 pgbouncer_fdw_16 Extension for querying PgBouncer stats from normal SQL views & running pgbouncer commands from normal SQL functions
sqlite_fdw 1.1 FDW pgdg16 sqlite_fdw_16 SQLite Foreign Data Wrapper
tds_fdw 2.0.3 FDW pgdg16 tds_fdw_16 Foreign data wrapper for querying a TDS database (Sybase or Microsoft SQL Server)
age 1.5.0 FEAT pigsty-pgsql age_16 AGE graph database extension
pg_graphql 1.5.4 FEAT pigsty-pgsql pg_graphql_16 pg_graphql: GraphQL support
pg_jsonschema 0.3.1 FEAT pigsty-pgsql pg_jsonschema_16 PostgreSQL extension providing JSON Schema validation
pg_strom 5.1 FEAT pgdg16-non-free pg_strom_16 PG-Strom - big-data processing acceleration using GPU and NVME extra cuda deps
pgmq 1.1.1 FEAT pigsty-pgsql pgmq_16 A lightweight message queue. Like AWS SQS and RSMQ but on Postgres.
pgq 3.5.1 FEAT pgdg16 pgq_16 Generic queue for PostgreSQL
emaj 4.4.0 FEAT pgdg16 e-maj_16 E-Maj extension enables fine-grained write logging and time travel on subsets of the database.
hll 2.18 FEAT pgdg16 hll_16 type for storing hyperloglog data
hypopg 1.4.1 FEAT pgdg16 hypopg_16 Hypothetical indexes for PostgreSQL
jsquery 1.1 FEAT pgdg16 jsquery_16 data type for jsonb inspection
periods 1.2 FEAT pgdg16 periods_16 Provide Standard SQL functionality for PERIODs and SYSTEM VERSIONING
pg_hint_plan 1.6.0 FEAT pgdg16 pg_hint_plan_16 Give PostgreSQL ability to manually force some decisions in execution plans.
pg_ivm 1.8 FEAT pgdg16 pg_ivm_16 incremental view maintenance on PostgreSQL
pgtt 3.1.0 FEAT pgdg16 pgtt_16 Extension to add Global Temporary Tables feature to PostgreSQL
rum 1.3 FEAT pgdg16 rum_16 RUM index access method
table_version 1.10.3 FEAT pgdg16 table_version_16 PostgreSQL table versioning extension
temporal_tables 1.2.2 FEAT pgdg16 temporal_tables_16 temporal tables
pg_net 0.9.1 FUNC pgdg16 pg_net_16 Async HTTP
count_distinct 3.0.1 FUNC pgdg16 count_distinct_16 An alternative to COUNT(DISTINCT …) aggregate, usable with HashAggregate
extra_window_functions 1.0 FUNC pgdg16 extra_window_functions_16 Extra Window Functions for PostgreSQL
gzip 1.0 FUNC pgdg16 pgsql_gzip_16 gzip and gunzip functions. new in pgdg
http 1.6 FUNC pgdg16 pgsql_http_16 HTTP client for PostgreSQL, allows web page retrieval inside the database. new in pgdg
pg_background 1.0 FUNC pgdg16 pg_background_16 Run SQL queries in the background
pg_idkit 0.2.3 FUNC pigsty-pgsql pg_idkit_16 multi-tool for generating new/niche universally unique identifiers (ex. UUIDv6, ULID, KSUID)
pg_later 0.1.0 FUNC pigsty-pgsql pg_later_16 pg_later: Run queries now and get results later dep: pgmq
pgjwt 0.2.0 FUNC pigsty-pgsql pgjwt_16 JSON Web Token API for Postgresql
pgsql_tweaks 0.10.2 FUNC pgdg16 pgsql_tweaks_16 Some functions and views for daily usage
tdigest 1.4.1 FUNC pgdg16 tdigest_16 Provides tdigest aggregate function.
topn 2.6.0 FUNC pgdg16 topn_16 type for top-n JSONB
postgis 3.4.2 GIS pgdg16 postgis34_16 PostGIS geometry and geography spatial types and functions
address_standardizer 3.4.2 GIS pgdg16 postgis34_16 Used to parse an address into constituent elements. Generally used to support geocoding address normalization step.
address_standardizer_data_us 3.4.2 GIS pgdg16 postgis34_16 Address Standardizer US dataset example
h3 4.1.3 GIS pgdg16 h3-pg_16 H3 bindings for PostgreSQL
h3_postgis 4.1.3 GIS pgdg16 h3-pg_16 H3 PostGIS integration
pgrouting 3.6.0 GIS pgdg16 pgrouting_16 pgRouting Extension
pointcloud 1.2.5 GIS pigsty-pgsql pointcloud_16 data type for lidar point clouds
pointcloud_postgis 1.2.5 GIS pgdg16 pointcloud_16 integration for pointcloud LIDAR data and PostGIS geometry data
postgis_raster 3.4.2 GIS pgdg16 postgis34_16 PostGIS raster types and functions
postgis_sfcgal 3.4.2 GIS pgdg16 postgis34_16 PostGIS SFCGAL functions
postgis_tiger_geocoder 3.4.2 GIS pgdg16 postgis34_16 PostGIS tiger geocoder and reverse geocoder
postgis_topology 3.4.2 GIS pgdg16 postgis34_16 PostGIS topology spatial types and functions
plv8 3.2.2 LANG pigsty-pgsql plv8_16 PL/JavaScript (v8) trusted procedural language
pg_tle 1.4.0 LANG pigsty-pgsql pg_tle_16 Trusted Language Extensions for PostgreSQL
pldbgapi 1.1 LANG pgdg16 pldebugger_16 server-side support for debugging PL/pgSQL functions
pllua 2.0 LANG pgdg16 pllua_16 Lua as a procedural language
plluau 2.0 LANG pgdg16 pllua_16 Lua as an untrusted procedural language
plpgsql_check 2.7 LANG pgdg16 plpgsql_check_16 extended check for plpgsql functions
plprql 0.1.0 LANG pigsty-pgsql plprql_16 Use PRQL in PostgreSQL - Pipelined Relational Query Language
plr 8.4.6 LANG pgdg16 plr_16 load R interpreter and execute R script from within a database
plsh 2 LANG pgdg16 plsh_16 PL/sh procedural language
columnar 11.1-11 OLAP pigsty-pgsql hydra_16 Hydra Columnar extension hydra 1.1.2
duckdb_fdw 1.1 OLAP pigsty-pgsql duckdb_fdw_16 DuckDB Foreign Data Wrapper libduckdb 0.10.2
parquet_s3_fdw 0.3 OLAP pigsty-pgsql parquet_s3_fdw_16 foreign-data wrapper for parquet on S3/MinIO deps: libarrow-s3
pg_analytics 0.6.1 OLAP pigsty-pgsql pg_analytics_16 Real-time analytics for PostgreSQL using columnar storage and vectorized execution
pg_lakehouse 0.7.0 OLAP pigsty-pgsql pg_lakehouse_16 pg_lakehouse: An analytical query engine for Postgres rust
timescaledb 2.15.0 OLAP timescaledb timescaledb-2-postgresql-16 Enables scalable inserts and complex queries for time-series data (Apache 2 Edition)
citus_columnar 11.3-1 OLAP pgdg16 citus_16 Citus columnar storage engine citus
pg_tier 0.0.3 OLAP pigsty-pgsql pg_tier_16 pg_tier: tiered storage developed by tembo.io 依赖parquet_s3_fdw
pglogical 2.4.4 REPL pgdg16 pglogical_16 PostgreSQL Logical Replication
pglogical_origin 1.0.0 REPL pgdg16 pglogical_16 Dummy extension for compatibility when upgrading from Postgres 9.4
repmgr 5.4 REPL pgdg16 repmgr_16 Replication manager for PostgreSQL
pg_search 0.7.0 SEARCH pigsty-pgsql pg_search_16 pg_search: Full text search for PostgreSQL using BM25 old name: pg_bm25
zhparser 2.2 SEARCH pigsty-pgsql zhparser_16 a parser for full-text search of Chinese deps: scws
pg_bigm 1.2 SEARCH pgdg16 pg_bigm_16 create 2-gram (bigram) index for faster full text search.
pg_tde 1.0 SEC pigsty-pgsql pg_tde_16 pg_tde access method alpha
pgsmcrypto 0.1.0 SEC pigsty-pgsql pgsmcrypto_16 PostgreSQL SM Algorithm Extension
anon 1.3.2 SEC pgdg16 postgresql_anonymizer_16 Data anonymization tools
credcheck 2.7.0 SEC pgdg16 credcheck_16 credcheck - postgresql plain text credential checker
logerrors 2.1 SEC pgdg16 logerrors_16 Function for collecting statistics about messages in logfile
login_hook 1.5 SEC pgdg16 login_hook_16 login_hook - hook to execute login_hook.login() at login time
passwordcracklib 3.0.0 SEC pgdg16 passwordcracklib_16 Strengthen PostgreSQL user password checks with cracklib
pg_auth_mon 1.1 SEC pgdg16 pg_auth_mon_16 monitor connection attempts per user
pg_jobmon 1.4.1 SEC pgdg16 pg_jobmon_16 Extension for logging and monitoring functions in PostgreSQL
pgaudit 16.0 SEC pgdg16 pgaudit_16 provides auditing functionality
pgauditlogtofile 1.5 SEC pgdg16 pgauditlogtofile_16 pgAudit addon to redirect audit log to an independent file
pgcryptokey 1.0 SEC pgdg16 pgcryptokey_16 cryptographic key management
pgsodium 3.1.9 SEC pgdg16 pgsodium_16 Postgres extension for libsodium functions
set_user 4.0.1 SEC pgdg16 set_user_16 similar to SET ROLE but with added logging
supabase_vault 0.2.8 SEC pigsty-pgsql vault_16 Supabase Vault Extension
citus 12.1-1 SHARD pgdg16 citus_16 Distributed PostgreSQL as an extension
pg_fkpart 1.7 SHARD pgdg16 pg_fkpart_16 Table partitioning by foreign key utility
pg_partman 5.1.0 SHARD pgdg16 pg_partman_16 Extension to manage partitioned tables by time or ID
orafce 4.10 SIM pgdg16 orafce_16 Functions and operators that emulate a subset of functions and packages from the Oracle RDBMS
pg_dbms_job 1.5.0 SIM pgdg16 pg_dbms_job_16 Extension to add Oracle DBMS_JOB full compatibility to PostgreSQL
pg_dbms_lock 1.0.0 SIM pgdg16 pg_dbms_lock_16 Extension to add Oracle DBMS_LOCK full compatibility to PostgreSQL
pg_dbms_metadata 1.0.0 SIM pgdg16 pg_dbms_metadata_16 Extension to add Oracle DBMS_METADATA compatibility to PostgreSQL
pg_extra_time 1.1.2 SIM pgdg16 pg_extra_time_16 Some date time functions and operators that,
pgmemcache 2.3.0 SIM pgdg16 pgmemcache_16 memcached interface
pg_permissions 1.1 STAT pgdg16 pg_permissions_16 view object permissions and compare them with the desired state
pg_profile 4.6 STAT pgdg16 pg_profile_16 PostgreSQL load profile repository and report builder
pg_qualstats 2.1.0 STAT pgdg16 pg_qualstats_16 An extension collecting statistics about quals
pg_show_plans 2.1 STAT pgdg16 pg_show_plans_16 show query plans of all currently running SQL statements
pg_stat_kcache 2.2.3 STAT pgdg16 pg_stat_kcache_16 Kernel statistics gathering
pg_stat_monitor 2.0 STAT pgdg16 pg_stat_monitor_16 The pg_stat_monitor is a PostgreSQL Query Performance Monitoring tool, based on PostgreSQL contrib module pg_stat_statements. pg_stat_monitor provides aggregated statistics, client information, plan details including plan, and histogram information.
pg_statviz 0.6 STAT pgdg16 pg_statviz_extension_16 stats visualization and time series analysis
pg_store_plans 1.8 STAT pgdg16 pg_store_plans_16 track plan statistics of all SQL statements executed
pg_track_settings 2.1.2 STAT pgdg16 pg_track_settings_16 Track settings changes
pg_wait_sampling 1.1 STAT pgdg16 pg_wait_sampling_16 sampling based statistics of wait events
pgexporter_ext 0.2.3 STAT pgdg16 pgexporter_ext_16 pgexporter extension for extra metrics
pgmeminfo 1.0 STAT pgdg16 pgmeminfo_16 show memory usage
plprofiler 4.2 STAT pgdg16 plprofiler_16 server-side support for profiling PL/pgSQL functions
powa 4.2.2 STAT pgdg16 powa_16 PostgreSQL Workload Analyser-core
system_stats 2.0 STAT pgdg16 system_stats_16 EnterpriseDB system statistics for PostgreSQL
dbt2 0.45.0 TEST pgdg16 dbt2-pg16-extensions OSDL-DBT-2 test kit
faker 0.5.3 TEST pgdg16 postgresql_faker_16 Wrapper for the Faker Python library postgresql_faker
pgtap 1.3.3 TEST pgdg16 pgtap_16 Unit testing for PostgreSQL
ip4r 2.4 TYPE pgdg16 ip4r_16 IPv4/v6 and IPv4/v6 range index type for PostgreSQL
md5hash 1.0.1 TYPE pigsty-pgsql md5hash_16 type for storing 128-bit binary data inline
pg_uuidv7 1.5 TYPE pgdg16 pg_uuidv7_16 pg_uuidv7: create UUIDv7 values in postgres
pgmp 1.1 TYPE pgdg16 pgmp_16 Multiple Precision Arithmetic extension
prefix 1.2.0 TYPE pgdg16 prefix_16 Prefix Range module for PostgreSQL
roaringbitmap 0.5 TYPE pigsty-pgsql pg_roaringbitmap_16 support for Roaring Bitmaps
semver 0.32.1 TYPE pgdg16 semver_16 Semantic version data type
timestamp9 1.4.0 TYPE pgdg16 timestamp9_16 timestamp nanosecond resolution
uint 0 TYPE pgdg16 uint_16 unsigned integer types
unit 7 TYPE pgdg16 postgresql-unit_16 SI units extension
imgsmlr ❋ 1.0.0 AI pigsty-pgsql imgsmlr_16 Image similarity with haar
pg_similarity ❋ 1.0.0 AI pigsty-pgsql pg_similarity_16 support similarity queries
multicorn ❋ 2.4 FDW pgdg16 multicorn2_16 Fetch foreign data in Python in your PostgreSQL server.
geoip ❋ 0.2.4 GIS pgdg16 geoip_16 IP-based geolocation query
plproxy ❋ 2.10.0 SHARD pgdg16 plproxy_16 Database partitioning implemented as procedural language
mysqlcompat ❋ 0.0.7 SIM pgdg16 mysqlcompat_16 A reimplemenation of as many MySQL functions as possible in PostgreSQL

DEB Extension

Pigsty has 189 available extensions on Debian systems, including 73 PostgreSQL contrib extensions and 116 extra deb extensions, 10 of which are maintained by Pigsty.

Based on Debian 12 & Ubuntu 22.04, which may have very slight differences in available extensions:

name version category repo pkg description comment
pg_cron 1.6 ADMIN pgdg16 postgresql-16-cron Job scheduler for PostgreSQL
pg_repack 1.5.0 ADMIN pgdg16 postgresql-16-repack Reorganize tables in PostgreSQL databases with minimal locks
pg_dirtyread 2 ADMIN pgdg16 postgresql-16-dirtyread Read dead but unvacuumed rows from table
pg_squeeze 1.6 ADMIN pgdg16 postgresql-16-pgsphere A tool to remove unused space from a relation.
pgagent 4.2 ADMIN pgdg16 postgresql-16-pgagent A PostgreSQL job scheduler
pgautofailover 2.1 ADMIN pgdg16 postgresql-16-auto-failover pg_auto_failover
pgdd 0.5.2 ADMIN pigsty-pgsql pgdd An in-database data dictionary providing database introspection via standard SQL query syntax ubuntu22 only
pgfincore 1.3.1 ADMIN pgdg16 postgresql-16-pgfincore examine and manage the os buffer cache
pgl_ddl_deploy 2.2 ADMIN pgdg16 postgresql-16-pgl-ddl-deploy automated ddl deployment using pglogical
pgpool_adm 1.4 ADMIN pgdg16 postgresql-16-pgpool2 Administrative functions for pgPool
pgpool_recovery 1.4 ADMIN pgdg16 postgresql-16-pgpool2 recovery functions for pgpool-II for V4.3
pgpool_regclass 1.0 ADMIN pgdg16 postgresql-16-pgpool2 replacement for regclass
prioritize 1.0 ADMIN pgdg16 postgresql-16-prioritize get and set the priority of PostgreSQL backends
toastinfo 1 ADMIN pgdg16 postgresql-16-toastinfo show details on toasted datums
pgml 2.8.1 AI pgml postgresql-16-pgml PostgresML: Run AL/ML workloads with SQL interface
vector 0.7.0 AI pgdg16 postgresql-16-pgvector vector data type and ivfflat and hnsw access methods
pg_similarity 1.0 AI pgdg16 postgresql-16-similarity support similarity queries
svector 0.6.1 AI pigsty-pgsql pg-sparse pg_sparse: Sparse vector data type and sparse HNSW access methods depreciated
wal2json 2.5.3 ETL pgdg16 postgresql-16-wal2json Changing data capture in JSON format
decoderbufs 0.1.0 ETL pgdg16 postgresql-16-decoderbufs Logical decoding plugin that delivers WAL stream changes using a Protocol Buffer format
pg_fact_loader 2.0 ETL pgdg16 postgresql-16-pg-fact-loader build fact tables with Postgres
wrappers 0.3.1 FDW pigsty-pgsql wrappers Postgres Foreign Data Wrappers by Supabase rust
mysql_fdw 1.2 FDW pgdg16 postgresql-16-mysql-fdw Foreign data wrapper for querying a MySQL server
ogr_fdw 1.1 FDW pgdg16 postgresql-16-ogr-fdw foreign-data wrapper for GIS data access
oracle_fdw 1.2 FDW pgdg16 postgresql-16-oracle-fdw foreign data wrapper for Oracle access
tds_fdw 2.0.3 FDW pgdg16 postgresql-16-tds-fdw Foreign data wrapper for querying a TDS database (Sybase or Microsoft SQL Server)
age 1.5.0 FEAT pgdg16 postgresql-16-age AGE graph database extension
pg_graphql 1.5.4 FEAT pigsty-pgsql pg-graphql pg_graphql: GraphQL support
pg_jsonschema 0.3.1 FEAT pigsty-pgsql pg-jsonschema PostgreSQL extension providing JSON Schema validation rust
rdkit 4.3.0 FEAT pgdg16 postgresql-16-rdkit Cheminformatics functionality for PostgreSQL.
hll 2.18 FEAT pgdg16 postgresql-16-hll type for storing hyperloglog data
hypopg 1.4.1 FEAT pgdg16 postgresql-16-hypopg Hypothetical indexes for PostgreSQL
jsquery 1.1 FEAT pgdg16 postgresql-16-jsquery data type for jsonb inspection
periods 1.2 FEAT pgdg16 postgresql-16-periods Provide Standard SQL functionality for PERIODs and SYSTEM VERSIONING
pg_hint_plan 1.6.0 FEAT pgdg16 postgresql-16-pg-hint-plan Give PostgreSQL ability to manually force some decisions in execution plans.
pgq 3.5 FEAT pgdg16 postgresql-16-pgq Generic queue for PostgreSQL
pgq_node 3.5 FEAT pgdg16 postgresql-16-pgq Generic queue for PostgreSQL, node extension
pre_prepare 0.4 FEAT pgdg16 postgresql-16-preprepare Prepare your prepare statement on server-side
rum 1.3 FEAT pgdg16 postgresql-16-rum RUM index access method
pg_net 0.9.1 FUNC pigsty-pgsql pg-net Enables asynchronous (non-blocking) HTTP/HTTPS requests with SQL rust
extra_window_functions 1.0 FUNC pgdg16 postgresql-16-extra-window-functions Extra Window Functions for PostgreSQL
first_last_agg 0.1.4 FUNC pgdg16 postgresql-16-first-last-agg first() and last() aggregate functions
http 1.6 FUNC pgdg16 postgresql-16-http HTTP client for PostgreSQL, allows web page retrieval inside the database.
icu_ext 1.8 FUNC pgdg16 postgresql-16-icu-ext PostgreSQL extension (in C) to expose functionality from the ICU library
pg_sphere 1.5.1 FUNC pgdg16 postgresql-16-pg_sphere spherical objects with useful functions, operators and index support
pgpcre 1 FUNC pgdg16 postgresql-16-pgpcre Perl Compatible Regular Expression functions
q3c 2.0.1 FUNC pgdg16 postgresql-16-q3c q3c sky indexing plugin
tdigest 1.4.1 FUNC pgdg16 postgresql-16-tdigest Provides tdigest aggregate function.
topn 2.6.0 FUNC pgdg16 postgresql-16-topn type for top-n JSONB
postgis-3 3.4.2 GIS pgdg16 postgresql-16-postgis-3 PostGIS geometry and geography spatial types and functions
address_standardizer-3 3.4.2 GIS pgdg16 postgresql-16-postgis-3 Used to parse an address into constituent elements. Generally used to support geocoding address normalization step.
address_standardizer_data_us-3 3.4.2 GIS pgdg16 postgresql-16-postgis-3 Address Standardizer US dataset example
h3 4.1.3 GIS pgdg16 postgresql-16-h3 H3 bindings for PostgreSQL
h3_postgis 4.1.3 GIS pgdg16 postgresql-16-h3 H3 PostGIS integration
ip4r 2.4 GIS pgdg16 postgresql-16-ip4r IPv4/v6 and IPv4/v6 range index type for PostgreSQL
mobilitydb 1.1.1 GIS pgdg16 postgresql-16-mobilitydb MobilityDB geospatial trajectory data management & analysis platform
pgrouting 3.6.2 GIS pgdg16 postgresql-16-pgrouting pgRouting Extension
pointcloud 1.2.5 GIS pgdg16 postgresql-16-pointcloud data type for lidar point clouds
pointcloud_postgis 1.2.5 GIS pgdg16 postgresql-16-pointcloud integration for pointcloud LIDAR data and PostGIS geometry data
postgis_raster-3 3.4.2 GIS pgdg16 postgresql-16-postgis-3 PostGIS raster types and functions
postgis_sfcgal-3 3.4.2 GIS pgdg16 postgresql-16-postgis-3 PostGIS SFCGAL functions
postgis_tiger_geocoder-3 3.4.2 GIS pgdg16 postgresql-16-postgis-3 PostGIS tiger geocoder and reverse geocoder
postgis_topology-3 3.4.2 GIS pgdg16 postgresql-16-postgis-3 PostGIS topology spatial types and functions
hstore_pllua 1.0 LANG pgdg16 postgresql-16-pllua Hstore transform for Lua
hstore_plluau 1.0 LANG pgdg16 postgresql-16-pllua transform between hstore and plluau
omnidb_plpgsql_debugger 1.0.0 LANG pgdg16 postgresql-16-omnidb Enable PL/PgSQL Debugger on OmniDB
pldbgapi 1.1 LANG pgdg16 postgresql-16-pldbgapi server-side support for debugging PL/pgSQL functions
pljava 1.6.7 LANG pgdg16 postgresql-16-pljava PL/Java procedural language
pllua 2.0 LANG pgdg16 postgresql-16-pllua Lua as a procedural language
plluau 2.0 LANG pgdg16 postgresql-16-pllua Lua as an untrusted procedural language
plpgsql_check 2.7 LANG pgdg16 postgresql-16-plpgsql-check extended check for plpgsql functions
plprql 0.1.0 LANG pigsty-pgsql plprql Use PRQL in PostgreSQL - Pipelined Relational Query Language debian only
plr 8.4.6 LANG pgdg16 postgresql-16-plr load R interpreter and execute R script from within a database
plsh 2 LANG pgdg16 postgresql-16-plsh PL/sh procedural language
pg_analytics 0.6.1 OLAP pigsty-pgsql pg-analytics Real-time analytics for PostgreSQL using columnar storage and vectorized execution ubuntu22 only
pg_lakehouse 0.7.0 OLAP pigsty-pgsql pg-lakehouse An analytical query engine for Postgres ubuntu22 only
timescaledb 2.15.0 OLAP timescaledb timescaledb-2-postgresql-16 Enables scalable inserts and complex queries for time-series data (Apache 2 Edition)
citus_columnar 11.3-1 OLAP pgdg16 postgresql-16-citus-12.1 Citus columnar storage engine citus
pglogical 2.4.4 REPL pgdg16 postgresql-16-pglogical PostgreSQL Logical Replication
londiste 3.8 REPL pgdg16 postgresql-16-londiste-sql Londiste replication support code
mimeo 1.5.1 REPL pgdg16 postgresql-16-mimeo Extension for specialized, per-table replication between PostgreSQL instances
pglogical_origin 1.0.0 REPL pgdg16 postgresql-16-pglogical Dummy extension for compatibility when upgrading from Postgres 9.4
pglogical_ticker 1.4 REPL pgdg16 postgresql-16-pglogical Have an accurate view on pglogical replication delay
repmgr 5.4 REPL pgdg16 postgresql-16-repmgr Replication manager for PostgreSQL
pg_search 0.7.0 SEARCH pigsty-pgsql pg-search Full text search for PostgreSQL using BM25 ubuntu22 only
credcheck 2.7.0 SEC pgdg16 postgresql-16-credcheck credcheck - postgresql plain text credential checker
pg_snakeoil 1 SEC pgdg16 postgresql-16-snakeoil PostgreSQL Anti-Virus
pgaudit 16.0 SEC pgdg16 postgresql-16-pgaudit provides auditing functionality
pgauditlogtofile 1.5 SEC pgdg16 postgresql-16-pgauditlogtofile pgAudit addon to redirect audit log to an independent file
set_user 4.0.1 SEC pgdg16 postgresql-16-set-user similar to SET ROLE but with added logging
table_log 0.6.1 SEC pgdg16 postgresql-16-tablelog Module to log changes on tables
citus 12.1-1 SHARD pgdg16 postgresql-16-citus-12.1 Distributed PostgreSQL as an extension
pg_partman 5.1.0 SHARD pgdg16 postgresql-16-partman Extension to manage partitioned tables by time or ID
plproxy 2.11.0 SHARD pgdg16 postgresql-16-plproxy Database partitioning implemented as procedural language
orafce 4.10 SIM pgdg16 postgresql-16-orafce Functions and operators that emulate a subset of functions and packages from the Oracle RDBMS
pgmemcache 2.3.0 SIM pgdg16 postgresql-16-pgmemcache memcached interface
pg_qualstats 2.1.0 STAT pgdg16 postgresql-16-pg-qualstats An extension collecting statistics about quals
pg_show_plans 2.1 STAT pgdg16 postgresql-16-show-plans show query plans of all currently running SQL statements
pg_stat_kcache 2.2.3 STAT pgdg16 postgresql-16-squeeze Kernel statistics gathering
pg_statviz 0.6 STAT pgdg16 postgresql-16-statviz stats visualization and time series analysis broke on debian12
pg_track_settings 2.1.2 STAT pgdg16 postgresql-16-pg-stat-kcache Track settings changes
pg_wait_sampling 1.1 STAT pgdg16 postgresql-16-pg-wait-sampling sampling based statistics of wait events
plprofiler 4.2 STAT pgdg16 postgresql-16-plprofiler server-side support for profiling PL/pgSQL functions
powa 4.2.2 STAT pgdg16 postgresql-16-powa PostgreSQL Workload Analyser-core
pgtap 1.3.3 TEST pgdg16 postgresql-16-pgtap Unit testing for PostgreSQL
asn1oid 1 TYPE pgdg16 postgresql-16-asn1oid asn1oid extension
debversion 1.1 TYPE pgdg16 postgresql-16-debversion Debian version number data type
numeral 1 TYPE pgdg16 postgresql-16-numeral numeral datatypes extension
pg_rational 0.0.1 TYPE pgdg16 postgresql-16-rational bigint fractions
pg_rrule 0.2.0 TYPE pgdg16 postgresql-16-pg-rrule RRULE field type for PostgreSQL
pgfaceting 0.2.0 TYPE pgdg16 postgresql-16-pgfaceting fast faceting queries using an inverted index depend pg_roaringbitmap
pgmp 1.1 TYPE pgdg16 postgresql-16-pgmp Multiple Precision Arithmetic extension
prefix 1.2.0 TYPE pgdg16 postgresql-16-prefix Prefix Range module for PostgreSQL
roaringbitmap 0.5 TYPE pgdg16 postgresql-16-roaringbitmap support for Roaring Bitmaps
semver 0.32.1 TYPE pgdg16 postgresql-16-semver Semantic version data type
unit 7 TYPE pgdg16 postgresql-16-unit SI units extension

Contrib Extension

PostgreSQL has 73 built-in contrib extensions available on all distros.

name version category description
adminpack 2.1 ADMIN administrative functions for PostgreSQL
autoinc 1.0 FUNC functions for autoincrementing fields
bool_plperl 1.0 LANG transform between bool and plperl
bool_plperlu 1.0 LANG transform between bool and plperlu
btree_gin 1.3 FUNC support for indexing common datatypes in GIN
btree_gist 1.7 FUNC support for indexing common datatypes in GiST
citext 1.6 TYPE data type for case-insensitive character strings
cube 1.5 TYPE data type for multidimensional cubes
dblink 1.2 FDW connect to other PostgreSQL databases from within a database
dict_int 1.0 FUNC text search dictionary template for integers
dict_xsyn 1.0 FUNC text search dictionary template for extended synonym processing
file_fdw 1.0 FDW foreign-data wrapper for flat file access
hstore 1.8 TYPE data type for storing sets of (key, value) pairs
hstore_plperl 1.0 LANG transform between hstore and plperl
hstore_plperlu 1.0 LANG transform between hstore and plperlu
hstore_plpython3u 1.0 LANG transform between hstore and plpython3u
insert_username 1.0 FUNC functions for tracking who changed a table
intagg 1.1 FUNC integer aggregator and enumerator (obsolete)
intarray 1.5 FUNC functions, operators, and index support for 1-D arrays of integers
isn 1.2 TYPE data types for international product numbering standards
jsonb_plperl 1.0 LANG transform between jsonb and plperl
jsonb_plperlu 1.0 LANG transform between jsonb and plperlu
jsonb_plpython3u 1.0 LANG transform between jsonb and plpython3u
lo 1.1 ADMIN Large Object maintenance
ltree 1.2 TYPE data type for hierarchical tree-like structures
ltree_plpython3u 1.0 LANG transform between ltree and plpython3u
moddatetime 1.0 FUNC functions for tracking last modification time
old_snapshot 1.0 ADMIN utilities in support of old_snapshot_threshold
pageinspect 1.12 STAT inspect the contents of database pages at a low level
pg_buffercache 1.4 STAT examine the shared buffer cache
pg_freespacemap 1.2 STAT examine the free space map (FSM)
pg_prewarm 1.2 ADMIN prewarm relation data
pg_stat_statements 1.10 STAT track planning and execution statistics of all SQL statements executed
pg_surgery 1.0 ADMIN extension to perform surgery on a damaged relation
pg_visibility 1.2 STAT examine the visibility map (VM) and page-level visibility info
pg_walinspect 1.1 STAT functions to inspect contents of PostgreSQL Write-Ahead Log
pgrowlocks 1.2 STAT show row-level locking information
pgstattuple 1.5 STAT show tuple-level statistics
plperl 1.0 LANG PL/Perl procedural language
plperlu 1.0 LANG PL/PerlU untrusted procedural language
plpgsql 1.0 LANG PL/pgSQL procedural language
plpython3u 1.0 LANG PL/Python3U untrusted procedural language
pltcl 1.0 LANG PL/Tcl procedural language
pltclu 1.0 LANG PL/TclU untrusted procedural language
postgres_fdw 1.1 FDW foreign-data wrapper for remote PostgreSQL servers
refint 1.0 FUNC functions for implementing referential integrity (obsolete)
seg 1.4 TYPE data type for representing line segments or floating-point intervals
sslinfo 1.2 STAT information about SSL certificates
tcn 1.0 FUNC Triggered change notifications
tsm_system_rows 1.0 FUNC TABLESAMPLE method which accepts number of rows as a limit
tsm_system_time 1.0 FUNC TABLESAMPLE method which accepts time in milliseconds as a limit
unaccent 1.1 FUNC text search dictionary that removes accents
uuid-ossp 1.1 FUNC generate universally unique identifiers (UUIDs)
xml2 1.1 TYPE XPath querying and XSLT
ltree_plpython LANG transform between ltree and plpython
hstore_plpython LANG transform between hstore and plpython
auto_explain STAT Provides a means for logging execution plans of slow statements automatically
vacuumlo ADMIN utility program that will remove any orphaned large objects from a PostgreSQL database
basic_archive ADMIN an example of an archive module
basebackup_to_shell ADMIN adds a custom basebackup target called shell
jsonb_plpython LANG transform between jsonb and plpython
passwordcheck SEC checks user passwords and reject weak password
sepgsql SEC label-based mandatory access control (MAC) based on SELinux security policy.
earthdistance 1.1 GIS calculate great-circle distances on the surface of the Earth
fuzzystrmatch 1.2 SEARCH determine similarities and distance between strings
oid2name ADMIN utility program that helps administrators to examine the file structure used by PostgreSQL
bloom 1.0 FEAT bloom access method - signature file based index
auth_delay SEC pause briefly before reporting authentication failure
pg_trgm 1.6 SEARCH text similarity measurement and index searching based on trigrams
tablefunc 1.0 OLAP functions that manipulate whole tables, including crosstab
pgcrypto 1.3 SEC cryptographic functions
amcheck 1.3 ADMIN functions for verifying relation integrity
test_decoding REPL SQL-based test/example module for WAL logical decoding

Pigsty Extension

Pigsty has maintained and packaged 37 RPM extensions for PostgreSQL 16 on EL systems (el8, el9), check Pigsty RPMs for details.

name version comment
pgml 2.8.1 PostgresML: access most advanced machine learning algorithms and pretrained models with SQL
age 1.5.0 Apache AGE graph database extension
pointcloud 1.2.5 A PostgreSQL extension for storing point cloud (LIDAR) data.
pg_bigm 1.2.0 full text search capability with create 2-gram (bigram) index. (pg 16 not supported)
pg_tle 1.4.0 Trusted Language Extensions for PostgreSQL
roaringbitmap 0.5 Support for Roaring Bitmaps
zhparser 2.2 Parser for full-text search of Chinese
pgjwt 0.2.0 JSON Web Token API for Postgresql
pg_graphql 1.5.4 GraphQL support to your PostgreSQL database.
pg_jsonschema 0.3.1 PostgreSQL extension providing JSON Schema validation
vault 0.2.9 Extension for storing encrypted secrets in the Vault
hydra 1.1.2 Hydra is open source, column-oriented Postgres extension
wrappers 0.3.1 Postgres Foreign Data Wrappers Collections by Supabase
duckdb_fdw 1.1 DuckDB Foreign Data Wrapper, build against libduckdb 0.10.2
pg_search 0.7.0 Full text search over SQL tables using the BM25 algorithm
pg_lakehouse 0.7.0 ery engine over object stores like S3 and table formats like Delta Lake
pg_analytics 0.6.1 Accelerates analytical query processing inside Postgres
pgmq 1.5.2 A lightweight message queue. Like AWS SQS and RSMQ but on Postgres.
pg_tier 0.0.3 Postgres Extension written in Rust, to enable data tiering to AWS S3
pg_vectorize 0.15.0 The simplest way to orchestrate vector search on Postgres
pg_later 0.1.0 Execute SQL now and get the results later.
pg_idkit 0.2.3 Generating many popular types of identifiers
plprql 0.1.0 Use PRQL in PostgreSQL
pgsmcrypto 0.1.0 PostgreSQL SM Algorithm Extension
pg_tiktoken 0.0.1 OpenAI tiktoken tokenizer for postgres
pgdd 0.5.2 Access Data Dictionary metadata with pure SQL
parquet_s3_fdw 1.1.0 ParquetS3 Foreign Data Wrapper for PostgresSQL
plv8 3.2.2 V8 Engine Javascript Procedural Language add-on for PostgreSQL
md5hash 1.0.1 Custom data type for storing MD5 hashes rather than text
pg_tde 1.0-alpha Experimental encrypted access method for PostgreSQL
pg_dirtyread 2.6 Read dead but unvacuumed tuples from a PostgreSQL relation
pg_sparse 0.6.1 pg_sparse: Sparse vector data type and sparse HNSW access methods (depreciated)
imgsmlr 1.0.0 ImgSmlr method is based on Haar wavelet transform (pg 16 not supported)
pg_similarity 1.0.0 set of functions and operators for executing similarity queries(covered by pgvector)
pgsql-http 1.6 HTTP client for PostgreSQL, allows web page retrieval inside the database.
pgsql-gzip 1.0 Gzip and unzip with SQL
pg_net 0.9.1 A PostgreSQL extension that enables asynchronous (non-blocking) HTTP/HTTPS requests with SQL

Caveat: Extension marked with ❋ are no longer supported due to various reasons.

Caveat: Extension marked with ※ are no longer supported due to various reasons.

4.4 - File Hierarchy

How files are organized in Pigsty, and directories structure used by modules

Pigsty FHS

#------------------------------------------------------------------------------
# pigsty
#  ^-----@app                    # extra demo application resources
#  ^-----@bin                    # bin scripts
#  ^-----@docs                   # document (can be docsified)
#  ^-----@files                  # ansible file resources 
#            ^-----@pigsty       # pigsty config template files
#            ^-----@prometheus   # prometheus rules definition
#            ^-----@grafana      # grafana dashboards
#            ^-----@postgres     # /pg/bin/ scripts
#            ^-----@migration    # pgsql migration task definition
#            ^-----@pki          # self-signed CA & certs

#  ^-----@roles                  # ansible business logic
#  ^-----@templates              # ansible templates
#  ^-----@vagrant                # vagrant local VM template
#  ^-----@terraform              # terraform cloud VM template
#  ^-----configure               # configure wizard script
#  ^-----ansible.cfg             # default ansible config file
#  ^-----pigsty.yml              # default config file
#  ^-----*.yml                   # ansible playbooks

#------------------------------------------------------------------------------
# /etc/pigsty/
#  ^-----@targets                # file based service discovery targets definition
#  ^-----@dashboards             # static grafana dashboards
#  ^-----@datasources            # static grafana datasource
#  ^-----@playbooks              # extra ansible playbooks
#------------------------------------------------------------------------------

CA FHS

Pigsty’s self-signed CA is located on files/pki/ directory under pigsty home.

YOU HAVE TO SECURE THE CA KEY PROPERLY: files/pki/ca/ca.key, which is generated by the ca role during install.yml or infra.yml.

# pigsty/files/pki
#  ^-----@ca                      # self-signed CA key & cert
#         ^[email protected]           # VERY IMPORTANT: keep it secret
#         ^[email protected]           # VERY IMPORTANT: trusted everywhere
#  ^-----@csr                     # signing request csr
#  ^-----@misc                    # misc certs, issued certs
#  ^-----@etcd                    # etcd server certs
#  ^-----@minio                   # minio server certs
#  ^-----@nginx                   # nginx SSL certs
#  ^-----@infra                   # infra client certs
#  ^-----@pgsql                   # pgsql server certs
#  ^-----@mongo                   # mongodb/ferretdb server certs
#  ^-----@mysql                   # mysql server certs

The managed nodes will have the following files installed:

/etc/pki/ca.crt                             # all nodes
/etc/pki/ca-trust/source/anchors/ca.crt     # soft link and trusted anchor

All infra nodes will have the following certs:

/etc/pki/infra.crt                          # infra nodes cert
/etc/pki/infra.key                          # infra nodes key

In case of admin node failure, you have to keep files/pki and pigsty.yml safe. You can rsync them to another admin node to make a backup admin node.

# run on meta-1, rsync to meta2
cd ~/pigsty;
rsync -avz ./ meta-2:~/pigsty  

NODE FHS

Node main data dir is specified by node_data parameter, which is /data by default.

The data dir is owned by root with mode 0777. All modules’ local data will be stored under this directory by default.

/data
#  ^-----@postgres                   # postgres main data dir
#  ^-----@backups                    # postgres backup data dir (if no dedicated backup disk)
#  ^-----@redis                      # redis data dir (shared by multiple redis instances)
#  ^-----@minio                      # minio data dir (default when in single node single disk mode)
#  ^-----@etcd                       # etcd main data dir
#  ^-----@prometheus                 # prometheus time series data dir
#  ^-----@loki                       # Loki data dir for logs
#  ^-----@docker                     # Docker data dir
#  ^-----@...                        # other modules

Prometheus FHS

The prometheus bin / rules are located on files/prometheus/ directory under pigsty home.

While the main config file is located on roles/infra/templates/prometheus/prometheus.yml.j2 and rendered to /etc/prometheus/prometheus.yml on infra nodes.

# /etc/prometheus/
#  ^-----prometheus.yml              # prometheus main config file
#  ^-----@bin                        # util scripts: check,reload,status,new
#  ^-----@rules                      # record & alerting rules definition
#            ^-----agent.yml         # agent rules & alert
#            ^-----infra.yml         # infra rules & alert
#            ^-----node.yml          # node  rules & alert
#            ^-----pgsql.yml         # pgsql rules & alert
#            ^-----redis.yml         # redis rules & alert
#            ^-----minio.yml         # minio rules & alert
#            ^-----etcd.yml          # etcd  rules & alert
#            ^-----mongo.yml         # mongo rules & alert
#            ^-----mysql.yml         # mysql rules & alert (placeholder)
#  ^-----@targets                    # file based service discovery targets definition
#            ^-----@infra            # infra static targets definition
#            ^-----@node             # nodes static targets definition
#            ^-----@etcd             # etcd static targets definition
#            ^-----@minio            # minio static targets definition
#            ^-----@ping             # blackbox ping targets definition
#            ^-----@pgsql            # pgsql static targets definition
#            ^-----@pgrds            # pgsql remote rds static targets
#            ^-----@redis            # redis static targets definition
#            ^-----@mongo            # mongo static targets definition
#            ^-----@mysql            # mysql static targets definition
#            ^-----@ping             # ping  static target definition
#            ^-----@patroni          # patroni static target defintion (when ssl enabled)
#            ^-----@.....            # other targets
# /etc/alertmanager.yml              # alertmanager main config file
# /etc/blackbox.yml                  # blackbox exporter main config file

Postgres FHS

The following parameters are related to the PostgreSQL database dir:

  • pg_dbsu_home: Postgres default user’s home dir, default is /var/lib/pgsql.
  • pg_bin_dir: Postgres binary dir, defaults to /usr/pgsql/bin/.
  • pg_data: Postgres database dir, default is /pg/data.
  • pg_fs_main: Postgres main data disk mount point, default is /data.
  • pg_fs_bkup: Postgres backup disk mount point, default is /data/backups (used when using local backup repo).
#--------------------------------------------------------------#
# Create Directory
#--------------------------------------------------------------#
# assumption:
#   {{ pg_fs_main }} for main data   , default: `/data`              [fast ssd]
#   {{ pg_fs_bkup }} for backup data , default: `/data/backups`     [cheap hdd]
#--------------------------------------------------------------#
# default variable:
#     pg_fs_main = /data             fast ssd
#     pg_fs_bkup = /data/backups     cheap hdd (optional)
#
#     /pg      -> /data/postgres/pg-test-15    (soft link)
#     /pg/data -> /data/postgres/pg-test-15/data
#--------------------------------------------------------------#
- name: create postgresql directories
  tags: pg_dir
  become: yes
  block:

    - name: make main and backup data dir
      file: path={{ item }} state=directory owner=root mode=0777
      with_items:
        - "{{ pg_fs_main }}"
        - "{{ pg_fs_bkup }}"

    # pg_cluster_dir:    "{{ pg_fs_main }}/postgres/{{ pg_cluster }}-{{ pg_version }}"
    - name: create postgres directories
      file: path={{ item }} state=directory owner={{ pg_dbsu }} group=postgres mode=0700
      with_items:
        - "{{ pg_fs_main }}/postgres"
        - "{{ pg_cluster_dir }}"
        - "{{ pg_cluster_dir }}/bin"
        - "{{ pg_cluster_dir }}/log"
        - "{{ pg_cluster_dir }}/tmp"
        - "{{ pg_cluster_dir }}/cert"
        - "{{ pg_cluster_dir }}/conf"
        - "{{ pg_cluster_dir }}/data"
        - "{{ pg_cluster_dir }}/meta"
        - "{{ pg_cluster_dir }}/stat"
        - "{{ pg_cluster_dir }}/change"
        - "{{ pg_backup_dir }}/backup"

Data FHS

# real dirs
{{ pg_fs_main }}     /data                      # top level data directory, usually a SSD mountpoint
{{ pg_dir_main }}    /data/postgres             # contains postgres data
{{ pg_cluster_dir }} /data/postgres/pg-test-15  # contains cluster `pg-test` data (of version 15)
                     /data/postgres/pg-test-15/bin            # bin scripts
                     /data/postgres/pg-test-15/log            # logs: postgres/pgbouncer/patroni/pgbackrest
                     /data/postgres/pg-test-15/tmp            # tmp, sql files, rendered results
                     /data/postgres/pg-test-15/cert           # postgres server certificates
                     /data/postgres/pg-test-15/conf           # patroni config, links to related config
                     /data/postgres/pg-test-15/data           # main data directory
                     /data/postgres/pg-test-15/meta           # identity information
                     /data/postgres/pg-test-15/stat           # stats information, summary, log report
                     /data/postgres/pg-test-15/change         # changing records
                     /data/postgres/pg-test-15/backup         # soft link to backup dir

{{ pg_fs_bkup }}     /data/backups                            # could be a cheap & large HDD mountpoint
                     /data/backups/postgres/pg-test-15/backup # local backup repo path

# soft links
/pg             ->   /data/postgres/pg-test-15                # pg root link
/pg/data        ->   /data/postgres/pg-test-15/data           # real data dir
/pg/backup      ->   /var/backups/postgres/pg-test-15/backup  # base backup

Binary FHS

On EL releases, the default path for PostgreSQL binaries is:

/usr/pgsql-${pg_version}/

Pigsty will create a softlink /usr/pgsql to the currently installed version specified by pg_version.

/usr/pgsql -> /usr/pgsql-15

Therefore, the default pg_bin_dir will be /usr/pgsql/bin/, and this path is added to the PATH environment via /etc/profile.d/pgsql.sh.

export PATH="/usr/pgsql/bin:/pg/bin:$PATH"
export PGHOME=/usr/pgsql
export PGDATA=/pg/data

For Ubuntu / Debian, the default path for PostgreSQL binaries is:

/usr/lib/postgresql/${pg_version}/bin

Pgbouncer FHS

Pgbouncer is run using the Postgres user, and the config file is located in /etc/pgbouncer. The config file includes.

  • pgbouncer.ini: pgbouncer main config
  • database.txt: pgbouncer database list
  • userlist.txt: pgbouncer user list
  • useropts.txt: pgbouncer user options (user-level parameter overrides)
  • pgb_hba.conf: lists the access privileges of the connection pool users

Redis FHS

Pigsty provides essential support for Redis deployment and monitoring.

Redis binaries are installed in /bin/ using RPM-packages or copied binaries, including:

redis-server    
redis-server    
redis-cli       
redis-sentinel  
redis-check-rdb 
redis-check-aof 
redis-benchmark 
/usr/libexec/redis-shutdown

For a Redis instance named redis-test-1-6379, the resources associated with it are shown below:

/usr/lib/systemd/system/redis-test-1-6379.service               # Services ('/lib/systemd' in debian)
/etc/redis/redis-test-1-6379.conf                               # Config 
/data/redis/redis-test-1-6379                                   # Database Catalog
/data/redis/redis-test-1-6379/redis-test-1-6379.rdb             # RDB File
/data/redis/redis-test-1-6379/redis-test-1-6379.aof             # AOF file
/var/log/redis/redis-test-1-6379.log                            # Log
/var/run/redis/redis-test-1-6379.pid                            # PID

For Ubuntu / Debian, the default systemd service dir is /lib/systemd/system/ instead of /usr/lib/systemd/system/.

4.5 - Comparison

Comparing products such as RDS and projects that has feature overlap with Pigsty

Comparing to RDS

Pigsty is an AGPLv3-licensed, local-first RDS alternative that can be deployed on your own physical machines/virtual machines, or on cloud servers.

Therefore, we chose the world’s leading Amazon Cloud AWS RDS for PostgreSQL and China’s market leader Alibaba Cloud RDS for PostgreSQL as benchmarks.

Both Alibaba Cloud RDS and AWS RDS are proprietary cloud database services, offered only on the public cloud through a leasing model. The following comparison is based on the latest PostgreSQL 16 main branch version, with the comparison cut-off date being February 2024.


Features

Item Pigsty Aliyun RDS AWS RDS
Major Version 12 - 16 12 - 16 12 - 16
Read on Standby Of course Not Readable Not Readable
Separate R & W By Port Paid Proxy Paid Proxy
Offline Instance Yes Not Available Not Available
Standby Cluster Yes Multi-AZ Multi-AZ
Delayed Instance Yes Not Available Not Available
Load Balancer HAProxy / LVS Paid ELB Paid ELB
Connection Pooling Pgbouncer Paid Proxy Paid RDS Proxy
High Availability Patroni / etcd HA Version Only HA Version Only
Point-in-Time Recovery pgBackRest / MinIO Yes Yes
Monitoring Metrics Prometheus / Exporter About 9 Metrics About 99 Metrics
Logging Collector Loki / Promtail Yes Yes
Dashboards Grafana / Echarts Basic Support Basic Support
Alerts AlterManager Basic Support Basic Support

Extensions

Here are some important extensions in the PostgreSQL ecosystem. The comparison is base on PostgreSQL 16 and complete on 2024-02-29:

Category Pigsty Aliyun RDS PG AWS RDS PG
Add Extension Free to Install Not Allowed Not Allowed
Geo Spatial PostGIS 3.4.2 PostGIS 3.3.4 PostGIS 3.4.1
Time Series TimescaleDB 2.14.2
Distributive Citus 12.1
AI / ML PostgresML 2.8.1
Columnar Hydra 1.1.1
Vector PGVector 0.6 pase 0.0.1 PGVector 0.6
Sparse Vector PG Sparse 0.5.6
Full-Text Search pg_bm25 0.5.6
Graph Apache AGE 1.5.0
GraphQL PG GraphQL 1.5.0
Message Queue pgq 3.5.0
OLAP pg_analytics 0.5.6
DuckDB duckdb_fdw 1.1
CDC wal2json 2.5.3 wal2json 2.5
Bloat Control pg_repack 1.5.0 pg_repack 1.4.8 pg_repack 1.5.0
Point Cloud PG PointCloud 1.2.5 Ganos PointCloud 6.1
AWS RDS 扩展

AWS RDS for PostgreSQL 16 可用扩展(已刨除PG自带扩展)

name pg16 pg15 pg14 pg13 pg12 pg11 pg10
amcheck 1.3 1.3 1.3 1.2 1.2 yes 1
auto_explain yes yes yes yes yes yes yes
autoinc 1 1 1 1 null null null
bloom 1 1 1 1 1 1 1
bool_plperl 1 1 1 1 null null null
btree_gin 1.3 1.3 1.3 1.3 1.3 1.3 1.2
btree_gist 1.7 1.7 1.6 1.5 1.5 1.5 1.5
citext 1.6 1.6 1.6 1.6 1.6 1.5 1.4
cube 1.5 1.5 1.5 1.4 1.4 1.4 1.2
dblink 1.2 1.2 1.2 1.2 1.2 1.2 1.2
dict_int 1 1 1 1 1 1 1
dict_xsyn 1 1 1 1 1 1 1
earthdistance 1.1 1.1 1.1 1.1 1.1 1.1 1.1
fuzzystrmatch 1.2 1.1 1.1 1.1 1.1 1.1 1.1
hstore 1.8 1.8 1.8 1.7 1.6 1.5 1.4
hstore_plperl 1 1 1 1 1 1 1
insert_username 1 1 1 1 null null null
intagg 1.1 1.1 1.1 1.1 1.1 1.1 1.1
intarray 1.5 1.5 1.5 1.3 1.2 1.2 1.2
isn 1.2 1.2 1.2 1.2 1.2 1.2 1.1
jsonb_plperl 1 1 1 1 1 null null
lo 1.1 1.1 1.1 1.1 1.1 1.1 1.1
ltree 1.2 1.2 1.2 1.2 1.1 1.1 1.1
moddatetime 1 1 1 1 null null null
old_snapshot 1 1 1 null null null null
pageinspect 1.12 1.11 1.9 1.8 1.7 1.7 1.6
pg_buffercache 1.4 1.3 1.3 1.3 1.3 1.3 1.3
pg_freespacemap 1.2 1.2 1.2 1.2 1.2 1.2 1.2
pg_prewarm 1.2 1.2 1.2 1.2 1.2 1.2 1.1
pg_stat_statements 1.1 1.1 1.9 1.8 1.7 1.6 1.6
pg_trgm 1.6 1.6 1.6 1.5 1.4 1.4 1.3
pg_visibility 1.2 1.2 1.2 1.2 1.2 1.2 1.2
pg_walinspect 1.1 1 null null null null null
pgcrypto 1.3 1.3 1.3 1.3 1.3 1.3 1.3
pgrowlocks 1.2 1.2 1.2 1.2 1.2 1.2 1.2
pgstattuple 1.5 1.5 1.5 1.5 1.5 1.5 1.5
plperl 1 1 1 1 1 1 1
plpgsql 1 1 1 1 1 1 1
pltcl 1 1 1 1 1 1 1
postgres_fdw 1.1 1.1 1.1 1 1 1 1
refint 1 1 1 1 null null null
seg 1.4 1.4 1.4 1.3 1.3 1.3 1.1
sslinfo 1.2 1.2 1.2 1.2 1.2 1.2 1.2
tablefunc 1 1 1 1 1 1 1
tcn 1 1 1 1 1 1 1
tsm_system_rows 1 1 1 1 1 1 1.1
tsm_system_time 1 1 1 1 1 1 1.1
unaccent 1.1 1.1 1.1 1.1 1.1 1.1 1.1
uuid-ossp 1.1 1.1 1.1 1.1 1.1 1.1 1.1
Aliyun Extensions

阿里云 RDS for PostgreSQL 16 可用扩展(已刨除PG自带扩展)

name pg16 pg15 pg14 pg13 pg12 pg11 pg10
bloom 1 1 1 1 1 1 1
btree_gin 1.3 1.3 1.3 1.3 1.3 1.3 1.2
btree_gist 1.7 1.7 1.6 1.5 1.5 1.5 1.5
citext 1.6 1.6 1.6 1.6 1.6 1.5 1.4
cube 1.5 1.5 1.5 1.4 1.4 1.4 1.2
dblink 1.2 1.2 1.2 1.2 1.2 1.2 1.2
dict_int 1 1 1 1 1 1 1
earthdistance 1.1 1.1 1.1 1.1 1.1 1.1 1.1
fuzzystrmatch 1.2 1.1 1.1 1.1 1.1 1.1 1.1
hstore 1.8 1.8 1.8 1.7 1.6 1.5 1.4
intagg 1.1 1.1 1.1 1.1 1.1 1.1 1.1
intarray 1.5 1.5 1.5 1.3 1.2 1.2 1.2
isn 1.2 1.2 1.2 1.2 1.2 1.2 1.1
ltree 1.2 1.2 1.2 1.2 1.1 1.1 1.1
pg_buffercache 1.4 1.3 1.3 1.3 1.3 1.3 1.3
pg_freespacemap 1.2 1.2 1.2 1.2 1.2 1.2 1.2
pg_prewarm 1.2 1.2 1.2 1.2 1.2 1.2 1.1
pg_stat_statements 1.1 1.1 1.9 1.8 1.7 1.6 1.6
pg_trgm 1.6 1.6 1.6 1.5 1.4 1.4 1.3
pgcrypto 1.3 1.3 1.3 1.3 1.3 1.3 1.3
pgrowlocks 1.2 1.2 1.2 1.2 1.2 1.2 1.2
pgstattuple 1.5 1.5 1.5 1.5 1.5 1.5 1.5
plperl 1 1 1 1 1 1 1
plpgsql 1 1 1 1 1 1 1
pltcl 1 1 1 1 1 1 1
postgres_fdw 1.1 1.1 1.1 1 1 1 1
sslinfo 1.2 1.2 1.2 1.2 1.2 1.2 1.2
tablefunc 1 1 1 1 1 1 1
tsm_system_rows 1 1 1 1 1 1 1
tsm_system_time 1 1 1 1 1 1 1
unaccent 1.1 1.1 1.1 1.1 1.1 1.1 1.1
uuid-ossp 1.1 1.1 1.1 1.1 1.1 1.1 1.1
xml2 1.1 1.1 1.1 1.1 1.1 1.1 1.1

Performance

Metric Pigsty Aliyun RDS AWS RDS
Best Performance PGTPC on NVME SSD evaluation sysbench oltp_rw RDS PG Performance Whitepaper sysbench oltp scenario per-core QPS 4000 ~ 8000
Storage Specs: Maximum Capacity 32TB / NVME SSD 32 TB / ESSD PL3 64 TB / io2 EBS Block Express
Storage Specs: Maximum IOPS 4K random read: up to 3M, random write 2000~350K 4K random read: up to 1M 16K random IOPS: 256K
Storage Specs: Maximum Latency 4K random read: 75µs, random write 15µs 4K random read: 200µs 500µs / inferred for 16K random IO
Storage Specs: Maximum Reliability UBER < 1e-18, equivalent to 18 nines MTBF: 2 million hours 5DWPD, for three years Reliability 9 nines, equivalent to UBER 1e-9 Storage and Data Reliability Durability: 99.999%, five nines (0.001% annual failure rate) io2 details
Storage Specs: Maximum Cost 31.5 ¥/TB·month ( 5-year warranty amortized / 3.2T / enterprise-grade / MLC ) 3200¥/TB·month (List price 6400¥, monthly package 4000¥) 3-year prepay total 50% off for this price 1900 ¥/TB·month for using the maximum specs 65536GB / 256K IOPS maximum discount

Observability

Pigsty offers nearly 3000 monitoring metrics, providing over 50 monitoring dashboards, covering database monitoring, host monitoring, connection pool monitoring, load balancing monitoring, etc., offering users an unparalleled observability experience.

Pigsty offers 638 PostgreSQL-related monitoring metrics, while AWS RDS only has 99, and Aliyun RDS has merely single-digit metrics:

Additionally, there are some projects that offer the capability to monitor PostgreSQL, but they are relatively basic and simplistic:


Maintainability

** Metric** Pigsty Aliyun RDS AWS RDS
System Usability Simple Simple Simple
Configuration Management Configuration file / CMDB based on Ansible Inventory Can use Terraform Can use Terraform
Change Method Idempotent playbooks based on Ansible Playbook Operations via console Operations via console
Parameter Tuning Automatically adapts based on node with four preset templates: OLTP, OLAP, TINY, CRIT
Infra as Code Native support Can use Terraform Can use Terraform
Customizable Parameters Pigsty Parameters 283 items
Service and Support Commercial subscription support available After-sales ticket support provided After-sales ticket support provided
No Internet Deployment Possible offline installation and deployment N/A N/A
Database Migration playbooks for zero-downtime migration from existing Postgres into Pigsty Provides cloud migration assistance Aliyun RDS Data Synchronization

Cost

Experience shows that the per-unit cost of hardware and software resources for RDS is 5 to 15 times that of self-built solutions, with the rent-to-own ratio typically being one month. For more details, please refer to Cost Analysis.

Factor Metric Pigsty Aliyun RDS AWS RDS
Cost Software License/Service Fees Free, hardware about 20 - 40 ¥/core·month 200 ~ 400 ¥/core·month 400 ~ 1300 ¥/core·month
Service Support Fees Service about 100 ¥/ core·month Included in RDS costs

Other Vendors


Kubernetes Operators

Pigsty refuse to run database inside kubernetes, but if you wish to do so, there are other options:

  • PGO
  • StackGres
  • CloudNativePG
  • TemboOperator
  • PostgresOperator
  • PerconaOperator
  • Kubegres
  • KubeDB
  • KubeBlocks

Reference:

4.6 - Cost Analysis

RDS / DBA Cost reference to help you evaluate the costs of self-hosting database

Cost Reference

EC2 vCPU-Month RDS vCPU-Month
DHH’s self-hosted core-month price (192C 384G) 25.32 Junior open-source DBA reference salary 15K/person-month
IDC self-hosted data center (exclusive physical machine: 64C384G) 19.53 Intermediate open-source DBA reference salary 30K/person-month
IDC self-hosted data center (container, oversold 500%) 7 Senior open-source DBA reference salary 60K/person-month
UCloud Elastic Virtual Machine (8C16G, oversold) 25 ORACLE database license 10000
Alibaba Cloud Elastic Server 2x memory (exclusive without overselling) 107 Alibaba Cloud RDS PG 2x memory (exclusive) 260
Alibaba Cloud Elastic Server 4x memory (exclusive without overselling) 138 Alibaba Cloud RDS PG 4x memory (exclusive) 320
Alibaba Cloud Elastic Server 8x memory (exclusive without overselling) 180 Alibaba Cloud RDS PG 8x memory (exclusive) 410
AWS C5D.METAL 96C 200G (monthly without upfront) 100 AWS RDS PostgreSQL db.T2 (2x) 440

For instance, using RDS for PostgreSQL on AWS, the price for a 64C / 256GB db.m5.16xlarge RDS for one month is $25,817, which is equivalent to about 180,000 yuan per month. The monthly rent is enough for you to buy two servers with even better performance and set them up on your own. The rent-to-buy ratio doesn’t even last a month; renting for just over ten days is enough to buy the whole server for yourself.

Payment Model Price Cost Per Year (¥10k)
Self-hosted IDC (Single Physical Server) ¥75k / 5 years 1.5
Self-hosted IDC (2-3 Server HA Cluster) ¥150k / 5 years 3.0 ~ 4.5
Alibaba Cloud RDS (On-demand) ¥87.36/hour 76.5
Alibaba Cloud RDS (Monthly) ¥42k / month 50
Alibaba Cloud RDS (Yearly, 15% off) ¥425,095 / year 42.5
Alibaba Cloud RDS (3-year, 50% off) ¥750,168 / 3 years 25
AWS (On-demand) $25,817 / month 217
AWS (1-year, no upfront) $22,827 / month 191.7
AWS (3-year, full upfront) $120k + $17.5k/month 175
AWS China/Ningxia (On-demand) ¥197,489 / month 237
AWS China/Ningxia (1-year, no upfront) ¥143,176 / month 171
AWS China/Ningxia (3-year, full upfront) ¥647k + ¥116k/month 160.6

Comparing the costs of self-hosting versus using a cloud database:

Method Cost Per Year (¥10k)
Self-hosted Servers 64C / 384G / 3.2TB NVME SSD 660K IOPS (2-3 servers) 3.0 ~ 4.5
Alibaba Cloud RDS PG High-Availability pg.x4m.8xlarge.2c, 64C / 256GB / 3.2TB ESSD PL3 25 ~ 50
AWS RDS PG High-Availability db.m5.16xlarge, 64C / 256GB / 3.2TB io1 x 80k IOPS 160 ~ 217

Cloud Exit Column

4.7 - Glossary

technical terms used in the documentation, along with their definitions and explanations.

5 - Module: PGSQL

The most advanced open-source relational database in the world with HA, PITR, IaC and more!

The most advanced open-source relational database in the world!

With battery-included observability, reliability, and maintainability powered by Pigsty

Concept

Overview of PostgreSQL in Pigsty


Configuration

Describe the cluster you want

  • Identity: Parameters used for describing a PostgreSQL cluster
  • Primary: Define a single instance cluster
  • Replica: Define a basic HA cluster with one primary & one replica
  • Offline: Define a dedicated instance for OLAP/ETL/Interactive queries.
  • Sync Standby: Enable synchronous commit to ensure no data loss
  • Quorum Commit: Use quorum sync commit for an even higher consistency level
  • Standby Cluster: Clone an existing cluster and follow it
  • Delayed Cluster: Clone an existing cluster for emergency data recovery
  • Citus Cluster: Define a Citus distributed database cluster
  • Major Version: Define a PostgreSQL cluster with specific major version

Administration

Admin your existing clusters


Playbook

Materialize the cluster with idempotent playbooks

  • pgsql.yml : Init HA PostgreSQL clusters or add new replicas.
  • pgsql-rm.yml : Remove PostgreSQL cluster, or remove replicas
  • pgsql-user.yml : Add new business user to existing PostgreSQL cluster
  • pgsql-db.yml : Add new business database to existing PostgreSQL cluster
  • pgsql-monitor.yml : Monitor remote PostgreSQL instance with local exporters
  • pgsql-migration.yml : Generate Migration manual & scripts for existing PostgreSQL
Example: Install PGSQL module

asciicast

Example: Remove PGSQL module

asciicast


Dashboard

There are 26 default grafana dashboards about PostgreSQL and categorized into 4 levels. Check Dashboards for details.

Overview Cluster Instance Database
PGSQL Overview PGSQL Cluster PGSQL Instance PGSQL Database
PGSQL Alert PGRDS Cluster PGRDS Instance PGCAT Database
PGSQL Shard PGSQL Activity PGCAT Instance PGSQL Tables
PGSQL Replication PGSQL Persist PGSQL Table
PGSQL Service PGSQL Proxy PGCAT Table
PGSQL Databases PGSQL Pgbouncer PGSQL Query
PGSQL Patroni PGSQL Session PGCAT Query
PGSQL Xacts PGCAT Locks
PGSQL Exporter PGCAT Schema

Parameter

API Reference for PGSQL module:

  • PG_ID : Calculate & Check Postgres Identity
  • PG_BUSINESS : Postgres Business Object Definition
  • PG_INSTALL : Install PGSQL Packages & Extensions
  • PG_BOOTSTRAP : Init a HA Postgres Cluster with Patroni
  • PG_PROVISION : Create users, databases, and in-database objects
  • PG_BACKUP : Setup backup repo with pgbackrest
  • PG_SERVICE : Exposing pg service, bind vip and register DNS
  • PG_EXPORTER : Add Monitor for PGSQL Instance
Parameters
Parameter Section Type Level Comment
pg_mode PG_ID enum C pgsql cluster mode: pgsql,citus,gpsql
pg_cluster PG_ID string C pgsql cluster name, REQUIRED identity parameter
pg_seq PG_ID int I pgsql instance seq number, REQUIRED identity parameter
pg_role PG_ID enum I pgsql role, REQUIRED, could be primary,replica,offline
pg_instances PG_ID dict I define multiple pg instances on node in {port:ins_vars} format
pg_upstream PG_ID ip I repl upstream ip addr for standby cluster or cascade replica
pg_shard PG_ID string C pgsql shard name, optional identity for sharding clusters
pg_group PG_ID int C pgsql shard index number, optional identity for sharding clusters
gp_role PG_ID enum C greenplum role of this cluster, could be master or segment
pg_exporters PG_ID dict C additional pg_exporters to monitor remote postgres instances
pg_offline_query PG_ID bool I set to true to enable offline query on this instance
pg_users PG_BUSINESS user[] C postgres business users
pg_databases PG_BUSINESS database[] C postgres business databases
pg_services PG_BUSINESS service[] C postgres business services
pg_hba_rules PG_BUSINESS hba[] C business hba rules for postgres
pgb_hba_rules PG_BUSINESS hba[] C business hba rules for pgbouncer
pg_replication_username PG_BUSINESS username G postgres replication username, replicator by default
pg_replication_password PG_BUSINESS password G postgres replication password, DBUser.Replicator by default
pg_admin_username PG_BUSINESS username G postgres admin username, dbuser_dba by default
pg_admin_password PG_BUSINESS password G postgres admin password in plain text, DBUser.DBA by default
pg_monitor_username PG_BUSINESS username G postgres monitor username, dbuser_monitor by default
pg_monitor_password PG_BUSINESS password G postgres monitor password, DBUser.Monitor by default
pg_dbsu_password PG_BUSINESS password G/C dbsu password, empty string means no dbsu password by default
pg_dbsu PG_INSTALL username C os dbsu name, postgres by default, better not change it
pg_dbsu_uid PG_INSTALL int C os dbsu uid and gid, 26 for default postgres users and groups
pg_dbsu_sudo PG_INSTALL enum C dbsu sudo privilege, none,limit,all,nopass. limit by default
pg_dbsu_home PG_INSTALL path C postgresql home directory, /var/lib/pgsql by default
pg_dbsu_ssh_exchange PG_INSTALL bool C exchange postgres dbsu ssh key among same pgsql cluster
pg_version PG_INSTALL enum C postgres major version to be installed, 16 by default
pg_bin_dir PG_INSTALL path C postgres binary dir, /usr/pgsql/bin by default
pg_log_dir PG_INSTALL path C postgres log dir, /pg/log/postgres by default
pg_packages PG_INSTALL string[] C pg packages to be installed, ${pg_version} will be replaced
pg_extensions PG_INSTALL string[] C pg extensions to be installed, ${pg_version} will be replaced
pg_safeguard PG_BOOTSTRAP bool G/C/A prevent purging running postgres instance? false by default
pg_clean PG_BOOTSTRAP bool G/C/A purging existing postgres during pgsql init? true by default
pg_data PG_BOOTSTRAP path C postgres data directory, /pg/data by default
pg_fs_main PG_BOOTSTRAP path C mountpoint/path for postgres main data, /data by default
pg_fs_bkup PG_BOOTSTRAP path C mountpoint/path for pg backup data, /data/backup by default
pg_storage_type PG_BOOTSTRAP enum C storage type for pg main data, SSD,HDD, SSD by default
pg_dummy_filesize PG_BOOTSTRAP size C size of /pg/dummy, hold 64MB disk space for emergency use
pg_listen PG_BOOTSTRAP ip(s) C/I postgres/pgbouncer listen addresses, comma separated list
pg_port PG_BOOTSTRAP port C postgres listen port, 5432 by default
pg_localhost PG_BOOTSTRAP path C postgres unix socket dir for localhost connection
pg_namespace PG_BOOTSTRAP path C top level key namespace in etcd, used by patroni & vip
patroni_enabled PG_BOOTSTRAP bool C if disabled, no postgres cluster will be created during init
patroni_mode PG_BOOTSTRAP enum C patroni working mode: default,pause,remove
patroni_port PG_BOOTSTRAP port C patroni listen port, 8008 by default
patroni_log_dir PG_BOOTSTRAP path C patroni log dir, /pg/log/patroni by default
patroni_ssl_enabled PG_BOOTSTRAP bool G secure patroni RestAPI communications with SSL?
patroni_watchdog_mode PG_BOOTSTRAP enum C patroni watchdog mode: automatic,required,off. off by default
patroni_username PG_BOOTSTRAP username C patroni restapi username, postgres by default
patroni_password PG_BOOTSTRAP password C patroni restapi password, Patroni.API by default
patroni_citus_db PG_BOOTSTRAP string C citus database managed by patroni, postgres by default
pg_conf PG_BOOTSTRAP enum C config template: oltp,olap,crit,tiny. oltp.yml by default
pg_max_conn PG_BOOTSTRAP int C postgres max connections, auto will use recommended value
pg_shared_buffer_ratio PG_BOOTSTRAP float C postgres shared buffer memory ratio, 0.25 by default, 0.1~0.4
pg_rto PG_BOOTSTRAP int C recovery time objective in seconds, 30s by default
pg_rpo PG_BOOTSTRAP int C recovery point objective in bytes, 1MiB at most by default
pg_libs PG_BOOTSTRAP string C preloaded libraries, timescaledb,pg_stat_statements,auto_explain by default
pg_delay PG_BOOTSTRAP interval I replication apply delay for standby cluster leader
pg_checksum PG_BOOTSTRAP bool C enable data checksum for postgres cluster?
pg_pwd_enc PG_BOOTSTRAP enum C passwords encryption algorithm: md5,scram-sha-256
pg_encoding PG_BOOTSTRAP enum C database cluster encoding, UTF8 by default
pg_locale PG_BOOTSTRAP enum C database cluster local, C by default
pg_lc_collate PG_BOOTSTRAP enum C database cluster collate, C by default
pg_lc_ctype PG_BOOTSTRAP enum C database character type, en_US.UTF8 by default
pgbouncer_enabled PG_BOOTSTRAP bool C if disabled, pgbouncer will not be launched on pgsql host
pgbouncer_port PG_BOOTSTRAP port C pgbouncer listen port, 6432 by default
pgbouncer_log_dir PG_BOOTSTRAP path C pgbouncer log dir, /pg/log/pgbouncer by default
pgbouncer_auth_query PG_BOOTSTRAP bool C query postgres to retrieve unlisted business users?
pgbouncer_poolmode PG_BOOTSTRAP enum C pooling mode: transaction,session,statement, transaction by default
pgbouncer_sslmode PG_BOOTSTRAP enum C pgbouncer client ssl mode, disable by default
pg_provision PG_PROVISION bool C provision postgres cluster after bootstrap
pg_init PG_PROVISION string G/C provision init script for cluster template, pg-init by default
pg_default_roles PG_PROVISION role[] G/C default roles and users in postgres cluster
pg_default_privileges PG_PROVISION string[] G/C default privileges when created by admin user
pg_default_schemas PG_PROVISION string[] G/C default schemas to be created
pg_default_extensions PG_PROVISION extension[] G/C default extensions to be created
pg_reload PG_PROVISION bool A reload postgres after hba changes
pg_default_hba_rules PG_PROVISION hba[] G/C postgres default host-based authentication rules
pgb_default_hba_rules PG_PROVISION hba[] G/C pgbouncer default host-based authentication rules
pgbackrest_enabled PG_BACKUP bool C enable pgbackrest on pgsql host?
pgbackrest_clean PG_BACKUP bool C remove pg backup data during init?
pgbackrest_log_dir PG_BACKUP path C pgbackrest log dir, /pg/log/pgbackrest by default
pgbackrest_method PG_BACKUP enum C pgbackrest repo method: local,minio,etc…
pgbackrest_repo PG_BACKUP dict G/C pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository
pg_weight PG_SERVICE int I relative load balance weight in service, 100 by default, 0-255
pg_service_provider PG_SERVICE enum G/C dedicate haproxy node group name, or empty string for local nodes by default
pg_default_service_dest PG_SERVICE enum G/C default service destination if svc.dest=‘default’
pg_default_services PG_SERVICE service[] G/C postgres default service definitions
pg_vip_enabled PG_SERVICE bool C enable a l2 vip for pgsql primary? false by default
pg_vip_address PG_SERVICE cidr4 C vip address in <ipv4>/<mask> format, require if vip is enabled
pg_vip_interface PG_SERVICE string C/I vip network interface to listen, eth0 by default
pg_dns_suffix PG_SERVICE string C pgsql dns suffix, ’’ by default
pg_dns_target PG_SERVICE enum C auto, primary, vip, none, or ad hoc ip
pg_exporter_enabled PG_EXPORTER bool C enable pg_exporter on pgsql hosts?
pg_exporter_config PG_EXPORTER string C pg_exporter configuration file name
pg_exporter_cache_ttls PG_EXPORTER string C pg_exporter collector ttl stage in seconds, ‘1,10,60,300’ by default
pg_exporter_port PG_EXPORTER port C pg_exporter listen port, 9630 by default
pg_exporter_params PG_EXPORTER string C extra url parameters for pg_exporter dsn
pg_exporter_url PG_EXPORTER pgurl C overwrite auto-generate pg dsn if specified
pg_exporter_auto_discovery PG_EXPORTER bool C enable auto database discovery? enabled by default
pg_exporter_exclude_database PG_EXPORTER string C csv of database that WILL NOT be monitored during auto-discovery
pg_exporter_include_database PG_EXPORTER string C csv of database that WILL BE monitored during auto-discovery
pg_exporter_connect_timeout PG_EXPORTER int C pg_exporter connect timeout in ms, 200 by default
pg_exporter_options PG_EXPORTER arg C overwrite extra options for pg_exporter
pgbouncer_exporter_enabled PG_EXPORTER bool C enable pgbouncer_exporter on pgsql hosts?
pgbouncer_exporter_port PG_EXPORTER port C pgbouncer_exporter listen port, 9631 by default
pgbouncer_exporter_url PG_EXPORTER pgurl C overwrite auto-generate pgbouncer dsn if specified
pgbouncer_exporter_options PG_EXPORTER arg C overwrite extra options for pgbouncer_exporter

Tutorials

  • Fork an existing PostgreSQL cluster.
  • Create a standby cluster of an existing PostgreSQL cluster.
  • Create a delayed cluster of another pgsql cluster?
  • Monitoring an existing postgres instance?
  • Migration from an external PostgreSQL with logical replication?
  • Use MinIO as a central pgBackRest repo.
  • Use dedicate etcd cluster for DCS?
  • Use dedicated haproxy for exposing PostgreSQL service.
  • Deploy a multi-node MinIO cluster?
  • Use CMDB instead of Config as inventory.
  • Use PostgreSQL as grafana backend storage ?
  • Use PostgreSQL as prometheus backend storage ?

5.1 - Architecture

PostgreSQL cluster architectures and implmenetation details.

Component Overview

Here is how PostgreSQL module components and their interactions. From top to bottom:

  • Cluster DNS is resolved by DNSMASQ on infra nodes
  • Cluster VIP is manged by vip-manager, which will bind to cluster primary.
    • vip-manager will acquire cluster leader info written by patroni from etcd cluster directly
  • Cluster services are exposed by Haproxy on nodes, services are distinguished by node ports (543x).
    • Haproxy port 9101: monitoring metrics & stats & admin page
    • Haproxy port 5433: default service that routes to primary pgbouncer: primary
    • Haproxy port 5434: default service that routes to replica pgbouncer: replica
    • Haproxy port 5436: default service that routes to primary postgres: default
    • Haproxy port 5438: default service that routeroutesto offline postgres: offline
    • HAProxy will route traffic based on health check information provided by patroni.
  • Pgbouncer is a connection pool middleware that buffers connections, exposes extra metrics, and brings extra flexibility @ port 6432
    • Pgbouncer is stateless and deployed with the Postgres server in a 1:1 manner through a local unix socket.
    • Production traffic (Primary/Replica) will go through pgbouncer by default (can be skipped by pg_default_service_dest )
    • Default/Offline service will always bypass pgbouncer and connect to target Postgres directly.
  • Postgres provides relational database services @ port 5432
    • Install PGSQL module on multiple nodes will automatically form a HA cluster based on streaming replication
    • PostgreSQL is supervised by patroni by default.
  • Patroni will supervise PostgreSQL server @ port 8008 by default
    • Patroni spawn postgres servers as the child process
    • Patroni uses etcd as DCS: config storage, failure detection, and leader election.
    • Patroni will provide Postgres information through a health check. Which is used by HAProxy
    • Patroni metrics will be scraped by prometheus on infra nodes
  • PG Exporter will expose postgres metrics @ port 9630
    • PostgreSQL’s metrics will be scraped by prometheus on infra nodes
  • Pgbouncer Exporter will expose pgbouncer metrics @ port 9631
    • Pgbouncer’s metrics will be scraped by prometheus on infra nodes
  • pgBackRest will work on the local repo by default (pgbackrest_method)
    • If local (default) is used as the backup repo, pgBackRest will create local repo under the primary’s pg_fs_bkup
    • If minio is used as the backup repo, pgBackRest will create the repo on the dedicated MinIO cluster in pgbackrest_repo.minio
  • Postgres-related logs (postgres,pgbouncer,patroni,pgbackrest) are exposed by promtail @ port 9080
    • Promtail will send logs to Loki on infra nodes

pigsty-arch.jpg

5.2 - Users

Define business users & roles in PostgreSQL, which is the object created by SQL CREATE USER/ROLE

In this context, the User refers to objects created by SQL CREATE USER/ROLE.


Define User

There are two parameters related to users:

  • pg_users : Define business users & roles at cluster level
  • pg_default_roles : Define system-wide roles & global users at global level

They are both arrays of user/role definition. You can define multiple users/roles in one cluster.

pg-meta:
  hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
  vars:
    pg_cluster: pg-meta
    pg_databases:
      - {name: dbuser_meta     ,password: DBUser.Meta     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: pigsty admin user }
      - {name: dbuser_view     ,password: DBUser.Viewer   ,pgbouncer: true ,roles: [dbrole_readonly] ,comment: read-only viewer for meta database }
      - {name: dbuser_grafana  ,password: DBUser.Grafana  ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for grafana database    }
      - {name: dbuser_bytebase ,password: DBUser.Bytebase ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for bytebase database   }
      - {name: dbuser_kong     ,password: DBUser.Kong     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for kong api gateway    }
      - {name: dbuser_gitea    ,password: DBUser.Gitea    ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for gitea service       }
      - {name: dbuser_wiki     ,password: DBUser.Wiki     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for wiki.js service     }
      - {name: dbuser_noco     ,password: DBUser.Noco     ,pgbouncer: true ,roles: [dbrole_admin]    ,comment: admin user for nocodb service      }

And each user definition may look like:

- name: dbuser_meta               # REQUIRED, `name` is the only mandatory field of a user definition
  password: DBUser.Meta           # optional, password, can be a scram-sha-256 hash string or plain text
  login: true                     # optional, can log in, true by default  (new biz ROLE should be false)
  superuser: false                # optional, is superuser? false by default
  createdb: false                 # optional, can create database? false by default
  createrole: false               # optional, can create role? false by default
  inherit: true                   # optional, can this role use inherited privileges? true by default
  replication: false              # optional, can this role do replication? false by default
  bypassrls: false                # optional, can this role bypass row level security? false by default
  pgbouncer: true                 # optional, add this user to pgbouncer user-list? false by default (production user should be true explicitly)
  connlimit: -1                   # optional, user connection limit, default -1 disable limit
  expire_in: 3650                 # optional, now + n days when this role is expired (OVERWRITE expire_at)
  expire_at: '2030-12-31'         # optional, YYYY-MM-DD 'timestamp' when this role is expired  (OVERWRITTEN by expire_in)
  comment: pigsty admin user      # optional, comment string for this user/role
  roles: [dbrole_admin]           # optional, belonged roles. default roles are: dbrole_{admin,readonly,readwrite,offline}
  parameters: {}                  # optional, role level parameters with `ALTER ROLE SET`
  pool_mode: transaction          # optional, pgbouncer pool mode at user level, transaction by default
  pool_connlimit: -1              # optional, max database connections at user level, default -1 disable limit
  search_path: public             # key value config parameters according to postgresql documentation (e.g: use pigsty as default search_path)
  • The only required field is name, which should be a valid & unique username in PostgreSQL.
  • Roles don’t need a password, while it could be necessary for a login-able user.
  • The password can be plain text or a scram-sha-256 / md5 hash string.
  • User/Role are created one by one in array order. So make sure role/group definition is ahead of its members
  • login, superuser, createdb, createrole, inherit, replication, bypassrls are boolean flags
  • pgbouncer is disabled by default. To add a business user to the pgbouncer user-list, you should set it to true explicitly.

ACL System

Pigsty has a battery-included ACL system, which can be easily used by assigning roles to users:

  • dbrole_readonly : The role for global read-only access
  • dbrole_readwrite : The role for global read-write access
  • dbrole_admin : The role for object creation
  • dbrole_offline : The role for restricted read-only access (offline instance)

If you wish to re-design your ACL system, check the following parameters & templates.


Create User

Users & Roles defined in pg_default_roles and pg_users will be automatically created one by one during cluster bootstrap.

If you wish to create user on an existing cluster, the bin/pgsql-user util can be used.

Add new user definition to all.children.<cls>.pg_users, and create that database with:

bin/pgsql-user <cls> <username>    # pgsql-user.yml -l <cls> -e username=<username>

The playbook is idempotent, so it’s ok to run this multiple times on the existing cluster.


Pgbouncer User

Pgbouncer is enabled by default and serves as a connection pool middleware, and its user is managed by default.

Pigsty will add all users in pg_users with pgbouncer: true flag to the pgbouncer userlist by default.

The user is listed in /etc/pgbouncer/userlist.txt:

"postgres" ""
"dbuser_wiki" "SCRAM-SHA-256$4096:+77dyhrPeFDT/TptHs7/7Q==$KeatuohpKIYzHPCt/tqBu85vI11o9mar/by0hHYM2W8=:X9gig4JtjoS8Y/o1vQsIX/gY1Fns8ynTXkbWOjUfbRQ="
"dbuser_view" "SCRAM-SHA-256$4096:DFoZHU/DXsHL8MJ8regdEw==$gx9sUGgpVpdSM4o6A2R9PKAUkAsRPLhLoBDLBUYtKS0=:MujSgKe6rxcIUMv4GnyXJmV0YNbf39uFRZv724+X1FE="
"dbuser_monitor" "SCRAM-SHA-256$4096:fwU97ZMO/KR0ScHO5+UuBg==$CrNsmGrx1DkIGrtrD1Wjexb/aygzqQdirTO1oBZROPY=:L8+dJ+fqlMQh7y4PmVR/gbAOvYWOr+KINjeMZ8LlFww="
"dbuser_meta" "SCRAM-SHA-256$4096:leB2RQPcw1OIiRnPnOMUEg==$eyC+NIMKeoTxshJu314+BmbMFpCcspzI3UFZ1RYfNyU=:fJgXcykVPvOfro2MWNkl5q38oz21nSl1dTtM65uYR1Q="
"dbuser_kong" "SCRAM-SHA-256$4096:bK8sLXIieMwFDz67/0dqXQ==$P/tCRgyKx9MC9LH3ErnKsnlOqgNd/nn2RyvThyiK6e4=:CDM8QZNHBdPf97ztusgnE7olaKDNHBN0WeAbP/nzu5A="
"dbuser_grafana" "SCRAM-SHA-256$4096:HjLdGaGmeIAGdWyn2gDt/Q==$jgoyOB8ugoce+Wqjr0EwFf8NaIEMtiTuQTg1iEJs9BM=:ed4HUFqLyB4YpRr+y25FBT7KnlFDnan6JPVT9imxzA4="
"dbuser_gitea" "SCRAM-SHA-256$4096:l1DBGCc4dtircZ8O8Fbzkw==$tpmGwgLuWPDog8IEKdsaDGtiPAxD16z09slvu+rHE74=:pYuFOSDuWSofpD9OZhG7oWvyAR0PQjJBffgHZLpLHds="
"dbuser_dba" "SCRAM-SHA-256$4096:zH8niABU7xmtblVUo2QFew==$Zj7/pq+ICZx7fDcXikiN7GLqkKFA+X5NsvAX6CMshF0=:pqevR2WpizjRecPIQjMZOm+Ap+x0kgPL2Iv5zHZs0+g="
"dbuser_bytebase" "SCRAM-SHA-256$4096:OMoTM9Zf8QcCCMD0svK5gg==$kMchqbf4iLK1U67pVOfGrERa/fY818AwqfBPhsTShNQ=:6HqWteN+AadrUnrgC0byr5A72noqnPugItQjOLFw0Wk="

And user level parameters are listed in /etc/pgbouncer/useropts.txt:

dbuser_dba                  = pool_mode=session max_user_connections=16
dbuser_monitor              = pool_mode=session max_user_connections=8

The userlist & useropts file will be updated automatically when you add a new user with pgsql-user util, or pgsql-user.yml playbook.

You can use pgbouncer_auth_query to simplify pgbouncer user management (with the cost of reliability & security).

5.3 - Databases

Define business databases in PostgreSQL, which is the object create by SQL CREATE DATABASE

In this context, Database refers to the object created by SQL CREATE DATABASE.

A PostgreSQL server can serve multiple databases simultaneously. And you can customize each database with Pigsty API.


Define Database

Business databases are defined by pg_databases, which is a cluster-level parameter.

For example, the default meta database is defined in the pg-meta cluster:

pg-meta:
  hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
  vars:
    pg_cluster: pg-meta
    pg_databases:
      - { name: meta ,baseline: cmdb.sql ,comment: pigsty meta database ,schemas: [pigsty] ,extensions: [{name: postgis, schema: public}, {name: timescaledb}]}
      - { name: grafana  ,owner: dbuser_grafana  ,revokeconn: true ,comment: grafana primary database }
      - { name: bytebase ,owner: dbuser_bytebase ,revokeconn: true ,comment: bytebase primary database }
      - { name: kong     ,owner: dbuser_kong     ,revokeconn: true ,comment: kong the api gateway database }
      - { name: gitea    ,owner: dbuser_gitea    ,revokeconn: true ,comment: gitea meta database }
      - { name: wiki     ,owner: dbuser_wiki     ,revokeconn: true ,comment: wiki meta database }
      - { name: noco     ,owner: dbuser_noco     ,revokeconn: true ,comment: nocodb database }

Each database definition is a dict with the following fields:

- name: meta                      # REQUIRED, `name` is the only mandatory field of a database definition
  baseline: cmdb.sql              # optional, database sql baseline path, (relative path among ansible search path, e.g files/)
  pgbouncer: true                 # optional, add this database to pgbouncer database list? true by default
  schemas: [pigsty]               # optional, additional schemas to be created, array of schema names
  extensions:                     # optional, additional extensions to be installed: array of `{name[,schema]}`
    - { name: postgis , schema: public }
    - { name: timescaledb }
  comment: pigsty meta database   # optional, comment string for this database
  owner: postgres                 # optional, database owner, postgres by default
  template: template1             # optional, which template to use, template1 by default
  encoding: UTF8                  # optional, database encoding, UTF8 by default. (MUST same as template database)
  locale: C                       # optional, database locale, C by default.  (MUST same as template database)
  lc_collate: C                   # optional, database collate, C by default. (MUST same as template database)
  lc_ctype: C                     # optional, database ctype, C by default.   (MUST same as template database)
  tablespace: pg_default          # optional, default tablespace, 'pg_default' by default.
  allowconn: true                 # optional, allow connection, true by default. false will disable connect at all
  revokeconn: false               # optional, revoke public connection privilege. false by default. (leave connect with grant option to owner)
  register_datasource: true       # optional, register this database to grafana datasources? true by default
  connlimit: -1                   # optional, database connection limit, default -1 disable limit
  pool_auth_user: dbuser_meta     # optional, all connection to this pgbouncer database will be authenticated by this user
  pool_mode: transaction          # optional, pgbouncer pool mode at database level, default transaction
  pool_size: 64                   # optional, pgbouncer pool size at database level, default 64
  pool_size_reserve: 32           # optional, pgbouncer pool size reserve at database level, default 32
  pool_size_min: 0                # optional, pgbouncer pool size min at database level, default 0
  pool_max_db_conn: 100           # optional, max database connections at database level, default 100

The only required field is name, which should be a valid and unique database name in PostgreSQL.

Newly created databases are forked from template1 database by default. which is customized by PG_PROVISION during cluster bootstrap.

Check ACL: Database Privilege for details about database-level privilege.


Create Database

Databases defined in pg_databases will be automatically created during cluster bootstrap.

If you wish to create database on an existing cluster, the bin/pgsql-db util can be used.

Add new database definition to all.children.<cls>.pg_databases, and create that database with:

bin/pgsql-db <cls> <dbname>    # pgsql-db.yml -l <cls> -e dbname=<dbname>

It’s usually not a good idea to execute this on the existing database again when a baseline script is used.

If you are using the default pgbouncer as the proxy middleware, YOU MUST create the new database with pgsql-db util or pgsql-db.yml playbook. Otherwise, the new database will not be added to the pgbouncer database list.

Remember, if your database definition has a non-trivial owner (dbsu postgres by default ), make sure the owner user exists. That is to say, always create the user before the database.


Pgbouncer Database

Pgbouncer is enabled by default and serves as a connection pool middleware.

Pigsty will add all databases in pg_databases to the pgbouncer database list by default. You can disable the pgbouncer proxy for a specific database by setting pgbouncer: false in the database definition.

The database is listed in /etc/pgbouncer/database.txt, with extra database-level parameters such as:

meta                        = host=/var/run/postgresql mode=session
grafana                     = host=/var/run/postgresql mode=transaction
bytebase                    = host=/var/run/postgresql auth_user=dbuser_meta
kong                        = host=/var/run/postgresql pool_size=32 reserve_pool=64
gitea                       = host=/var/run/postgresql min_pool_size=10
wiki                        = host=/var/run/postgresql
noco                        = host=/var/run/postgresql
mongo                       = host=/var/run/postgresql

The Pgbouncer database list will be updated when create database with Pigsty util & playbook.

To access pgbouncer administration functionality, you can use the pgb alias as dbsu.

There’s a util function defined in /etc/profile.d/pg-alias.sh, allowing you to reroute pgbouncer database traffic to a new host quickly, which can be used during zero-downtime migration.

# route pgbouncer traffic to another cluster member
function pgb-route(){
  local ip=${1-'\/var\/run\/postgresql'}
  sed -ie "s/host=[^[:space:]]\+/host=${ip}/g" /etc/pgbouncer/pgbouncer.ini
  cat /etc/pgbouncer/pgbouncer.ini
}

5.4 - Services

Define and create new services, and expose them via haproxy

Service Implementation

In Pigsty, services are implemented using haproxy on nodes, differentiated by different ports on the host node.

Every node has Haproxy enabled to expose services. From the database perspective, nodes in the cluster may be primary or replicas, but from the service perspective, all nodes are the same. This means even if you access a replica node, as long as you use the correct service port, you can still use the primary’s read-write service. This design seals the complexity: as long as you can access any instance on the PostgreSQL cluster, you can fully access all services.

This design is akin to the NodePort service in Kubernetes. Similarly, in Pigsty, every service includes these two core elements:

  1. Access endpoints exposed via NodePort (port number, from where to access?)
  2. Target instances chosen through Selectors (list of instances, who will handle it?)

The boundary of Pigsty’s service delivery stops at the cluster’s HAProxy. Users can access these load balancers in various ways. Please refer to Access Service.

All services are declared through configuration files. For instance, the default PostgreSQL service is defined by the pg_default_services parameter:

- { name: primary ,port: 5433 ,dest: default  ,check: /primary   ,selector: "[]" }
- { name: replica ,port: 5434 ,dest: default  ,check: /read-only ,selector: "[]" , backup: "[? pg_role == `primary` || pg_role == `offline` ]" }
- { name: default ,port: 5436 ,dest: postgres ,check: /primary   ,selector: "[]" }
- { name: offline ,port: 5438 ,dest: postgres ,check: /replica   ,selector: "[? pg_role == `offline` || pg_offline_query ]" , backup: "[? pg_role == `replica` && !pg_offline_query]"}

You can also define new service in pg_services. And pg_default_servicespg_services are both array of Service Definition.


Define Service

The default services are defined in pg_default_services.

While you can define your extra PostgreSQL services with pg_services @ the global or cluster level.

These two parameters are both arrays of service objects. Each service definition will be rendered as a haproxy config in /etc/haproxy/<svcname>.cfg, check service.j2 for details.

Here is an example of an extra service definition: standby

- name: standby                   # required, service name, the actual svc name will be prefixed with `pg_cluster`, e.g: pg-meta-standby
  port: 5435                      # required, service exposed port (work as kubernetes service node port mode)
  ip: "*"                         # optional, service bind ip address, `*` for all ip by default
  selector: "[]"                  # required, service member selector, use JMESPath to filter inventory
  dest: default                   # optional, destination port, default|postgres|pgbouncer|<port_number>, 'default' by default
  check: /sync                    # optional, health check url path, / by default
  backup: "[? pg_role == `primary`]"  # backup server selector
  maxconn: 3000                   # optional, max allowed front-end connection
  balance: roundrobin             # optional, haproxy load balance algorithm (roundrobin by default, other: leastconn)
  options: 'inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100'

And it will be translated to a haproxy config file /etc/haproxy/pg-test-standby.conf:

#---------------------------------------------------------------------
# service: pg-test-standby @ 10.10.10.11:5435
#---------------------------------------------------------------------
# service instances 10.10.10.11, 10.10.10.13, 10.10.10.12
# service backups   10.10.10.11
listen pg-test-standby
    bind *:5435
    mode tcp
    maxconn 5000
    balance roundrobin
    option httpchk
    option http-keep-alive
    http-check send meth OPTIONS uri /sync  # <--- true for primary & sync standby
    http-check expect status 200
    default-server inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100
    # servers
    server pg-test-1 10.10.10.11:6432 check port 8008 weight 100 backup   # the primary is used as backup server
    server pg-test-3 10.10.10.13:6432 check port 8008 weight 100
    server pg-test-2 10.10.10.12:6432 check port 8008 weight 100

Reload Service

When cluster membership has changed, such as append / remove replicas, switchover/failover, or adjust relative weight, You have to reload service to make the changes take effect.

bin/pgsql-svc <cls> [ip...]         # reload service for lb cluster or lb instance
# ./pgsql.yml -t pg_service         # the actual ansible task to reload service

Override Service

You can override default service configuration with several ways:

Bypass Pgbouncer

When defining a service, if svc.dest='default', this parameter pg_default_service_dest will be used as the default value. pgbouncer is used by default, you can use postgres instead, so the default primary & replica service will bypass pgbouncer and route traffic to postgres directly

If you don’t need connection pooling at all, you can change pg_default_service_dest to postgres, and remove default and offline services.

If you don’t need read-only replicas for online traffic, you can remove replica from pg_default_services too.

pg_default_services:
  - { name: primary ,port: 5433 ,dest: default  ,check: /primary   ,selector: "[]" }
  - { name: replica ,port: 5434 ,dest: default  ,check: /read-only ,selector: "[]" , backup: "[? pg_role == `primary` || pg_role == `offline` ]" }
  - { name: default ,port: 5436 ,dest: postgres ,check: /primary   ,selector: "[]" }
  - { name: offline ,port: 5438 ,dest: postgres ,check: /replica   ,selector: "[? pg_role == `offline` || pg_offline_query ]" , backup: "[? pg_role == `replica` && !pg_offline_query]"}

Delegate Service

Pigsty expose PostgreSQL services with haproxy on node. All haproxy instances among the cluster are configured with the same service definition.

However, you can delegate pg service to a specific node group (e.g. dedicate haproxy lb cluster) rather than cluster members.

To do so, you will have to override the default service definition with pg_default_services and set pg_service_provider to the proxy group name.

For example, this configuration will expose pg cluster primary service on haproxy node group proxy with port 10013.

pg_service_provider: proxy       # use load balancer on group `proxy` with port 10013
pg_default_services:  [{ name: primary ,port: 10013 ,dest: postgres  ,check: /primary   ,selector: "[]" }]

It’s user’s responsibility to make sure each delegate service port is unique among the proxy cluster.

5.5 - Extensions

Define, Create, Install, Enable Extensions in Pigsty

Extensions are the soul of PostgreSQL, and Pigsty deeply integrates the core extension plugins of the PostgreSQL ecosystem, providing you with battery-included distributed temporal, geospatial text, graph, and vector database capabilities! Check extension list for details.

Pigsty includes over 220+ PostgreSQL extension plugins and has compiled, packaged, integrated, and maintained many extensions not included in the official PGDG source. It also ensures through thorough testing that all these plugins can work together seamlessly. Including some potent extensions:

  • PostGIS: Add geospatial data support to PostgreSQL
  • TimescaleDB: Add time-series/continuous-aggregation support to PostgreSQL
  • PGVector: AI vector/embedding data type support, and ivfflat / hnsw index access method
  • Citus: Turn a standalone primary-replica postgres cluster into a horizontally scalable distributed cluster
  • Apache AGE: Add OpenCypher graph query language support to PostgreSQL, works like Neo4J
  • PG GraphQL: Add GraphQL language support to PostgreSQL
  • zhparser : Add Chinese word segmentation support to PostgreSQL, works like ElasticSearch
  • Supabase: Open-Source Firebase alternative based on PostgreSQL
  • FerretDB: Open-Source MongoDB alternative based on PostgreSQL
  • PostgresML: Use machine learning algorithms and pretrained models with SQL
  • ParadeDB: Open-Source ElasticSearch Alternative (based on PostgreSQL)

Plugins are already included and placed in the yum repo of the infra nodes, which can be directly enabled through PGSQL Cluster Config. Pigsty also introduces a complete compilation environment and infrastructure, allowing you to compile extensions not included in Pigsty & PGDG.

pigsty-extension.jpg

Some “database” are not actual PostgreSQL extensions, but also supported by pigsty, such as:

  • Supabase: Open-Source Firebase Alternative (based on PostgreSQL)
  • FerretDB: Open-Source MongoDB Alternative (based on PostgreSQL)
  • NocoDB: Open-Source Airtable Alternative (based on PostgreSQL)
  • DuckDB: Open-Source Analytical SQLite Alternative (PostgreSQL Compatible)

Install Extension

When you init a PostgreSQL cluster, the extensions listed in pg_packages & pg_extensions will be installed.

For default EL systems, the default values of pg_packages and pg_extensions are defined as follows:

pg_packages:     # these extensions are always installed by default : pg_repack, wal2json, passwordcheck_cracklib
  - pg_repack_${pg_version}* wal2json_${pg_version}* passwordcheck_cracklib_${pg_version}* # important extensions
pg_extensions:   # install postgis, timescaledb, pgvector by default
  - postgis34_${pg_version}* timescaledb-2-postgresql-${pg_version}* pgvector_${pg_version}*

For ubuntu / debian, package names are different, and passwordcheck_cracklib is not available.

pg_packages:    # these extensions are always installed by default : pg_repack, wal2json
  - postgresql-${pg_version}-repack postgresql-${pg_version}-wal2json
pg_extensions:  # these extensions are installed by default:
  - postgresql-${pg_version}-postgis* timescaledb-2-postgresql-${pg_version} postgresql-${pg_version}-pgvector postgresql-${pg_version}-citus-12.1

Here, ${pg_version} is a placeholder that will be replaced with the actual major version number pg_version of that PostgreSQL cluster Therefore, the default configuration will install these extensions:

  • pg_repack: Extension for online table bloat processing.
  • wal2json: Extracts changes in JSON format through logical decoding.
  • passwordcheck_cracklib: Enforce password policy. (EL only)
  • postgis: Geospatial database extension (postgis34, EL7: postgis33)
  • timescaledb: Time-series database extension
  • pgvector: Vector datatype and ivfflat/hnsw index
  • citus: Distributed/columnar storage extension, (citus is conflict with hydra, choose one of them on EL systems)

If you want to enable certain extensions in a target cluster that has not yet been created, you can directly declare them with the parameters:

pg-meta:
  hosts: { 10.10.10.10: { pg_seq: 1 ,pg_role: primary } }
  vars:
    pg_cluster: pg-meta
    pg_databases:
      - name: test
        extensions:                 # <----- install these extensions for database `test`
          - { name: postgis, schema: public }
          - { name: timescaledb }
          - { name: pg_cron }
          - { name: vector }
          - { name: age }
    pg_libs: 'timescaledb, pg_cron, pg_stat_statements, auto_explain' # <- some extension require a share library to work
    pg_extensions:
      - postgis34_${pg_version}* timescaledb-2-postgresql-${pg_version}* pgvector_${pg_version}* hydra_${pg_version}*   # default extensions to be installed
      - pg_cron_${pg_version}*        # <---- new extension: pg_cron
      - apache-age_${pg_version}*     # <---- new extension: apache-age
      - zhparser_${pg_version}*       # <---- new extension: zhparser

You can run the pg_extension sub-task in pgsql.yml to add extensions to clusters that have already been created.

./pgsql.yml -l pg-meta -t pg_extension    # install specified extensions for cluster pg-v15

To install all available extensions in one pass, you can just specify pg_extensions: ['*${pg_version}*'], which is really a bold move.


Install Manually

After the PostgreSQL cluster is inited, you can manually install plugins via Ansible or Shell commands. For example, if you want to enable a specific extension on a cluster that has already been initialized:

cd ~/pigsty;    # enter pigsty home dir and install the apache age extension for the pg-test cluster
ansible pg-test -m yum -b -a 'name=apache-age_16*'     # The extension name usually has a suffix like `_<pgmajorversion>`

Most plugins are already included in the yum repository on the infrastructure node and can be installed directly using the yum command. If not included, you can consider downloading from the PGDG upstream source using the repotrack / apt download command or compiling source code into RPMs for distribution.

After the extension installation, you should be able to see them in the pg_available_extensions view of the target database cluster. Next, execute in the database where you want to install the extension:

CREATE EXTENSION age;          -- install the graph database extension

5.6 - Authentication

Host-Based Authentication in Pigsty, how to manage HBA rules in Pigsty?

Host-Based Authentication in Pigsty

PostgreSQL has various authentication methods. You can use all of them, while pigsty’s battery-include ACL system focuses on HBA, password, and SSL authentication.


Client Authentication

To connect to a PostgreSQL database, the user has to be authenticated (with a password by default).

You can provide the password in the connection string (not secure) or use the PGPASSWORD env or .pgpass file. Check psql docs and PostgreSQL connection string for more details.

psql 'host=<host> port=<port> dbname=<dbname> user=<username> password=<password>'
psql postgres://<username>:<password>@<host>:<port>/<dbname>
PGPASSWORD=<password>; psql -U <username> -h <host> -p <port> -d <dbname>

The default connection string for the meta database:

psql 'host=10.10.10.10 port=5432 dbname=meta user=dbuser_dba password=DBUser.DBA'
psql postgres://dbuser_dba:[email protected]:5432/meta
PGPASSWORD=DBUser.DBA; psql -U dbuser_dba -h 10.10.10.10 -p 5432 -d meta

To connect with the SSL certificate, you can use the PGSSLCERT and PGSSLKEY env or sslkey & sslcert parameters.

psql 'postgres://dbuser_dba:[email protected]:5432/meta?sslkey=/path/to/dbuser_dba.key&sslcert=/path/to/dbuser_dba.crt'

While the client certificate (CN = username) can be issued with local CA & cert.yml.


Define HBA

There are four parameters for HBA Rules in Pigsty:

Which are array of hba rule objects, and each hba rule is one of the following forms:

1. Raw Form

- title: allow intranet password access
  role: common
  rules:
    - host   all  all  10.0.0.0/8      md5
    - host   all  all  172.16.0.0/12   md5
    - host   all  all  192.168.0.0/16  md5

In the form, the title will be rendered as a comment line, followed by the rules as hba string one by one.

An HBA Rule is installed when the instance’s pg_role is the same as the role.

HBA Rule with role: common will be installed on all instances.

HBA Rule with role: offline will be installed on instances with pg_role = offline or pg_offline_query = true.

2. Alias Form

The alias form, which replace rules with addr, auth, user, and db fields.

- addr: 'intra'    # world|intra|infra|admin|local|localhost|cluster|<cidr>
  auth: 'pwd'      # trust|pwd|ssl|cert|deny|<official auth method>
  user: 'all'      # all|${dbsu}|${repl}|${admin}|${monitor}|<user>|<group>
  db: 'all'        # all|replication|....
  rules: []        # raw hba string precedence over above all
  title: allow intranet password access
  • addr: where
    • world: all IP addresses
    • intra: all intranet cidr: '10.0.0.0/8', '172.16.0.0/12', '192.168.0.0/16'
    • infra: IP addresses of infra nodes
    • admin: admin_ip address
    • local: local unix socket
    • localhost: local unix socket + tcp 127.0.0.1/32
    • cluster: all IP addresses of pg cluster members
    • <cidr>: any standard CIDR blocks or IP addresses
  • auth: how
    • deny: reject access
    • trust: trust authentication
    • pwd: use md5 or scram-sha-256 password auth according to pg_pwd_enc
    • sha/scram-sha-256: enforce scram-sha-256 password authentication
    • md5: md5 password authentication
    • ssl: enforce host ssl in addition to pwd auth
    • ssl-md5: enforce host ssl in addition to md5 password auth
    • ssl-sha: enforce host ssl in addition to scram-sha-256 password auth
    • os/ident: use ident os user authentication
    • peer: use peer authentication
    • cert: use certificate-based client authentication
  • user: who
  • db: which
    • all: all databases
    • replication: replication database
    • ad hoc database name

3. Where to Define

Typically, global HBA is defined in all.vars. If you want to modify the global default HBA rules, you can copy from the full.yml template to all.vars for modification.

Cluster-specific HBA rules are defined in the cluster-level configuration of the database:

Here are some examples of cluster HBA rule definitions.

pg-meta:
  hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
  vars:
    pg_cluster: pg-meta
    pg_hba_rules:
      - { user: dbuser_view ,db: all    ,addr: infra        ,auth: pwd  ,title: 'allow grafana dashboard access cmdb from infra nodes'}
      - { user: all         ,db: all    ,addr: 100.0.0.0/8  ,auth: pwd  ,title: 'all user access all db from kubernetes cluster' }
      - { user: '${admin}'  ,db: world  ,addr: 0.0.0.0/0    ,auth: cert ,title: 'all admin world access with client cert'        }

Reload HBA

To reload postgres/pgbouncer hba rules:

bin/pgsql-hba <cls>                 # reload hba rules of cluster `<cls>`
bin/pgsql-hba <cls> ip1 ip2...      # reload hba rules of specific instances

The underlying command: are:

./pgsql.yml -l <cls> -e pg_reload=true -t pg_hba
./pgsql.yml -l <cls> -e pg_reload=true -t pgbouncer_hba,pgbouncer_reload

Default HBA

Pigsty has a default set of HBA rules, which is pretty secure for most cases.

The rules are self-explained in alias form.

pg_default_hba_rules:             # postgres default host-based authentication rules
  - {user: '${dbsu}'    ,db: all         ,addr: local     ,auth: ident ,title: 'dbsu access via local os user ident'  }
  - {user: '${dbsu}'    ,db: replication ,addr: local     ,auth: ident ,title: 'dbsu replication from local os ident' }
  - {user: '${repl}'    ,db: replication ,addr: localhost ,auth: pwd   ,title: 'replicator replication from localhost'}
  - {user: '${repl}'    ,db: replication ,addr: intra     ,auth: pwd   ,title: 'replicator replication from intranet' }
  - {user: '${repl}'    ,db: postgres    ,addr: intra     ,auth: pwd   ,title: 'replicator postgres db from intranet' }
  - {user: '${monitor}' ,db: all         ,addr: localhost ,auth: pwd   ,title: 'monitor from localhost with password' }
  - {user: '${monitor}' ,db: all         ,addr: infra     ,auth: pwd   ,title: 'monitor from infra host with password'}
  - {user: '${admin}'   ,db: all         ,addr: infra     ,auth: ssl   ,title: 'admin @ infra nodes with pwd & ssl'   }
  - {user: '${admin}'   ,db: all         ,addr: world     ,auth: ssl   ,title: 'admin @ everywhere with ssl & pwd'   }
  - {user: '+dbrole_readonly',db: all    ,addr: localhost ,auth: pwd   ,title: 'pgbouncer read/write via local socket'}
  - {user: '+dbrole_readonly',db: all    ,addr: intra     ,auth: pwd   ,title: 'read/write biz user via password'     }
  - {user: '+dbrole_offline' ,db: all    ,addr: intra     ,auth: pwd   ,title: 'allow etl offline tasks from intranet'}
pgb_default_hba_rules:            # pgbouncer default host-based authentication rules
  - {user: '${dbsu}'    ,db: pgbouncer   ,addr: local     ,auth: peer  ,title: 'dbsu local admin access with os ident'}
  - {user: 'all'        ,db: all         ,addr: localhost ,auth: pwd   ,title: 'allow all user local access with pwd' }
  - {user: '${monitor}' ,db: pgbouncer   ,addr: intra     ,auth: pwd   ,title: 'monitor access via intranet with pwd' }
  - {user: '${monitor}' ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other monitor access addr' }
  - {user: '${admin}'   ,db: all         ,addr: intra     ,auth: pwd   ,title: 'admin access via intranet with pwd'   }
  - {user: '${admin}'   ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other admin access addr'   }
  - {user: 'all'        ,db: all         ,addr: intra     ,auth: pwd   ,title: 'allow all user intra access with pwd' }
Example: Rendered pg_hba.conf
#==============================================================#
# File      :   pg_hba.conf
# Desc      :   Postgres HBA Rules for pg-meta-1 [primary]
# Time      :   2023-01-11 15:19
# Host      :   pg-meta-1 @ 10.10.10.10:5432
# Path      :   /pg/data/pg_hba.conf
# Note      :   ANSIBLE MANAGED, DO NOT CHANGE!
# Author    :   Ruohang Feng ([email protected])
# License   :   AGPLv3
#==============================================================#

# addr alias
# local     : /var/run/postgresql
# admin     : 10.10.10.10
# infra     : 10.10.10.10
# intra     : 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16

# user alias
# dbsu    :  postgres
# repl    :  replicator
# monitor :  dbuser_monitor
# admin   :  dbuser_dba

# dbsu access via local os user ident [default]
local    all                postgres                              ident

# dbsu replication from local os ident [default]
local    replication        postgres                              ident

# replicator replication from localhost [default]
local    replication        replicator                            scram-sha-256
host     replication        replicator         127.0.0.1/32       scram-sha-256

# replicator replication from intranet [default]
host     replication        replicator         10.0.0.0/8         scram-sha-256
host     replication        replicator         172.16.0.0/12      scram-sha-256
host     replication        replicator         192.168.0.0/16     scram-sha-256

# replicator postgres db from intranet [default]
host     postgres           replicator         10.0.0.0/8         scram-sha-256
host     postgres           replicator         172.16.0.0/12      scram-sha-256
host     postgres           replicator         192.168.0.0/16     scram-sha-256

# monitor from localhost with password [default]
local    all                dbuser_monitor                        scram-sha-256
host     all                dbuser_monitor     127.0.0.1/32       scram-sha-256

# monitor from infra host with password [default]
host     all                dbuser_monitor     10.10.10.10/32     scram-sha-256

# admin @ infra nodes with pwd & ssl [default]
hostssl  all                dbuser_dba         10.10.10.10/32     scram-sha-256

# admin @ everywhere with ssl & pwd [default]
hostssl  all                dbuser_dba         0.0.0.0/0          scram-sha-256

# pgbouncer read/write via local socket [default]
local    all                +dbrole_readonly                      scram-sha-256
host     all                +dbrole_readonly   127.0.0.1/32       scram-sha-256

# read/write biz user via password [default]
host     all                +dbrole_readonly   10.0.0.0/8         scram-sha-256
host     all                +dbrole_readonly   172.16.0.0/12      scram-sha-256
host     all                +dbrole_readonly   192.168.0.0/16     scram-sha-256

# allow etl offline tasks from intranet [default]
host     all                +dbrole_offline    10.0.0.0/8         scram-sha-256
host     all                +dbrole_offline    172.16.0.0/12      scram-sha-256
host     all                +dbrole_offline    192.168.0.0/16     scram-sha-256

# allow application database intranet access [common] [DISABLED]
#host    kong            dbuser_kong         10.0.0.0/8          md5
#host    bytebase        dbuser_bytebase     10.0.0.0/8          md5
#host    grafana         dbuser_grafana      10.0.0.0/8          md5
Example: Rendered pgb_hba.conf
#==============================================================#
# File      :   pgb_hba.conf
# Desc      :   Pgbouncer HBA Rules for pg-meta-1 [primary]
# Time      :   2023-01-11 15:28
# Host      :   pg-meta-1 @ 10.10.10.10:5432
# Path      :   /etc/pgbouncer/pgb_hba.conf
# Note      :   ANSIBLE MANAGED, DO NOT CHANGE!
# Author    :   Ruohang Feng ([email protected])
# License   :   AGPLv3
#==============================================================#

# PGBOUNCER HBA RULES FOR pg-meta-1 @ 10.10.10.10:6432
# ansible managed: 2023-01-11 14:30:58

# addr alias
# local     : /var/run/postgresql
# admin     : 10.10.10.10
# infra     : 10.10.10.10
# intra     : 10.0.0.0/8, 172.16.0.0/12, 192.168.0.0/16

# user alias
# dbsu    :  postgres
# repl    :  replicator
# monitor :  dbuser_monitor
# admin   :  dbuser_dba

# dbsu local admin access with os ident [default]
local    pgbouncer          postgres                              peer

# allow all user local access with pwd [default]
local    all                all                                   scram-sha-256
host     all                all                127.0.0.1/32       scram-sha-256

# monitor access via intranet with pwd [default]
host     pgbouncer          dbuser_monitor     10.0.0.0/8         scram-sha-256
host     pgbouncer          dbuser_monitor     172.16.0.0/12      scram-sha-256
host     pgbouncer          dbuser_monitor     192.168.0.0/16     scram-sha-256

# reject all other monitor access addr [default]
host     all                dbuser_monitor     0.0.0.0/0          reject

# admin access via intranet with pwd [default]
host     all                dbuser_dba         10.0.0.0/8         scram-sha-256
host     all                dbuser_dba         172.16.0.0/12      scram-sha-256
host     all                dbuser_dba         192.168.0.0/16     scram-sha-256

# reject all other admin access addr [default]
host     all                dbuser_dba         0.0.0.0/0          reject

# allow all user intra access with pwd [default]
host     all                all                10.0.0.0/8         scram-sha-256
host     all                all                172.16.0.0/12      scram-sha-256
host     all                all                192.168.0.0/16     scram-sha-256

Security Enhancement

For those critical cases, we have a security.yml template with the following hba rule set as a reference:

pg_default_hba_rules:             # postgres host-based auth rules by default
  - {user: '${dbsu}'    ,db: all         ,addr: local     ,auth: ident ,title: 'dbsu access via local os user ident'  }
  - {user: '${dbsu}'    ,db: replication ,addr: local     ,auth: ident ,title: 'dbsu replication from local os ident' }
  - {user: '${repl}'    ,db: replication ,addr: localhost ,auth: ssl   ,title: 'replicator replication from localhost'}
  - {user: '${repl}'    ,db: replication ,addr: intra     ,auth: ssl   ,title: 'replicator replication from intranet' }
  - {user: '${repl}'    ,db: postgres    ,addr: intra     ,auth: ssl   ,title: 'replicator postgres db from intranet' }
  - {user: '${monitor}' ,db: all         ,addr: localhost ,auth: pwd   ,title: 'monitor from localhost with password' }
  - {user: '${monitor}' ,db: all         ,addr: infra     ,auth: ssl   ,title: 'monitor from infra host with password'}
  - {user: '${admin}'   ,db: all         ,addr: infra     ,auth: ssl   ,title: 'admin @ infra nodes with pwd & ssl'   }
  - {user: '${admin}'   ,db: all         ,addr: world     ,auth: cert  ,title: 'admin @ everywhere with ssl & cert'   }
  - {user: '+dbrole_readonly',db: all    ,addr: localhost ,auth: ssl   ,title: 'pgbouncer read/write via local socket'}
  - {user: '+dbrole_readonly',db: all    ,addr: intra     ,auth: ssl   ,title: 'read/write biz user via password'     }
  - {user: '+dbrole_offline' ,db: all    ,addr: intra     ,auth: ssl   ,title: 'allow etl offline tasks from intranet'}
pgb_default_hba_rules:            # pgbouncer host-based authentication rules
  - {user: '${dbsu}'    ,db: pgbouncer   ,addr: local     ,auth: peer  ,title: 'dbsu local admin access with os ident'}
  - {user: 'all'        ,db: all         ,addr: localhost ,auth: pwd   ,title: 'allow all user local access with pwd' }
  - {user: '${monitor}' ,db: pgbouncer   ,addr: intra     ,auth: ssl   ,title: 'monitor access via intranet with pwd' }
  - {user: '${monitor}' ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other monitor access addr' }
  - {user: '${admin}'   ,db: all         ,addr: intra     ,auth: ssl   ,title: 'admin access via intranet with pwd'   }
  - {user: '${admin}'   ,db: all         ,addr: world     ,auth: deny  ,title: 'reject all other admin access addr'   }
  - {user: 'all'        ,db: all         ,addr: intra     ,auth: ssl   ,title: 'allow all user intra access with pwd' }

5.7 - Configuration

Configure your PostgreSQL cluster & instances according to your needs

You can define different types of instances & clusters.

  • Identity: Parameters used for describing a PostgreSQL cluster
  • Primary: Define a single instance cluster.
  • Replica: Define a basic HA cluster with one primary & one replica.
  • Offline: Define a dedicated instance for OLAP/ETL/Interactive queries
  • Sync Standby: Enable synchronous commit to ensure no data loss.
  • Quorum Commit: Use quorum sync commit for an even higher consistency level.
  • Standby Cluster: Clone an existing cluster and follow it
  • Delayed Cluster: Clone an existing cluster for emergency data recovery
  • Citus Cluster: Define a Citus distributed database cluster
  • Major Version: Create postgres cluster with different major version

Primary

Let’s start with the simplest case, singleton meta:

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
  vars:
    pg_cluster: pg-test

Use the following command to create a primary database instance on the 10.10.10.11 node.

bin/pgsql-add pg-test

Replica

To add a physical replica, you can assign a new instance to pg-test with pg_role set to replica

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }  # <--- newly added
  vars:
    pg_cluster: pg-test

You can create an entire cluster or append a replica to the existing cluster:

bin/pgsql-add pg-test               # init entire cluster in one-pass
bin/pgsql-add pg-test 10.10.10.12   # add replica to existing cluster

Offline

The offline instance is a dedicated replica to serve slow queries, ETL, OLAP traffic and interactive queries, etc…

To add an offline instance, assign a new instance with pg_role set to offline.

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 2, pg_role: offline } # <--- newly added
  vars:
    pg_cluster: pg-test

Offline instance works like common replica instances, but it is used as a backup server in pg-test-replica service. That is to say, offline and primary instance serves only when all replica instances are down.

You can have ad hoc access control offline with pg_default_hba_rules and pg_hba_rules. It will apply to the offline instance and any instances with pg_offline_query flag.


Sync Standby

Pigsty uses asynchronous stream replication by default. Which may have a small replication lag. (10KB / 10ms). A small window of data loss may occur when the primary fails (can be controlled with pg_rpo.), but it is acceptable for most scenarios.

But in some critical scenarios (e.g. financial transactions), data loss is totally unacceptable or read-your-write consistency is required. In this case, you can enable synchronous commit to ensure that.

To enable sync standby mode, you can simply use crit.yml template in pg_conf

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: replica }
  vars:
    pg_cluster: pg-test
    pg_conf: crit.yml   # <--- use crit template

To enable sync standby on existing clusters, config the cluster and enable synchronous_mode:

$ pg edit-config pg-test    # run on admin node with admin user
+++
-synchronous_mode: false    # <--- old value
+synchronous_mode: true     # <--- new value
 synchronous_mode_strict: false

Apply these changes? [y/N]: y

If synchronous_mode: true, the synchronous_standby_names parameter will be managed by patroni. It will choose a sync standby from all available replicas and write its name to the primary’s configuration file.


Quorum Commit

When sync standby is enabled, PostgreSQL will pick one replica as the standby instance, and all other replicas as candidates. Primary will wait until the standby instance flushes to disk before a commit is confirmed, and the standby instance will always have the latest data without any lags.

However, you can achieve an even higher/lower consistency level with the quorum commit (trade-off with availability).

For example, to have all 2 replicas to confirm a commit:

synchronous_mode: true          # make sure synchronous mode is enabled
synchronous_node_count: 2       # at least 2 nodes to confirm a commit

If you have more replicas and wish to have more sync standby, increase synchronous_node_count accordingly. Beware of adjust synchronous_node_count accordingly when you append or remove replicas.

The postgres synchronous_standby_names parameter will be managed by patroni:

synchronous_standby_names = '2 ("pg-test-3","pg-test-2")'
Example: Multiple Sync Standby
$ pg edit-config pg-test
---
+++
@@ -82,10 +82,12 @@
     work_mem: 4MB
+    synchronous_standby_names: 'ANY 2 (pg-test-2, pg-test-3, pg-test-4)'
 
-synchronous_mode: false
+synchronous_mode: true
+synchronous_node_count: 2
 synchronous_mode_strict: false

Apply these changes? [y/N]: y

And we can see that the two replicas are selected as sync standby now.

+ Cluster: pg-test (7080814403632534854) +---------+----+-----------+-----------------+
| Member    | Host        | Role         | State   | TL | Lag in MB | Tags            |
+-----------+-------------+--------------+---------+----+-----------+-----------------+
| pg-test-1 | 10.10.10.10 | Leader       | running |  1 |           | clonefrom: true |
| pg-test-2 | 10.10.10.11 | Sync Standby | running |  1 |         0 | clonefrom: true |
| pg-test-3 | 10.10.10.12 | Sync Standby | running |  1 |         0 | clonefrom: true |
+-----------+-------------+--------------+---------+----+-----------+-----------------+

The classic quorum commit is to use majority of replicas to confirm a commit.

synchronous_mode: quorum        # use quorum commit
postgresql:
  parameters:                   # change the PostgreSQL parameter `synchronous_standby_names`, use the `ANY n ()` notion
    synchronous_standby_names: 'ANY 1 (*)'  # you can specify a list of standby names, or use `*` to match them all
Example: Enable Quorum Commit
$ pg edit-config pg-test

+    synchronous_standby_names: 'ANY 1 (*)' # You have to configure this manually
+ synchronous_mode: quorum        # use quorum commit mode, undocumented parameter
- synchronous_node_count: 2       # this parameter is no longer needed in quorum mode

Apply these changes? [y/N]: y

After applying the configuration, we can see that all replicas are no longer sync standby, but just normal replicas.

After that, when we can check pg_stat_replication.sync_state, it becomes quorum instead of sync or async.


Standby Cluster

You can clone an existing cluster and create a standby cluster, which can be used for migration, horizontal split, multi-az deployment, or disaster recovery.

A standby cluster’s definition is just the same as any other normal cluster, except there’s a pg_upstream defined on the primary instance.

For example, you have a pg-test cluster, to create a standby cluster pg-test2, the inventory may look like this:

# pg-test is the original cluster
pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
  vars: { pg_cluster: pg-test }

# pg-test2 is a standby cluster of pg-test.
pg-test2:
  hosts:
    10.10.10.12: { pg_seq: 1, pg_role: primary , pg_upstream: 10.10.10.11 } # <--- pg_upstream is defined here
    10.10.10.13: { pg_seq: 2, pg_role: replica }
  vars: { pg_cluster: pg-test2 }

And pg-test2-1, the primary of pg-test2 will be a replica of pg-test and serve as a Standby Leader in pg-test2.

Just make sure that the pg_upstream parameter is configured on the primary of the backup cluster to pull backups from the original upstream automatically.

bin/pgsql-add pg-test     # Creating the original cluster
bin/pgsql-add pg-test2    # Creating a Backup Cluster
Example: Change Replication Upstream

You can change the replication upstream of the standby cluster when necessary (e.g. upstream failover).

To do so, just change the standby_cluster.host to the new upstream IP address and apply.

$ pg edit-config pg-test2

 standby_cluster:
   create_replica_methods:
   - basebackup
-  host: 10.10.10.13     # <--- The old upstream
+  host: 10.10.10.12     # <--- The new upstream
   port: 5432

 Apply these changes? [y/N]: y
Example: Promote Standby Cluster

You can promote the standby cluster to a standalone cluster at any time.

To do so, you have to config the cluster and wipe the entire standby_cluster section then apply.

$ pg edit-config pg-test2
-standby_cluster:
-  create_replica_methods:
-  - basebackup
-  host: 10.10.10.11
-  port: 5432

Apply these changes? [y/N]: y
Example: Cascade Replica

If the pg_upstream is specified for replica rather than primary, the replica will be configured as a cascade replica with the given upstream ip instead of the cluster primary

pg-test:
  hosts: # pg-test-1 ---> pg-test-2 ---> pg-test-3
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica } # <--- bridge instance
    10.10.10.13: { pg_seq: 2, pg_role: replica, pg_upstream: 10.10.10.12 } 
    # ^--- replicate from pg-test-2 (the bridge) instead of pg-test-1 (the primary) 
  vars: { pg_cluster: pg-test }

Delayed Cluster

A delayed cluster is a special type of standby cluster, which is used to recover “drop-by-accident” ASAP.

For example, if you wish to have a cluster pg-testdelay which has the same data as 1-day ago pg-test cluster:

# pg-test is the original cluster
pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
  vars: { pg_cluster: pg-test }

# pg-testdelay is a delayed cluster of pg-test.
pg-testdelay:
  hosts:
    10.10.10.12: { pg_seq: 1, pg_role: primary , pg_upstream: 10.10.10.11, pg_delay: 1d }
    10.10.10.13: { pg_seq: 2, pg_role: replica }
  vars: { pg_cluster: pg-test2 }

You can also configure a replication delay on the existing standby cluster.

$ pg edit-config pg-testdelay
 standby_cluster:
   create_replica_methods:
   - basebackup
   host: 10.10.10.11
   port: 5432
+  recovery_min_apply_delay: 1h    # <--- add delay here

Apply these changes? [y/N]: y

When some tuples & tables are dropped by accident, you can advance this delayed cluster to a proper time point and select data from it.

It takes more resources, but can be much faster and have less impact than PITR


Citus Cluster

Pigsty has native citus support. Check files/pigsty/citus.yml & prod.yml for example.

To define a citus cluster, you have to specify the following parameters:

Besides, extra hba rules that allow ssl access from local & other data nodes are required. Which may looks like this

all:
  children:
    pg-citus0: # citus data node 0
      hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus0 , pg_group: 0 }
    pg-citus1: # citus data node 1
      hosts: { 10.10.10.11: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus1 , pg_group: 1 }
    pg-citus2: # citus data node 2
      hosts: { 10.10.10.12: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus2 , pg_group: 2 }
    pg-citus3: # citus data node 3, with an extra replica
      hosts:
        10.10.10.13: { pg_seq: 1, pg_role: primary }
        10.10.10.14: { pg_seq: 2, pg_role: replica }
      vars: { pg_cluster: pg-citus3 , pg_group: 3 }
  vars:                               # global parameters for all citus clusters
    pg_mode: citus                    # pgsql cluster mode: citus
    pg_shard: pg-citus                # citus shard name: pg-citus
    patroni_citus_db: meta            # citus distributed database name
    pg_dbsu_password: DBUser.Postgres # all dbsu password access for citus cluster
    pg_users: [ { name: dbuser_meta ,password: DBUser.Meta ,pgbouncer: true ,roles: [ dbrole_admin ] } ]
    pg_databases: [ { name: meta ,extensions: [ { name: citus }, { name: postgis }, { name: timescaledb } ] } ]
    pg_hba_rules:
      - { user: 'all' ,db: all  ,addr: 127.0.0.1/32 ,auth: ssl ,title: 'all user ssl access from localhost' }
      - { user: 'all' ,db: all  ,addr: intra        ,auth: ssl ,title: 'all user ssl access from intranet'  }

And you can create distributed table & reference table on the coordinator node. Any data node can be used as the coordinator node since citus 11.2.

SELECT create_distributed_table('pgbench_accounts', 'aid'); SELECT truncate_local_data_after_distributing_table($$public.pgbench_accounts$$);
SELECT create_reference_table('pgbench_branches')         ; SELECT truncate_local_data_after_distributing_table($$public.pgbench_branches$$);
SELECT create_reference_table('pgbench_history')          ; SELECT truncate_local_data_after_distributing_table($$public.pgbench_history$$);
SELECT create_reference_table('pgbench_tellers')          ; SELECT truncate_local_data_after_distributing_table($$public.pgbench_tellers$$);

Major Version

Pigsty works on PostgreSQL 10+. While the pre-packaged packages only includes 12 - 16 for now.

version Comment Packages
16 The latest version with important extensions Core, L1 L2
15 The stable major version, with full extension support (default) Core, L1,L2,L3
14 The old stable major version, ith L1 extension support only Core, L1
13 Older major version, with L1 extension support only Core, L1
12 Older major version, with L1 extension support only Core, L1
  • Core: postgresql*, available on PG 12 - 16
  • L1 extensions: wal2json, pg_repack, passwordcheck_cracklib (PG 12 - 16)
  • L2 extensions: postgis, citus, timescaledb, pgvector (PG15, PG16)
  • L3 extensions: Other miscellaneous extensions (PG15 only)

Since some extensions are not available on PG 12,13,14,16, you may have to change pg_extensions and pg_libs to fit your needs.

Here are some example cluster definition with different major versions.

pg-v12:
  hosts: { 10.10.10.12: { pg_seq: 1 ,pg_role: primary } }
  vars:
    pg_cluster: pg-v12
    pg_version: 12
    pg_libs: 'pg_stat_statements, auto_explain'
    pg_extensions: [ 'wal2json_12* pg_repack_12* passwordcheck_cracklib_12*' ]

pg-v13:
  hosts: { 10.10.10.13: { pg_seq: 1 ,pg_role: primary } }
  vars:
    pg_cluster: pg-v13
    pg_version: 13
    pg_libs: 'pg_stat_statements, auto_explain'
    pg_extensions: [ 'wal2json_13* pg_repack_13* passwordcheck_cracklib_13*' ]

pg-v14:
  hosts: { 10.10.10.14: { pg_seq: 1 ,pg_role: primary } }
  vars:
    pg_cluster: pg-v14
    pg_version: 14

pg-v15:
  hosts: { 10.10.10.15: { pg_seq: 1 ,pg_role: primary } }
  vars:
    pg_cluster: pg-v15
    pg_version: 15

pg-v16:
  hosts: { 10.10.10.16: { pg_seq: 1 ,pg_role: primary } }
  vars:
    pg_cluster: pg-v16
    pg_version: 16

Beware that these extensions are just not included in Pigsty’s default repo. You can have these extensions on older pg version with proper configuration.

5.8 - Administration

Administration standard operation procedures to manage PostgreSQL clusters in production environment.

How to maintain an existing PostgreSQL cluster with Pigsty?

Here are some SOP for common pgsql admin tasks


Cheatsheet

PGSQL playbooks and shortcuts:

bin/pgsql-add   <cls>                   # create pgsql cluster <cls>
bin/pgsql-user  <cls> <username>        # create pg user <username> on <cls>
bin/pgsql-db    <cls> <dbname>          # create pg database <dbname> on <cls>
bin/pgsql-svc   <cls> [...ip]           # reload pg service of cluster <cls>
bin/pgsql-hba   <cls> [...ip]           # reload postgres/pgbouncer HBA rules of cluster <cls>
bin/pgsql-add   <cls> [...ip]           # append replicas for cluster <cls>
bin/pgsql-rm    <cls> [...ip]           # remove replicas from cluster <cls>
bin/pgsql-rm    <cls>                   # remove pgsql cluster <cls>

Patroni admin command and shortcuts:

pg list        <cls>                    # print cluster info
pg edit-config <cls>                    # edit cluster config 
pg reload      <cls> [ins]              # reload cluster config
pg restart     <cls> [ins]              # restart pgsql cluster
pg reinit      <cls> [ins]              # reinit cluster members
pg pause       <cls>                    # entering maintenance mode (no auto failover)
pg resume      <cls>                    # exiting maintenance mode
pg switchover  <cls>                    # switchover on cluster <cls>
pg failover    <cls>                    # failover on cluster <cls>

pgBackRest backup & restore command and shortcuts:

pb info                                 # print pgbackrest repo info
pg-backup                               # make a backup, incr, or full backup if necessary
pg-backup full                          # make a full backup
pg-backup diff                          # make a differential backup
pg-backup incr                          # make a incremental backup
pg-pitr -i                              # restore to the time of latest backup complete (not often used)
pg-pitr --time="2022-12-30 14:44:44+08" # restore to specific time point (in case of drop db, drop table)
pg-pitr --name="my-restore-point"       # restore TO a named restore point create by pg_create_restore_point
pg-pitr --lsn="0/7C82CB8" -X            # restore right BEFORE a LSN
pg-pitr --xid="1234567" -X -P           # restore right BEFORE a specific transaction id, then promote
pg-pitr --backup=latest                 # restore to latest backup set
pg-pitr --backup=20221108-105325        # restore to a specific backup set, which can be checked with pgbackrest info

Systemd components quick reference

systemctl stop patroni                  # start stop restart reload
systemctl stop pgbouncer                # start stop restart reload
systemctl stop pg_exporter              # start stop restart reload
systemctl stop pgbouncer_exporter       # start stop restart reload
systemctl stop node_exporter            # start stop restart
systemctl stop haproxy                  # start stop restart reload
systemctl stop vip-manager              # start stop restart reload
systemctl stop postgres                 # only when patroni_mode == 'remove'

Create Cluster

To create a new Postgres cluster, define it in the inventory first, then init with:

bin/node-add <cls>                # init nodes for cluster <cls>           # ./node.yml  -l <cls> 
bin/pgsql-add <cls>               # init pgsql instances of cluster <cls>  # ./pgsql.yml -l <cls>

Beware, perform bin/node-add first, then bin/pgsql-add, PGSQL works on managed nodes only.

Example: Create Cluster

asciicast


Create User

To create a new business user on the existing Postgres cluster, add user definition to all.children.<cls>.pg_users, then create the user as follows:

bin/pgsql-user <cls> <username>   # ./pgsql-user.yml -l <cls> -e username=<username>
Example: Create Business User

asciicast


Create Database

To create a new database user on the existing Postgres cluster, add database definition to all.children.<cls>.pg_databases, then create the database as follows:

bin/pgsql-db <cls> <dbname>       # ./pgsql-db.yml -l <cls> -e dbname=<dbname>

Note: If the database has specified an owner, the user should already exist, or you’ll have to Create User first.

Example: Create Business Database

asciicast


Reload Service

Services are exposed access point served by HAProxy.

This task is used when cluster membership has changed, e.g., append/remove replicas, switchover/failover / exposing new service or updating existing service’s config (e.g., LB Weight)

To create new services or reload existing services on entire proxy cluster or specific instances:

bin/pgsql-svc <cls>               # pgsql.yml -l <cls> -t pg_service -e pg_reload=true
bin/pgsql-svc <cls> [ip...]       # pgsql.yml -l ip... -t pg_service -e pg_reload=true
Example: Reload PG Service to Kick one Instance

asciicast


Reload HBARule

This task is used when your Postgres/Pgbouncer HBA rules have changed, you may have to reload hba to apply changes.

If you have any role-specific HBA rules, you may have to reload hba after a switchover/failover, too.

To reload postgres & pgbouncer HBA rules on entire cluster or specific instances:

bin/pgsql-hba <cls>               # pgsql.yml -l <cls> -t pg_hba,pgbouncer_hba,pgbouncer_reload -e pg_reload=true
bin/pgsql-hba <cls> [ip...]       # pgsql.yml -l ip... -t pg_hba,pgbouncer_hba,pgbouncer_reload -e pg_reload=true
Example: Reload Cluster HBA Rules

asciicast


Config Cluster

To change the config of a existing Postgres cluster, you have to initiate control command on admin node with admin user:

pg edit-config <cls>              # interactive config a cluster with patronictl

Change patroni parameters & postgresql.parameters, save & apply changes with the wizard.

Example: Config Cluster in Non-Interactive Manner

You can skip interactive mode and use -p option to override postgres parameters, for example:

pg edit-config -p log_min_duration_statement=1000 pg-test
pg edit-config --force -p shared_preload_libraries='timescaledb, pg_cron, pg_stat_statements, auto_explain'
Example: Change Cluster Config with Patroni REST API

You can also use Patroni REST API to change the config in a non-interactive mode, for example:

$ curl -s 10.10.10.11:8008/config | jq .  # get current config
$ curl -u 'postgres:Patroni.API' \
        -d '{"postgresql":{"parameters": {"log_min_duration_statement":200}}}' \
        -s -X PATCH http://10.10.10.11:8008/config | jq .

Note: patroni unsafe RestAPI access is limit from infra/admin nodes and protected with an HTTP basic auth username/password and an optional HTTPS mode.

Example: Config Cluster with patronictl

asciicast


Append Replica

To add a new replica to the existing Postgres cluster, you have to add its definition to the inventory: all.children.<cls>.hosts, then:

bin/node-add <ip>                 # init node <ip> for the new replica               
bin/pgsql-add <cls> <ip>          # init pgsql instances on <ip> for cluster <cls>  

It will add node <ip> to pigsty and init it as a replica of the cluster <cls>.

Cluster services will be reloaded to adopt the new member

Example: Add replica to pg-test

asciicast

For example, if you want to add a pg-test-3 / 10.10.10.13 to the existing cluster pg-test, you’ll have to update the inventory first:

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary } # existing member
    10.10.10.12: { pg_seq: 2, pg_role: replica } # existing member
    10.10.10.13: { pg_seq: 3, pg_role: replica } # <--- new member
  vars: { pg_cluster: pg-test }

then apply the change as follows:

bin/node-add          10.10.10.13   # add node to pigsty
bin/pgsql-add pg-test 10.10.10.13   # init new replica on 10.10.10.13 for cluster pg-test

which is similar to cluster init but only works on single instance。

[ OK ] init instances  10.10.10.11 to pgsql cluster 'pg-test':
[WARN]   reminder: add nodes to pigsty, then install additional module 'pgsql'
[HINT]     $ bin/node-add  10.10.10.11  # run this ahead, except infra nodes
[WARN]   init instances from cluster:
[ OK ]     $ ./pgsql.yml -l '10.10.10.11,&pg-test'
[WARN]   reload pg_service on existing instances:
[ OK ]     $ ./pgsql.yml -l 'pg-test,!10.10.10.11' -t pg_service

Remove Replica

To remove a replica from the existing PostgreSQL cluster:

bin/pgsql-rm <cls> <ip...>        # ./pgsql-rm.yml -l <ip>

It will remove instance <ip> from cluster <cls>. Cluster services will be reloaded to kick the removed instance from load balancer.

Example: Remove replica from pg-test

asciicast

For example, if you want to remove pg-test-3 / 10.10.10.13 from the existing cluster pg-test:

bin/pgsql-rm pg-test 10.10.10.13  # remove pgsql instance 10.10.10.13 from pg-test
bin/node-rm  10.10.10.13          # remove that node from pigsty (optional)
vi pigsty.yml                     # remove instance definition from inventory
bin/pgsql-svc pg-test             # refresh pg_service on existing instances to kick removed instance from load balancer
[ OK ] remove pgsql instances from  10.10.10.13 of 'pg-test':
[WARN]   remove instances from cluster:
[ OK ]     $ ./pgsql-rm.yml -l '10.10.10.13,&pg-test'

And remove instance definition from the inventory:

pg-test:
  hosts:
    10.10.10.11: { pg_seq: 1, pg_role: primary }
    10.10.10.12: { pg_seq: 2, pg_role: replica }
    10.10.10.13: { pg_seq: 3, pg_role: replica } # <--- remove this after execution
  vars: { pg_cluster: pg-test }

Finally, you can update pg service and kick the removed instance from load balancer:

bin/pgsql-svc pg-test             # reload pg service on pg-test

Remove Cluster

To remove the entire Postgres cluster, just run:

bin/pgsql-rm <cls>                # ./pgsql-rm.yml -l <cls>
Example: Remove Cluster

asciicast

Example: Force removing a cluster

Note: if pg_safeguard is configured for this cluster (or globally configured to true), pgsql-rm.yml will abort to avoid removing a cluster by accident.

You can use playbook command line args to explicitly overwrite it to force the purge:

./pgsql-rm.yml -l pg-meta -e pg_safeguard=false    # force removing pg cluster pg-meta

Switchover

You can perform a PostgreSQL cluster switchover with patroni cmd.

pg switchover <cls>   # interactive mode, you can skip that with following options
pg switchover --leader pg-test-1 --candidate=pg-test-2 --scheduled=now --force pg-test
Example: Switchover pg-test

asciicast

$ pg switchover pg-test
Master [pg-test-1]:
Candidate ['pg-test-2', 'pg-test-3'] []: pg-test-2
When should the switchover take place (e.g. 2022-12-26T07:39 )  [now]: now
Current cluster topology
+ Cluster: pg-test (7181325041648035869) -----+----+-----------+-----------------+
| Member    | Host        | Role    | State   | TL | Lag in MB | Tags            |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-1 | 10.10.10.11 | Leader  | running |  1 |           | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-2 | 10.10.10.12 | Replica | running |  1 |         0 | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-3 | 10.10.10.13 | Replica | running |  1 |         0 | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+
Are you sure you want to switchover cluster pg-test, demoting current master pg-test-1? [y/N]: y
2022-12-26 06:39:58.02468 Successfully switched over to "pg-test-2"
+ Cluster: pg-test (7181325041648035869) -----+----+-----------+-----------------+
| Member    | Host        | Role    | State   | TL | Lag in MB | Tags            |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-1 | 10.10.10.11 | Replica | stopped |    |   unknown | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-2 | 10.10.10.12 | Leader  | running |  1 |           | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+
| pg-test-3 | 10.10.10.13 | Replica | running |  1 |         0 | clonefrom: true |
|           |             |         |         |    |           | conf: tiny.yml  |
|           |             |         |         |    |           | spec: 1C.2G.50G |
|           |             |         |         |    |           | version: '15'   |
+-----------+-------------+---------+---------+----+-----------+-----------------+

To do so with Patroni API (schedule a switchover from 2 to 1 at a specific time):

curl -u 'postgres:Patroni.API' \
  -d '{"leader":"pg-test-2", "candidate": "pg-test-1","scheduled_at":"2022-12-26T14:47+08"}' \
  -s -X POST http://10.10.10.11:8008/switchover

Backup Cluster

To create a backup with pgBackRest, run as local dbsu:

pg-backup                         # make a postgres base backup
pg-backup full                    # make a full backup
pg-backup diff                    # make a differential backup
pg-backup incr                    # make a incremental backup
pb info                           # check backup information

Check Backup & PITR for details.

Example: Make Backups

asciicast

Example: Create routine backup crontab

You can add crontab to node_crontab to specify your backup policy.

# make a full backup 1 am everyday
- '00 01 * * * postgres /pg/bin/pg-backup full'

# rotate backup: make a full backup on monday 1am, and an incremental backup during weekdays
- '00 01 * * 1 postgres /pg/bin/pg-backup full'
- '00 01 * * 2,3,4,5,6,7 postgres /pg/bin/pg-backup'

Restore Cluster

To restore a cluster to a previous time point (PITR), run as local dbsu:

pg-pitr -i                              # restore to the time of latest backup complete (not often used)
pg-pitr --time="2022-12-30 14:44:44+08" # restore to specific time point (in case of drop db, drop table)
pg-pitr --name="my-restore-point"       # restore TO a named restore point create by pg_create_restore_point
pg-pitr --lsn="0/7C82CB8" -X            # restore right BEFORE a LSN
pg-pitr --xid="1234567" -X -P           # restore right BEFORE a specific transaction id, then promote
pg-pitr --backup=latest                 # restore to latest backup set
pg-pitr --backup=20221108-105325        # restore to a specific backup set, which can be checked with pgbackrest info

And follow the instructions wizard, Check Backup & PITR for details.

Example: PITR with raw pgBackRest Command
# restore to the latest available point (e.g. hardware failure)
pgbackrest --stanza=pg-meta restore

# PITR to specific time point (e.g. drop table by accident)
pgbackrest --stanza=pg-meta --type=time --target="2022-11-08 10:58:48" \
   --target-action=promote restore

# restore specific backup point and then promote (or pause|shutdown)
pgbackrest --stanza=pg-meta --type=immediate --target-action=promote \
  --set=20221108-105325F_20221108-105938I restore

Adding Packages

To add newer version of RPM packages, you have to add them to repo_packages and repo_url_packages

Then rebuild repo on infra nodes with ./infra.yml -t repo_build subtask, Then you can install these packages with ansible module package:

ansible pg-test -b -m package -a "name=pg_cron_15,topn_15,pg_stat_monitor_15*"  # install some packages
Update Packages Manually
# add repo upstream on admin node, then download them manually
cd ~/pigsty; ./infra.yml -t repo_upstream,repo_cache # add upstream repo (internet)
cd /www/pigsty;  repotrack "some_new_package_name"   # download the latest RPMs

# re-create local repo on admin node, then refresh yum/apt cache on all nodes
cd ~/pigsty; ./infra.yml -t repo_create              # recreate local repo on admin node
./node.yml -t node_repo                              # refresh yum/apt cache on all nodes

# alternatives: clean and remake cache on all nodes with ansible command
ansible all -b -a 'yum clean all'                         # clean node repo cache
ansible all -b -a 'yum makecache'                         # remake cache from the new repo
ansible all -b -a 'apt clean'                             # clean node repo cache (Ubuntu/Debian)
ansible all -b -a 'apt update'                            # remake cache from the new repo (Ubuntu/Debian)

For example, you can then install or upgrade packages with:

ansible pg-test -b -m package -a "name=postgresql15* state=latest"

Install Extension

If you want to install extension on pg clusters, Add them to pg_extensions and make sure them installed with:

./pgsql.yml -t pg_extension     # install extensions

Some extension needs to be loaded in shared_preload_libraries, You can add them to pg_libs, or Config an existing cluster.

Finally, CREATE EXTENSION <extname>; on the cluster primary instance to install it.

Example: Install pg_cron on pg-test cluster
ansible pg-test -b -m package -a "name=pg_cron_15"          # install pg_cron packages on all nodes
# add pg_cron to shared_preload_libraries
pg edit-config --force -p shared_preload_libraries='timescaledb, pg_cron, pg_stat_statements, auto_explain'
pg restart --force pg-test                                  # restart cluster
psql -h pg-test -d postgres -c 'CREATE EXTENSION pg_cron;'  # install pg_cron on primary

Check PGSQL Extensions: Install for details.


Minor Upgrade

To perform a minor server version upgrade/downgrade, you have to add packages to yum/apt repo first.

Then perform a rolling upgrade/downgrade from all replicas, then switchover the cluster to upgrade the leader.

ansible <cls> -b -a "yum upgrade/downgrade -y <pkg>"    # upgrade/downgrade packages
pg restart --force <cls>                                # restart cluster
Example: Downgrade PostgreSQL 15.2 to 15.1

Add 15.1 packages to yum/apt repo and refresh node package manager cache:

cd ~/pigsty; ./infra.yml -t repo_upstream               # add upstream repo backup
cd /www/pigsty; repotrack postgresql15-*-15.1           # add 15.1 packages to yum repo
cd ~/pigsty; ./infra.yml -t repo_create                 # re-create repo
ansible pg-test -b -a 'yum clean all'                   # clean node repo cache (use apt in debian/ubuntu)
ansible pg-test -b -a 'yum makecache'                   # remake yum cache from the new repo

Perform a downgrade and restart the cluster:

ansible pg-test -b -a "yum downgrade -y postgresql15*"  # downgrade packages
pg restart --force pg-test                              # restart entire cluster to finish upgrade
Example: Upgrade PostgreSQL 15.1 back to 15.2

This time we upgrade in a rolling fashion:

ansible pg-test -b -a "yum upgrade -y postgresql15*"    # upgrade packages
ansible pg-test -b -a '/usr/pgsql/bin/pg_ctl --version' # check binary version is 15.2
pg restart --role replica --force pg-test               # restart replicas
pg switchover --leader pg-test-1 --candidate=pg-test-2 --scheduled=now --force pg-test    # switchover
pg restart --role primary --force pg-test               # restart primary

Major Upgrade

The simplest way to achieve a major version upgrade is to create a new cluster with the new version, then migration with logical replication & green/blue deployment.

You can also perform an in-place major upgrade, which is not recommended especially when certain extensions are installed. But it is possible.

Assume you want to upgrade PostgreSQL 14 to 15, you have to add packages to yum/apt repo, and guarantee the extensions has exact same version too.

./pgsql.yml -t pg_pkg -e pg_version=15                         # install packages for pg 15
sudo su - postgres; mkdir -p /data/postgres/pg-meta-15/data/   # prepare directories for 15
pg_upgrade -b /usr/pgsql-14/bin/ -B /usr/pgsql-15/bin/ -d /data/postgres/pg-meta-14/data/ -D /data/postgres/pg-meta-15/data/ -v -c # preflight
pg_upgrade -b /usr/pgsql-14/bin/ -B /usr/pgsql-15/bin/ -d /data/postgres/pg-meta-14/data/ -D /data/postgres/pg-meta-15/data/ --link -j8 -v -c
rm -rf /usr/pgsql; ln -s /usr/pgsql-15 /usr/pgsql;             # fix binary links 
mv /data/postgres/pg-meta-14 /data/postgres/pg-meta-15         # rename data directory
rm -rf /pg; ln -s /data/postgres/pg-meta-15 /pg                # fix data dir links

5.9 - Access Control

Built-in roles system, and battery-included access control model in Pigsty.

Pigsty has a battery-included access control model based on Role System and Privileges.


Role System

Pigsty has a default role system consist of four default roles and four default users

Role name Attributes Member of Description
dbrole_readonly NOLOGIN role for global read-only access
dbrole_readwrite NOLOGIN dbrole_readonly role for global read-write access
dbrole_admin NOLOGIN pg_monitor,dbrole_readwrite role for object creation
dbrole_offline NOLOGIN role for restricted read-only access
postgres SUPERUSER system superuser
replicator REPLICATION pg_monitor,dbrole_readonly system replicator
dbuser_dba SUPERUSER dbrole_admin pgsql admin user
dbuser_monitor pg_monitor pgsql monitor user
pg_default_roles:                 # default roles and users in postgres cluster
  - { name: dbrole_readonly  ,login: false ,comment: role for global read-only access     }
  - { name: dbrole_offline   ,login: false ,comment: role for restricted read-only access }
  - { name: dbrole_readwrite ,login: false ,roles: [dbrole_readonly] ,comment: role for global read-write access }
  - { name: dbrole_admin     ,login: false ,roles: [pg_monitor, dbrole_readwrite] ,comment: role for object creation }
  - { name: postgres     ,superuser: true  ,comment: system superuser }
  - { name: replicator ,replication: true  ,roles: [pg_monitor, dbrole_readonly] ,comment: system replicator }
  - { name: dbuser_dba   ,superuser: true  ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 ,comment: pgsql admin user }
  - { name: dbuser_monitor ,roles: [pg_monitor] ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }

Default Roles

There are four default roles in pigsty:

  • Read Only (dbrole_readonly): Role for global read-only access
  • Read Write (dbrole_readwrite): Role for global read-write access, inherits dbrole_readonly.
  • Admin (dbrole_admin): Role for DDL commands, inherits dbrole_readwrite.
  • Offline (dbrole_offline): Role for restricted read-only access (offline instance)

Default roles are defined in pg_default_roles, change default roles is not recommended.

- { name: dbrole_readonly  , login: false , comment: role for global read-only access  }                            # production read-only role
- { name: dbrole_offline ,   login: false , comment: role for restricted read-only access (offline instance) }      # restricted-read-only role
- { name: dbrole_readwrite , login: false , roles: [dbrole_readonly], comment: role for global read-write access }  # production read-write role
- { name: dbrole_admin , login: false , roles: [pg_monitor, dbrole_readwrite] , comment: role for object creation } # production DDL change role

Default Users

There are four default users in pigsty, too.

  • Superuser (postgres), the owner and creator of the cluster, same as the OS dbsu.
  • Replication user (replicator), the system user used for primary-replica.
  • Monitor user (dbuser_monitor), a user used to monitor database and connection pool metrics.
  • Admin user (dbuser_dba), the admin user who performs daily operations and database changes.

Default users’ username/password are defined with dedicate parameters (except for dbsu password):

!> Remember to change these password in production deployment !

pg_dbsu: postgres                             # os user for the database
pg_replication_username: replicator           # system replication user
pg_replication_password: DBUser.Replicator    # system replication password
pg_monitor_username: dbuser_monitor           # system monitor user
pg_monitor_password: DBUser.Monitor           # system monitor password
pg_admin_username: dbuser_dba                 # system admin user
pg_admin_password: DBUser.DBA                 # system admin password

To define extra options, specify them in pg_default_roles:

- { name: postgres     ,superuser: true                                          ,comment: system superuser }
- { name: replicator ,replication: true  ,roles: [pg_monitor, dbrole_readonly]   ,comment: system replicator }
- { name: dbuser_dba   ,superuser: true  ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 , comment: pgsql admin user }
- { name: dbuser_monitor   ,roles: [pg_monitor, dbrole_readonly] ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }

Privileges

Pigsty has a battery-included privilege model that works with default roles.

  • All users have access to all schemas.
  • Read-Only user can read from all tables. (SELECT, EXECUTE)
  • Read-Write user can write to all tables run DML. (INSERT, UPDATE, DELETE).
  • Admin user can create object and run DDL (CREATE, USAGE, TRUNCATE, REFERENCES, TRIGGER).
  • Offline user is Read-Only user with limited access on offline instance (pg_role = 'offline' or pg_offline_query = true)
  • Object created by admin users will have correct privilege.
  • Default privileges are installed on all databases, including template database.
  • Database connect privilege is covered by database definition
  • CREATE privileges of database & public schema are revoked from PUBLIC by default

Object Privilege

Default object privileges are defined in pg_default_privileges.

- GRANT USAGE      ON SCHEMAS   TO dbrole_readonly
- GRANT SELECT     ON TABLES    TO dbrole_readonly
- GRANT SELECT     ON SEQUENCES TO dbrole_readonly
- GRANT EXECUTE    ON FUNCTIONS TO dbrole_readonly
- GRANT USAGE      ON SCHEMAS   TO dbrole_offline
- GRANT SELECT     ON TABLES    TO dbrole_offline
- GRANT SELECT     ON SEQUENCES TO dbrole_offline
- GRANT EXECUTE    ON FUNCTIONS TO dbrole_offline
- GRANT INSERT     ON TABLES    TO dbrole_readwrite
- GRANT UPDATE     ON TABLES    TO dbrole_readwrite
- GRANT DELETE     ON TABLES    TO dbrole_readwrite
- GRANT USAGE      ON SEQUENCES TO dbrole_readwrite
- GRANT UPDATE     ON SEQUENCES TO dbrole_readwrite
- GRANT TRUNCATE   ON TABLES    TO dbrole_admin
- GRANT REFERENCES ON TABLES    TO dbrole_admin
- GRANT TRIGGER    ON TABLES    TO dbrole_admin
- GRANT CREATE     ON SCHEMAS   TO dbrole_admin

Newly created objects will have corresponding privileges when it is created by admin users

The \ddp+ may looks like:

Type Access privileges
function =X
dbrole_readonly=X
dbrole_offline=X
dbrole_admin=X
schema dbrole_readonly=U
dbrole_offline=U
dbrole_admin=UC
sequence dbrole_readonly=r
dbrole_offline=r
dbrole_readwrite=wU
dbrole_admin=rwU
table dbrole_readonly=r
dbrole_offline=r
dbrole_readwrite=awd
dbrole_admin=arwdDxt

Default Privilege

ALTER DEFAULT PRIVILEGES allows you to set the privileges that will be applied to objects created in the future. It does not affect privileges assigned to already-existing objects, and objects created by non-admin users.

Pigsty will use the following default privileges:

{% for priv in pg_default_privileges %}
ALTER DEFAULT PRIVILEGES FOR ROLE {{ pg_dbsu }} {{ priv }};
{% endfor %}

{% for priv in pg_default_privileges %}
ALTER DEFAULT PRIVILEGES FOR ROLE {{ pg_admin_username }} {{ priv }};
{% endfor %}

-- for additional business admin, they can SET ROLE to dbrole_admin
{% for priv in pg_default_privileges %}
ALTER DEFAULT PRIVILEGES FOR ROLE "dbrole_admin" {{ priv }};
{% endfor %}

Which will be rendered in pg-init-template.sql alone with ALTER DEFAULT PRIVILEGES statement for admin users.

These SQL command will be executed on postgres & template1 during cluster bootstrap, and newly created database will inherit it from tempalte1 by default.

That is to say, to maintain the correct object privilege, you have to run DDL with admin users, which could be:

  1. {{ pg_dbsu }}, postgres by default
  2. {{ pg_admin_username }}, dbuser_dba by default
  3. Business admin user granted with dbrole_admin

It’s wise to use postgres as global object owner to perform DDL changes. If you wish to create objects with business admin user, YOU MUST USE SET ROLE dbrole_admin before running that DDL to maintain the correct privileges.

You can also ALTER DEFAULT PRIVILEGE FOR ROLE <some_biz_admin> XXX to grant default privilege to business admin user, too.


Database Privilege

Database privilege is covered by database definition.

There are 3 database level privileges: CONNECT, CREATE, TEMP, and a special ‘privilege’: OWNERSHIP.

- name: meta         # required, `name` is the only mandatory field of a database definition
  owner: postgres    # optional, specify a database owner, {{ pg_dbsu }} by default
  allowconn: true    # optional, allow connection, true by default. false will disable connect at all
  revokeconn: false  # optional, revoke public connection privilege. false by default. (leave connect with grant option to owner)
  • If owner exists, it will be used as database owner instead of default {{ pg_dbsu }}
  • If revokeconn is false, all users have the CONNECT privilege of the database, this is the default behavior.
  • If revokeconn is set to true explicitly:
    • CONNECT privilege of the database will be revoked from PUBLIC
    • CONNECT privilege will be granted to {{ pg_replication_username }}, {{ pg_monitor_username }} and {{ pg_admin_username }}
    • CONNECT privilege will be granted to database owner with GRANT OPTION

revokeconn flag can be used for database access isolation, you can create different business users as the owners for each database and set the revokeconn option for all of them.

Example: Database Isolation
pg-infra:
  hosts:
    10.10.10.40: { pg_seq: 1, pg_role: primary }
    10.10.10.41: { pg_seq: 2, pg_role: replica , pg_offline_query: true }
  vars:
    pg_cluster: pg-infra
    pg_users:
      - { name: dbuser_confluence, password: mc2iohos , pgbouncer: true, roles: [ dbrole_admin ] }
      - { name: dbuser_gitlab, password: sdf23g22sfdd , pgbouncer: true, roles: [ dbrole_readwrite ] }
      - { name: dbuser_jira, password: sdpijfsfdsfdfs , pgbouncer: true, roles: [ dbrole_admin ] }
    pg_databases:
      - { name: confluence , revokeconn: true, owner: dbuser_confluence , connlimit: 100 }
      - { name: gitlab , revokeconn: true, owner: dbuser_gitlab, connlimit: 100 }
      - { name: jira , revokeconn: true, owner: dbuser_jira , connlimit: 100 }

Create Privilege

Pigsty revokes the CREATE privilege on database from PUBLIC by default, for security consideration. And this is the default behavior since PostgreSQL 15.

The database owner have the full capability to adjust these privileges as they see fit.

5.10 - Backup & PITR

How to perform base backup & PITR with pgBackRest?

Pigsty uses pgBackRest for PITR backup & restore.

In the case of a hardware failure, a physical replica failover could be the best choice. Whereas for data corruption scenarios (whether machine or human errors), Point-in-Time Recovery (PITR) is often more appropriate.

Backup

Use the following command to perform the backup:

# stanza name = {{ pg_cluster }} by default
pgbackrest --stanza=${stanza} --type=full|diff|incr backup

# you can also use the following command in pigsty (/pg/bin/pg-backup)
pg-backup       # make a backup, incr, or full backup if necessary
pg-backup full  # make a full backup
pg-backup diff  # make a differential backup
pg-backup incr  # make a incremental backup

Use the following command to print backup info:

pb info  # print backup info

You can also acquire backup info from the monitoring system: PGCAT Instance - Backup

Backup Info Example
$ pb info
stanza: pg-meta
    status: ok
    cipher: none

    db (current)
        wal archive min/max (14): 000000010000000000000001/000000010000000000000023

        full backup: 20221108-105325F
            timestamp start/stop: 2022-11-08 10:53:25 / 2022-11-08 10:53:29
            wal start/stop: 000000010000000000000004 / 000000010000000000000004
            database size: 96.6MB, database backup size: 96.6MB
            repo1: backup set size: 18.9MB, backup size: 18.9MB

        incr backup: 20221108-105325F_20221108-105938I
            timestamp start/stop: 2022-11-08 10:59:38 / 2022-11-08 10:59:41
            wal start/stop: 00000001000000000000000F / 00000001000000000000000F
            database size: 246.7MB, database backup size: 167.3MB
            repo1: backup set size: 35.4MB, backup size: 20.4MB
            backup reference list: 20221108-105325F

Restore

Use the following command to perform restore

pg-pitr                                 # restore to wal archive stream end (e.g. used in case of entire DC failure)
pg-pitr -i                              # restore to the time of latest backup complete (not often used)
pg-pitr --time="2022-12-30 14:44:44+08" # restore to specific time point (in case of drop db, drop table)
pg-pitr --name="my-restore-point"       # restore TO a named restore point create by pg_create_restore_point
pg-pitr --lsn="0/7C82CB8" -X            # restore right BEFORE a LSN
pg-pitr --xid="1234567" -X -P           # restore right BEFORE a specific transaction id, then promote
pg-pitr --backup=latest                 # restore to latest backup set
pg-pitr --backup=20221108-105325        # restore to a specific backup set, which can be checked with pgbackrest info

pg-pitr                                 # pgbackrest --stanza=pg-meta restore
pg-pitr -i                              # pgbackrest --stanza=pg-meta --type=immediate restore
pg-pitr -t "2022-12-30 14:44:44+08"     # pgbackrest --stanza=pg-meta --type=time --target="2022-12-30 14:44:44+08" restore
pg-pitr -n "my-restore-point"           # pgbackrest --stanza=pg-meta --type=name --target=my-restore-point restore
pg-pitr -b 20221108-105325F             # pgbackrest --stanza=pg-meta --type=name --set=20221230-120101F restore
pg-pitr -l "0/7C82CB8" -X               # pgbackrest --stanza=pg-meta --type=lsn --target="0/7C82CB8" --target-exclusive restore
pg-pitr -x 1234567 -X -P                # pgbackrest --stanza=pg-meta --type=xid --target="0/7C82CB8" --target-exclusive --target-action=promote restore

The pg-pitr script will generate instructions for you to perform PITR.

For example, if you wish to rollback current cluster status back to "2023-02-07 12:38:00+08":

$ pg-pitr -t "2023-02-07 12:38:00+08"
pgbackrest --stanza=pg-meta --type=time --target='2023-02-07 12:38:00+08' restore
Perform time PITR on pg-meta
[1. Stop PostgreSQL] ===========================================
   1.1 Pause Patroni (if there are any replicas)
       $ pg pause <cls>  # pause patroni auto failover
   1.2 Shutdown Patroni
       $ pt-stop         # sudo systemctl stop patroni
   1.3 Shutdown Postgres
       $ pg-stop         # pg_ctl -D /pg/data stop -m fast

[2. Perform PITR] ===========================================
   2.1 Restore Backup
       $ pgbackrest --stanza=pg-meta --type=time --target='2023-02-07 12:38:00+08' restore
   2.2 Start PG to Replay WAL
       $ pg-start        # pg_ctl -D /pg/data start
   2.3 Validate and Promote
     - If database content is ok, promote it to finish recovery, otherwise goto 2.1
       $ pg-promote      # pg_ctl -D /pg/data promote

[3. Restart Patroni] ===========================================
   3.1 Start Patroni
       $ pt-start;        # sudo systemctl start patroni
   3.2 Enable Archive Again
       $ psql -c 'ALTER SYSTEM SET archive_mode = on; SELECT pg_reload_conf();'
   3.3 Restart Patroni
       $ pt-restart      # sudo systemctl start patroni

[4. Restore Cluster] ===========================================
   3.1 Re-Init All Replicas (if any replicas)
       $ pg reinit <cls> <ins>
   3.2 Resume Patroni
       $ pg resume <cls> # resume patroni auto failover
   3.2 Make Full Backup (optional)
       $ pg-backup full  # pgbackrest --stanza=pg-meta backup --type=full

Policy

You can customize your backup policy with node_crontab and pgbackrest_repo

local repo

For example, the default pg-meta will take a full backup every day at 1 am.

node_crontab:  # make a full backup 1 am everyday
  - '00 01 * * * postgres /pg/bin/pg-backup full'

With the default local repo retention policy, it will keep at most two full backups and temporarily allow three during backup.

pgbackrest_repo:                  # pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository
  local:                          # default pgbackrest repo with local posix fs
    path: /pg/backup              # local backup directory, `/pg/backup` by default
    retention_full_type: count    # retention full backups by count
    retention_full: 2             # keep 2, at most 3 full backup when using local fs repo

Your backup disk storage should be at least three x database file size + WAL archive in 3 days.

MinIO repo

When using MinIO, storage capacity is usually not a problem. You can keep backups as long as you want.

For example, the default pg-test will take a full backup on Monday and incr backup on other weekdays.

node_crontab:  # make a full backup 1 am everyday
  - '00 01 * * 1 postgres /pg/bin/pg-backup full'
  - '00 01 * * 2,3,4,5,6,7 postgres /pg/bin/pg-backup'

And with a 14-day time retention policy, backup in the last two weeks will be kept. But beware, this guarantees a week’s PITR period only.

pgbackrest_repo:                  # pgbackrest repo: https://pgbackrest.org/configuration.html#section-repository=
  minio:                          # optional minio repo for pgbackrest
    type: s3                      # minio is s3-compatible, so s3 is used
    s3_endpoint: sss.pigsty       # minio endpoint domain name, `sss.pigsty` by default
    s3_region: us-east-1          # minio region, us-east-1 by default, useless for minio
    s3_bucket: pgsql              # minio bucket name, `pgsql` by default
    s3_key: pgbackrest            # minio user access key for pgbackrest
    s3_key_secret: S3User.Backup  # minio user secret key for pgbackrest
    s3_uri_style: path            # use path style uri for minio rather than host style
    path: /pgbackrest             # minio backup path, default is `/pgbackrest`
    storage_port: 9000            # minio port, 9000 by default
    storage_ca_file: /etc/pki/ca.crt  # minio ca file path, `/etc/pki/ca.crt` by default
    bundle: y                     # bundle small files into a single file
    cipher_type: aes-256-cbc      # enable AES encryption for remote backup repo
    cipher_pass: pgBackRest       # AES encryption password, default is 'pgBackRest'
    retention_full_type: time     # retention full backup by time on minio repo
    retention_full: 14            # keep full backup for last 14 days

5.11 - Migration

How to migrate existing postgres into Pigsty-managed cluster with mimial downtime? The blue-green online migration playbook

Pigsty has a built-in playbook pgsql-migration.yml to perform online database migration based on logical replication.

With proper automation, the downtime could be minimized to several seconds. But beware that logical replication requires PostgreSQL 10+ to work. You can still use the facility here and use a pg_dump | psql instead of logical replication.


Define a Migration Task

You have to create a migration task definition file to use this playbook.

Check files/migration/pg-meta.yml for example.

It will try to migrate the pg-meta.meta to pg-test.test.

pg-meta-1	10.10.10.10  --> pg-test-1	10.10.10.11 (10.10.10.12,10.10.10.13)

You have to tell pigsty where is the source cluster and destination cluster. The database to be migrated, and the primary IP address.

You should have superuser privileges on both sides to proceed

You can overwrite the superuser connection to the source cluster with src_pg, and logical replication connection string with sub_conn, Otherwise, pigsty default admin & replicator credentials will be used.

---
#-----------------------------------------------------------------
# PG_MIGRATION
#-----------------------------------------------------------------
context_dir: ~/migration           # migration manuals & scripts
#-----------------------------------------------------------------
# SRC Cluster (The OLD Cluster)
#-----------------------------------------------------------------
src_cls: pg-meta      # src cluster name         <REQUIRED>
src_db: meta          # src database name        <REQUIRED>
src_ip: 10.10.10.10   # src cluster primary ip   <REQUIRED>
#src_pg: ''            # if defined, use this as src dbsu pgurl instead of:
#                      # postgres://{{ pg_admin_username }}@{{ src_ip }}/{{ src_db }}
#                      # e.g. 'postgres://dbuser_dba:[email protected]:5432/meta'
#sub_conn: ''          # if defined, use this as subscription connstr instead of:
#                      # host={{ src_ip }} dbname={{ src_db }} user={{ pg_replication_username }}'
#                      # e.g. 'host=10.10.10.10 dbname=meta user=replicator password=DBUser.Replicator'
#-----------------------------------------------------------------
# DST Cluster (The New Cluster)
#-----------------------------------------------------------------
dst_cls: pg-test      # dst cluster name         <REQUIRED>
dst_db: test          # dst database name        <REQUIRED>
dst_ip: 10.10.10.11   # dst cluster primary ip   <REQUIRED>
#dst_pg: ''            # if defined, use this as dst dbsu pgurl instead of:
#                      # postgres://{{ pg_admin_username }}@{{ dst_ip }}/{{ dst_db }}
#                      # e.g. 'postgres://dbuser_dba:[email protected]:5432/test'
#-----------------------------------------------------------------
# PGSQL
#-----------------------------------------------------------------
pg_dbsu: postgres
pg_replication_username: replicator
pg_replication_password: DBUser.Replicator
pg_admin_username: dbuser_dba
pg_admin_password: DBUser.DBA
pg_monitor_username: dbuser_monitor
pg_monitor_password: DBUser.Monitor
#-----------------------------------------------------------------
...

Generate Migration Plan

The playbook does not migrate src to dst, but it will generate everything your need to do so.

After the execution, you will find migration context dir under ~/migration/pg-meta.meta by default

Following the README.md and executing these scripts one by one, you will do the trick!

# this script will setup migration context with env vars
. ~/migration/pg-meta.meta/activate

# these scripts are used for check src cluster status
# and help generating new cluster definition in pigsty
./check-user     # check src users
./check-db       # check src databases
./check-hba      # check src hba rules
./check-repl     # check src replica identities
./check-misc     # check src special objects

# these scripts are used for building logical replication
# between existing src cluster and pigsty managed dst cluster
# schema, data will be synced in realtime, except for sequences
./copy-schema    # copy schema to dest
./create-pub     # create publication on src
./create-sub     # create subscription on dst
./copy-progress  # print logical replication progress
./copy-diff      # quick src & dst diff by counting tables

# these scripts will run in an online migration, which will
# stop src cluster, copy sequence numbers (which is not synced with logical replication)
# you have to reroute you app traffic according to your access method (dns,vip,haproxy,pgbouncer,etc...)
# then perform cleanup to drop subscription and publication
./copy-seq [n]   # sync sequence numbers, if n is given, an additional shift will applied
#./disable-src   # restrict src cluster access to admin node & new cluster (YOUR IMPLEMENTATION)
#./re-routing    # ROUTING APPLICATION TRAFFIC FROM SRC TO DST!            (YOUR IMPLEMENTATION)
./drop-sub       # drop subscription on dst after migration
./drop-pub       # drop publication on src after migration

Caveats

You can use ./copy-seq 1000 to advance all sequences by a number (e.g. 1000) after syncing sequences. Which may prevent potential serial primary key conflict in new clusters.

You have to implement your own ./re-routing script to route your application traffic from src to dst. Since we don’t know how your traffic is routed (e.g dns, VIP, haproxy, or pgbouncer). Of course, you can always do that by hand…

You have to implement your own ./disable-src script to restrict the src cluster. You can do that by changing HBA rules & reload (recommended), or just shutting down postgres, pgbouncer, or haproxy…

5.12 - Monitoring

How PostgreSQL monitoring works, and how to monitor remote (existing) PostgreSQL instances?

Overview

Pigsty uses the modern observability stack for PostgreSQL monitoring:

  • Grafana for metrics visualization and PostgreSQL datasource.
  • Prometheus for PostgreSQL / Pgbouncer / Patroni / HAProxy / Node metrics
  • Loki for PostgreSQL / Pgbouncer / Patroni / pgBackRest logs
  • Battery-Include dashboards for PostgreSQL and everything else

Metrics

PostgreSQL’s metrics are defined by collector files: pg_exporter.yml. Prometheus record rules and alert evaluation will further process it: files/prometheus/rules/pgsql.yml

There are three identity labels: cls, ins, ip, which will be attached to all metrics & logs. node & haproxy will try to reuse the same identity to provide consistent metrics & logs.

{ cls: pg-meta, ins: pg-meta-1, ip: 10.10.10.10 }
{ cls: pg-meta, ins: pg-test-1, ip: 10.10.10.11 }
{ cls: pg-meta, ins: pg-test-2, ip: 10.10.10.12 }
{ cls: pg-meta, ins: pg-test-3, ip: 10.10.10.13 }

Logs

PostgreSQL-related logs are collected by promtail and sent to Loki on infra nodes by default.

Targets

Prometheus monitoring targets are defined in static files under /etc/prometheus/targets/pgsql/. Each instance will have a corresponding file. Take pg-meta-1 as an example:

# pg-meta-1 [primary] @ 10.10.10.10
- labels: { cls: pg-meta, ins: pg-meta-1, ip: 10.10.10.10 }
  targets:
    - 10.10.10.10:9630    # <--- pg_exporter for PostgreSQL metrics
    - 10.10.10.10:9631    # <--- pg_exporter for Pgbouncer metrics
    - 10.10.10.10:8008    # <--- patroni metrics

When the global flag patroni_ssl_enabled is set, the patroni target will be managed as /etc/prometheus/targets/patroni/<ins>.yml because it requires a different scrape endpoint (https).

Prometheus monitoring target will be removed when a cluster is removed by bin/pgsql-rm or pgsql-rm.yml. You can use playbook subtasks, or remove them manually:

bin/pgmon-rm <ins>      # remove prometheus targets from all infra nodes

Remote RDS targets are managed as /etc/prometheus/targets/pgrds/<cls>.yml. It will be created by the pgsql-monitor.yml playbook or bin/pgmon-add script.


Monitor Mode

There are three ways to monitor PostgreSQL instances in Pigsty:

Item \ Level L1 L2 L3
Name Remote Database Service Existing Deployment Fully Managed Deployment
Abbr RDS MANAGED FULL
Scenes connect string URL only ssh-sudo-able Instances created by Pigsty
PGCAT Functionality ✅ Full Availability ✅ Full Availability ✅ Full Availability
PGSQL Functionality ✅ PG metrics only ✅ PG and node metrics ✅ Full Support
Connection Pool Metrics ❌ Not available ⚠️ Optional ✅ Pre-Configured
Load Balancer Metrics ❌ Not available ⚠️ Optional ✅ Pre-Configured
PGLOG Functionality ❌ Not Available ⚠️ Optional ⚠️ Optional
PG Exporter ⚠️ On infra nodes ✅ On DB nodes ✅ On DB nodes
Node Exporter ❌ Not Deployed ✅ On DB nodes ✅ On DB nodes
Intrusion into DB nodes ✅ Non-Intrusive ⚠️ Installing Exporter ⚠️ Fully Managed by Pigsty
Instance Already Exists ✅ Yes ✅ Yes ⚠️ Created by Pigsty
Monitoring users and views ⚠️Manually Setup ⚠️Manually Setup ✅ Auto configured
Deployment Usage Playbook bin/pgmon-add <cls> subtasks of pgsql.ym/node.yml pgsql.yml
Required Privileges connectable PGURL from infra nodes DB node ssh and sudo privileges DB node ssh and sudo privileges
Function Overview PGCAT + PGRDS Most Functionality Full Functionality

Monitor Existing Cluster

Suppose the target DB node can be managed by Pigsty (accessible via ssh and sudo is available). In that case, you can use the pg_exporter task in the pgsql.yml playbook to deploy the monitoring component PG Exporter on the target node in the same manner as a standard deployment.

You can also deploy the connection pool and its monitoring on existing instance nodes using the pgbouncer and pgbouncer_exporter tasks from the same playbook. Additionally, you can deploy host monitoring, load balancing, and log collection components using the node_exporter, haproxy, and promtail tasks from the node.yml playbook, achieving a similar user experience with the native Pigsty cluster.

The definition method for existing clusters is very similar to the normal clusters managed by Pigsty. Selectively run certain tasks from the pgsql.yml playbook instead of running the entire playbook.

./node.yml  -l <cls> -t node_repo,node_pkg           # Add YUM sources for INFRA nodes on host nodes and install packages.
./node.yml  -l <cls> -t node_exporter,node_register  # Configure host monitoring and add to Prometheus.
./node.yml  -l <cls> -t promtail                     # Configure host log collection and send to Loki.
./pgsql.yml -l <cls> -t pg_exporter,pg_register      # Configure PostgreSQL monitoring and register with Prometheus/Grafana.

Since the target database cluster already exists, you must manually setup monitoring users, schemas, and extensions on the target database cluster.


Monitor RDS

If you can only access the target database via PGURL (database connection string), you can refer to the instructions here for configuration. In this mode, Pigsty deploys the corresponding PG Exporter on the INFRA node to fetch metrics from the remote database, as shown below:

------ infra ------
|                 |
|   prometheus    |            v---- pg-foo-1 ----v
|       ^         |  metrics   |         ^        |
|   pg_exporter <-|------------|----  postgres    |
|   (port: 20001) |            | 10.10.10.10:5432 |
|       ^         |            ^------------------^
|       ^         |                      ^
|       ^         |            v---- pg-foo-2 ----v
|       ^         |  metrics   |         ^        |
|   pg_exporter <-|------------|----  postgres    |
|   (port: 20002) |            | 10.10.10.11:5433 |
-------------------            ^------------------^

The monitoring system will no longer have host/pooler/load balancer metrics. But the PostgreSQL metrics & catalog info are still available. Pigsty has two dedicated dashboards for that: PGRDS Cluster and PGRDS Instance. Overview and Database level dashboards are reused. Since Pigsty cannot manage your RDS, you have to setup monitor on the target database in advance.

Below, we use a sandbox environment as an example: now we assume that the pg-meta cluster is an RDS instance pg-foo-1 to be monitored, and the pg-test cluster is an RDS cluster pg-bar to be monitored:

  1. Create monitoring schemas, users, and permissions on the target. Refer to Monitoring Object Configuration for details.

  2. Declare the cluster in the configuration list. For example, suppose we want to monitor the “remote” pg-meta & pg-test clusters:

    infra:            # Infra cluster for proxies, monitoring, alerts, etc.
      hosts: { 10.10.10.10: { infra_seq: 1 } }
      vars:           # Install pg_exporter on 'infra' group for remote postgres RDS
        pg_exporters: # List all remote instances here, assign a unique unused local port for k
          20001: { pg_cluster: pg-foo, pg_seq: 1, pg_host: 10.10.10.10 , pg_databases: [{ name: meta }] } # Register meta database as Grafana data source
    
          20002: { pg_cluster: pg-bar, pg_seq: 1, pg_host: 10.10.10.11 , pg_port: 5432 } # Several different connection string concatenation methods
          20003: { pg_cluster: pg-bar, pg_seq: 2, pg_host: 10.10.10.12 , pg_exporter_url: 'postgres://dbuser_monitor:[email protected]:5432/postgres?sslmode=disable'}
          20004: { pg_cluster: pg-bar, pg_seq: 3, pg_host: 10.10.10.13 , pg_monitor_username: dbuser_monitor, pg_monitor_password: DBUser.Monitor }
    

    The databases listed in the pg_databases field will be registered in Grafana as a PostgreSQL data source, providing data support for the PGCAT monitoring panel. If you don’t want to use PGCAT and register the database in Grafana, set pg_databases to an empty array or leave it blank.

    pigsty-monitor.jpg

  3. Execute the command to add monitoring: bin/pgmon-add <clsname>

    bin/pgmon-add pg-foo  # Bring the pg-foo cluster into monitoring
    bin/pgmon-add pg-bar  # Bring the pg-bar cluster into monitoring
    
  4. To remove a remote cluster from monitoring, use bin/pgmon-rm <clsname>

    bin/pgmon-rm pg-foo  # Remove pg-foo from Pigsty monitoring
    bin/pgmon-rm pg-bar  # Remove pg-bar from Pigsty monitoring
    

You can use more parameters to override the default pg_exporter options. Here is an example for monitoring Aliyun RDS and PolarDB with Pigsty:

Example: Monitor Aliyun RDS PG & PolarDB

Check remote.yml config for details.

infra:            # infra cluster for proxy, monitor, alert, etc..
  hosts: { 10.10.10.10: { infra_seq: 1 } }
  vars:           # install pg_exporter for remote postgres RDS on a group 'infra'
    pg_exporters: # list all remote instances here, alloc a unique unused local port as k
      20001: { pg_cluster: pg-foo, pg_seq: 1, pg_host: 10.10.10.10 }
      20002: { pg_cluster: pg-bar, pg_seq: 1, pg_host: 10.10.10.11 , pg_port: 5432 }
      20003: { pg_cluster: pg-bar, pg_seq: 2, pg_host: 10.10.10.12 , pg_exporter_url: 'postgres://dbuser_monitor:[email protected]:5432/postgres?sslmode=disable'}
      20004: { pg_cluster: pg-bar, pg_seq: 3, pg_host: 10.10.10.13 , pg_monitor_username: dbuser_monitor, pg_monitor_password: DBUser.Monitor }

      20011:
        pg_cluster: pg-polar                        # RDS Cluster Name (Identity, Explicitly Assigned, used as 'cls')
        pg_seq: 1                                   # RDS Instance Seq (Identity, Explicitly Assigned, used as part of 'ins')
        pg_host: pxx.polardbpg.rds.aliyuncs.com     # RDS Host Address
        pg_port: 1921                               # RDS Port
        pg_exporter_include_database: 'test'        # Only monitoring database in this list
        pg_monitor_username: dbuser_monitor         # monitor username, overwrite default
        pg_monitor_password: DBUser_Monitor         # monitor password, overwrite default
        pg_databases: [{ name: test }]              # database to be added to grafana datasource

      20012:
        pg_cluster: pg-polar                        # RDS Cluster Name (Identity, Explicitly Assigned, used as 'cls')
        pg_seq: 2                                   # RDS Instance Seq (Identity, Explicitly Assigned, used as part of 'ins')
        pg_host: pe-xx.polarpgmxs.rds.aliyuncs.com  # RDS Host Address
        pg_port: 1521                               # RDS Port
        pg_databases: [{ name: test }]              # database to be added to grafana datasource

      20014:
        pg_cluster: pg-rds
        pg_seq: 1
        pg_host: pgm-xx.pg.rds.aliyuncs.com
        pg_port: 5432
        pg_exporter_auto_discovery: true
        pg_exporter_include_database: 'rds'
        pg_monitor_username: dbuser_monitor
        pg_monitor_password: DBUser_Monitor
        pg_databases: [ { name: rds } ]

      20015:
        pg_cluster: pg-rdsha
        pg_seq: 1
        pg_host: pgm-2xx8wu.pg.rds.aliyuncs.com
        pg_port: 5432
        pg_exporter_auto_discovery: true
        pg_exporter_include_database: 'rds'
        pg_databases: [{ name: test }, {name: rds}]

      20016:
        pg_cluster: pg-rdsha
        pg_seq: 2
        pg_host: pgr-xx.pg.rds.aliyuncs.com
        pg_exporter_auto_discovery: true
        pg_exporter_include_database: 'rds'
        pg_databases: [{ name: test }, {name: rds}]

Monitor Setup

When you want to monitor existing instances, whether it’s RDS or a self-built PostgreSQL instance, you need to make some configurations on the target database so that Pigsty can access them.

To bring an external existing PostgreSQL instance into monitoring, you need a connection string that can access that instance/cluster. Any accessible connection string (business user, superuser) can be used, but we recommend using a dedicated monitoring user to avoid permission leaks.

  • Monitor User: The default username used is dbuser_monitor. This user belongs to the pg_monitor group, or ensure it has the necessary view permissions.
  • Monitor HBA: Default password is DBUser.Monitor. You need to ensure that the HBA policy allows the monitoring user to access the database from the infra nodes.
  • Monitor Schema: It’s optional but recommended to create a dedicate schema monitor for monitoring views and extensions.
  • Monitor Extension:It is strongly recommended to enable the built-in extension pg_stat_statements.
  • Monitor View: Monitoring views are optional but can provide additional metrics. Which is recommended.

Monitor User

Create a monitor user on the target database cluster. For example, dbuser_monitor is used by default in Pigsty.

CREATE USER dbuser_monitor;                                       -- create the monitor user
COMMENT ON ROLE dbuser_monitor IS 'system monitor user';          -- comment the monitor user
GRANT pg_monitor TO dbuser_monitor;                               -- grant system role pg_monitor to monitor user

ALTER USER dbuser_monitor PASSWORD 'DBUser.Monitor';              -- set password for monitor user
ALTER USER dbuser_monitor SET log_min_duration_statement = 1000;  -- set this to avoid log flooding
ALTER USER dbuser_monitor SET search_path = monitor,public;       -- set this to avoid pg_stat_statements extension not working

The monitor user here should have consistent pg_monitor_username and pg_monitor_password with Pigsty config inventory.


Monitor HBA

You also need to configure pg_hba.conf to allow monitoring user access from infra/admin nodes.

# allow local role monitor with password
local   all  dbuser_monitor                    md5
host    all  dbuser_monitor  127.0.0.1/32      md5
host    all  dbuser_monitor  <admin_ip>/32     md5
host    all  dbuser_monitor  <infra_ip>/32     md5

If your RDS does not support the RAW HBA format, add admin/infra node IP to the whitelist.


Monitor Schema

Monitor schema is optional, but we strongly recommend creating one.

CREATE SCHEMA IF NOT EXISTS monitor;               -- create dedicate monitor schema
GRANT USAGE ON SCHEMA monitor TO dbuser_monitor;   -- allow monitor user to use this schema

Monitor Extension

Monitor extension is optional, but we strongly recommend enabling pg_stat_statements extension.

Note that this extension must be listed in shared_preload_libraries to take effect, and changing this parameter requires a database restart.

CREATE EXTENSION IF NOT EXISTS "pg_stat_statements" WITH SCHEMA "monitor";

You should create this extension inside the admin database: postgres. If your RDS does not grant CREATE on the database postgres. You can create that extension in the default public schema:

CREATE EXTENSION IF NOT EXISTS "pg_stat_statements";
ALTER USER dbuser_monitor SET search_path = monitor,public;

As long as your monitor user can access pg_stat_statements view without schema qualification, it should be fine.


Monitor View

It’s recommended to create the monitor views in all databases that need to be monitored.

Monitor Schema & View Definition
----------------------------------------------------------------------
-- Table bloat estimate : monitor.pg_table_bloat
----------------------------------------------------------------------
DROP VIEW IF EXISTS monitor.pg_table_bloat CASCADE;
CREATE OR REPLACE VIEW monitor.pg_table_bloat AS
SELECT CURRENT_CATALOG AS datname, nspname, relname , tblid , bs * tblpages AS size,
       CASE WHEN tblpages - est_tblpages_ff > 0 THEN (tblpages - est_tblpages_ff)/tblpages::FLOAT ELSE 0 END AS ratio
FROM (
         SELECT ceil( reltuples / ( (bs-page_hdr)*fillfactor/(tpl_size*100) ) ) + ceil( toasttuples / 4 ) AS est_tblpages_ff,
                tblpages, fillfactor, bs, tblid, nspname, relname, is_na
         FROM (
                  SELECT
                      ( 4 + tpl_hdr_size + tpl_data_size + (2 * ma)
                          - CASE WHEN tpl_hdr_size % ma = 0 THEN ma ELSE tpl_hdr_size % ma END
                          - CASE WHEN ceil(tpl_data_size)::INT % ma = 0 THEN ma ELSE ceil(tpl_data_size)::INT % ma END
                          ) AS tpl_size, (heappages + toastpages) AS tblpages, heappages,
                      toastpages, reltuples, toasttuples, bs, page_hdr, tblid, nspname, relname, fillfactor, is_na
                  FROM (
                           SELECT
                               tbl.oid AS tblid, ns.nspname , tbl.relname, tbl.reltuples,
                               tbl.relpages AS heappages, coalesce(toast.relpages, 0) AS toastpages,
                               coalesce(toast.reltuples, 0) AS toasttuples,
                               coalesce(substring(array_to_string(tbl.reloptions, ' ') FROM 'fillfactor=([0-9]+)')::smallint, 100) AS fillfactor,
                               current_setting('block_size')::numeric AS bs,
                               CASE WHEN version()~'mingw32' OR version()~'64-bit|x86_64|ppc64|ia64|amd64' THEN 8 ELSE 4 END AS ma,
                               24 AS page_hdr,
                               23 + CASE WHEN MAX(coalesce(s.null_frac,0)) > 0 THEN ( 7 + count(s.attname) ) / 8 ELSE 0::int END
                                   + CASE WHEN bool_or(att.attname = 'oid' and att.attnum < 0) THEN 4 ELSE 0 END AS tpl_hdr_size,
                               sum( (1-coalesce(s.null_frac, 0)) * coalesce(s.avg_width, 0) ) AS tpl_data_size,
                               bool_or(att.atttypid = 'pg_catalog.name'::regtype)
                                   OR sum(CASE WHEN att.attnum > 0 THEN 1 ELSE 0 END) <> count(s.attname) AS is_na
                           FROM pg_attribute AS att
                                    JOIN pg_class AS tbl ON att.attrelid = tbl.oid
                                    JOIN pg_namespace AS ns ON ns.oid = tbl.relnamespace
                                    LEFT JOIN pg_stats AS s ON s.schemaname=ns.nspname AND s.tablename = tbl.relname AND s.inherited=false AND s.attname=att.attname
                                    LEFT JOIN pg_class AS toast ON tbl.reltoastrelid = toast.oid
                           WHERE NOT att.attisdropped AND tbl.relkind = 'r' AND nspname NOT IN ('pg_catalog','information_schema')
                           GROUP BY 1,2,3,4,5,6,7,8,9,10
                       ) AS s
              ) AS s2
     ) AS s3
WHERE NOT is_na;
COMMENT ON VIEW monitor.pg_table_bloat IS 'postgres table bloat estimate';

GRANT SELECT ON monitor.pg_table_bloat TO pg_monitor;

----------------------------------------------------------------------
-- Index bloat estimate : monitor.pg_index_bloat
----------------------------------------------------------------------
DROP VIEW IF EXISTS monitor.pg_index_bloat CASCADE;
CREATE OR REPLACE VIEW monitor.pg_index_bloat AS
SELECT CURRENT_CATALOG AS datname, nspname, idxname AS relname, tblid, idxid, relpages::BIGINT * bs AS size,
       COALESCE((relpages - ( reltuples * (6 + ma - (CASE WHEN index_tuple_hdr % ma = 0 THEN ma ELSE index_tuple_hdr % ma END)
                                               + nulldatawidth + ma - (CASE WHEN nulldatawidth % ma = 0 THEN ma ELSE nulldatawidth % ma END))
                                  / (bs - pagehdr)::FLOAT  + 1 )), 0) / relpages::FLOAT AS ratio
FROM (
         SELECT nspname,idxname,indrelid AS tblid,indexrelid AS idxid,
                reltuples,relpages,
                current_setting('block_size')::INTEGER                                                               AS bs,
                (CASE WHEN version() ~ 'mingw32' OR version() ~ '64-bit|x86_64|ppc64|ia64|amd64' THEN 8 ELSE 4 END)  AS ma,
                24                                                                                                   AS pagehdr,
                (CASE WHEN max(COALESCE(pg_stats.null_frac, 0)) = 0 THEN 2 ELSE 6 END)                               AS index_tuple_hdr,
                sum((1.0 - COALESCE(pg_stats.null_frac, 0.0)) *
                    COALESCE(pg_stats.avg_width, 1024))::INTEGER                                                     AS nulldatawidth
         FROM pg_attribute
                  JOIN (
             SELECT pg_namespace.nspname,
                    ic.relname                                                   AS idxname,
                    ic.reltuples,
                    ic.relpages,
                    pg_index.indrelid,
                    pg_index.indexrelid,
                    tc.relname                                                   AS tablename,
                    regexp_split_to_table(pg_index.indkey::TEXT, ' ') :: INTEGER AS attnum,
                    pg_index.indexrelid                                          AS index_oid
             FROM pg_index
                      JOIN pg_class ic ON pg_index.indexrelid = ic.oid
                      JOIN pg_class tc ON pg_index.indrelid = tc.oid
                      JOIN pg_namespace ON pg_namespace.oid = ic.relnamespace
                      JOIN pg_am ON ic.relam = pg_am.oid
             WHERE pg_am.amname = 'btree' AND ic.relpages > 0 AND nspname NOT IN ('pg_catalog', 'information_schema')
         ) ind_atts ON pg_attribute.attrelid = ind_atts.indexrelid AND pg_attribute.attnum = ind_atts.attnum
                  JOIN pg_stats ON pg_stats.schemaname = ind_atts.nspname
             AND ((pg_stats.tablename = ind_atts.tablename AND pg_stats.attname = pg_get_indexdef(pg_attribute.attrelid, pg_attribute.attnum, TRUE))
                 OR (pg_stats.tablename = ind_atts.idxname AND pg_stats.attname = pg_attribute.attname))
         WHERE pg_attribute.attnum > 0
         GROUP BY 1, 2, 3, 4, 5, 6
     ) est;
COMMENT ON VIEW monitor.pg_index_bloat IS 'postgres index bloat estimate (btree-only)';

GRANT SELECT ON monitor.pg_index_bloat TO pg_monitor;

----------------------------------------------------------------------
-- Relation Bloat : monitor.pg_bloat
----------------------------------------------------------------------
DROP VIEW IF EXISTS monitor.pg_bloat CASCADE;
CREATE OR REPLACE VIEW monitor.pg_bloat AS
SELECT coalesce(ib.datname, tb.datname)                                                   AS datname,
       coalesce(ib.nspname, tb.nspname)                                                   AS nspname,
       coalesce(ib.tblid, tb.tblid)                                                       AS tblid,
       coalesce(tb.nspname || '.' || tb.relname, ib.nspname || '.' || ib.tblid::RegClass) AS tblname,
       tb.size                                                                            AS tbl_size,
       CASE WHEN tb.ratio < 0 THEN 0 ELSE round(tb.ratio::NUMERIC, 6) END                 AS tbl_ratio,
       (tb.size * (CASE WHEN tb.ratio < 0 THEN 0 ELSE tb.ratio::NUMERIC END)) ::BIGINT    AS tbl_wasted,
       ib.idxid,
       ib.nspname || '.' || ib.relname                                                    AS idxname,
       ib.size                                                                            AS idx_size,
       CASE WHEN ib.ratio < 0 THEN 0 ELSE round(ib.ratio::NUMERIC, 5) END                 AS idx_ratio,
       (ib.size * (CASE WHEN ib.ratio < 0 THEN 0 ELSE ib.ratio::NUMERIC END)) ::BIGINT    AS idx_wasted
FROM monitor.pg_index_bloat ib
         FULL OUTER JOIN monitor.pg_table_bloat tb ON ib.tblid = tb.tblid;

COMMENT ON VIEW monitor.pg_bloat IS 'postgres relation bloat detail';
GRANT SELECT ON monitor.pg_bloat TO pg_monitor;

----------------------------------------------------------------------
-- monitor.pg_index_bloat_human
----------------------------------------------------------------------
DROP VIEW IF EXISTS monitor.pg_index_bloat_human CASCADE;
CREATE OR REPLACE VIEW monitor.pg_index_bloat_human AS
SELECT idxname                            AS name,
       tblname,
       idx_wasted                         AS wasted,
       pg_size_pretty(idx_size)           AS idx_size,
       round(100 * idx_ratio::NUMERIC, 2) AS idx_ratio,
       pg_size_pretty(idx_wasted)         AS idx_wasted,
       pg_size_pretty(tbl_size)           AS tbl_size,
       round(100 * tbl_ratio::NUMERIC, 2) AS tbl_ratio,
       pg_size_pretty(tbl_wasted)         AS tbl_wasted
FROM monitor.pg_bloat
WHERE idxname IS NOT NULL;
COMMENT ON VIEW monitor.pg_index_bloat_human IS 'postgres index bloat info in human-readable format';
GRANT SELECT ON monitor.pg_index_bloat_human TO pg_monitor;


----------------------------------------------------------------------
-- monitor.pg_table_bloat_human
----------------------------------------------------------------------
DROP VIEW IF EXISTS monitor.pg_table_bloat_human CASCADE;
CREATE OR REPLACE VIEW monitor.pg_table_bloat_human AS
SELECT tblname                                          AS name,
       idx_wasted + tbl_wasted                          AS wasted,
       pg_size_pretty(idx_wasted + tbl_wasted)          AS all_wasted,
       pg_size_pretty(tbl_wasted)                       AS tbl_wasted,
       pg_size_pretty(tbl_size)                         AS tbl_size,
       tbl_ratio,
       pg_size_pretty(idx_wasted)                       AS idx_wasted,
       pg_size_pretty(idx_size)                         AS idx_size,
       round(idx_wasted::NUMERIC * 100.0 / idx_size, 2) AS idx_ratio
FROM (SELECT datname,
             nspname,
             tblname,
             coalesce(max(tbl_wasted), 0)                         AS tbl_wasted,
             coalesce(max(tbl_size), 1)                           AS tbl_size,
             round(100 * coalesce(max(tbl_ratio), 0)::NUMERIC, 2) AS tbl_ratio,
             coalesce(sum(idx_wasted), 0)                         AS idx_wasted,
             coalesce(sum(idx_size), 1)                           AS idx_size
      FROM monitor.pg_bloat
      WHERE tblname IS NOT NULL
      GROUP BY 1, 2, 3
     ) d;
COMMENT ON VIEW monitor.pg_table_bloat_human IS 'postgres table bloat info in human-readable format';
GRANT SELECT ON monitor.pg_table_bloat_human TO pg_monitor;


----------------------------------------------------------------------
-- Activity Overview: monitor.pg_session
----------------------------------------------------------------------
DROP VIEW IF EXISTS monitor.pg_session CASCADE;
CREATE OR REPLACE VIEW monitor.pg_session AS
SELECT coalesce(datname, 'all') AS datname, numbackends, active, idle, ixact, max_duration, max_tx_duration, max_conn_duration
FROM (
         SELECT datname,
                count(*)                                         AS numbackends,
                count(*) FILTER ( WHERE state = 'active' )       AS active,
                count(*) FILTER ( WHERE state = 'idle' )         AS idle,
                count(*) FILTER ( WHERE state = 'idle in transaction'
                    OR state = 'idle in transaction (aborted)' ) AS ixact,
                max(extract(epoch from now() - state_change))
                FILTER ( WHERE state = 'active' )                AS max_duration,
                max(extract(epoch from now() - xact_start))      AS max_tx_duration,
                max(extract(epoch from now() - backend_start))   AS max_conn_duration
         FROM pg_stat_activity
         WHERE backend_type = 'client backend'
           AND pid <> pg_backend_pid()
         GROUP BY ROLLUP (1)
         ORDER BY 1 NULLS FIRST
     ) t;
COMMENT ON VIEW monitor.pg_session IS 'postgres activity group by session';
GRANT SELECT ON monitor.pg_session TO pg_monitor;


----------------------------------------------------------------------
-- Sequential Scan: monitor.pg_seq_scan
----------------------------------------------------------------------
DROP VIEW IF EXISTS monitor.pg_seq_scan CASCADE;
CREATE OR REPLACE VIEW monitor.pg_seq_scan AS
SELECT schemaname                                                        AS nspname,
       relname,
       seq_scan,
       seq_tup_read,
       seq_tup_read / seq_scan                                           AS seq_tup_avg,
       idx_scan,
       n_live_tup + n_dead_tup                                           AS tuples,
       round(n_live_tup * 100.0::NUMERIC / (n_live_tup + n_dead_tup), 2) AS live_ratio
FROM pg_stat_user_tables
WHERE seq_scan > 0
  and (n_live_tup + n_dead_tup) > 0
ORDER BY seq_scan DESC;
COMMENT ON VIEW monitor.pg_seq_scan IS 'table that have seq scan';
GRANT SELECT ON monitor.pg_seq_scan TO pg_monitor;
Shmem allocation for PostgreSQL 13+
DROP FUNCTION IF EXISTS monitor.pg_shmem() CASCADE;
CREATE OR REPLACE FUNCTION monitor.pg_shmem() RETURNS SETOF
    pg_shmem_allocations AS $$ SELECT * FROM pg_shmem_allocations;$$ LANGUAGE SQL SECURITY DEFINER;
COMMENT ON FUNCTION monitor.pg_shmem() IS 'security wrapper for system view pg_shmem';
REVOKE ALL ON FUNCTION monitor.pg_shmem() FROM PUBLIC;
GRANT EXECUTE ON FUNCTION monitor.pg_shmem() TO pg_monitor;

5.13 - Dashboards

Grafana dashboards provided by Pigsty

Grafana Dashboards for PostgreSQL clusters: Demo & Gallery.

pigsty-dashboard.jpg

There are 26 default grafana dashboards about PostgreSQL and categorized into 4 levels. and categorized into PGSQL, PGCAT & PGLOG by datasource.

Overview Cluster Instance Database
PGSQL Overview PGSQL Cluster PGSQL Instance PGSQL Database
PGSQL Alert PGRDS Cluster PGRDS Instance PGCAT Database
PGSQL Shard PGSQL Activity PGCAT Instance PGSQL Tables
PGSQL Replication PGSQL Persist PGSQL Table
PGSQL Service PGSQL Proxy PGCAT Table
PGSQL Databases PGSQL Pgbouncer PGSQL Query
PGSQL Patroni PGSQL Session PGCAT Query
PGSQL Xacts PGCAT Locks
PGSQL Exporter PGCAT Schema

Overview

  • pgsql-overview : The main dashboard for PGSQL module
  • pgsql-alert : Global PGSQL key metrics and alerting events
  • pgsql-shard : Overview of a horizontal sharded PGSQL cluster, e.g. citus / gpsql cluster

Cluster

  • pgsql-cluster: The main dashboard for a PGSQL cluster
  • pgrds-cluster: The PGSQL Cluster dashboard for RDS, focus on all postgres metrics only.
  • pgsql-activity: Cares about the Session/Load/QPS/TPS/Locks of a PGSQL cluster
  • pgsql-replication: Cares about PGSQL cluster replication, slots, and pub/sub.
  • pgsql-service: Cares about PGSQL cluster services, proxies, routes, and load balancers.
  • pgsql-databases: Cares about database CRUD, slow queries, and table statistics cross all instances.
  • pgsql-patroni: Cares about cluster HA agent: patroni status.

Instance

  • pgsql-instance: The main dashboard for a single PGSQL instance
  • pgrds-instance: The PGSQL Instance dashboard for RDS, focus on all postgres metrics only.
  • pgcat-instance: Instance information from database catalog directly
  • pgsql-persist: Metrics about persistence: WAL, XID, Checkpoint, Archive, IO
  • pgsql-proxy: Metrics about haproxy the service provider
  • pgsql-queries: Overview of all queries in a single instance
  • pgsql-session: Metrics about sessions and active/idle time in a single instance
  • pgsql-xacts: Metrics about transactions, locks, queries, etc…
  • pgsql-exporter: Postgres & Pgbouncer exporter self monitoring metrics

Database

  • pgsql-database: The main dashboard for a single PGSQL database
  • pgcat-database: Database information from database catalog directly
  • pgsql-tables : Table/Index access metrics inside a single database
  • pgsql-table: Detailed information (QPS/RT/Index/Seq…) about a single table
  • pgcat-table: Detailed information (Stats/Bloat/…) about a single table from database catalog directly
  • pgsql-query: Detailed information (QPS/RT) about a single query
  • pgcat-query: Detailed information (SQL/Stats) about a single query from database catalog directly

Overview

PGSQL Overview : The main dashboard for PGSQL module

PGSQL Overview

pgsql-overview.jpg

PGSQL Alert : Global PGSQL key metrics and alerting events

PGSQL Alert

pgsql-alert.jpg

PGSQL Shard : Overview of a horizontal sharded PGSQL cluster, e.g. CITUS / GPSQL cluster

PGSQL Shard

pgsql-shard.jpg


Cluster

PGSQL Cluster: The main dashboard for a PGSQL cluster

PGSQL Cluster

pgsql-cluster.jpg

PGRDS Cluster: The PGSQL Cluster dashboard for RDS, focus on all postgres metrics only.

PGRDS Cluster

pgrds-cluster.jpg

PGSQL Service: Cares about PGSQL cluster services, proxies, routes, and load balancers.

PGSQL Service

pgsql-service.jpg

PGSQL Activity: Cares about the Session/Load/QPS/TPS/Locks of a PGSQL cluster

PGSQL Activity

pgsql-activity.jpg

PGSQL Replication: Cares about PGSQL cluster replication, slots, and pub/sub.

PGSQL Replication

pgsql-replication.jpg

PGSQL Databases: Cares about database CRUD, slow queries, and table statistics cross all instances.

PGSQL Databases

pgsql-databases.jpg

PGSQL Patroni: Cares about cluster HA agent: patroni status.

PGSQL Patroni

pgsql-patroni.jpg


Instance

PGSQL Instance: The main dashboard for a single PGSQL instance

PGSQL Instance

pgsql-instance.jpg

PGRDS Instance: The PGSQL Instance dashboard for RDS, focus on all postgres metrics only.

PGRDS Instance

pgrds-instance.jpg

PGSQL Proxy: Metrics about haproxy the service provider

PGSQL Proxy

pgsql-proxy.jpg

PGSQL Pgbouncer: Metrics about one single pgbouncer connection pool instance

PGSQL Pgbouncer

pgsql-pgbouncer.jpg

PGSQL Persist: Metrics about persistence: WAL, XID, Checkpoint, Archive, IO

PGSQL Persist

pgsql-persist.jpg

PGSQL Xacts: Metrics about transactions, locks, queries, etc…

PGSQL Xacts

pgsql-xacts.jpg

PGSQL Session: Metrics about sessions and active/idle time in a single instance

PGSQL Session

pgsql-session.jpg

PGSQL Exporter: Postgres & Pgbouncer exporter self monitoring metrics

PGSQL Exporter

pgsql-exporter.jpg


Database

PGSQL Database: The main dashboard for a single PGSQL database

PGSQL Database

pgsql-database.jpg

PGSQL Tables : Table/Index access metrics inside a single database

PGSQL Tables

pgsql-tables.jpg

PGSQL Table: Detailed information (QPS/RT/Index/Seq…) about a single table

PGSQL Table

pgsql-table.jpg

PGSQL Query: Detailed information (QPS/RT) about a single query

PGSQL Query

pgsql-query.jpg


PGCAT

PGCAT Instance: Instance information from database catalog directly

PGCAT Instance

pgcat-instance.jpg

PGCAT Database: Database information from database catalog directly

PGCAT Database

pgcat-database.jpg

PGCAT Schema: Detailed information about one single schema from database catalog directly

PGCAT Schema

pgcat-schema.jpg

PGCAT Table: Detailed information about one single table from database catalog directly

PGCAT Table

pgcat-table.jpg

PGCAT Query: Detailed information about one single type of query from database catalog directly

PGCAT Query

pgcat-query.jpg

PGCAT Locks: Detailed information about live locks & activity from database catalog directly

PGCAT Locks

pgcat-locks.jpg


PGLOG

PGLOG Overview: Overview of csv log sample in pigsty meta database

PGLOG Overview

pglog-overview.jpg

PGLOG Overview: Detail of one single session of csv log sample in pigsty meta database

PGLOG Session

pglog-session.jpg


PGSQL Shard

pgsql-shard.jpg

PGSQL Cluster

pgsql-cluster.jpg

PGSQL Service

pgsql-service.jpg

PGSQL Activity

pgsql-activity.jpg

PGSQL Replication

pgsql-replication.jpg

PGSQL Databases

pgsql-databases.jpg

PGSQL Instance

pgsql-instance.jpg

PGSQL Proxy

pgsql-proxy.jpg

PGSQL Pgbouncer

pgsql-pgbouncer.jpg

PGSQL Session

pgsql-session.jpg

PGSQL Xacts

pgsql-xacts.jpg

PGSQL Persist

pgsql-persist.jpg

PGSQL Database

pgsql-database.jpg

PGSQL Tables

pgsql-tables.jpg

PGSQL Table

pgsql-table.jpg

PGSQL Query

pgsql-query.jpg

PGCAT Instance

pgcat-instance.jpg

PGCAT Database

pgcat-database.jpg

PGCAT Schema

pgcat-schema.jpg

PGCAT Table

pgcat-table.jpg

PGCAT Lock

pgcat-locks.jpg

PGCAT Query

pgcat-query.jpg

PGLOG Overview

pglog-overview.jpg

PGLOG Session

pglog-session.jpg

5.14 - Metrics

Pigsty PGSQL module metric list

PGSQL module has 638 available metrics

Metric Name Type Labels Description
ALERTS Unknown category, job, level, ins, severity, ip, alertname, alertstate, instance, cls N/A
ALERTS_FOR_STATE Unknown category, job, level, ins, severity, ip, alertname, instance, cls N/A
cls:pressure1 Unknown job, cls N/A
cls:pressure15 Unknown job, cls N/A
cls:pressure5 Unknown job, cls N/A
go_gc_duration_seconds summary job, ins, ip, instance, quantile, cls A summary of the pause duration of garbage collection cycles.
go_gc_duration_seconds_count Unknown job, ins, ip, instance, cls N/A
go_gc_duration_seconds_sum Unknown job, ins, ip, instance, cls N/A
go_goroutines gauge job, ins, ip, instance, cls Number of goroutines that currently exist.
go_info gauge version, job, ins, ip, instance, cls Information about the Go environment.
go_memstats_alloc_bytes gauge job, ins, ip, instance, cls Number of bytes allocated and still in use.
go_memstats_alloc_bytes_total counter job, ins, ip, instance, cls Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes gauge job, ins, ip, instance, cls Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total counter job, ins, ip, instance, cls Total number of frees.
go_memstats_gc_sys_bytes gauge job, ins, ip, instance, cls Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes gauge job, ins, ip, instance, cls Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes gauge job, ins, ip, instance, cls Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes gauge job, ins, ip, instance, cls Number of heap bytes that are in use.
go_memstats_heap_objects gauge job, ins, ip, instance, cls Number of allocated objects.
go_memstats_heap_released_bytes gauge job, ins, ip, instance, cls Number of heap bytes released to OS.
go_memstats_heap_sys_bytes gauge job, ins, ip, instance, cls Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds gauge job, ins, ip, instance, cls Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total counter job, ins, ip, instance, cls Total number of pointer lookups.
go_memstats_mallocs_total counter job, ins, ip, instance, cls Total number of mallocs.
go_memstats_mcache_inuse_bytes gauge job, ins, ip, instance, cls Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes gauge job, ins, ip, instance, cls Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes gauge job, ins, ip, instance, cls Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes gauge job, ins, ip, instance, cls Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes gauge job, ins, ip, instance, cls Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes gauge job, ins, ip, instance, cls Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes gauge job, ins, ip, instance, cls Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes gauge job, ins, ip, instance, cls Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes gauge job, ins, ip, instance, cls Number of bytes obtained from system.
go_threads gauge job, ins, ip, instance, cls Number of OS threads created.
ins:pressure1 Unknown job, ins, ip, cls N/A
ins:pressure15 Unknown job, ins, ip, cls N/A
ins:pressure5 Unknown job, ins, ip, cls N/A
patroni_cluster_unlocked gauge job, ins, ip, instance, cls, scope Value is 1 if the cluster is unlocked, 0 if locked.
patroni_dcs_last_seen gauge job, ins, ip, instance, cls, scope Epoch timestamp when DCS was last contacted successfully by Patroni.
patroni_failsafe_mode_is_active gauge job, ins, ip, instance, cls, scope Value is 1 if failsafe mode is active, 0 if inactive.
patroni_is_paused gauge job, ins, ip, instance, cls, scope Value is 1 if auto failover is disabled, 0 otherwise.
patroni_master gauge job, ins, ip, instance, cls, scope Value is 1 if this node is the leader, 0 otherwise.
patroni_pending_restart gauge job, ins, ip, instance, cls, scope Value is 1 if the node needs a restart, 0 otherwise.
patroni_postgres_in_archive_recovery gauge job, ins, ip, instance, cls, scope Value is 1 if Postgres is replicating from archive, 0 otherwise.
patroni_postgres_running gauge job, ins, ip, instance, cls, scope Value is 1 if Postgres is running, 0 otherwise.
patroni_postgres_server_version gauge job, ins, ip, instance, cls, scope Version of Postgres (if running), 0 otherwise.
patroni_postgres_streaming gauge job, ins, ip, instance, cls, scope Value is 1 if Postgres is streaming, 0 otherwise.
patroni_postgres_timeline counter job, ins, ip, instance, cls, scope Postgres timeline of this node (if running), 0 otherwise.
patroni_postmaster_start_time gauge job, ins, ip, instance, cls, scope Epoch seconds since Postgres started.
patroni_primary gauge job, ins, ip, instance, cls, scope Value is 1 if this node is the leader, 0 otherwise.
patroni_replica gauge job, ins, ip, instance, cls, scope Value is 1 if this node is a replica, 0 otherwise.
patroni_standby_leader gauge job, ins, ip, instance, cls, scope Value is 1 if this node is the standby_leader, 0 otherwise.
patroni_sync_standby gauge job, ins, ip, instance, cls, scope Value is 1 if this node is a sync standby replica, 0 otherwise.
patroni_up Unknown job, ins, ip, instance, cls N/A
patroni_version gauge job, ins, ip, instance, cls, scope Patroni semver without periods.
patroni_xlog_location counter job, ins, ip, instance, cls, scope Current location of the Postgres transaction log, 0 if this node is not the leader.
patroni_xlog_paused gauge job, ins, ip, instance, cls, scope Value is 1 if the Postgres xlog is paused, 0 otherwise.
patroni_xlog_received_location counter job, ins, ip, instance, cls, scope Current location of the received Postgres transaction log, 0 if this node is not a replica.
patroni_xlog_replayed_location counter job, ins, ip, instance, cls, scope Current location of the replayed Postgres transaction log, 0 if this node is not a replica.
patroni_xlog_replayed_timestamp gauge job, ins, ip, instance, cls, scope Current timestamp of the replayed Postgres transaction log, 0 if null.
pg:cls:active_backends Unknown job, cls N/A
pg:cls:active_time_rate15m Unknown job, cls N/A
pg:cls:active_time_rate1m Unknown job, cls N/A
pg:cls:active_time_rate5m Unknown job, cls N/A
pg:cls:age Unknown job, cls N/A
pg:cls:buf_alloc_rate1m Unknown job, cls N/A
pg:cls:buf_clean_rate1m Unknown job, cls N/A
pg:cls:buf_flush_backend_rate1m Unknown job, cls N/A
pg:cls:buf_flush_checkpoint_rate1m Unknown job, cls N/A
pg:cls:cpu_count Unknown job, cls N/A
pg:cls:cpu_usage Unknown job, cls N/A
pg:cls:cpu_usage_15m Unknown job, cls N/A
pg:cls:cpu_usage_1m Unknown job, cls N/A
pg:cls:cpu_usage_5m Unknown job, cls N/A
pg:cls:db_size Unknown job, cls N/A
pg:cls:file_size Unknown job, cls N/A
pg:cls:ixact_backends Unknown job, cls N/A
pg:cls:ixact_time_rate1m Unknown job, cls N/A
pg:cls:lag_bytes Unknown job, cls N/A
pg:cls:lag_seconds Unknown job, cls N/A
pg:cls:leader Unknown job, ins, ip, instance, cls N/A
pg:cls:load1 Unknown job, cls N/A
pg:cls:load15 Unknown job, cls N/A
pg:cls:load5 Unknown job, cls N/A
pg:cls:lock_count Unknown job, cls N/A
pg:cls:locks Unknown job, cls, mode N/A
pg:cls:log_size Unknown job, cls N/A
pg:cls:lsn_rate1m Unknown job, cls N/A
pg:cls:members Unknown job, ins, ip, cls N/A
pg:cls:num_backends Unknown job, cls N/A
pg:cls:partition Unknown job, cls N/A
pg:cls:receiver Unknown state, slot_name, job, appname, ip, cls, sender_host, sender_port N/A
pg:cls:rlock_count Unknown job, cls N/A
pg:cls:saturation1 Unknown job, cls N/A
pg:cls:saturation15 Unknown job, cls N/A
pg:cls:saturation5 Unknown job, cls N/A
pg:cls:sender Unknown pid, usename, address, job, ins, appname, ip, cls N/A
pg:cls:session_time_rate1m Unknown job, cls N/A
pg:cls:size Unknown job, cls N/A
pg:cls:slot_count Unknown job, cls N/A
pg:cls:slot_retained_bytes Unknown job, cls N/A
pg:cls:standby_count Unknown job, cls N/A
pg:cls:sync_state Unknown job, cls N/A
pg:cls:timeline Unknown job, cls N/A
pg:cls:tup_deleted_rate1m Unknown job, cls N/A
pg:cls:tup_fetched_rate1m Unknown job, cls N/A
pg:cls:tup_inserted_rate1m Unknown job, cls N/A
pg:cls:tup_modified_rate1m Unknown job, cls N/A
pg:cls:tup_returned_rate1m Unknown job, cls N/A
pg:cls:wal_size Unknown job, cls N/A
pg:cls:xact_commit_rate15m Unknown job, cls N/A
pg:cls:xact_commit_rate1m Unknown job, cls N/A
pg:cls:xact_commit_rate5m Unknown job, cls N/A
pg:cls:xact_rollback_rate15m Unknown job, cls N/A
pg:cls:xact_rollback_rate1m Unknown job, cls N/A
pg:cls:xact_rollback_rate5m Unknown job, cls N/A
pg:cls:xact_total_rate15m Unknown job, cls N/A
pg:cls:xact_total_rate1m Unknown job, cls N/A
pg:cls:xact_total_sigma15m Unknown job, cls N/A
pg:cls:xlock_count Unknown job, cls N/A
pg:db:active_backends Unknown datname, job, ins, ip, instance, cls N/A
pg:db:active_time_rate15m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:active_time_rate1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:active_time_rate5m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:age Unknown datname, job, ins, ip, instance, cls N/A
pg:db:age_deriv1h Unknown datname, job, ins, ip, instance, cls N/A
pg:db:age_exhaust Unknown datname, job, ins, ip, instance, cls N/A
pg:db:blk_io_time_seconds_rate1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:blk_read_time_seconds_rate1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:blk_write_time_seconds_rate1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:blks_access_1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:blks_hit_1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:blks_hit_ratio1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:blks_read_1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:conn_limit Unknown datname, job, ins, ip, instance, cls N/A
pg:db:conn_usage Unknown datname, job, ins, ip, instance, cls N/A
pg:db:db_size Unknown datname, job, ins, ip, instance, cls N/A
pg:db:ixact_backends Unknown datname, job, ins, ip, instance, cls N/A
pg:db:ixact_time_rate1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:lock_count Unknown datname, job, ins, ip, instance, cls N/A
pg:db:num_backends Unknown datname, job, ins, ip, instance, cls N/A
pg:db:rlock_count Unknown datname, job, ins, ip, instance, cls N/A
pg:db:session_time_rate1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:temp_bytes_rate1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:temp_files_1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:tup_deleted_rate1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:tup_fetched_rate1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:tup_inserted_rate1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:tup_modified_rate1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:tup_returned_rate1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:wlock_count Unknown datname, job, ins, ip, instance, cls N/A
pg:db:xact_commit_rate15m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:xact_commit_rate1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:xact_commit_rate5m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:xact_rollback_rate15m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:xact_rollback_rate1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:xact_rollback_rate5m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:xact_total_rate15m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:xact_total_rate1m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:xact_total_rate5m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:xact_total_sigma15m Unknown datname, job, ins, ip, instance, cls N/A
pg:db:xlock_count Unknown datname, job, ins, ip, instance, cls N/A
pg:env:active_backends Unknown job N/A
pg:env:active_time_rate15m Unknown job N/A
pg:env:active_time_rate1m Unknown job N/A
pg:env:active_time_rate5m Unknown job N/A
pg:env:age Unknown job N/A
pg:env:cpu_count Unknown job N/A
pg:env:cpu_usage Unknown job N/A
pg:env:cpu_usage_15m Unknown job N/A
pg:env:cpu_usage_1m Unknown job N/A
pg:env:cpu_usage_5m Unknown job N/A
pg:env:ixact_backends Unknown job N/A
pg:env:ixact_time_rate1m Unknown job N/A
pg:env:lag_bytes Unknown job N/A
pg:env:lag_seconds Unknown job N/A
pg:env:lsn_rate1m Unknown job N/A
pg:env:session_time_rate1m Unknown job N/A
pg:env:tup_deleted_rate1m Unknown job N/A
pg:env:tup_fetched_rate1m Unknown job N/A
pg:env:tup_inserted_rate1m Unknown job N/A
pg:env:tup_modified_rate1m Unknown job N/A
pg:env:tup_returned_rate1m Unknown job N/A
pg:env:xact_commit_rate15m Unknown job N/A
pg:env:xact_commit_rate1m Unknown job N/A
pg:env:xact_commit_rate5m Unknown job N/A
pg:env:xact_rollback_rate15m Unknown job N/A
pg:env:xact_rollback_rate1m Unknown job N/A
pg:env:xact_rollback_rate5m Unknown job N/A
pg:env:xact_total_rate15m Unknown job N/A
pg:env:xact_total_rate1m Unknown job N/A
pg:env:xact_total_sigma15m Unknown job N/A
pg:ins:active_backends Unknown job, ins, ip, instance, cls N/A
pg:ins:active_time_rate15m Unknown job, ins, ip, instance, cls N/A
pg:ins:active_time_rate1m Unknown job, ins, ip, instance, cls N/A
pg:ins:active_time_rate5m Unknown job, ins, ip, instance, cls N/A
pg:ins:age Unknown job, ins, ip, instance, cls N/A
pg:ins:blks_hit_ratio1m Unknown job, ins, ip, instance, cls N/A
pg:ins:buf_alloc_rate1m Unknown job, ins, ip, instance, cls N/A
pg:ins:buf_clean_rate1m Unknown job, ins, ip, instance, cls N/A
pg:ins:buf_flush_backend_rate1m Unknown job, ins, ip, instance, cls N/A
pg:ins:buf_flush_checkpoint_rate1m Unknown job, ins, ip, instance, cls N/A
pg:ins:ckpt_1h Unknown job, ins, ip, instance, cls N/A
pg:ins:ckpt_req_1m Unknown job, ins, ip, instance, cls N/A
pg:ins:ckpt_timed_1m Unknown job, ins, ip, instance, cls N/A
pg:ins:conn_limit Unknown job, ins, ip, instance, cls N/A
pg:ins:conn_usage Unknown job, ins, ip, instance, cls N/A
pg:ins:cpu_count Unknown job, ins, ip, instance, cls N/A
pg:ins:cpu_usage Unknown job, ins, ip, instance, cls N/A
pg:ins:cpu_usage_15m Unknown job, ins, ip, instance, cls N/A
pg:ins:cpu_usage_1m Unknown job, ins, ip, instance, cls N/A
pg:ins:cpu_usage_5m Unknown job, ins, ip, instance, cls N/A
pg:ins:db_size Unknown job, ins, ip, instance, cls N/A
pg:ins:file_size Unknown job, ins, ip, instance, cls N/A
pg:ins:fs_size Unknown job, ins, ip, instance, cls N/A
pg:ins:is_leader Unknown job, ins, ip, instance, cls N/A
pg:ins:ixact_backends Unknown job, ins, ip, instance, cls N/A
pg:ins:ixact_time_rate1m Unknown job, ins, ip, instance, cls N/A
pg:ins:lag_bytes Unknown job, ins, ip, instance, cls N/A
pg:ins:lag_seconds Unknown job, ins, ip, instance, cls N/A
pg:ins:load1 Unknown job, ins, ip, instance, cls N/A
pg:ins:load15 Unknown job, ins, ip, instance, cls N/A
pg:ins:load5 Unknown job, ins, ip, instance, cls N/A
pg:ins:lock_count Unknown job, ins, ip, instance, cls N/A
pg:ins:locks Unknown job, ins, ip, mode, instance, cls N/A
pg:ins:log_size Unknown job, ins, ip, instance, cls N/A
pg:ins:lsn_rate1m Unknown job, ins, ip, instance, cls N/A
pg:ins:mem_size Unknown job, ins, ip, instance, cls N/A
pg:ins:num_backends Unknown job, ins, ip, instance, cls N/A
pg:ins:rlock_count Unknown job, ins, ip, instance, cls N/A
pg:ins:saturation1 Unknown job, ins, ip, cls N/A
pg:ins:saturation15 Unknown job, ins, ip, cls N/A
pg:ins:saturation5 Unknown job, ins, ip, cls N/A
pg:ins:session_time_rate1m Unknown job, ins, ip, instance, cls N/A
pg:ins:slot_retained_bytes Unknown job, ins, ip, instance, cls N/A
pg:ins:space_usage Unknown job, ins, ip, instance, cls N/A
pg:ins:status Unknown job, ins, ip, instance, cls N/A
pg:ins:sync_state Unknown job, ins, instance, cls N/A
pg:ins:target_count Unknown job, cls, ins N/A
pg:ins:timeline Unknown job, ins, ip, instance, cls N/A
pg:ins:tup_deleted_rate1m Unknown job, ins, ip, instance, cls N/A
pg:ins:tup_fetched_rate1m Unknown job, ins, ip, instance, cls N/A
pg:ins:tup_inserted_rate1m Unknown job, ins, ip, instance, cls N/A
pg:ins:tup_modified_rate1m Unknown job, ins, ip, instance, cls N/A
pg:ins:tup_returned_rate1m Unknown job, ins, ip, instance, cls N/A
pg:ins:wal_size Unknown job, ins, ip, instance, cls N/A
pg:ins:wlock_count Unknown job, ins, ip, instance, cls N/A
pg:ins:xact_commit_rate15m Unknown job, ins, ip, instance, cls N/A
pg:ins:xact_commit_rate1m Unknown job, ins, ip, instance, cls N/A
pg:ins:xact_commit_rate5m Unknown job, ins, ip, instance, cls N/A
pg:ins:xact_rollback_rate15m Unknown job, ins, ip, instance, cls N/A
pg:ins:xact_rollback_rate1m Unknown job, ins, ip, instance, cls N/A
pg:ins:xact_rollback_rate5m Unknown job, ins, ip, instance, cls N/A
pg:ins:xact_total_rate15m Unknown job, ins, ip, instance, cls N/A
pg:ins:xact_total_rate1m Unknown job, ins, ip, instance, cls N/A
pg:ins:xact_total_rate5m Unknown job, ins, ip, instance, cls N/A
pg:ins:xact_total_sigma15m Unknown job, ins, ip, instance, cls N/A
pg:ins:xlock_count Unknown job, ins, ip, instance, cls N/A
pg:query:call_rate1m Unknown datname, query, job, ins, ip, instance, cls N/A
pg:query:rt_1m Unknown datname, query, job, ins, ip, instance, cls N/A
pg:table:scan_rate1m Unknown datname, relname, job, ins, ip, instance, cls N/A
pg_activity_count gauge datname, state, job, ins, ip, instance, cls Count of connection among (datname,state)
pg_activity_max_conn_duration gauge datname, state, job, ins, ip, instance, cls Max backend session duration since state change among (datname, state)
pg_activity_max_duration gauge datname, state, job, ins, ip, instance, cls Max duration since last state change among (datname, state)
pg_activity_max_tx_duration gauge datname, state, job, ins, ip, instance, cls Max transaction duration since state change among (datname, state)
pg_archiver_failed_count counter job, ins, ip, instance, cls Number of failed attempts for archiving WAL files
pg_archiver_finish_count counter job, ins, ip, instance, cls Number of WAL files that have been successfully archived
pg_archiver_last_failed_time counter job, ins, ip, instance, cls Time of the last failed archival operation
pg_archiver_last_finish_time counter job, ins, ip, instance, cls Time of the last successful archive operation
pg_archiver_reset_time gauge job, ins, ip, instance, cls Time at which archive statistics were last reset
pg_backend_count gauge type, job, ins, ip, instance, cls Database backend process count by backend_type
pg_bgwriter_buffers_alloc counter job, ins, ip, instance, cls Number of buffers allocated
pg_bgwriter_buffers_backend counter job, ins, ip, instance, cls Number of buffers written directly by a backend
pg_bgwriter_buffers_backend_fsync counter job, ins, ip, instance, cls Number of times a backend had to execute its own fsync call
pg_bgwriter_buffers_checkpoint counter job, ins, ip, instance, cls Number of buffers written during checkpoints
pg_bgwriter_buffers_clean counter job, ins, ip, instance, cls Number of buffers written by the background writer
pg_bgwriter_checkpoint_sync_time counter job, ins, ip, instance, cls Total amount of time that has been spent in the portion of checkpoint processing where files are synchronized to disk, in seconds
pg_bgwriter_checkpoint_write_time counter job, ins, ip, instance, cls Total amount of time that has been spent in the portion of checkpoint processing where files are written to disk, in seconds
pg_bgwriter_checkpoints_req counter job, ins, ip, instance, cls Number of requested checkpoints that have been performed
pg_bgwriter_checkpoints_timed counter job, ins, ip, instance, cls Number of scheduled checkpoints that have been performed
pg_bgwriter_maxwritten_clean counter job, ins, ip, instance, cls Number of times the background writer stopped a cleaning scan because it had written too many buffers
pg_bgwriter_reset_time counter job, ins, ip, instance, cls Time at which bgwriter statistics were last reset
pg_boot_time gauge job, ins, ip, instance, cls unix timestamp when postmaster boot
pg_checkpoint_checkpoint_lsn counter job, ins, ip, instance, cls Latest checkpoint location
pg_checkpoint_elapse gauge job, ins, ip, instance, cls Seconds elapsed since latest checkpoint in seconds
pg_checkpoint_full_page_writes gauge job, ins, ip, instance, cls Latest checkpoint’s full_page_writes enabled
pg_checkpoint_newest_commit_ts_xid counter job, ins, ip, instance, cls Latest checkpoint’s newestCommitTsXid
pg_checkpoint_next_multi_offset counter job, ins, ip, instance, cls Latest checkpoint’s NextMultiOffset
pg_checkpoint_next_multixact_id counter job, ins, ip, instance, cls Latest checkpoint’s NextMultiXactId
pg_checkpoint_next_oid counter job, ins, ip, instance, cls Latest checkpoint’s NextOID
pg_checkpoint_next_xid counter job, ins, ip, instance, cls Latest checkpoint’s NextXID xid
pg_checkpoint_next_xid_epoch counter job, ins, ip, instance, cls Latest checkpoint’s NextXID epoch
pg_checkpoint_oldest_active_xid counter job, ins, ip, instance, cls Latest checkpoint’s oldestActiveXID
pg_checkpoint_oldest_commit_ts_xid counter job, ins, ip, instance, cls Latest checkpoint’s oldestCommitTsXid
pg_checkpoint_oldest_multi_dbid gauge job, ins, ip, instance, cls Latest checkpoint’s oldestMulti’s DB OID
pg_checkpoint_oldest_multi_xid counter job, ins, ip, instance, cls Latest checkpoint’s oldestMultiXid
pg_checkpoint_oldest_xid counter job, ins, ip, instance, cls Latest checkpoint’s oldestXID
pg_checkpoint_oldest_xid_dbid gauge job, ins, ip, instance, cls Latest checkpoint’s oldestXID’s DB OID
pg_checkpoint_prev_tli counter job, ins, ip, instance, cls Latest checkpoint’s PrevTimeLineID
pg_checkpoint_redo_lsn counter job, ins, ip, instance, cls Latest checkpoint’s REDO location
pg_checkpoint_time counter job, ins, ip, instance, cls Time of latest checkpoint
pg_checkpoint_tli counter job, ins, ip, instance, cls Latest checkpoint’s TimeLineID
pg_conf_reload_time gauge job, ins, ip, instance, cls seconds since last configuration reload
pg_db_active_time counter datname, job, ins, ip, instance, cls Time spent executing SQL statements in this database, in seconds
pg_db_age gauge datname, job, ins, ip, instance, cls Age of database calculated from datfrozenxid
pg_db_allow_conn gauge datname, job, ins, ip, instance, cls If false(0) then no one can connect to this database.
pg_db_blk_read_time counter datname, job, ins, ip, instance, cls Time spent reading data file blocks by backends in this database, in seconds
pg_db_blk_write_time counter datname, job, ins, ip, instance, cls Time spent writing data file blocks by backends in this database, in seconds
pg_db_blks_access counter datname, job, ins, ip, instance, cls Number of times disk blocks that accessed read+hit
pg_db_blks_hit counter datname, job, ins, ip, instance, cls Number of times disk blocks were found already in the buffer cache
pg_db_blks_read counter datname, job, ins, ip, instance, cls Number of disk blocks read in this database
pg_db_cks_fail_time gauge datname, job, ins, ip, instance, cls Time at which the last data page checksum failure was detected in this database
pg_db_cks_fails counter datname, job, ins, ip, instance, cls Number of data page checksum failures detected in this database, -1 for not enabled
pg_db_confl_confl_bufferpin counter datname, job, ins, ip, instance, cls Number of queries in this database that have been canceled due to pinned buffers
pg_db_confl_confl_deadlock counter datname, job, ins, ip, instance, cls Number of queries in this database that have been canceled due to deadlocks
pg_db_confl_confl_lock counter datname, job, ins, ip, instance, cls Number of queries in this database that have been canceled due to lock timeouts
pg_db_confl_confl_snapshot counter datname, job, ins, ip, instance, cls Number of queries in this database that have been canceled due to old snapshots
pg_db_confl_confl_tablespace counter datname, job, ins, ip, instance, cls Number of queries in this database that have been canceled due to dropped tablespaces
pg_db_conflicts counter datname, job, ins, ip, instance, cls Number of queries canceled due to conflicts with recovery in this database
pg_db_conn_limit gauge datname, job, ins, ip, instance, cls Sets maximum number of concurrent connections that can be made to this database. -1 means no limit.
pg_db_datid gauge datname, job, ins, ip, instance, cls OID of the database
pg_db_deadlocks counter datname, job, ins, ip, instance, cls Number of deadlocks detected in this database
pg_db_frozen_xid gauge datname, job, ins, ip, instance, cls All transaction IDs before this one have been frozened
pg_db_is_template gauge datname, job, ins, ip, instance, cls If true(1), then this database can be cloned by any user with CREATEDB privileges
pg_db_ixact_time counter datname, job, ins, ip, instance, cls Time spent idling while in a transaction in this database, in seconds
pg_db_numbackends gauge datname, job, ins, ip, instance, cls Number of backends currently connected to this database
pg_db_reset_time counter datname, job, ins, ip, instance, cls Time at which database statistics were last reset
pg_db_session_time counter datname, job, ins, ip, instance, cls Time spent by database sessions in this database, in seconds
pg_db_sessions counter datname, job, ins, ip, instance, cls Total number of sessions established to this database
pg_db_sessions_abandoned counter datname, job, ins, ip, instance, cls Number of database sessions to this database that were terminated because connection to the client was lost
pg_db_sessions_fatal counter datname, job, ins, ip, instance, cls Number of database sessions to this database that were terminated by fatal errors
pg_db_sessions_killed counter datname, job, ins, ip, instance, cls Number of database sessions to this database that were terminated by operator intervention
pg_db_temp_bytes counter datname, job, ins, ip, instance, cls Total amount of data written to temporary files by queries in this database.
pg_db_temp_files counter datname, job, ins, ip, instance, cls Number of temporary files created by queries in this database
pg_db_tup_deleted counter datname, job, ins, ip, instance, cls Number of rows deleted by queries in this database
pg_db_tup_fetched counter datname, job, ins, ip, instance, cls Number of rows fetched by queries in this database
pg_db_tup_inserted counter datname, job, ins, ip, instance, cls Number of rows inserted by queries in this database
pg_db_tup_modified counter datname, job, ins, ip, instance, cls Number of rows modified by queries in this database
pg_db_tup_returned counter datname, job, ins, ip, instance, cls Number of rows returned by queries in this database
pg_db_tup_updated counter datname, job, ins, ip, instance, cls Number of rows updated by queries in this database
pg_db_xact_commit counter datname, job, ins, ip, instance, cls Number of transactions in this database that have been committed
pg_db_xact_rollback counter datname, job, ins, ip, instance, cls Number of transactions in this database that have been rolled back
pg_db_xact_total counter datname, job, ins, ip, instance, cls Number of transactions in this database
pg_downstream_count gauge state, job, ins, ip, instance, cls Count of corresponding state
pg_exporter_agent_up Unknown job, ins, ip, instance, cls N/A
pg_exporter_last_scrape_time gauge job, ins, ip, instance, cls seconds exporter spending on scrapping
pg_exporter_query_cache_ttl gauge datname, query, job, ins, ip, instance, cls times to live of query cache
pg_exporter_query_scrape_duration gauge datname, query, job, ins, ip, instance, cls seconds query spending on scrapping
pg_exporter_query_scrape_error_count gauge datname, query, job, ins, ip, instance, cls times the query failed
pg_exporter_query_scrape_hit_count gauge datname, query, job, ins, ip, instance, cls numbers been scrapped from this query
pg_exporter_query_scrape_metric_count gauge datname, query, job, ins, ip, instance, cls numbers of metrics been scrapped from this query
pg_exporter_query_scrape_total_count gauge datname, query, job, ins, ip, instance, cls times exporter server was scraped for metrics
pg_exporter_scrape_duration gauge job, ins, ip, instance, cls seconds exporter spending on scrapping
pg_exporter_scrape_error_count counter job, ins, ip, instance, cls times exporter was scraped for metrics and failed
pg_exporter_scrape_total_count counter job, ins, ip, instance, cls times exporter was scraped for metrics
pg_exporter_server_scrape_duration gauge datname, job, ins, ip, instance, cls seconds exporter server spending on scrapping
pg_exporter_server_scrape_error_count Unknown datname, job, ins, ip, instance, cls N/A
pg_exporter_server_scrape_total_count gauge datname, job, ins, ip, instance, cls times exporter server was scraped for metrics
pg_exporter_server_scrape_total_seconds gauge datname, job, ins, ip, instance, cls seconds exporter server spending on scrapping
pg_exporter_up gauge job, ins, ip, instance, cls always be 1 if your could retrieve metrics
pg_exporter_uptime gauge job, ins, ip, instance, cls seconds since exporter primary server inited
pg_flush_lsn counter job, ins, ip, instance, cls primary only, location of current wal syncing
pg_func_calls counter datname, funcname, job, ins, ip, instance, cls Number of times this function has been called
pg_func_self_time counter datname, funcname, job, ins, ip, instance, cls Total time spent in this function itself, not including other functions called by it, in ms
pg_func_total_time counter datname, funcname, job, ins, ip, instance, cls Total time spent in this function and all other functions called by it, in ms
pg_in_recovery gauge job, ins, ip, instance, cls server is in recovery mode? 1 for yes 0 for no
pg_index_idx_blks_hit counter datname, relname, job, ins, relid, ip, instance, cls, idxname Number of buffer hits in this index
pg_index_idx_blks_read counter datname, relname, job, ins, relid, ip, instance, cls, idxname Number of disk blocks read from this index
pg_index_idx_scan counter datname, relname, job, ins, relid, ip, instance, cls, idxname Number of index scans initiated on this index
pg_index_idx_tup_fetch counter datname, relname, job, ins, relid, ip, instance, cls, idxname Number of live table rows fetched by simple index scans using this index
pg_index_idx_tup_read counter datname, relname, job, ins, relid, ip, instance, cls, idxname Number of index entries returned by scans on this index
pg_index_relpages gauge datname, relname, job, ins, relid, ip, instance, cls, idxname Size of the on-disk representation of this index in pages
pg_index_reltuples gauge datname, relname, job, ins, relid, ip, instance, cls, idxname Estimate relation tuples
pg_insert_lsn counter job, ins, ip, instance, cls primary only, location of current wal inserting
pg_io_evictions counter type, job, ins, object, ip, context, instance, cls Number of times a block has been written out from a shared or local buffer
pg_io_extend_time counter type, job, ins, object, ip, context, instance, cls Time spent in extend operations in seconds
pg_io_extends counter type, job, ins, object, ip, context, instance, cls Number of relation extend operations, each of the size specified in op_bytes.
pg_io_fsync_time counter type, job, ins, object, ip, context, instance, cls Time spent in fsync operations in seconds
pg_io_fsyncs counter type, job, ins, object, ip, context, instance, cls Number of fsync calls. These are only tracked in context normal
pg_io_hits counter type, job, ins, object, ip, context, instance, cls The number of times a desired block was found in a shared buffer.
pg_io_op_bytes gauge type, job, ins, object, ip, context, instance, cls The number of bytes per unit of I/O read, written, or extended. 8192 by default
pg_io_read_time counter type, job, ins, object, ip, context, instance, cls Time spent in read operations in seconds
pg_io_reads counter type, job, ins, object, ip, context, instance, cls Number of read operations, each of the size specified in op_bytes.
pg_io_reset_time gauge type, job, ins, object, ip, context, instance, cls Timestamp at which these statistics were last reset
pg_io_reuses counter type, job, ins, object, ip, context, instance, cls The number of times an existing buffer in reused
pg_io_write_time counter type, job, ins, object, ip, context, instance, cls Time spent in write operations in seconds
pg_io_writeback_time counter type, job, ins, object, ip, context, instance, cls Time spent in writeback operations in seconds
pg_io_writebacks counter type, job, ins, object, ip, context, instance, cls Number of units of size op_bytes which the process requested the kernel write out to permanent storage.
pg_io_writes counter type, job, ins, object, ip, context, instance, cls Number of write operations, each of the size specified in op_bytes.
pg_is_in_recovery gauge job, ins, ip, instance, cls 1 if in recovery mode
pg_is_wal_replay_paused gauge job, ins, ip, instance, cls 1 if wal play paused
pg_lag gauge job, ins, ip, instance, cls replica only, replication lag in seconds
pg_last_replay_time gauge job, ins, ip, instance, cls time when last transaction been replayed
pg_lock_count gauge datname, job, ins, ip, mode, instance, cls Number of locks of corresponding mode and database
pg_lsn counter job, ins, ip, instance, cls log sequence number, current write location
pg_meta_info gauge cls, extensions, version, job, ins, primary_conninfo, conf_path, hba_path, ip, cluster_id, instance, listen_port, wal_level, ver_num, cluster_name, data_dir constant 1
pg_query_calls counter datname, query, job, ins, ip, instance, cls Number of times the statement was executed
pg_query_exec_time counter datname, query, job, ins, ip, instance, cls Total time spent executing the statement, in seconds
pg_query_io_time counter datname, query, job, ins, ip, instance, cls Total time the statement spent reading and writing blocks, in seconds
pg_query_rows counter datname, query, job, ins, ip, instance, cls Total number of rows retrieved or affected by the statement
pg_query_sblk_dirtied counter datname, query, job, ins, ip, instance, cls Total number of shared blocks dirtied by the statement
pg_query_sblk_hit counter datname, query, job, ins, ip, instance, cls Total number of shared block cache hits by the statement
pg_query_sblk_read counter datname, query, job, ins, ip, instance, cls Total number of shared blocks read by the statement
pg_query_sblk_written counter datname, query, job, ins, ip, instance, cls Total number of shared blocks written by the statement
pg_query_wal_bytes counter datname, query, job, ins, ip, instance, cls Total amount of WAL bytes generated by the statement
pg_receive_lsn counter job, ins, ip, instance, cls replica only, location of wal synced to disk
pg_recovery_backup_end_lsn counter job, ins, ip, instance, cls Backup end location
pg_recovery_backup_start_lsn counter job, ins, ip, instance, cls Backup start location
pg_recovery_min_lsn counter job, ins, ip, instance, cls Minimum recovery ending location
pg_recovery_min_timeline counter job, ins, ip, instance, cls Min recovery ending loc’s timeline
pg_recovery_prefetch_block_distance gauge job, ins, ip, instance, cls How many blocks ahead the prefetcher is looking
pg_recovery_prefetch_hit counter job, ins, ip, instance, cls Number of blocks not prefetched because they were already in the buffer pool
pg_recovery_prefetch_io_depth gauge job, ins, ip, instance, cls How many prefetches have been initiated but are not yet known to have completed
pg_recovery_prefetch_prefetch counter job, ins, ip, instance, cls Number of blocks prefetched because they were not in the buffer pool
pg_recovery_prefetch_reset_time counter job, ins, ip, instance, cls Time at which these recovery prefetch statistics were last reset
pg_recovery_prefetch_skip_fpw gauge job, ins, ip, instance, cls Number of blocks not prefetched because a full page image was included in the WAL
pg_recovery_prefetch_skip_init counter job, ins, ip, instance, cls Number of blocks not prefetched because they would be zero-initialized
pg_recovery_prefetch_skip_new counter job, ins, ip, instance, cls Number of blocks not prefetched because they didn’t exist yet
pg_recovery_prefetch_skip_rep counter job, ins, ip, instance, cls Number of blocks not prefetched because they were already recently prefetched
pg_recovery_prefetch_wal_distance gauge job, ins, ip, instance, cls How many bytes ahead the prefetcher is looking
pg_recovery_require_record gauge job, ins, ip, instance, cls End-of-backup record required
pg_recv_flush_lsn counter state, slot_name, job, ins, ip, instance, cls, sender_host, sender_port Last write-ahead log location already received and flushed to disk
pg_recv_flush_tli counter state, slot_name, job, ins, ip, instance, cls, sender_host, sender_port Timeline number of last write-ahead log location received and flushed to disk
pg_recv_init_lsn counter state, slot_name, job, ins, ip, instance, cls, sender_host, sender_port First write-ahead log location used when WAL receiver is started
pg_recv_init_tli counter state, slot_name, job, ins, ip, instance, cls, sender_host, sender_port First timeline number used when WAL receiver is started
pg_recv_msg_recv_time gauge state, slot_name, job, ins, ip, instance, cls, sender_host, sender_port Receipt time of last message received from origin WAL sender
pg_recv_msg_send_time gauge state, slot_name, job, ins, ip, instance, cls, sender_host, sender_port Send time of last message received from origin WAL sender
pg_recv_pid gauge state, slot_name, job, ins, ip, instance, cls, sender_host, sender_port Process ID of the WAL receiver process
pg_recv_reported_lsn counter state, slot_name, job, ins, ip, instance, cls, sender_host, sender_port Last write-ahead log location reported to origin WAL sender
pg_recv_reported_time gauge state, slot_name, job, ins, ip, instance, cls, sender_host, sender_port Time of last write-ahead log location reported to origin WAL sender
pg_recv_time gauge state, slot_name, job, ins, ip, instance, cls, sender_host, sender_port Time of current snapshot
pg_recv_write_lsn counter state, slot_name, job, ins, ip, instance, cls, sender_host, sender_port Last write-ahead log location already received and written to disk, but not flushed.
pg_relkind_count gauge datname, job, ins, ip, instance, cls, relkind Number of relations of corresponding relkind
pg_repl_backend_xmin counter pid, usename, address, job, ins, appname, ip, instance, cls This standby’s xmin horizon reported by hot_standby_feedback.
pg_repl_client_port gauge pid, usename, address, job, ins, appname, ip, instance, cls TCP port number that the client is using for communication with this WAL sender, or -1 if a Unix socket is used
pg_repl_flush_diff gauge pid, usename, address, job, ins, appname, ip, instance, cls Last log position flushed to disk by this standby server diff with current lsn
pg_repl_flush_lag gauge pid, usename, address, job, ins, appname, ip, instance, cls Time elapsed between flushing recent WAL locally and receiving notification that this standby server has written and flushed it
pg_repl_flush_lsn counter pid, usename, address, job, ins, appname, ip, instance, cls Last write-ahead log location flushed to disk by this standby server
pg_repl_launch_time counter pid, usename, address, job, ins, appname, ip, instance, cls Time when this process was started, i.e., when the client connected to this WAL sender
pg_repl_lsn counter pid, usename, address, job, ins, appname, ip, instance, cls Current log position on this server
pg_repl_replay_diff gauge pid, usename, address, job, ins, appname, ip, instance, cls Last log position replayed into the database on this standby server diff with current lsn
pg_repl_replay_lag gauge pid, usename, address, job, ins, appname, ip, instance, cls Time elapsed between flushing recent WAL locally and receiving notification that this standby server has written, flushed and applied it
pg_repl_replay_lsn counter pid, usename, address, job, ins, appname, ip, instance, cls Last write-ahead log location replayed into the database on this standby server
pg_repl_reply_time gauge pid, usename, address, job, ins, appname, ip, instance, cls Send time of last reply message received from standby server
pg_repl_sent_diff gauge pid, usename, address, job, ins, appname, ip, instance, cls Last log position sent to this standby server diff with current lsn
pg_repl_sent_lsn counter pid, usename, address, job, ins, appname, ip, instance, cls Last write-ahead log location sent on this connection
pg_repl_state gauge pid, usename, address, job, ins, appname, ip, instance, cls Current WAL sender encoded state 0-4 for streaming startup catchup backup stopping
pg_repl_sync_priority gauge pid, usename, address, job, ins, appname, ip, instance, cls Priority of this standby server for being chosen as the synchronous standby
pg_repl_sync_state gauge pid, usename, address, job, ins, appname, ip, instance, cls Encoded synchronous state of this standby server, 0-3 for async potential sync quorum
pg_repl_time counter pid, usename, address, job, ins, appname, ip, instance, cls Current timestamp in unix epoch
pg_repl_write_diff gauge pid, usename, address, job, ins, appname, ip, instance, cls Last log position written to disk by this standby server diff with current lsn
pg_repl_write_lag gauge pid, usename, address, job, ins, appname, ip, instance, cls Time elapsed between flushing recent WAL locally and receiving notification that this standby server has written it
pg_repl_write_lsn counter pid, usename, address, job, ins, appname, ip, instance, cls Last write-ahead log location written to disk by this standby server
pg_replay_lsn counter job, ins, ip, instance, cls replica only, location of wal applied
pg_seq_blks_hit counter datname, job, ins, ip, instance, cls, seqname Number of buffer hits in this sequence
pg_seq_blks_read counter datname, job, ins, ip, instance, cls, seqname Number of disk blocks read from this sequence
pg_seq_last_value counter datname, job, ins, ip, instance, cls, seqname The last sequence value written to disk
pg_setting_block_size gauge job, ins, ip, instance, cls pg page block size, 8192 by default
pg_setting_data_checksums gauge job, ins, ip, instance, cls whether data checksum is enabled, 1 enabled 0 disabled
pg_setting_max_connections gauge job, ins, ip, instance, cls number of concurrent connections to the database server
pg_setting_max_locks_per_transaction gauge job, ins, ip, instance, cls no more than this many distinct objects can be locked at any one time
pg_setting_max_prepared_transactions gauge job, ins, ip, instance, cls maximum number of transactions that can be in the prepared state simultaneously
pg_setting_max_replication_slots gauge job, ins, ip, instance, cls maximum number of replication slots
pg_setting_max_wal_senders gauge job, ins, ip, instance, cls maximum number of concurrent connections from standby servers
pg_setting_max_worker_processes gauge job, ins, ip, instance, cls maximum number of background processes that the system can support
pg_setting_wal_log_hints gauge job, ins, ip, instance, cls whether wal_log_hints is enabled, 1 enabled 0 disabled
pg_size_bytes gauge datname, job, ins, ip, instance, cls File size in bytes
pg_slot_active gauge slot_name, job, ins, ip, instance, cls True(1) if this slot is currently actively being used
pg_slot_catalog_xmin counter slot_name, job, ins, ip, instance, cls The oldest transaction affecting the system catalogs that this slot needs the database to retain.
pg_slot_confirm_lsn counter slot_name, job, ins, ip, instance, cls The address (LSN) up to which the logical slot’s consumer has confirmed receiving data.
pg_slot_reset_time counter slot_name, job, ins, ip, instance, cls When statistics were last reset
pg_slot_restart_lsn counter slot_name, job, ins, ip, instance, cls The address (LSN) of oldest WAL which still might be required by the consumer of this slot
pg_slot_retained_bytes gauge slot_name, job, ins, ip, instance, cls Size of bytes that retained for this slot
pg_slot_safe_wal_size gauge slot_name, job, ins, ip, instance, cls bytes that can be written to WAL which will not make slot into lost
pg_slot_spill_bytes counter slot_name, job, ins, ip, instance, cls Bytes that spilled to disk due to logical decode mem exceeding
pg_slot_spill_count counter slot_name, job, ins, ip, instance, cls Xacts that spilled to disk due to logical decode mem exceeding (a xact can be spilled multiple times)
pg_slot_spill_txns counter slot_name, job, ins, ip, instance, cls Xacts that spilled to disk due to logical decode mem exceeding (subtrans included)
pg_slot_stream_bytes counter slot_name, job, ins, ip, instance, cls Bytes that streamed to decoding output plugin after mem exceed
pg_slot_stream_count counter slot_name, job, ins, ip, instance, cls Xacts that streamed to decoding output plugin after mem exceed (a xact can be streamed multiple times)
pg_slot_stream_txns counter slot_name, job, ins, ip, instance, cls Xacts that streamed to decoding output plugin after mem exceed
pg_slot_temporary gauge slot_name, job, ins, ip, instance, cls True(1) if this is a temporary replication slot.
pg_slot_total_bytes counter slot_name, job, ins, ip, instance, cls Number of decoded bytes sent to the decoding output plugin for this slot
pg_slot_total_txns counter slot_name, job, ins, ip, instance, cls Number of decoded xacts sent to the decoding output plugin for this slot
pg_slot_wal_status gauge slot_name, job, ins, ip, instance, cls WAL reserve status 0-3 means reserved,extended,unreserved,lost, -1 means other
pg_slot_xmin counter slot_name, job, ins, ip, instance, cls The oldest transaction that this slot needs the database to retain.
pg_slru_blks_exists counter job, ins, ip, instance, cls Number of blocks checked for existence for this SLRU
pg_slru_blks_hit counter job, ins, ip, instance, cls Number of times disk blocks were found already in the SLRU, so that a read was not necessary
pg_slru_blks_read counter job, ins, ip, instance, cls Number of disk blocks read for this SLRU
pg_slru_blks_written counter job, ins, ip, instance, cls Number of disk blocks written for this SLRU
pg_slru_blks_zeroed counter job, ins, ip, instance, cls Number of blocks zeroed during initializations
pg_slru_flushes counter job, ins, ip, instance, cls Number of flushes of dirty data for this SLRU
pg_slru_reset_time counter job, ins, ip, instance, cls Time at which these statistics were last reset
pg_slru_truncates counter job, ins, ip, instance, cls Number of truncates for this SLRU
pg_ssl_disabled gauge job, ins, ip, instance, cls Number of client connection that does not use ssl
pg_ssl_enabled gauge job, ins, ip, instance, cls Number of client connection that use ssl
pg_sync_standby_enabled gauge job, ins, ip, names, instance, cls Synchronous commit enabled, 1 if enabled, 0 if disabled
pg_table_age gauge datname, relname, job, ins, ip, instance, cls Age of this table in vacuum cycles
pg_table_analyze_count counter datname, relname, job, ins, ip, instance, cls Number of times this table has been manually analyzed
pg_table_autoanalyze_count counter datname, relname, job, ins, ip, instance, cls Number of times this table has been analyzed by the autovacuum daemon
pg_table_autovacuum_count counter datname, relname, job, ins, ip, instance, cls Number of times this table has been vacuumed by the autovacuum daemon
pg_table_frozenxid counter datname, relname, job, ins, ip, instance, cls All txid before this have been frozen on this table
pg_table_heap_blks_hit counter datname, relname, job, ins, ip, instance, cls Number of buffer hits in this table
pg_table_heap_blks_read counter datname, relname, job, ins, ip, instance, cls Number of disk blocks read from this table
pg_table_idx_blks_hit counter datname, relname, job, ins, ip, instance, cls Number of buffer hits in all indexes on this table
pg_table_idx_blks_read counter datname, relname, job, ins, ip, instance, cls Number of disk blocks read from all indexes on this table
pg_table_idx_scan counter datname, relname, job, ins, ip, instance, cls Number of index scans initiated on this table
pg_table_idx_tup_fetch counter datname, relname, job, ins, ip, instance, cls Number of live rows fetched by index scans
pg_table_kind gauge datname, relname, job, ins, ip, instance, cls Relation kind r/table/114
pg_table_n_dead_tup gauge datname, relname, job, ins, ip, instance, cls Estimated number of dead rows
pg_table_n_ins_since_vacuum gauge datname, relname, job, ins, ip, instance, cls Estimated number of rows inserted since this table was last vacuumed
pg_table_n_live_tup gauge datname, relname, job, ins, ip, instance, cls Estimated number of live rows
pg_table_n_mod_since_analyze gauge datname, relname, job, ins, ip, instance, cls Estimated number of rows modified since this table was last analyzed
pg_table_n_tup_del counter datname, relname, job, ins, ip, instance, cls Number of rows deleted
pg_table_n_tup_hot_upd counter datname, relname, job, ins, ip, instance, cls Number of rows HOT updated (i.e with no separate index update required)
pg_table_n_tup_ins counter datname, relname, job, ins, ip, instance, cls Number of rows inserted
pg_table_n_tup_mod counter datname, relname, job, ins, ip, instance, cls Number of rows modified (insert + update + delete)
pg_table_n_tup_newpage_upd counter datname, relname, job, ins, ip, instance, cls Number of rows updated where the successor version goes onto a new heap page
pg_table_n_tup_upd counter datname, relname, job, ins, ip, instance, cls Number of rows updated (includes HOT updated rows)
pg_table_ncols gauge datname, relname, job, ins, ip, instance, cls Number of columns in the table
pg_table_pages gauge datname, relname, job, ins, ip, instance, cls Size of the on-disk representation of this table in pages
pg_table_relid gauge datname, relname, job, ins, ip, instance, cls Relation oid of this table
pg_table_seq_scan counter datname, relname, job, ins, ip, instance, cls Number of sequential scans initiated on this table
pg_table_seq_tup_read counter datname, relname, job, ins, ip, instance, cls Number of live rows fetched by sequential scans
pg_table_size_bytes gauge datname, relname, job, ins, ip, instance, cls Total bytes of this table (including toast, index, toast index)
pg_table_size_indexsize gauge datname, relname, job, ins, ip, instance, cls Bytes of all related indexes of this table
pg_table_size_relsize gauge datname, relname, job, ins, ip, instance, cls Bytes of this table itself (main, vm, fsm)
pg_table_size_toastsize gauge datname, relname, job, ins, ip, instance, cls Bytes of toast tables of this table
pg_table_tbl_scan counter datname, relname, job, ins, ip, instance, cls Number of scans initiated on this table
pg_table_tup_read counter datname, relname, job, ins, ip, instance, cls Number of live rows fetched by scans
pg_table_tuples counter datname, relname, job, ins, ip, instance, cls All txid before this have been frozen on this table
pg_table_vacuum_count counter datname, relname, job, ins, ip, instance, cls Number of times this table has been manually vacuumed (not counting VACUUM FULL)
pg_timestamp gauge job, ins, ip, instance, cls database current timestamp
pg_up gauge job, ins, ip, instance, cls last scrape was able to connect to the server: 1 for yes, 0 for no
pg_uptime gauge job, ins, ip, instance, cls seconds since postmaster start
pg_version gauge job, ins, ip, instance, cls server version number
pg_wait_count gauge datname, job, ins, event, ip, instance, cls Count of WaitEvent on target database
pg_wal_buffers_full counter job, ins, ip, instance, cls Number of times WAL data was written to disk because WAL buffers became full
pg_wal_bytes counter job, ins, ip, instance, cls Total amount of WAL generated in bytes
pg_wal_fpi counter job, ins, ip, instance, cls Total number of WAL full page images generated
pg_wal_records counter job, ins, ip, instance, cls Total number of WAL records generated
pg_wal_reset_time counter job, ins, ip, instance, cls When statistics were last reset
pg_wal_sync counter job, ins, ip, instance, cls Number of times WAL files were synced to disk via issue_xlog_fsync request
pg_wal_sync_time counter job, ins, ip, instance, cls Total amount of time spent syncing WAL files to disk via issue_xlog_fsync request, in seconds
pg_wal_write counter job, ins, ip, instance, cls Number of times WAL buffers were written out to disk via XLogWrite request.
pg_wal_write_time counter job, ins, ip, instance, cls Total amount of time spent writing WAL buffers to disk via XLogWrite request in seconds
pg_write_lsn counter job, ins, ip, instance, cls primary only, location of current wal writing
pg_xact_xmax counter job, ins, ip, instance, cls First as-yet-unassigned txid. txid >= this are invisible.
pg_xact_xmin counter job, ins, ip, instance, cls Earliest txid that is still active
pg_xact_xnum gauge job, ins, ip, instance, cls Current active transaction count
pgbouncer:cls:load1 Unknown job, cls N/A
pgbouncer:cls:load15 Unknown job, cls N/A
pgbouncer:cls:load5 Unknown job, cls N/A
pgbouncer:db:conn_usage Unknown datname, job, ins, ip, instance, host, cls, real_datname, port N/A
pgbouncer:db:conn_usage_reserve Unknown datname, job, ins, ip, instance, host, cls, real_datname, port N/A
pgbouncer:db:pool_current_conn Unknown datname, job, ins, ip, instance, host, cls, real_datname, port N/A
pgbouncer:db:pool_disabled Unknown datname, job, ins, ip, instance, host, cls, real_datname, port N/A
pgbouncer:db:pool_max_conn Unknown datname, job, ins, ip, instance, host, cls, real_datname, port N/A
pgbouncer:db:pool_paused Unknown datname, job, ins, ip, instance, host, cls, real_datname, port N/A
pgbouncer:db:pool_reserve_size Unknown datname, job, ins, ip, instance, host, cls, real_datname, port N/A
pgbouncer:db:pool_size Unknown datname, job, ins, ip, instance, host, cls, real_datname, port N/A
pgbouncer:ins:free_clients Unknown job, ins, ip, instance, cls N/A
pgbouncer:ins:free_servers Unknown job, ins, ip, instance, cls N/A
pgbouncer:ins:load1 Unknown job, ins, ip, instance, cls N/A
pgbouncer:ins:load15 Unknown job, ins, ip, instance, cls N/A
pgbouncer:ins:load5 Unknown job, ins, ip, instance, cls N/A
pgbouncer:ins:login_clients Unknown job, ins, ip, instance, cls N/A
pgbouncer:ins:pool_databases Unknown job, ins, ip, instance, cls N/A
pgbouncer:ins:pool_users Unknown job, ins, ip, instance, cls N/A
pgbouncer:ins:pools Unknown job, ins, ip, instance, cls N/A
pgbouncer:ins:used_clients Unknown job, ins, ip, instance, cls N/A
pgbouncer_database_current_connections gauge datname, job, ins, ip, instance, host, cls, real_datname, port Current number of connections for this database
pgbouncer_database_disabled gauge datname, job, ins, ip, instance, host, cls, real_datname, port True(1) if this database is currently disabled, else 0
pgbouncer_database_max_connections gauge datname, job, ins, ip, instance, host, cls, real_datname, port Maximum number of allowed connections for this database
pgbouncer_database_min_pool_size gauge datname, job, ins, ip, instance, host, cls, real_datname, port Minimum number of server connections
pgbouncer_database_paused gauge datname, job, ins, ip, instance, host, cls, real_datname, port True(1) if this database is currently paused, else 0
pgbouncer_database_pool_size gauge datname, job, ins, ip, instance, host, cls, real_datname, port Maximum number of server connections
pgbouncer_database_reserve_pool gauge datname, job, ins, ip, instance, host, cls, real_datname, port Maximum number of additional connections for this database
pgbouncer_exporter_agent_up Unknown job, ins, ip, instance, cls N/A
pgbouncer_exporter_last_scrape_time gauge job, ins, ip, instance, cls seconds exporter spending on scrapping
pgbouncer_exporter_query_cache_ttl gauge datname, query, job, ins, ip, instance, cls times to live of query cache
pgbouncer_exporter_query_scrape_duration gauge datname, query, job, ins, ip, instance, cls seconds query spending on scrapping
pgbouncer_exporter_query_scrape_error_count gauge datname, query, job, ins, ip, instance, cls times the query failed
pgbouncer_exporter_query_scrape_hit_count gauge datname, query, job, ins, ip, instance, cls numbers been scrapped from this query
pgbouncer_exporter_query_scrape_metric_count gauge datname, query, job, ins, ip, instance, cls numbers of metrics been scrapped from this query
pgbouncer_exporter_query_scrape_total_count gauge datname, query, job, ins, ip, instance, cls times exporter server was scraped for metrics
pgbouncer_exporter_scrape_duration gauge job, ins, ip, instance, cls seconds exporter spending on scrapping
pgbouncer_exporter_scrape_error_count counter job, ins, ip, instance, cls times exporter was scraped for metrics and failed
pgbouncer_exporter_scrape_total_count counter job, ins, ip, instance, cls times exporter was scraped for metrics
pgbouncer_exporter_server_scrape_duration gauge datname, job, ins, ip, instance, cls seconds exporter server spending on scrapping
pgbouncer_exporter_server_scrape_total_count gauge datname, job, ins, ip, instance, cls times exporter server was scraped for metrics
pgbouncer_exporter_server_scrape_total_seconds gauge datname, job, ins, ip, instance, cls seconds exporter server spending on scrapping
pgbouncer_exporter_up gauge job, ins, ip, instance, cls always be 1 if your could retrieve metrics
pgbouncer_exporter_uptime gauge job, ins, ip, instance, cls seconds since exporter primary server inited
pgbouncer_in_recovery gauge job, ins, ip, instance, cls server is in recovery mode? 1 for yes 0 for no
pgbouncer_list_items gauge job, ins, ip, instance, list, cls Number of corresponding pgbouncer object
pgbouncer_pool_active_cancel_clients gauge datname, job, ins, ip, instance, user, cls, pool_mode Client connections that have forwarded query cancellations to the server and are waiting for the server response.
pgbouncer_pool_active_cancel_servers gauge datname, job, ins, ip, instance, user, cls, pool_mode Server connections that are currently forwarding a cancel request
pgbouncer_pool_active_clients gauge datname, job, ins, ip, instance, user, cls, pool_mode Client connections that are linked to server connection and can process queries
pgbouncer_pool_active_servers gauge datname, job, ins, ip, instance, user, cls, pool_mode Server connections that are linked to a client
pgbouncer_pool_cancel_clients gauge datname, job, ins, ip, instance, user, cls, pool_mode Client connections that have not forwarded query cancellations to the server yet.
pgbouncer_pool_cancel_servers gauge datname, job, ins, ip, instance, user, cls, pool_mode cancel requests have completed that were sent to cancel a query on this server
pgbouncer_pool_idle_servers gauge datname, job, ins, ip, instance, user, cls, pool_mode Server connections that are unused and immediately usable for client queries
pgbouncer_pool_login_servers gauge datname, job, ins, ip, instance, user, cls, pool_mode Server connections currently in the process of logging in
pgbouncer_pool_maxwait gauge datname, job, ins, ip, instance, user, cls, pool_mode How long the first(oldest) client in the queue has waited, in seconds, key metric
pgbouncer_pool_maxwait_us gauge datname, job, ins, ip, instance, user, cls, pool_mode Microsecond part of the maximum waiting time.
pgbouncer_pool_tested_servers gauge datname, job, ins, ip, instance, user, cls, pool_mode Server connections that are currently running reset or check query
pgbouncer_pool_used_servers gauge datname, job, ins, ip, instance, user, cls, pool_mode Server connections that have been idle for more than server_check_delay (means have to run check query)
pgbouncer_pool_waiting_clients gauge datname, job, ins, ip, instance, user, cls, pool_mode Client connections that have sent queries but have not yet got a server connection
pgbouncer_stat_avg_query_count gauge datname, job, ins, ip, instance, cls Average queries per second in last stat period
pgbouncer_stat_avg_query_time gauge datname, job, ins, ip, instance, cls Average query duration, in seconds
pgbouncer_stat_avg_recv gauge datname, job, ins, ip, instance, cls Average received (from clients) bytes per second
pgbouncer_stat_avg_sent gauge datname, job, ins, ip, instance, cls Average sent (to clients) bytes per second
pgbouncer_stat_avg_wait_time gauge datname, job, ins, ip, instance, cls Time spent by clients waiting for a server, in seconds (average per second).
pgbouncer_stat_avg_xact_count gauge datname, job, ins, ip, instance, cls Average transactions per second in last stat period
pgbouncer_stat_avg_xact_time gauge datname, job, ins, ip, instance, cls Average transaction duration, in seconds
pgbouncer_stat_total_query_count gauge datname, job, ins, ip, instance, cls Total number of SQL queries pooled by pgbouncer
pgbouncer_stat_total_query_time counter datname, job, ins, ip, instance, cls Total number of seconds spent when executing queries
pgbouncer_stat_total_received counter datname, job, ins, ip, instance, cls Total volume in bytes of network traffic received by pgbouncer
pgbouncer_stat_total_sent counter datname, job, ins, ip, instance, cls Total volume in bytes of network traffic sent by pgbouncer
pgbouncer_stat_total_wait_time counter datname, job, ins, ip, instance, cls Time spent by clients waiting for a server, in seconds
pgbouncer_stat_total_xact_count gauge datname, job, ins, ip, instance, cls Total number of SQL transactions pooled by pgbouncer
pgbouncer_stat_total_xact_time counter datname, job, ins, ip, instance, cls Total number of seconds spent when in a transaction
pgbouncer_up gauge job, ins, ip, instance, cls last scrape was able to connect to the server: 1 for yes, 0 for no
pgbouncer_version gauge job, ins, ip, instance, cls server version number
process_cpu_seconds_total counter job, ins, ip, instance, cls Total user and system CPU time spent in seconds.
process_max_fds gauge job, ins, ip, instance, cls Maximum number of open file descriptors.
process_open_fds gauge job, ins, ip, instance, cls Number of open file descriptors.
process_resident_memory_bytes gauge job, ins, ip, instance, cls Resident memory size in bytes.
process_start_time_seconds gauge job, ins, ip, instance, cls Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes gauge job, ins, ip, instance, cls Virtual memory size in bytes.
process_virtual_memory_max_bytes gauge job, ins, ip, instance, cls Maximum amount of virtual memory available in bytes.
promhttp_metric_handler_requests_in_flight gauge job, ins, ip, instance, cls Current number of scrapes being served.
promhttp_metric_handler_requests_total counter code, job, ins, ip, instance, cls Total number of scrapes by HTTP status code.
scrape_duration_seconds Unknown job, ins, ip, instance, cls N/A
scrape_samples_post_metric_relabeling Unknown job, ins, ip, instance, cls N/A
scrape_samples_scraped Unknown job, ins, ip, instance, cls N/A
scrape_series_added Unknown job, ins, ip, instance, cls N/A
up Unknown job, ins, ip, instance, cls N/A

5.15 - FAQ

Pigsty PGSQL module frequently asked questions

ABORT due to postgres exists

Set pg_clean = true and pg_safeguard = false to force clean postgres data during pgsql.yml

This happens when you run pgsql.yml on a node with postgres running, and pg_clean is set to false.

If pg_clean is true (and the pg_safeguard is false, too), the pgsql.yml playbook will remove the existing pgsql data and re-init it as a new one, which makes this playbook fully idempotent.

You can still purge the existing PostgreSQL data by using a special task tag pg_purge

./pgsql.yml -t pg_clean      # honor pg_clean and pg_safeguard
./pgsql.yml -t pg_purge      # ignore pg_clean and pg_safeguard

ABORT due to pg_safeguard enabled

Disable pg_safeguard to remove the Postgres instance.

If pg_safeguard is enabled, you can not remove the running pgsql instance with bin/pgsql-rm and pgsql-rm.yml playbook.

To disable pg_safeguard, you can set pg_safeguard to false in the inventory or pass -e pg_safeguard=false as cli arg to the playbook:

./pgsql-rm.yml -e pg_safeguard=false -l <cls_to_remove>    # force override pg_safeguard

Fail to wait for postgres/patroni primary

There are several possible reasons for this error, and you need to check the system logs to determine the actual cause.

This usually happens when the cluster is misconfigured, or the previous primary is improperly removed. (e.g., trash metadata in DCS with the same cluster name).

You must check /pg/log/* to find the reason.

To delete trash meta from etcd, you can use etcdctl del --prefix /pg/<cls>, do with caution!

  • 1: Misconfiguration. Identify the incorrect parameters, modify them, and apply the changes.
  • 2: Another cluster with the same cls name already exists in the deployment
  • 3: The previous cluster on the node, or previous cluster with same name was not correctly removed.
    • To remove obsolete cluster metadata, you can use etcdctl del --prefix /pg/<cls> to manually delete the residual data.
  • 4: The RPM packages related to your PostgreSQL or node were not successfully installed.
  • 5: Your Watchdog kernel module was not correctly enabled or loaded, but required.
  • 6: The locale or ctype specified pg_lc_collate and pg_lc_ctype does not exist in OS

Feel free to submit an issue or seek help from the community.


Fail to wait for postgres/patroni replica

Failed Immediately: Usually, this happens because of misconfiguration, network issues, broken DCS metadata, etc…, you have to inspect /pg/log to find out the actual reason.

Failed After a While: This may be due to source instance data corruption. Check PGSQL FAQ: How to create replicas when data is corrupted?

Timeout: If the wait for postgres replica task takes 30min or more and fails due to timeout, This is common for a huge cluster (e.g., 1TB+, which may take hours to create a replica). In this case, the underlying creating replica procedure is still proceeding. You can check cluster status with pg list <cls> and wait until the replica catches up with the primary. Then continue the following tasks:

./pgsql.yml -t pg_hba,pg_backup,pgbouncer,pg_vip,pg_dns,pg_service,pg_exporter,pg_register -l <problematic_replica>

Install PostgreSQL 12 - 15

To install PostgreSQL 12 - 15, you have to set pg_version to 12, 13, 14, or 15 in the inventory. (usually at cluster level)

pg_version: 16                    # install pg 16 in this template
pg_libs: 'pg_stat_statements, auto_explain' # remove timescaledb from pg 16 beta
pg_extensions: []                 # missing pg16 extensions for now

How enable hugepage for PostgreSQL?

use node_hugepage_count and node_hugepage_ratio or /pg/bin/pg-tune-hugepage

If you plan to enable hugepage, consider using node_hugepage_count and node_hugepage_ratio and apply with ./node.yml -t node_tune .

It’s good to allocate enough hugepage before postgres start, and use pg_tune_hugepage to shrink them later.

If your postgres is already running, you can use /pg/bin/pg-tune-hugepage to enable hugepage on the fly. Note that this only works on PostgreSQL 15+

sync; echo 3 > /proc/sys/vm/drop_caches   # drop system cache (ready for performance impact)
sudo /pg/bin/pg-tune-hugepage             # write nr_hugepages to /etc/sysctl.d/hugepage.conf
pg restart <cls>                          # restart postgres to use hugepage

How to guarantee zero data loss during failover?

Use crit.yml template, or setting pg_rpo to 0, or config cluster with synchronous mode.

Consider using Sync Standby and Quorum Comit to guarantee 0 data loss during failover.


How to survive from disk full?

rm -rf /pg/dummy will free some emergency space.

The pg_dummy_filesize is set to 64MB by default. Consider increasing it to 8GB or larger in the production environment.

It will be placed on /pg/dummy same disk as the PGSQL main data disk. You can remove that file to free some emergency space. At least you can run some shell scripts on that node.


How to create replicas when data is corrupted?

Disable clonefrom on bad instances and reload patroni config.

Pigsty sets the cloneform: true tag on all instances’ patroni config, which marks the instance available for cloning replica.

If this instance has corrupt data files, you can set clonefrom: false to avoid pulling data from the evil instance. To do so:

$ vi /pg/bin/patroni.yml

tags:
  nofailover: false
  clonefrom: true      # ----------> change to false
  noloadbalance: false
  nosync: false
  version:  '15'
  spec: '4C.8G.50G'
  conf: 'oltp.yml'
  
$ systemctl reload patroni

How to create replicas when data is corrupted?

Disable clonefrom on bad instances and reload patroni config.

Pigsty sets the cloneform: true tag on all instances’ patroni config, which marks the instance available for cloning replica.

If this instance has corrupt data files, you can set clonefrom: false to avoid pulling data from the evil instance. To do so:

$ vi /pg/bin/patroni.yml

tags:
  nofailover: false
  clonefrom: true      # ----------> change to false
  noloadbalance: false
  nosync: false
  version:  '15'
  spec: '4C.8G.50G'
  conf: 'oltp.yml'
  
$ systemctl reload patroni

Performance impact of monitoring exporter

Not very much, 200ms per 10 ~ 15 seconds, won’t affect the database performance.

The default scrape interval for prometheus is 10s in pigsty, make sure the exporter can finish the scrape within that period.


How to monitor an existing PostgreSQL instance?

Check PGSQL Monitor for details.


How to remove monitor targets from prometheus?

./pgsql-rm.yml -t prometheus -l <cls>     # remove prometheus targets of cluster 'cls'

Or

bin/pgmon-rm <ins>     # shortcut for removing prometheus targets of pgsql instance 'ins'

6 - Module: INFRA

Independent modules which providers infrastructure services: NTP, DNS, and the modern observability stack ——Grafana & Prometheus

Pigsty has a battery-included, production-ready INFRA module, to provide ultimate observability.

Configuration | Administration | Playbook | Dashboard | Parameter


Overview

Each Pigsty deployment requires a set of infrastructure components to work properly. which including:

Component Port Domain Description
Nginx 80 h.pigsty Web Service Portal (YUM/APT Repo)
AlertManager 9093 a.pigsty Alert Aggregation and delivery
Prometheus 9090 p.pigsty Monitoring Time Series Database
Grafana 3000 g.pigsty Visualization Platform
Loki 3100 - Logging Collection Server
PushGateway 9091 - Collect One-Time Job Metrics
BlackboxExporter 9115 - Blackbox Probing
Dnsmasq 53 - DNS Server
Chronyd 123 - NTP Time Server
PostgreSQL 5432 - Pigsty CMDB & default database
Ansible - - Run playbooks

Pigsty will set up these components for you on infra nodes. You can expose them to the outside world by configuring the infra_portal parameter.

infra_portal:  # domain names and upstream servers
  home         : { domain: h.pigsty }
  grafana      : { domain: g.pigsty ,endpoint: "${admin_ip}:3000" , websocket: true }
  prometheus   : { domain: p.pigsty ,endpoint: "${admin_ip}:9090" }
  alertmanager : { domain: a.pigsty ,endpoint: "${admin_ip}:9093" }
  blackbox     : { endpoint: "${admin_ip}:9115" }
  loki         : { endpoint: "${admin_ip}:3100" }
  #minio        : { domain: sss.pigsty  ,endpoint: "${admin_ip}:9001" ,scheme: https ,websocket: true }

pigsty-arch.jpg


Configuration

To define an infra cluster, use the hard-coded group name infra in your inventory file.

You can use multiple nodes to deploy INFRA module, but at least one is required. You have to assign a unique infra_seq to each node.

# Single infra node
infra: { hosts: { 10.10.10.10: { infra_seq: 1 } }}

# Two INFRA node
infra:
  hosts:
    10.10.10.10: { infra_seq: 1 }
    10.10.10.11: { infra_seq: 2 }

Then you can init INFRA module with infra.yml playbook.


Administration

Here are some administration tasks related to INFRA module:


Install/Remove Infra Module

./infra.yml     # install infra/node module on `infra` group
./infra-rm.yml  # remove infra module from `infra` group

Manage Local Software Repo

./infra.yml -t repo             # setup local yum/apt repo

./infra.yml -t repo_dir         # create repo directory
./infra.yml -t repo_check       # check repo exists
./infra.yml -t repo_prepare     # use existing repo if exists
./infra.yml -t repo_build       # build repo from upstream if not exists
./infra.yml   -t repo_upstream  # handle upstream repo files in /etc/yum.repos.d or /etc/apt/sources.list.d
./infra.yml   -t repo_url_pkg   # download packages from internet defined by repo_url_packages
./infra.yml   -t repo_cache     # make upstream yum/apt cache
./infra.yml   -t repo_boot_pkg  # install bootstrap pkg such as createrepo_c,yum-utils,... (or dpkg-dev in debian/ubuntu)
./infra.yml   -t repo_pkg       # download packages & dependencies from upstream repo
./infra.yml   -t repo_create    # create a local yum repo with createrepo_c & modifyrepo_c
./infra.yml   -t repo_use       # add newly built repo
./infra.yml -t repo_nginx       # launch a nginx for repo if no nginx is serving

Manage Infra Component

您可以使用以下剧本子任务,管理 Infra节点 上的各个基础设施组件

./infra.yml -t infra_env      : env_dir, env_pg, env_var
./infra.yml -t infra_pkg      : infra_pkg, infra_pkg_pip
./infra.yml -t infra_user     : setup infra os user group
./infra.yml -t infra_cert     : issue cert for infra components
./infra.yml -t dns            : dns_config, dns_record, dns_launch
./infra.yml -t nginx          : nginx_config, nginx_cert, nginx_static, nginx_launch, nginx_exporter
./infra.yml -t prometheus     : prometheus_clean, prometheus_dir, prometheus_config, prometheus_launch, prometheus_reload
./infra.yml -t alertmanager   : alertmanager_config, alertmanager_launch
./infra.yml -t pushgateway    : pushgateway_config, pushgateway_launch
./infra.yml -t blackbox       : blackbox_config, blackbox_launch
./infra.yml -t grafana        : grafana_clean, grafana_config, grafana_plugin, grafana_launch, grafana_provision
./infra.yml -t loki           : loki clean, loki_dir, loki_config, loki_launch
./infra.yml -t infra_register : register infra components to prometheus
./infra.yml -t nginx_index                        # render Nginx homepage
./infra.yml -t nginx_config,nginx_reload          # render Nginx upstream server config
./infra.yml -t prometheus_conf,prometheus_reload  # render Prometheus main config and reload
./infra.yml -t prometheus_rule,prometheus_reload  # copy Prometheus rules & alert definition and reload
./infra.yml -t grafana_plugin                     # download Grafana plugins from the Internet

Playbook

  • install.yml : Install Pigsty on all nodes in one-pass
  • infra.yml : Init pigsty infrastructure on infra nodes
  • infra-rm.yml : Remove infrastructure components from infra nodes

asciicast


infra.yml

The playbook infra.yml will init pigsty infrastructure on infra nodes.

It will also install NODE module on infra nodes too.

Here are available subtasks:

# ca            : create self-signed CA on localhost files/pki
#   - ca_dir        : create CA directory
#   - ca_private    : generate ca private key: files/pki/ca/ca.key
#   - ca_cert       : signing ca cert: files/pki/ca/ca.crt
#
# id            : generate node identity
#
# repo          : bootstrap a local yum repo from internet or offline packages
#   - repo_dir      : create repo directory
#   - repo_check    : check repo exists
#   - repo_prepare  : use existing repo if exists
#   - repo_build    : build repo from upstream if not exists
#     - repo_upstream    : handle upstream repo files in /etc/yum.repos.d
#       - repo_remove    : remove existing repo file if repo_remove == true
#       - repo_add       : add upstream repo files to /etc/yum.repos.d
#     - repo_url_pkg     : download packages from internet defined by repo_url_packages
#     - repo_cache       : make upstream yum cache with yum makecache
#     - repo_boot_pkg    : install bootstrap pkg such as createrepo_c,yum-utils,...
#     - repo_pkg         : download packages & dependencies from upstream repo
#     - repo_create      : create a local yum repo with createrepo_c & modifyrepo_c
#     - repo_use         : add newly built repo into /etc/yum.repos.d
#   - repo_nginx    : launch a nginx for repo if no nginx is serving
#
# node/haproxy/docker/monitor : setup infra node as a common node (check node.yml)
#   - node_name, node_hosts, node_resolv, node_firewall, node_ca, node_repo, node_pkg
#   - node_feature, node_kernel, node_tune, node_sysctl, node_profile, node_ulimit
#   - node_data, node_admin, node_timezone, node_ntp, node_crontab, node_vip
#   - haproxy_install, haproxy_config, haproxy_launch, haproxy_reload
#   - docker_install, docker_admin, docker_config, docker_launch, docker_image
#   - haproxy_register, node_exporter, node_register, promtail
#
# infra         : setup infra components
#   - infra_env      : env_dir, env_pg, env_var
#   - infra_pkg      : infra_pkg, infra_pkg_pip
#   - infra_user     : setup infra os user group
#   - infra_cert     : issue cert for infra components
#   - dns            : dns_config, dns_record, dns_launch
#   - nginx          : nginx_config, nginx_cert, nginx_static, nginx_launch, nginx_exporter
#   - prometheus     : prometheus_clean, prometheus_dir, prometheus_config, prometheus_launch, prometheus_reload
#   - alertmanager   : alertmanager_config, alertmanager_launch
#   - pushgateway    : pushgateway_config, pushgateway_launch
#   - blackbox       : blackbox_config, blackbox_launch
#   - grafana        : grafana_clean, grafana_config, grafana_plugin, grafana_launch, grafana_provision
#   - loki           : loki clean, loki_dir, loki_config, loki_launch
#   - infra_register : register infra components to prometheus

asciicast


infra-rm.yml

The playbook infra-rm.yml will remove infrastructure components from infra nodes

./infra-rm.yml               # remove INFRA module
./infra-rm.yml -t service    # stop INFRA services
./infra-rm.yml -t data       # remove INFRA data
./infra-rm.yml -t package    # uninstall INFRA packages

install.yml

The playbook install.yml will install Pigsty on all node in one-pass.

Check Playbook: One-Pass Install for details.


Dashboard

Pigsty Home : Home dashboard for pigsty’s grafana

Pigsty Home Dashboard

pigsty.jpg

INFRA Overview : Overview of all infra components

INFRA Overview Dashboard

infra-overview.jpg

Nginx Overview : Nginx metrics & logs

Nginx Overview Dashboard

nginx-overview.jpg

Grafana Overview: Grafana metrics & logs

Grafana Overview Dashboard

grafana-overview.jpg

Prometheus Overview: Prometheus metrics & logs

Prometheus Overview Dashboard

prometheus-overview.jpg

Loki Overview: Loki metrics & logs

Loki Overview Dashboard

loki-overview.jpg

Logs Instance: Logs for a single instance

Logs Instance Dashboard

logs-instance.jpg

Logs Overview: Overview of all logs

Logs Overview Dashboard

logs-overview.jpg

CMDB Overview: CMDB visualization

CMDB Overview Dashboard

cmdb-overview.jpg

ETCD Overview: etcd metrics & logs

ETCD Overview Dashboard

etcd-overview.jpg


Parameter

API Reference for INFRA module:

  • META: infra meta data
  • CA: self-signed CA
  • INFRA_ID : Portals and identity
  • REPO: local yum/atp repo
  • INFRA_PACKAGE : packages to be installed
  • NGINX : nginx web server
  • DNS: dnsmasq nameserver
  • PROMETHEUS : prometheus, alertmanager, pushgateway & blackbox_exporter
  • GRAFANA : Grafana, the visualization platform
  • LOKI : Loki, the logging server
Parameters
Parameter Section Type Level Comment
version META string G pigsty version string
admin_ip META ip G admin node ip address
region META enum G upstream mirror region: default,china,europe
proxy_env META dict G global proxy env when downloading packages
ca_method CA enum G create,recreate,copy, create by default
ca_cn CA string G ca common name, fixed as pigsty-ca
cert_validity CA interval G cert validity, 20 years by default
infra_seq INFRA_ID int I infra node identity, REQUIRED
infra_portal INFRA_ID dict G infra services exposed via portal
repo_enabled REPO bool G/I create a yum/apt repo on this infra node?
repo_home REPO path G repo home dir, /www by default
repo_name REPO string G repo name, pigsty by default
repo_endpoint REPO url G access point to this repo by domain or ip:port
repo_remove REPO bool G/A remove existing upstream repo
repo_modules REPO string G/A which repo modules are installed in repo_upstream
repo_upstream REPO upstream[] G where to download upstream packages
repo_packages REPO string[] G which packages to be included
repo_url_packages REPO string[] G extra packages from url
infra_packages INFRA_PACKAGE string[] G packages to be installed on infra nodes
infra_packages_pip INFRA_PACKAGE string G pip installed packages for infra nodes
nginx_enabled NGINX bool G/I enable nginx on this infra node?
nginx_exporter_enabled NGINX bool G/I enable nginx_exporter on this infra node?
nginx_sslmode NGINX enum G nginx ssl mode? disable,enable,enforce
nginx_home NGINX path G nginx content dir, /www by default
nginx_port NGINX port G nginx listen port, 80 by default
nginx_ssl_port NGINX port G nginx ssl listen port, 443 by default
nginx_navbar NGINX index[] G nginx index page navigation links
dns_enabled DNS bool G/I setup dnsmasq on this infra node?
dns_port DNS port G dns server listen port, 53 by default
dns_records DNS string[] G dynamic dns records resolved by dnsmasq
prometheus_enabled PROMETHEUS bool G/I enable prometheus on this infra node?
prometheus_clean PROMETHEUS bool G/A clean prometheus data during init?
prometheus_data PROMETHEUS path G prometheus data dir, /data/prometheus by default
prometheus_sd_dir PROMETHEUS path G prometheus file service discovery directory
prometheus_sd_interval PROMETHEUS interval G prometheus target refresh interval, 5s by default
prometheus_scrape_interval PROMETHEUS interval G prometheus scrape & eval interval, 10s by default
prometheus_scrape_timeout PROMETHEUS interval G prometheus global scrape timeout, 8s by default
prometheus_options PROMETHEUS arg G prometheus extra server options
pushgateway_enabled PROMETHEUS bool G/I setup pushgateway on this infra node?
pushgateway_options PROMETHEUS arg G pushgateway extra server options
blackbox_enabled PROMETHEUS bool G/I setup blackbox_exporter on this infra node?
blackbox_options PROMETHEUS arg G blackbox_exporter extra server options
alertmanager_enabled PROMETHEUS bool G/I setup alertmanager on this infra node?
alertmanager_options PROMETHEUS arg G alertmanager extra server options
exporter_metrics_path PROMETHEUS path G exporter metric path, /metrics by default
exporter_install PROMETHEUS enum G how to install exporter? none,yum,binary
exporter_repo_url PROMETHEUS url G exporter repo file url if install exporter via yum
grafana_enabled GRAFANA bool G/I enable grafana on this infra node?
grafana_clean GRAFANA bool G/A clean grafana data during init?
grafana_admin_username GRAFANA username G grafana admin username, admin by default
grafana_admin_password GRAFANA password G grafana admin password, pigsty by default
grafana_plugin_cache GRAFANA path G path to grafana plugins cache tarball
grafana_plugin_list GRAFANA string[] G grafana plugins to be downloaded with grafana-cli
loki_enabled LOKI bool G/I enable loki on this infra node?
loki_clean LOKI bool G/A whether remove existing loki data?
loki_data LOKI path G loki data dir, /data/loki by default
loki_retention LOKI interval G loki log retention period, 15d by default

6.1 - Metrics

Pigsty INFRA module metric list

INFRA module has 964 available metrics

Metric Name Type Labels Description
alertmanager_alerts gauge ins, instance, ip, job, cls, state How many alerts by state.
alertmanager_alerts_invalid_total counter version, ins, instance, ip, job, cls The total number of received alerts that were invalid.
alertmanager_alerts_received_total counter version, ins, instance, ip, status, job, cls The total number of received alerts.
alertmanager_build_info gauge revision, version, ins, instance, ip, tags, goarch, goversion, job, cls, branch, goos A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which alertmanager was built, and the goos and goarch for the build.
alertmanager_cluster_alive_messages_total counter ins, instance, ip, peer, job, cls Total number of received alive messages.
alertmanager_cluster_enabled gauge ins, instance, ip, job, cls Indicates whether the clustering is enabled or not.
alertmanager_cluster_failed_peers gauge ins, instance, ip, job, cls Number indicating the current number of failed peers in the cluster.
alertmanager_cluster_health_score gauge ins, instance, ip, job, cls Health score of the cluster. Lower values are better and zero means ’totally healthy’.
alertmanager_cluster_members gauge ins, instance, ip, job, cls Number indicating current number of members in cluster.
alertmanager_cluster_messages_pruned_total counter ins, instance, ip, job, cls Total number of cluster messages pruned.
alertmanager_cluster_messages_queued gauge ins, instance, ip, job, cls Number of cluster messages which are queued.
alertmanager_cluster_messages_received_size_total counter ins, instance, ip, msg_type, job, cls Total size of cluster messages received.
alertmanager_cluster_messages_received_total counter ins, instance, ip, msg_type, job, cls Total number of cluster messages received.
alertmanager_cluster_messages_sent_size_total counter ins, instance, ip, msg_type, job, cls Total size of cluster messages sent.
alertmanager_cluster_messages_sent_total counter ins, instance, ip, msg_type, job, cls Total number of cluster messages sent.
alertmanager_cluster_peer_info gauge ins, instance, ip, peer, job, cls A metric with a constant ‘1’ value labeled by peer name.
alertmanager_cluster_peers_joined_total counter ins, instance, ip, job, cls A counter of the number of peers that have joined.
alertmanager_cluster_peers_left_total counter ins, instance, ip, job, cls A counter of the number of peers that have left.
alertmanager_cluster_peers_update_total counter ins, instance, ip, job, cls A counter of the number of peers that have updated metadata.
alertmanager_cluster_reconnections_failed_total counter ins, instance, ip, job, cls A counter of the number of failed cluster peer reconnection attempts.
alertmanager_cluster_reconnections_total counter ins, instance, ip, job, cls A counter of the number of cluster peer reconnections.
alertmanager_cluster_refresh_join_failed_total counter ins, instance, ip, job, cls A counter of the number of failed cluster peer joined attempts via refresh.
alertmanager_cluster_refresh_join_total counter ins, instance, ip, job, cls A counter of the number of cluster peer joined via refresh.
alertmanager_config_hash gauge ins, instance, ip, job, cls Hash of the currently loaded alertmanager configuration.
alertmanager_config_last_reload_success_timestamp_seconds gauge ins, instance, ip, job, cls Timestamp of the last successful configuration reload.
alertmanager_config_last_reload_successful gauge ins, instance, ip, job, cls Whether the last configuration reload attempt was successful.
alertmanager_dispatcher_aggregation_groups gauge ins, instance, ip, job, cls Number of active aggregation groups
alertmanager_dispatcher_alert_processing_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
alertmanager_dispatcher_alert_processing_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
alertmanager_http_concurrency_limit_exceeded_total counter ins, instance, method, ip, job, cls Total number of times an HTTP request failed because the concurrency limit was reached.
alertmanager_http_request_duration_seconds_bucket Unknown ins, instance, method, ip, le, job, cls, handler N/A
alertmanager_http_request_duration_seconds_count Unknown ins, instance, method, ip, job, cls, handler N/A
alertmanager_http_request_duration_seconds_sum Unknown ins, instance, method, ip, job, cls, handler N/A
alertmanager_http_requests_in_flight gauge ins, instance, method, ip, job, cls Current number of HTTP requests being processed.
alertmanager_http_response_size_bytes_bucket Unknown ins, instance, method, ip, le, job, cls, handler N/A
alertmanager_http_response_size_bytes_count Unknown ins, instance, method, ip, job, cls, handler N/A
alertmanager_http_response_size_bytes_sum Unknown ins, instance, method, ip, job, cls, handler N/A
alertmanager_integrations gauge ins, instance, ip, job, cls Number of configured integrations.
alertmanager_marked_alerts gauge ins, instance, ip, job, cls, state How many alerts by state are currently marked in the Alertmanager regardless of their expiry.
alertmanager_nflog_gc_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
alertmanager_nflog_gc_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
alertmanager_nflog_gossip_messages_propagated_total counter ins, instance, ip, job, cls Number of received gossip messages that have been further gossiped.
alertmanager_nflog_maintenance_errors_total counter ins, instance, ip, job, cls How many maintenances were executed for the notification log that failed.
alertmanager_nflog_maintenance_total counter ins, instance, ip, job, cls How many maintenances were executed for the notification log.
alertmanager_nflog_queries_total counter ins, instance, ip, job, cls Number of notification log queries were received.
alertmanager_nflog_query_duration_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
alertmanager_nflog_query_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
alertmanager_nflog_query_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
alertmanager_nflog_query_errors_total counter ins, instance, ip, job, cls Number notification log received queries that failed.
alertmanager_nflog_snapshot_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
alertmanager_nflog_snapshot_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
alertmanager_nflog_snapshot_size_bytes gauge ins, instance, ip, job, cls Size of the last notification log snapshot in bytes.
alertmanager_notification_latency_seconds_bucket Unknown integration, ins, instance, ip, le, job, cls N/A
alertmanager_notification_latency_seconds_count Unknown integration, ins, instance, ip, job, cls N/A
alertmanager_notification_latency_seconds_sum Unknown integration, ins, instance, ip, job, cls N/A
alertmanager_notification_requests_failed_total counter integration, ins, instance, ip, job, cls The total number of failed notification requests.
alertmanager_notification_requests_total counter integration, ins, instance, ip, job, cls The total number of attempted notification requests.
alertmanager_notifications_failed_total counter integration, ins, instance, ip, reason, job, cls The total number of failed notifications.
alertmanager_notifications_total counter integration, ins, instance, ip, job, cls The total number of attempted notifications.
alertmanager_oversize_gossip_message_duration_seconds_bucket Unknown ins, instance, ip, le, key, job, cls N/A
alertmanager_oversize_gossip_message_duration_seconds_count Unknown ins, instance, ip, key, job, cls N/A
alertmanager_oversize_gossip_message_duration_seconds_sum Unknown ins, instance, ip, key, job, cls N/A
alertmanager_oversized_gossip_message_dropped_total counter ins, instance, ip, key, job, cls Number of oversized gossip messages that were dropped due to a full message queue.
alertmanager_oversized_gossip_message_failure_total counter ins, instance, ip, key, job, cls Number of oversized gossip message sends that failed.
alertmanager_oversized_gossip_message_sent_total counter ins, instance, ip, key, job, cls Number of oversized gossip message sent.
alertmanager_peer_position gauge ins, instance, ip, job, cls Position the Alertmanager instance believes it’s in. The position determines a peer’s behavior in the cluster.
alertmanager_receivers gauge ins, instance, ip, job, cls Number of configured receivers.
alertmanager_silences gauge ins, instance, ip, job, cls, state How many silences by state.
alertmanager_silences_gc_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
alertmanager_silences_gc_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
alertmanager_silences_gossip_messages_propagated_total counter ins, instance, ip, job, cls Number of received gossip messages that have been further gossiped.
alertmanager_silences_maintenance_errors_total counter ins, instance, ip, job, cls How many maintenances were executed for silences that failed.
alertmanager_silences_maintenance_total counter ins, instance, ip, job, cls How many maintenances were executed for silences.
alertmanager_silences_queries_total counter ins, instance, ip, job, cls How many silence queries were received.
alertmanager_silences_query_duration_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
alertmanager_silences_query_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
alertmanager_silences_query_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
alertmanager_silences_query_errors_total counter ins, instance, ip, job, cls How many silence received queries did not succeed.
alertmanager_silences_snapshot_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
alertmanager_silences_snapshot_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
alertmanager_silences_snapshot_size_bytes gauge ins, instance, ip, job, cls Size of the last silence snapshot in bytes.
blackbox_exporter_build_info gauge revision, version, ins, instance, ip, tags, goarch, goversion, job, cls, branch, goos A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which blackbox_exporter was built, and the goos and goarch for the build.
blackbox_exporter_config_last_reload_success_timestamp_seconds gauge ins, instance, ip, job, cls Timestamp of the last successful configuration reload.
blackbox_exporter_config_last_reload_successful gauge ins, instance, ip, job, cls Blackbox exporter config loaded successfully.
blackbox_module_unknown_total counter ins, instance, ip, job, cls Count of unknown modules requested by probes
cortex_distributor_ingester_clients gauge ins, instance, ip, job, cls The current number of ingester clients.
cortex_dns_failures_total Unknown ins, instance, ip, job, cls N/A
cortex_dns_lookups_total Unknown ins, instance, ip, job, cls N/A
cortex_frontend_query_range_duration_seconds_bucket Unknown ins, instance, method, ip, le, job, cls, status_code N/A
cortex_frontend_query_range_duration_seconds_count Unknown ins, instance, method, ip, job, cls, status_code N/A
cortex_frontend_query_range_duration_seconds_sum Unknown ins, instance, method, ip, job, cls, status_code N/A
cortex_ingester_flush_queue_length gauge ins, instance, ip, job, cls The total number of series pending in the flush queue.
cortex_kv_request_duration_seconds_bucket Unknown ins, instance, role, ip, le, kv_name, type, operation, job, cls, status_code N/A
cortex_kv_request_duration_seconds_count Unknown ins, instance, role, ip, kv_name, type, operation, job, cls, status_code N/A
cortex_kv_request_duration_seconds_sum Unknown ins, instance, role, ip, kv_name, type, operation, job, cls, status_code N/A
cortex_member_consul_heartbeats_total Unknown ins, instance, ip, job, cls N/A
cortex_prometheus_notifications_alertmanagers_discovered gauge ins, instance, ip, user, job, cls The number of alertmanagers discovered and active.
cortex_prometheus_notifications_dropped_total Unknown ins, instance, ip, user, job, cls N/A
cortex_prometheus_notifications_queue_capacity gauge ins, instance, ip, user, job, cls The capacity of the alert notifications queue.
cortex_prometheus_notifications_queue_length gauge ins, instance, ip, user, job, cls The number of alert notifications in the queue.
cortex_prometheus_rule_evaluation_duration_seconds summary ins, instance, ip, user, job, cls, quantile The duration for a rule to execute.
cortex_prometheus_rule_evaluation_duration_seconds_count Unknown ins, instance, ip, user, job, cls N/A
cortex_prometheus_rule_evaluation_duration_seconds_sum Unknown ins, instance, ip, user, job, cls N/A
cortex_prometheus_rule_group_duration_seconds summary ins, instance, ip, user, job, cls, quantile The duration of rule group evaluations.
cortex_prometheus_rule_group_duration_seconds_count Unknown ins, instance, ip, user, job, cls N/A
cortex_prometheus_rule_group_duration_seconds_sum Unknown ins, instance, ip, user, job, cls N/A
cortex_query_frontend_connected_schedulers gauge ins, instance, ip, job, cls Number of schedulers this frontend is connected to.
cortex_query_frontend_queries_in_progress gauge ins, instance, ip, job, cls Number of queries in progress handled by this frontend.
cortex_query_frontend_retries_bucket Unknown ins, instance, ip, le, job, cls N/A
cortex_query_frontend_retries_count Unknown ins, instance, ip, job, cls N/A
cortex_query_frontend_retries_sum Unknown ins, instance, ip, job, cls N/A
cortex_query_scheduler_connected_frontend_clients gauge ins, instance, ip, job, cls Number of query-frontend worker clients currently connected to the query-scheduler.
cortex_query_scheduler_connected_querier_clients gauge ins, instance, ip, job, cls Number of querier worker clients currently connected to the query-scheduler.
cortex_query_scheduler_inflight_requests summary ins, instance, ip, job, cls, quantile Number of inflight requests (either queued or processing) sampled at a regular interval. Quantile buckets keep track of inflight requests over the last 60s.
cortex_query_scheduler_inflight_requests_count Unknown ins, instance, ip, job, cls N/A
cortex_query_scheduler_inflight_requests_sum Unknown ins, instance, ip, job, cls N/A
cortex_query_scheduler_queue_duration_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
cortex_query_scheduler_queue_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
cortex_query_scheduler_queue_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
cortex_query_scheduler_queue_length Unknown ins, instance, ip, user, job, cls N/A
cortex_query_scheduler_running gauge ins, instance, ip, job, cls Value will be 1 if the scheduler is in the ReplicationSet and actively receiving/processing requests
cortex_ring_member_heartbeats_total Unknown ins, instance, ip, job, cls N/A
cortex_ring_member_tokens_owned gauge ins, instance, ip, job, cls The number of tokens owned in the ring.
cortex_ring_member_tokens_to_own gauge ins, instance, ip, job, cls The number of tokens to own in the ring.
cortex_ring_members gauge ins, instance, ip, job, cls, state Number of members in the ring
cortex_ring_oldest_member_timestamp gauge ins, instance, ip, job, cls, state Timestamp of the oldest member in the ring.
cortex_ring_tokens_total gauge ins, instance, ip, job, cls Number of tokens in the ring
cortex_ruler_clients gauge ins, instance, ip, job, cls The current number of ruler clients in the pool.
cortex_ruler_config_last_reload_successful gauge ins, instance, ip, user, job, cls Boolean set to 1 whenever the last configuration reload attempt was successful.
cortex_ruler_config_last_reload_successful_seconds gauge ins, instance, ip, user, job, cls Timestamp of the last successful configuration reload.
cortex_ruler_config_updates_total Unknown ins, instance, ip, user, job, cls N/A
cortex_ruler_managers_total gauge ins, instance, ip, job, cls Total number of managers registered and running in the ruler
cortex_ruler_ring_check_errors_total Unknown ins, instance, ip, job, cls N/A
cortex_ruler_sync_rules_total Unknown ins, instance, ip, reason, job, cls N/A
deprecated_flags_inuse_total Unknown ins, instance, ip, job, cls N/A
go_cgo_go_to_c_calls_calls_total Unknown ins, instance, ip, job, cls N/A
go_cpu_classes_gc_mark_assist_cpu_seconds_total Unknown ins, instance, ip, job, cls N/A
go_cpu_classes_gc_mark_dedicated_cpu_seconds_total Unknown ins, instance, ip, job, cls N/A
go_cpu_classes_gc_mark_idle_cpu_seconds_total Unknown ins, instance, ip, job, cls N/A
go_cpu_classes_gc_pause_cpu_seconds_total Unknown ins, instance, ip, job, cls N/A
go_cpu_classes_gc_total_cpu_seconds_total Unknown ins, instance, ip, job, cls N/A
go_cpu_classes_idle_cpu_seconds_total Unknown ins, instance, ip, job, cls N/A
go_cpu_classes_scavenge_assist_cpu_seconds_total Unknown ins, instance, ip, job, cls N/A
go_cpu_classes_scavenge_background_cpu_seconds_total Unknown ins, instance, ip, job, cls N/A
go_cpu_classes_scavenge_total_cpu_seconds_total Unknown ins, instance, ip, job, cls N/A
go_cpu_classes_total_cpu_seconds_total Unknown ins, instance, ip, job, cls N/A
go_cpu_classes_user_cpu_seconds_total Unknown ins, instance, ip, job, cls N/A
go_gc_cycles_automatic_gc_cycles_total Unknown ins, instance, ip, job, cls N/A
go_gc_cycles_forced_gc_cycles_total Unknown ins, instance, ip, job, cls N/A
go_gc_cycles_total_gc_cycles_total Unknown ins, instance, ip, job, cls N/A
go_gc_duration_seconds summary ins, instance, ip, job, cls, quantile A summary of the pause duration of garbage collection cycles.
go_gc_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
go_gc_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
go_gc_gogc_percent gauge ins, instance, ip, job, cls Heap size target percentage configured by the user, otherwise 100. This value is set by the GOGC environment variable, and the runtime/debug.SetGCPercent function.
go_gc_gomemlimit_bytes gauge ins, instance, ip, job, cls Go runtime memory limit configured by the user, otherwise math.MaxInt64. This value is set by the GOMEMLIMIT environment variable, and the runtime/debug.SetMemoryLimit function.
go_gc_heap_allocs_by_size_bytes_bucket Unknown ins, instance, ip, le, job, cls N/A
go_gc_heap_allocs_by_size_bytes_count Unknown ins, instance, ip, job, cls N/A
go_gc_heap_allocs_by_size_bytes_sum Unknown ins, instance, ip, job, cls N/A
go_gc_heap_allocs_bytes_total Unknown ins, instance, ip, job, cls N/A
go_gc_heap_allocs_objects_total Unknown ins, instance, ip, job, cls N/A
go_gc_heap_frees_by_size_bytes_bucket Unknown ins, instance, ip, le, job, cls N/A
go_gc_heap_frees_by_size_bytes_count Unknown ins, instance, ip, job, cls N/A
go_gc_heap_frees_by_size_bytes_sum Unknown ins, instance, ip, job, cls N/A
go_gc_heap_frees_bytes_total Unknown ins, instance, ip, job, cls N/A
go_gc_heap_frees_objects_total Unknown ins, instance, ip, job, cls N/A
go_gc_heap_goal_bytes gauge ins, instance, ip, job, cls Heap size target for the end of the GC cycle.
go_gc_heap_live_bytes gauge ins, instance, ip, job, cls Heap memory occupied by live objects that were marked by the previous GC.
go_gc_heap_objects_objects gauge ins, instance, ip, job, cls Number of objects, live or unswept, occupying heap memory.
go_gc_heap_tiny_allocs_objects_total Unknown ins, instance, ip, job, cls N/A
go_gc_limiter_last_enabled_gc_cycle gauge ins, instance, ip, job, cls GC cycle the last time the GC CPU limiter was enabled. This metric is useful for diagnosing the root cause of an out-of-memory error, because the limiter trades memory for CPU time when the GC’s CPU time gets too high. This is most likely to occur with use of SetMemoryLimit. The first GC cycle is cycle 1, so a value of 0 indicates that it was never enabled.
go_gc_pauses_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
go_gc_pauses_seconds_count Unknown ins, instance, ip, job, cls N/A
go_gc_pauses_seconds_sum Unknown ins, instance, ip, job, cls N/A
go_gc_scan_globals_bytes gauge ins, instance, ip, job, cls The total amount of global variable space that is scannable.
go_gc_scan_heap_bytes gauge ins, instance, ip, job, cls The total amount of heap space that is scannable.
go_gc_scan_stack_bytes gauge ins, instance, ip, job, cls The number of bytes of stack that were scanned last GC cycle.
go_gc_scan_total_bytes gauge ins, instance, ip, job, cls The total amount space that is scannable. Sum of all metrics in /gc/scan.
go_gc_stack_starting_size_bytes gauge ins, instance, ip, job, cls The stack size of new goroutines.
go_godebug_non_default_behavior_execerrdot_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_gocachehash_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_gocachetest_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_gocacheverify_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_http2client_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_http2server_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_installgoroot_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_jstmpllitinterp_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_multipartmaxheaders_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_multipartmaxparts_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_multipathtcp_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_panicnil_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_randautoseed_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_tarinsecurepath_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_tlsmaxrsasize_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_x509sha1_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_x509usefallbackroots_events_total Unknown ins, instance, ip, job, cls N/A
go_godebug_non_default_behavior_zipinsecurepath_events_total Unknown ins, instance, ip, job, cls N/A
go_goroutines gauge ins, instance, ip, job, cls Number of goroutines that currently exist.
go_info gauge version, ins, instance, ip, job, cls Information about the Go environment.
go_memory_classes_heap_free_bytes gauge ins, instance, ip, job, cls Memory that is completely free and eligible to be returned to the underlying system, but has not been. This metric is the runtime’s estimate of free address space that is backed by physical memory.
go_memory_classes_heap_objects_bytes gauge ins, instance, ip, job, cls Memory occupied by live objects and dead objects that have not yet been marked free by the garbage collector.
go_memory_classes_heap_released_bytes gauge ins, instance, ip, job, cls Memory that is completely free and has been returned to the underlying system. This metric is the runtime’s estimate of free address space that is still mapped into the process, but is not backed by physical memory.
go_memory_classes_heap_stacks_bytes gauge ins, instance, ip, job, cls Memory allocated from the heap that is reserved for stack space, whether or not it is currently in-use. Currently, this represents all stack memory for goroutines. It also includes all OS thread stacks in non-cgo programs. Note that stacks may be allocated differently in the future, and this may change.
go_memory_classes_heap_unused_bytes gauge ins, instance, ip, job, cls Memory that is reserved for heap objects but is not currently used to hold heap objects.
go_memory_classes_metadata_mcache_free_bytes gauge ins, instance, ip, job, cls Memory that is reserved for runtime mcache structures, but not in-use.
go_memory_classes_metadata_mcache_inuse_bytes gauge ins, instance, ip, job, cls Memory that is occupied by runtime mcache structures that are currently being used.
go_memory_classes_metadata_mspan_free_bytes gauge ins, instance, ip, job, cls Memory that is reserved for runtime mspan structures, but not in-use.
go_memory_classes_metadata_mspan_inuse_bytes gauge ins, instance, ip, job, cls Memory that is occupied by runtime mspan structures that are currently being used.
go_memory_classes_metadata_other_bytes gauge ins, instance, ip, job, cls Memory that is reserved for or used to hold runtime metadata.
go_memory_classes_os_stacks_bytes gauge ins, instance, ip, job, cls Stack memory allocated by the underlying operating system. In non-cgo programs this metric is currently zero. This may change in the future.In cgo programs this metric includes OS thread stacks allocated directly from the OS. Currently, this only accounts for one stack in c-shared and c-archive build modes, and other sources of stacks from the OS are not measured. This too may change in the future.
go_memory_classes_other_bytes gauge ins, instance, ip, job, cls Memory used by execution trace buffers, structures for debugging the runtime, finalizer and profiler specials, and more.
go_memory_classes_profiling_buckets_bytes gauge ins, instance, ip, job, cls Memory that is used by the stack trace hash map used for profiling.
go_memory_classes_total_bytes gauge ins, instance, ip, job, cls All memory mapped by the Go runtime into the current process as read-write. Note that this does not include memory mapped by code called via cgo or via the syscall package. Sum of all metrics in /memory/classes.
go_memstats_alloc_bytes counter ins, instance, ip, job, cls Total number of bytes allocated, even if freed.
go_memstats_alloc_bytes_total counter ins, instance, ip, job, cls Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes gauge ins, instance, ip, job, cls Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total counter ins, instance, ip, job, cls Total number of frees.
go_memstats_gc_sys_bytes gauge ins, instance, ip, job, cls Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes gauge ins, instance, ip, job, cls Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes gauge ins, instance, ip, job, cls Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes gauge ins, instance, ip, job, cls Number of heap bytes that are in use.
go_memstats_heap_objects gauge ins, instance, ip, job, cls Number of allocated objects.
go_memstats_heap_released_bytes gauge ins, instance, ip, job, cls Number of heap bytes released to OS.
go_memstats_heap_sys_bytes gauge ins, instance, ip, job, cls Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds gauge ins, instance, ip, job, cls Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total counter ins, instance, ip, job, cls Total number of pointer lookups.
go_memstats_mallocs_total counter ins, instance, ip, job, cls Total number of mallocs.
go_memstats_mcache_inuse_bytes gauge ins, instance, ip, job, cls Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes gauge ins, instance, ip, job, cls Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes gauge ins, instance, ip, job, cls Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes gauge ins, instance, ip, job, cls Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes gauge ins, instance, ip, job, cls Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes gauge ins, instance, ip, job, cls Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes gauge ins, instance, ip, job, cls Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes gauge ins, instance, ip, job, cls Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes gauge ins, instance, ip, job, cls Number of bytes obtained from system.
go_sched_gomaxprocs_threads gauge ins, instance, ip, job, cls The current runtime.GOMAXPROCS setting, or the number of operating system threads that can execute user-level Go code simultaneously.
go_sched_goroutines_goroutines gauge ins, instance, ip, job, cls Count of live goroutines.
go_sched_latencies_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
go_sched_latencies_seconds_count Unknown ins, instance, ip, job, cls N/A
go_sched_latencies_seconds_sum Unknown ins, instance, ip, job, cls N/A
go_sql_stats_connections_blocked_seconds unknown ins, instance, db_name, ip, job, cls The total time blocked waiting for a new connection.
go_sql_stats_connections_closed_max_idle unknown ins, instance, db_name, ip, job, cls The total number of connections closed due to SetMaxIdleConns.
go_sql_stats_connections_closed_max_idle_time unknown ins, instance, db_name, ip, job, cls The total number of connections closed due to SetConnMaxIdleTime.
go_sql_stats_connections_closed_max_lifetime unknown ins, instance, db_name, ip, job, cls The total number of connections closed due to SetConnMaxLifetime.
go_sql_stats_connections_idle gauge ins, instance, db_name, ip, job, cls The number of idle connections.
go_sql_stats_connections_in_use gauge ins, instance, db_name, ip, job, cls The number of connections currently in use.
go_sql_stats_connections_max_open gauge ins, instance, db_name, ip, job, cls Maximum number of open connections to the database.
go_sql_stats_connections_open gauge ins, instance, db_name, ip, job, cls The number of established connections both in use and idle.
go_sql_stats_connections_waited_for unknown ins, instance, db_name, ip, job, cls The total number of connections waited for.
go_sync_mutex_wait_total_seconds_total Unknown ins, instance, ip, job, cls N/A
go_threads gauge ins, instance, ip, job, cls Number of OS threads created.
grafana_access_evaluation_count unknown ins, instance, ip, job, cls number of evaluation calls
grafana_access_evaluation_duration_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_access_evaluation_duration_count Unknown ins, instance, ip, job, cls N/A
grafana_access_evaluation_duration_sum Unknown ins, instance, ip, job, cls N/A
grafana_access_permissions_duration_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_access_permissions_duration_count Unknown ins, instance, ip, job, cls N/A
grafana_access_permissions_duration_sum Unknown ins, instance, ip, job, cls N/A
grafana_aggregator_discovery_aggregation_count_total Unknown ins, instance, ip, job, cls N/A
grafana_alerting_active_alerts gauge ins, instance, ip, job, cls amount of active alerts
grafana_alerting_active_configurations gauge ins, instance, ip, job, cls The number of active Alertmanager configurations.
grafana_alerting_alertmanager_config_match gauge ins, instance, ip, job, cls The total number of match
grafana_alerting_alertmanager_config_match_re gauge ins, instance, ip, job, cls The total number of matchRE
grafana_alerting_alertmanager_config_matchers gauge ins, instance, ip, job, cls The total number of matchers
grafana_alerting_alertmanager_config_object_matchers gauge ins, instance, ip, job, cls The total number of object_matchers
grafana_alerting_discovered_configurations gauge ins, instance, ip, job, cls The number of organizations we’ve discovered that require an Alertmanager configuration.
grafana_alerting_dispatcher_aggregation_groups gauge ins, instance, ip, job, cls Number of active aggregation groups
grafana_alerting_dispatcher_alert_processing_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_alerting_dispatcher_alert_processing_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_alerting_execution_time_milliseconds summary ins, instance, ip, job, cls, quantile summary of alert execution duration
grafana_alerting_execution_time_milliseconds_count Unknown ins, instance, ip, job, cls N/A
grafana_alerting_execution_time_milliseconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_alerting_nflog_gc_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_alerting_nflog_gc_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_alerting_nflog_gossip_messages_propagated_total Unknown ins, instance, ip, job, cls N/A
grafana_alerting_nflog_queries_total Unknown ins, instance, ip, job, cls N/A
grafana_alerting_nflog_query_duration_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_alerting_nflog_query_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_alerting_nflog_query_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_alerting_nflog_query_errors_total Unknown ins, instance, ip, job, cls N/A
grafana_alerting_nflog_snapshot_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_alerting_nflog_snapshot_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_alerting_nflog_snapshot_size_bytes gauge ins, instance, ip, job, cls Size of the last notification log snapshot in bytes.
grafana_alerting_notification_latency_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_alerting_notification_latency_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_alerting_notification_latency_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_alerting_schedule_alert_rules gauge ins, instance, ip, job, cls The number of alert rules that could be considered for evaluation at the next tick.
grafana_alerting_schedule_alert_rules_hash gauge ins, instance, ip, job, cls A hash of the alert rules that could be considered for evaluation at the next tick.
grafana_alerting_schedule_periodic_duration_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_alerting_schedule_periodic_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_alerting_schedule_periodic_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_alerting_schedule_query_alert_rules_duration_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_alerting_schedule_query_alert_rules_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_alerting_schedule_query_alert_rules_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_alerting_scheduler_behind_seconds gauge ins, instance, ip, job, cls The total number of seconds the scheduler is behind.
grafana_alerting_silences_gc_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_alerting_silences_gc_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_alerting_silences_gossip_messages_propagated_total Unknown ins, instance, ip, job, cls N/A
grafana_alerting_silences_queries_total Unknown ins, instance, ip, job, cls N/A
grafana_alerting_silences_query_duration_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_alerting_silences_query_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_alerting_silences_query_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_alerting_silences_query_errors_total Unknown ins, instance, ip, job, cls N/A
grafana_alerting_silences_snapshot_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_alerting_silences_snapshot_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_alerting_silences_snapshot_size_bytes gauge ins, instance, ip, job, cls Size of the last silence snapshot in bytes.
grafana_alerting_state_calculation_duration_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_alerting_state_calculation_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_alerting_state_calculation_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_alerting_state_history_writes_bytes_total Unknown ins, instance, ip, job, cls N/A
grafana_alerting_ticker_interval_seconds gauge ins, instance, ip, job, cls Interval at which the ticker is meant to tick.
grafana_alerting_ticker_last_consumed_tick_timestamp_seconds gauge ins, instance, ip, job, cls Timestamp of the last consumed tick in seconds.
grafana_alerting_ticker_next_tick_timestamp_seconds gauge ins, instance, ip, job, cls Timestamp of the next tick in seconds before it is consumed.
grafana_api_admin_user_created_total Unknown ins, instance, ip, job, cls N/A
grafana_api_dashboard_get_milliseconds summary ins, instance, ip, job, cls, quantile summary for dashboard get duration
grafana_api_dashboard_get_milliseconds_count Unknown ins, instance, ip, job, cls N/A
grafana_api_dashboard_get_milliseconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_api_dashboard_save_milliseconds summary ins, instance, ip, job, cls, quantile summary for dashboard save duration
grafana_api_dashboard_save_milliseconds_count Unknown ins, instance, ip, job, cls N/A
grafana_api_dashboard_save_milliseconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_api_dashboard_search_milliseconds summary ins, instance, ip, job, cls, quantile summary for dashboard search duration
grafana_api_dashboard_search_milliseconds_count Unknown ins, instance, ip, job, cls N/A
grafana_api_dashboard_search_milliseconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_api_dashboard_snapshot_create_total Unknown ins, instance, ip, job, cls N/A
grafana_api_dashboard_snapshot_external_total Unknown ins, instance, ip, job, cls N/A
grafana_api_dashboard_snapshot_get_total Unknown ins, instance, ip, job, cls N/A
grafana_api_dataproxy_request_all_milliseconds summary ins, instance, ip, job, cls, quantile summary for dataproxy request duration
grafana_api_dataproxy_request_all_milliseconds_count Unknown ins, instance, ip, job, cls N/A
grafana_api_dataproxy_request_all_milliseconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_api_login_oauth_total Unknown ins, instance, ip, job, cls N/A
grafana_api_login_post_total Unknown ins, instance, ip, job, cls N/A
grafana_api_login_saml_total Unknown ins, instance, ip, job, cls N/A
grafana_api_models_dashboard_insert_total Unknown ins, instance, ip, job, cls N/A
grafana_api_org_create_total Unknown ins, instance, ip, job, cls N/A
grafana_api_response_status_total Unknown ins, instance, ip, job, cls, code N/A
grafana_api_user_signup_completed_total Unknown ins, instance, ip, job, cls N/A
grafana_api_user_signup_invite_total Unknown ins, instance, ip, job, cls N/A
grafana_api_user_signup_started_total Unknown ins, instance, ip, job, cls N/A
grafana_apiserver_audit_event_total Unknown ins, instance, ip, job, cls N/A
grafana_apiserver_audit_requests_rejected_total Unknown ins, instance, ip, job, cls N/A
grafana_apiserver_client_certificate_expiration_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_apiserver_client_certificate_expiration_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_apiserver_client_certificate_expiration_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_apiserver_envelope_encryption_dek_cache_fill_percent gauge ins, instance, ip, job, cls [ALPHA] Percent of the cache slots currently occupied by cached DEKs.
grafana_apiserver_flowcontrol_seat_fair_frac gauge ins, instance, ip, job, cls [ALPHA] Fair fraction of server’s concurrency to allocate to each priority level that can use it
grafana_apiserver_storage_data_key_generation_duration_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_apiserver_storage_data_key_generation_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_apiserver_storage_data_key_generation_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_apiserver_storage_data_key_generation_failures_total Unknown ins, instance, ip, job, cls N/A
grafana_apiserver_storage_envelope_transformation_cache_misses_total Unknown ins, instance, ip, job, cls N/A
grafana_apiserver_tls_handshake_errors_total Unknown ins, instance, ip, job, cls N/A
grafana_apiserver_webhooks_x509_insecure_sha1_total Unknown ins, instance, ip, job, cls N/A
grafana_apiserver_webhooks_x509_missing_san_total Unknown ins, instance, ip, job, cls N/A
grafana_authn_authn_failed_authentication_total Unknown ins, instance, ip, job, cls N/A
grafana_authn_authn_successful_authentication_total Unknown ins, instance, ip, client, job, cls N/A
grafana_authn_authn_successful_login_total Unknown ins, instance, ip, client, job, cls N/A
grafana_aws_cloudwatch_get_metric_data_total Unknown ins, instance, ip, job, cls N/A
grafana_aws_cloudwatch_get_metric_statistics_total Unknown ins, instance, ip, job, cls N/A
grafana_aws_cloudwatch_list_metrics_total Unknown ins, instance, ip, job, cls N/A
grafana_build_info gauge revision, version, ins, instance, edition, ip, goversion, job, cls, branch A metric with a constant ‘1’ value labeled by version, revision, branch, and goversion from which Grafana was built
grafana_build_timestamp gauge revision, version, ins, instance, edition, ip, goversion, job, cls, branch A metric exposing when the binary was built in epoch
grafana_cardinality_enforcement_unexpected_categorizations_total Unknown ins, instance, ip, job, cls N/A
grafana_database_conn_idle gauge ins, instance, ip, job, cls The number of idle connections
grafana_database_conn_in_use gauge ins, instance, ip, job, cls The number of connections currently in use
grafana_database_conn_max_idle_closed_seconds unknown ins, instance, ip, job, cls The total number of connections closed due to SetConnMaxIdleTime
grafana_database_conn_max_idle_closed_total Unknown ins, instance, ip, job, cls N/A
grafana_database_conn_max_lifetime_closed_total Unknown ins, instance, ip, job, cls N/A
grafana_database_conn_max_open gauge ins, instance, ip, job, cls Maximum number of open connections to the database
grafana_database_conn_open gauge ins, instance, ip, job, cls The number of established connections both in use and idle
grafana_database_conn_wait_count_total Unknown ins, instance, ip, job, cls N/A
grafana_database_conn_wait_duration_seconds unknown ins, instance, ip, job, cls The total time blocked waiting for a new connection
grafana_datasource_request_duration_seconds_bucket Unknown datasource, ins, instance, method, ip, le, datasource_type, job, cls, code N/A
grafana_datasource_request_duration_seconds_count Unknown datasource, ins, instance, method, ip, datasource_type, job, cls, code N/A
grafana_datasource_request_duration_seconds_sum Unknown datasource, ins, instance, method, ip, datasource_type, job, cls, code N/A
grafana_datasource_request_in_flight gauge datasource, ins, instance, ip, datasource_type, job, cls A gauge of outgoing data source requests currently being sent by Grafana
grafana_datasource_request_total Unknown datasource, ins, instance, method, ip, datasource_type, job, cls, code N/A
grafana_datasource_response_size_bytes_bucket Unknown datasource, ins, instance, ip, le, datasource_type, job, cls N/A
grafana_datasource_response_size_bytes_count Unknown datasource, ins, instance, ip, datasource_type, job, cls N/A
grafana_datasource_response_size_bytes_sum Unknown datasource, ins, instance, ip, datasource_type, job, cls N/A
grafana_db_datasource_query_by_id_total Unknown ins, instance, ip, job, cls N/A
grafana_disabled_metrics_total Unknown ins, instance, ip, job, cls N/A
grafana_emails_sent_failed unknown ins, instance, ip, job, cls Number of emails Grafana failed to send
grafana_emails_sent_total Unknown ins, instance, ip, job, cls N/A
grafana_encryption_cache_reads_total Unknown ins, instance, method, ip, hit, job, cls N/A
grafana_encryption_ops_total Unknown ins, instance, ip, success, operation, job, cls N/A
grafana_environment_info gauge version, ins, instance, ip, job, cls, commit A metric with a constant ‘1’ value labeled by environment information about the running instance.
grafana_feature_toggles_info gauge ins, instance, ip, job, cls info metric that exposes what feature toggles are enabled or not
grafana_frontend_boot_css_time_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_frontend_boot_css_time_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_frontend_boot_css_time_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_frontend_boot_first_contentful_paint_time_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_frontend_boot_first_contentful_paint_time_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_frontend_boot_first_contentful_paint_time_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_frontend_boot_first_paint_time_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_frontend_boot_first_paint_time_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_frontend_boot_first_paint_time_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_frontend_boot_js_done_time_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_frontend_boot_js_done_time_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_frontend_boot_js_done_time_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_frontend_boot_load_time_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_frontend_boot_load_time_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_frontend_boot_load_time_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_frontend_plugins_preload_ms_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_frontend_plugins_preload_ms_count Unknown ins, instance, ip, job, cls N/A
grafana_frontend_plugins_preload_ms_sum Unknown ins, instance, ip, job, cls N/A
grafana_hidden_metrics_total Unknown ins, instance, ip, job, cls N/A
grafana_http_request_duration_seconds_bucket Unknown ins, instance, method, ip, le, job, cls, status_code, handler N/A
grafana_http_request_duration_seconds_count Unknown ins, instance, method, ip, job, cls, status_code, handler N/A
grafana_http_request_duration_seconds_sum Unknown ins, instance, method, ip, job, cls, status_code, handler N/A
grafana_http_request_in_flight gauge ins, instance, ip, job, cls A gauge of requests currently being served by Grafana.
grafana_idforwarding_idforwarding_failed_token_signing_total Unknown ins, instance, ip, job, cls N/A
grafana_idforwarding_idforwarding_token_signing_duration_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_idforwarding_idforwarding_token_signing_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_idforwarding_idforwarding_token_signing_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_idforwarding_idforwarding_token_signing_from_cache_total Unknown ins, instance, ip, job, cls N/A
grafana_idforwarding_idforwarding_token_signing_total Unknown ins, instance, ip, job, cls N/A
grafana_instance_start_total Unknown ins, instance, ip, job, cls N/A
grafana_ldap_users_sync_execution_time summary ins, instance, ip, job, cls, quantile summary for LDAP users sync execution duration
grafana_ldap_users_sync_execution_time_count Unknown ins, instance, ip, job, cls N/A
grafana_ldap_users_sync_execution_time_sum Unknown ins, instance, ip, job, cls N/A
grafana_live_client_command_duration_seconds summary ins, instance, method, ip, job, cls, quantile Client command duration summary.
grafana_live_client_command_duration_seconds_count Unknown ins, instance, method, ip, job, cls N/A
grafana_live_client_command_duration_seconds_sum Unknown ins, instance, method, ip, job, cls N/A
grafana_live_client_num_reply_errors unknown ins, instance, method, ip, job, cls, code Number of errors in replies sent to clients.
grafana_live_client_num_server_disconnects unknown ins, instance, ip, job, cls, code Number of server initiated disconnects.
grafana_live_client_recover unknown ins, instance, ip, recovered, job, cls Count of recover operations.
grafana_live_node_action_count unknown action, ins, instance, ip, job, cls Number of node actions called.
grafana_live_node_build gauge version, ins, instance, ip, job, cls Node build info.
grafana_live_node_messages_received_count unknown ins, instance, ip, type, job, cls Number of messages received.
grafana_live_node_messages_sent_count unknown ins, instance, ip, type, job, cls Number of messages sent.
grafana_live_node_num_channels gauge ins, instance, ip, job, cls Number of channels with one or more subscribers.
grafana_live_node_num_clients gauge ins, instance, ip, job, cls Number of clients connected.
grafana_live_node_num_nodes gauge ins, instance, ip, job, cls Number of nodes in cluster.
grafana_live_node_num_subscriptions gauge ins, instance, ip, job, cls Number of subscriptions.
grafana_live_node_num_users gauge ins, instance, ip, job, cls Number of unique users connected.
grafana_live_transport_connect_count unknown ins, instance, ip, transport, job, cls Number of connections to specific transport.
grafana_live_transport_messages_sent unknown ins, instance, ip, transport, job, cls Number of messages sent over specific transport.
grafana_loki_plugin_parse_response_duration_seconds_bucket Unknown endpoint, ins, instance, ip, le, status, job, cls N/A
grafana_loki_plugin_parse_response_duration_seconds_count Unknown endpoint, ins, instance, ip, status, job, cls N/A
grafana_loki_plugin_parse_response_duration_seconds_sum Unknown endpoint, ins, instance, ip, status, job, cls N/A
grafana_page_response_status_total Unknown ins, instance, ip, job, cls, code N/A
grafana_plugin_build_info gauge version, signature_status, ins, instance, plugin_type, ip, plugin_id, job, cls A metric with a constant ‘1’ value labeled by pluginId, pluginType and version from which Grafana plugin was built
grafana_plugin_request_duration_milliseconds_bucket Unknown endpoint, ins, instance, target, ip, le, plugin_id, job, cls N/A
grafana_plugin_request_duration_milliseconds_count Unknown endpoint, ins, instance, target, ip, plugin_id, job, cls N/A
grafana_plugin_request_duration_milliseconds_sum Unknown endpoint, ins, instance, target, ip, plugin_id, job, cls N/A
grafana_plugin_request_duration_seconds_bucket Unknown endpoint, ins, instance, target, ip, le, status, plugin_id, source, job, cls N/A
grafana_plugin_request_duration_seconds_count Unknown endpoint, ins, instance, target, ip, status, plugin_id, source, job, cls N/A
grafana_plugin_request_duration_seconds_sum Unknown endpoint, ins, instance, target, ip, status, plugin_id, source, job, cls N/A
grafana_plugin_request_size_bytes_bucket Unknown endpoint, ins, instance, target, ip, le, plugin_id, source, job, cls N/A
grafana_plugin_request_size_bytes_count Unknown endpoint, ins, instance, target, ip, plugin_id, source, job, cls N/A
grafana_plugin_request_size_bytes_sum Unknown endpoint, ins, instance, target, ip, plugin_id, source, job, cls N/A
grafana_plugin_request_total Unknown endpoint, ins, instance, target, ip, status, plugin_id, job, cls N/A
grafana_process_cpu_seconds_total Unknown ins, instance, ip, job, cls N/A
grafana_process_max_fds gauge ins, instance, ip, job, cls Maximum number of open file descriptors.
grafana_process_open_fds gauge ins, instance, ip, job, cls Number of open file descriptors.
grafana_process_resident_memory_bytes gauge ins, instance, ip, job, cls Resident memory size in bytes.
grafana_process_start_time_seconds gauge ins, instance, ip, job, cls Start time of the process since unix epoch in seconds.
grafana_process_virtual_memory_bytes gauge ins, instance, ip, job, cls Virtual memory size in bytes.
grafana_process_virtual_memory_max_bytes gauge ins, instance, ip, job, cls Maximum amount of virtual memory available in bytes.
grafana_prometheus_plugin_backend_request_count unknown endpoint, ins, instance, ip, status, errorSource, job, cls The total amount of prometheus backend plugin requests
grafana_proxy_response_status_total Unknown ins, instance, ip, job, cls, code N/A
grafana_public_dashboard_request_count unknown ins, instance, ip, job, cls counter for public dashboards requests
grafana_registered_metrics_total Unknown ins, instance, ip, stability_level, deprecated_version, job, cls N/A
grafana_rendering_queue_size gauge ins, instance, ip, job, cls size of rendering queue
grafana_search_dashboard_search_failures_duration_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_search_dashboard_search_failures_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_search_dashboard_search_failures_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_search_dashboard_search_successes_duration_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
grafana_search_dashboard_search_successes_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
grafana_search_dashboard_search_successes_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
grafana_stat_active_users gauge ins, instance, ip, job, cls number of active users
grafana_stat_total_orgs gauge ins, instance, ip, job, cls total amount of orgs
grafana_stat_total_playlists gauge ins, instance, ip, job, cls total amount of playlists
grafana_stat_total_service_account_tokens gauge ins, instance, ip, job, cls total amount of service account tokens
grafana_stat_total_service_accounts gauge ins, instance, ip, job, cls total amount of service accounts
grafana_stat_total_service_accounts_role_none gauge ins, instance, ip, job, cls total amount of service accounts with no role
grafana_stat_total_teams gauge ins, instance, ip, job, cls total amount of teams
grafana_stat_total_users gauge ins, instance, ip, job, cls total amount of users
grafana_stat_totals_active_admins gauge ins, instance, ip, job, cls total amount of active admins
grafana_stat_totals_active_editors gauge ins, instance, ip, job, cls total amount of active editors
grafana_stat_totals_active_viewers gauge ins, instance, ip, job, cls total amount of active viewers
grafana_stat_totals_admins gauge ins, instance, ip, job, cls total amount of admins
grafana_stat_totals_alert_rules gauge ins, instance, ip, job, cls total amount of alert rules in the database
grafana_stat_totals_annotations gauge ins, instance, ip, job, cls total amount of annotations in the database
grafana_stat_totals_correlations gauge ins, instance, ip, job, cls total amount of correlations
grafana_stat_totals_dashboard gauge ins, instance, ip, job, cls total amount of dashboards
grafana_stat_totals_dashboard_versions gauge ins, instance, ip, job, cls total amount of dashboard versions in the database
grafana_stat_totals_data_keys gauge ins, instance, ip, job, cls, active total amount of data keys in the database
grafana_stat_totals_datasource gauge ins, instance, ip, plugin_id, job, cls total number of defined datasources, labeled by pluginId
grafana_stat_totals_editors gauge ins, instance, ip, job, cls total amount of editors
grafana_stat_totals_folder gauge ins, instance, ip, job, cls total amount of folders
grafana_stat_totals_library_panels gauge ins, instance, ip, job, cls total amount of library panels in the database
grafana_stat_totals_library_variables gauge ins, instance, ip, job, cls total amount of library variables in the database
grafana_stat_totals_public_dashboard gauge ins, instance, ip, job, cls total amount of public dashboards
grafana_stat_totals_rule_groups gauge ins, instance, ip, job, cls total amount of alert rule groups in the database
grafana_stat_totals_viewers gauge ins, instance, ip, job, cls total amount of viewers
infra_up Unknown ins, instance, ip, job, cls N/A
jaeger_tracer_baggage_restrictions_updates_total Unknown result, ins, instance, ip, job, cls N/A
jaeger_tracer_baggage_truncations_total Unknown ins, instance, ip, job, cls N/A
jaeger_tracer_baggage_updates_total Unknown result, ins, instance, ip, job, cls N/A
jaeger_tracer_finished_spans_total Unknown ins, instance, ip, sampled, job, cls N/A
jaeger_tracer_reporter_queue_length gauge ins, instance, ip, job, cls Current number of spans in the reporter queue
jaeger_tracer_reporter_spans_total Unknown result, ins, instance, ip, job, cls N/A
jaeger_tracer_sampler_queries_total Unknown result, ins, instance, ip, job, cls N/A
jaeger_tracer_sampler_updates_total Unknown result, ins, instance, ip, job, cls N/A
jaeger_tracer_span_context_decoding_errors_total Unknown ins, instance, ip, job, cls N/A
jaeger_tracer_started_spans_total Unknown ins, instance, ip, sampled, job, cls N/A
jaeger_tracer_throttled_debug_spans_total Unknown ins, instance, ip, job, cls N/A
jaeger_tracer_throttler_updates_total Unknown result, ins, instance, ip, job, cls N/A
jaeger_tracer_traces_total Unknown ins, instance, ip, sampled, job, cls, state N/A
kv_request_duration_seconds_bucket Unknown ins, instance, role, ip, le, kv_name, type, operation, job, cls, status_code N/A
kv_request_duration_seconds_count Unknown ins, instance, role, ip, kv_name, type, operation, job, cls, status_code N/A
kv_request_duration_seconds_sum Unknown ins, instance, role, ip, kv_name, type, operation, job, cls, status_code N/A
legacy_grafana_alerting_ticker_interval_seconds gauge ins, instance, ip, job, cls Interval at which the ticker is meant to tick.
legacy_grafana_alerting_ticker_last_consumed_tick_timestamp_seconds gauge ins, instance, ip, job, cls Timestamp of the last consumed tick in seconds.
legacy_grafana_alerting_ticker_next_tick_timestamp_seconds gauge ins, instance, ip, job, cls Timestamp of the next tick in seconds before it is consumed.
logql_query_duration_seconds_bucket Unknown ins, instance, query_type, ip, le, job, cls N/A
logql_query_duration_seconds_count Unknown ins, instance, query_type, ip, job, cls N/A
logql_query_duration_seconds_sum Unknown ins, instance, query_type, ip, job, cls N/A
loki_azure_blob_egress_bytes_total Unknown ins, instance, ip, job, cls N/A
loki_boltdb_shipper_apply_retention_last_successful_run_timestamp_seconds gauge ins, instance, ip, job, cls Unix timestamp of the last successful retention run
loki_boltdb_shipper_compact_tables_operation_duration_seconds gauge ins, instance, ip, job, cls Time (in seconds) spent in compacting all the tables
loki_boltdb_shipper_compact_tables_operation_last_successful_run_timestamp_seconds gauge ins, instance, ip, job, cls Unix timestamp of the last successful compaction run
loki_boltdb_shipper_compact_tables_operation_total Unknown ins, instance, ip, status, job, cls N/A
loki_boltdb_shipper_compactor_running gauge ins, instance, ip, job, cls Value will be 1 if compactor is currently running on this instance
loki_boltdb_shipper_open_existing_file_failures_total Unknown ins, instance, ip, component, job, cls N/A
loki_boltdb_shipper_query_time_table_download_duration_seconds unknown ins, instance, ip, component, job, cls, table Time (in seconds) spent in downloading of files per table at query time
loki_boltdb_shipper_request_duration_seconds_bucket Unknown ins, instance, ip, le, component, operation, job, cls, status_code N/A
loki_boltdb_shipper_request_duration_seconds_count Unknown ins, instance, ip, component, operation, job, cls, status_code N/A
loki_boltdb_shipper_request_duration_seconds_sum Unknown ins, instance, ip, component, operation, job, cls, status_code N/A
loki_boltdb_shipper_tables_download_operation_duration_seconds gauge ins, instance, ip, component, job, cls Time (in seconds) spent in downloading updated files for all the tables
loki_boltdb_shipper_tables_sync_operation_total Unknown ins, instance, ip, status, component, job, cls N/A
loki_boltdb_shipper_tables_upload_operation_total Unknown ins, instance, ip, status, component, job, cls N/A
loki_build_info gauge revision, version, ins, instance, ip, tags, goarch, goversion, job, cls, branch, goos A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which loki was built, and the goos and goarch for the build.
loki_bytes_per_line_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_bytes_per_line_count Unknown ins, instance, ip, job, cls N/A
loki_bytes_per_line_sum Unknown ins, instance, ip, job, cls N/A
loki_cache_corrupt_chunks_total Unknown ins, instance, ip, job, cls N/A
loki_cache_fetched_keys unknown ins, instance, ip, job, cls Total count of keys requested from cache.
loki_cache_hits unknown ins, instance, ip, job, cls Total count of keys found in cache.
loki_cache_request_duration_seconds_bucket Unknown ins, instance, method, ip, le, job, cls, status_code N/A
loki_cache_request_duration_seconds_count Unknown ins, instance, method, ip, job, cls, status_code N/A
loki_cache_request_duration_seconds_sum Unknown ins, instance, method, ip, job, cls, status_code N/A
loki_cache_value_size_bytes_bucket Unknown ins, instance, method, ip, le, job, cls N/A
loki_cache_value_size_bytes_count Unknown ins, instance, method, ip, job, cls N/A
loki_cache_value_size_bytes_sum Unknown ins, instance, method, ip, job, cls N/A
loki_chunk_fetcher_cache_dequeued_total Unknown ins, instance, ip, job, cls N/A
loki_chunk_fetcher_cache_enqueued_total Unknown ins, instance, ip, job, cls N/A
loki_chunk_fetcher_cache_skipped_buffer_full_total Unknown ins, instance, ip, job, cls N/A
loki_chunk_fetcher_fetched_size_bytes_bucket Unknown ins, instance, ip, le, source, job, cls N/A
loki_chunk_fetcher_fetched_size_bytes_count Unknown ins, instance, ip, source, job, cls N/A
loki_chunk_fetcher_fetched_size_bytes_sum Unknown ins, instance, ip, source, job, cls N/A
loki_chunk_store_chunks_per_query_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_chunk_store_chunks_per_query_count Unknown ins, instance, ip, job, cls N/A
loki_chunk_store_chunks_per_query_sum Unknown ins, instance, ip, job, cls N/A
loki_chunk_store_deduped_bytes_total Unknown ins, instance, ip, job, cls N/A
loki_chunk_store_deduped_chunks_total Unknown ins, instance, ip, job, cls N/A
loki_chunk_store_fetched_chunk_bytes_total Unknown ins, instance, ip, user, job, cls N/A
loki_chunk_store_fetched_chunks_total Unknown ins, instance, ip, user, job, cls N/A
loki_chunk_store_index_entries_per_chunk_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_chunk_store_index_entries_per_chunk_count Unknown ins, instance, ip, job, cls N/A
loki_chunk_store_index_entries_per_chunk_sum Unknown ins, instance, ip, job, cls N/A
loki_chunk_store_index_lookups_per_query_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_chunk_store_index_lookups_per_query_count Unknown ins, instance, ip, job, cls N/A
loki_chunk_store_index_lookups_per_query_sum Unknown ins, instance, ip, job, cls N/A
loki_chunk_store_series_post_intersection_per_query_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_chunk_store_series_post_intersection_per_query_count Unknown ins, instance, ip, job, cls N/A
loki_chunk_store_series_post_intersection_per_query_sum Unknown ins, instance, ip, job, cls N/A
loki_chunk_store_series_pre_intersection_per_query_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_chunk_store_series_pre_intersection_per_query_count Unknown ins, instance, ip, job, cls N/A
loki_chunk_store_series_pre_intersection_per_query_sum Unknown ins, instance, ip, job, cls N/A
loki_chunk_store_stored_chunk_bytes_total Unknown ins, instance, ip, user, job, cls N/A
loki_chunk_store_stored_chunks_total Unknown ins, instance, ip, user, job, cls N/A
loki_consul_request_duration_seconds_bucket Unknown ins, instance, ip, le, kv_name, operation, job, cls, status_code N/A
loki_consul_request_duration_seconds_count Unknown ins, instance, ip, kv_name, operation, job, cls, status_code N/A
loki_consul_request_duration_seconds_sum Unknown ins, instance, ip, kv_name, operation, job, cls, status_code N/A
loki_delete_request_lookups_failed_total Unknown ins, instance, ip, job, cls N/A
loki_delete_request_lookups_total Unknown ins, instance, ip, job, cls N/A
loki_discarded_bytes_total Unknown ins, instance, ip, reason, job, cls, tenant N/A
loki_discarded_samples_total Unknown ins, instance, ip, reason, job, cls, tenant N/A
loki_distributor_bytes_received_total Unknown ins, instance, retention_hours, ip, job, cls, tenant N/A
loki_distributor_ingester_appends_total Unknown ins, instance, ip, ingester, job, cls N/A
loki_distributor_lines_received_total Unknown ins, instance, ip, job, cls, tenant N/A
loki_distributor_replication_factor gauge ins, instance, ip, job, cls The configured replication factor.
loki_distributor_structured_metadata_bytes_received_total Unknown ins, instance, retention_hours, ip, job, cls, tenant N/A
loki_experimental_features_in_use_total Unknown ins, instance, ip, job, cls N/A
loki_index_chunk_refs_total Unknown ins, instance, ip, status, job, cls N/A
loki_index_request_duration_seconds_bucket Unknown ins, instance, ip, le, component, operation, job, cls, status_code N/A
loki_index_request_duration_seconds_count Unknown ins, instance, ip, component, operation, job, cls, status_code N/A
loki_index_request_duration_seconds_sum Unknown ins, instance, ip, component, operation, job, cls, status_code N/A
loki_inflight_requests gauge ins, instance, method, ip, route, job, cls Current number of inflight requests.
loki_ingester_autoforget_unhealthy_ingesters_total Unknown ins, instance, ip, job, cls N/A
loki_ingester_blocks_per_chunk_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_ingester_blocks_per_chunk_count Unknown ins, instance, ip, job, cls N/A
loki_ingester_blocks_per_chunk_sum Unknown ins, instance, ip, job, cls N/A
loki_ingester_checkpoint_creations_failed_total Unknown ins, instance, ip, job, cls N/A
loki_ingester_checkpoint_creations_total Unknown ins, instance, ip, job, cls N/A
loki_ingester_checkpoint_deletions_failed_total Unknown ins, instance, ip, job, cls N/A
loki_ingester_checkpoint_deletions_total Unknown ins, instance, ip, job, cls N/A
loki_ingester_checkpoint_duration_seconds summary ins, instance, ip, job, cls, quantile Time taken to create a checkpoint.
loki_ingester_checkpoint_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
loki_ingester_checkpoint_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
loki_ingester_checkpoint_logged_bytes_total Unknown ins, instance, ip, job, cls N/A
loki_ingester_chunk_age_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_ingester_chunk_age_seconds_count Unknown ins, instance, ip, job, cls N/A
loki_ingester_chunk_age_seconds_sum Unknown ins, instance, ip, job, cls N/A
loki_ingester_chunk_bounds_hours_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_ingester_chunk_bounds_hours_count Unknown ins, instance, ip, job, cls N/A
loki_ingester_chunk_bounds_hours_sum Unknown ins, instance, ip, job, cls N/A
loki_ingester_chunk_compression_ratio_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_ingester_chunk_compression_ratio_count Unknown ins, instance, ip, job, cls N/A
loki_ingester_chunk_compression_ratio_sum Unknown ins, instance, ip, job, cls N/A
loki_ingester_chunk_encode_time_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_ingester_chunk_encode_time_seconds_count Unknown ins, instance, ip, job, cls N/A
loki_ingester_chunk_encode_time_seconds_sum Unknown ins, instance, ip, job, cls N/A
loki_ingester_chunk_entries_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_ingester_chunk_entries_count Unknown ins, instance, ip, job, cls N/A
loki_ingester_chunk_entries_sum Unknown ins, instance, ip, job, cls N/A
loki_ingester_chunk_size_bytes_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_ingester_chunk_size_bytes_count Unknown ins, instance, ip, job, cls N/A
loki_ingester_chunk_size_bytes_sum Unknown ins, instance, ip, job, cls N/A
loki_ingester_chunk_stored_bytes_total Unknown ins, instance, ip, job, cls, tenant N/A
loki_ingester_chunk_utilization_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_ingester_chunk_utilization_count Unknown ins, instance, ip, job, cls N/A
loki_ingester_chunk_utilization_sum Unknown ins, instance, ip, job, cls N/A
loki_ingester_chunks_created_total Unknown ins, instance, ip, job, cls N/A
loki_ingester_chunks_flushed_total Unknown ins, instance, ip, reason, job, cls N/A
loki_ingester_chunks_stored_total Unknown ins, instance, ip, job, cls, tenant N/A
loki_ingester_client_request_duration_seconds_bucket Unknown ins, instance, ip, le, operation, job, cls, status_code N/A
loki_ingester_client_request_duration_seconds_count Unknown ins, instance, ip, operation, job, cls, status_code N/A
loki_ingester_client_request_duration_seconds_sum Unknown ins, instance, ip, operation, job, cls, status_code N/A
loki_ingester_limiter_enabled gauge ins, instance, ip, job, cls Whether the ingester’s limiter is enabled
loki_ingester_memory_chunks gauge ins, instance, ip, job, cls The total number of chunks in memory.
loki_ingester_memory_streams gauge ins, instance, ip, job, cls, tenant The total number of streams in memory per tenant.
loki_ingester_memory_streams_labels_bytes gauge ins, instance, ip, job, cls Total bytes of labels of the streams in memory.
loki_ingester_received_chunks unknown ins, instance, ip, job, cls The total number of chunks received by this ingester whilst joining.
loki_ingester_samples_per_chunk_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_ingester_samples_per_chunk_count Unknown ins, instance, ip, job, cls N/A
loki_ingester_samples_per_chunk_sum Unknown ins, instance, ip, job, cls N/A
loki_ingester_sent_chunks unknown ins, instance, ip, job, cls The total number of chunks sent by this ingester whilst leaving.
loki_ingester_shutdown_marker gauge ins, instance, ip, job, cls 1 if prepare shutdown has been called, 0 otherwise
loki_ingester_streams_created_total Unknown ins, instance, ip, job, cls, tenant N/A
loki_ingester_streams_removed_total Unknown ins, instance, ip, job, cls, tenant N/A
loki_ingester_wal_bytes_in_use gauge ins, instance, ip, job, cls Total number of bytes in use by the WAL recovery process.
loki_ingester_wal_disk_full_failures_total Unknown ins, instance, ip, job, cls N/A
loki_ingester_wal_duplicate_entries_total Unknown ins, instance, ip, job, cls N/A
loki_ingester_wal_logged_bytes_total Unknown ins, instance, ip, job, cls N/A
loki_ingester_wal_records_logged_total Unknown ins, instance, ip, job, cls N/A
loki_ingester_wal_recovered_bytes_total Unknown ins, instance, ip, job, cls N/A
loki_ingester_wal_recovered_chunks_total Unknown ins, instance, ip, job, cls N/A
loki_ingester_wal_recovered_entries_total Unknown ins, instance, ip, job, cls N/A
loki_ingester_wal_recovered_streams_total Unknown ins, instance, ip, job, cls N/A
loki_ingester_wal_replay_active gauge ins, instance, ip, job, cls Whether the WAL is replaying
loki_ingester_wal_replay_duration_seconds gauge ins, instance, ip, job, cls Time taken to replay the checkpoint and the WAL.
loki_ingester_wal_replay_flushing gauge ins, instance, ip, job, cls Whether the wal replay is in a flushing phase due to backpressure
loki_internal_log_messages_total Unknown ins, instance, ip, level, job, cls N/A
loki_kv_request_duration_seconds_bucket Unknown ins, instance, role, ip, le, kv_name, type, operation, job, cls, status_code N/A
loki_kv_request_duration_seconds_count Unknown ins, instance, role, ip, kv_name, type, operation, job, cls, status_code N/A
loki_kv_request_duration_seconds_sum Unknown ins, instance, role, ip, kv_name, type, operation, job, cls, status_code N/A
loki_log_flushes_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_log_flushes_count Unknown ins, instance, ip, job, cls N/A
loki_log_flushes_sum Unknown ins, instance, ip, job, cls N/A
loki_log_messages_total Unknown ins, instance, ip, level, job, cls N/A
loki_logql_querystats_bytes_processed_per_seconds_bucket Unknown ins, instance, range, ip, le, sharded, type, job, cls, status_code, latency_type N/A
loki_logql_querystats_bytes_processed_per_seconds_count Unknown ins, instance, range, ip, sharded, type, job, cls, status_code, latency_type N/A
loki_logql_querystats_bytes_processed_per_seconds_sum Unknown ins, instance, range, ip, sharded, type, job, cls, status_code, latency_type N/A
loki_logql_querystats_chunk_download_latency_seconds_bucket Unknown ins, instance, range, ip, le, type, job, cls, status_code N/A
loki_logql_querystats_chunk_download_latency_seconds_count Unknown ins, instance, range, ip, type, job, cls, status_code N/A
loki_logql_querystats_chunk_download_latency_seconds_sum Unknown ins, instance, range, ip, type, job, cls, status_code N/A
loki_logql_querystats_downloaded_chunk_total Unknown ins, instance, range, ip, type, job, cls, status_code N/A
loki_logql_querystats_duplicates_total Unknown ins, instance, ip, job, cls N/A
loki_logql_querystats_ingester_sent_lines_total Unknown ins, instance, ip, job, cls N/A
loki_logql_querystats_latency_seconds_bucket Unknown ins, instance, range, ip, le, type, job, cls, status_code N/A
loki_logql_querystats_latency_seconds_count Unknown ins, instance, range, ip, type, job, cls, status_code N/A
loki_logql_querystats_latency_seconds_sum Unknown ins, instance, range, ip, type, job, cls, status_code N/A
loki_panic_total Unknown ins, instance, ip, job, cls N/A
loki_querier_index_cache_corruptions_total Unknown ins, instance, ip, job, cls N/A
loki_querier_index_cache_encode_errors_total Unknown ins, instance, ip, job, cls N/A
loki_querier_index_cache_gets_total Unknown ins, instance, ip, job, cls N/A
loki_querier_index_cache_hits_total Unknown ins, instance, ip, job, cls N/A
loki_querier_index_cache_puts_total Unknown ins, instance, ip, job, cls N/A
loki_querier_query_frontend_clients gauge ins, instance, ip, job, cls The current number of clients connected to query-frontend.
loki_querier_query_frontend_request_duration_seconds_bucket Unknown ins, instance, ip, le, operation, job, cls, status_code N/A
loki_querier_query_frontend_request_duration_seconds_count Unknown ins, instance, ip, operation, job, cls, status_code N/A
loki_querier_query_frontend_request_duration_seconds_sum Unknown ins, instance, ip, operation, job, cls, status_code N/A
loki_querier_tail_active gauge ins, instance, ip, job, cls Number of active tailers
loki_querier_tail_active_streams gauge ins, instance, ip, job, cls Number of active streams being tailed
loki_querier_tail_bytes_total Unknown ins, instance, ip, job, cls N/A
loki_querier_worker_concurrency gauge ins, instance, ip, job, cls Number of concurrent querier workers
loki_querier_worker_inflight_queries gauge ins, instance, ip, job, cls Number of queries being processed by the querier workers
loki_query_frontend_log_result_cache_hit_total Unknown ins, instance, ip, job, cls N/A
loki_query_frontend_log_result_cache_miss_total Unknown ins, instance, ip, job, cls N/A
loki_query_frontend_partitions_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_query_frontend_partitions_count Unknown ins, instance, ip, job, cls N/A
loki_query_frontend_partitions_sum Unknown ins, instance, ip, job, cls N/A
loki_query_frontend_shard_factor_bucket Unknown ins, instance, ip, le, mapper, job, cls N/A
loki_query_frontend_shard_factor_count Unknown ins, instance, ip, mapper, job, cls N/A
loki_query_frontend_shard_factor_sum Unknown ins, instance, ip, mapper, job, cls N/A
loki_query_scheduler_enqueue_count Unknown ins, instance, ip, level, user, job, cls N/A
loki_rate_store_expired_streams_total Unknown ins, instance, ip, job, cls N/A
loki_rate_store_max_stream_rate_bytes gauge ins, instance, ip, job, cls The maximum stream rate for any stream reported by ingesters during a sync operation. Sharded Streams are combined.
loki_rate_store_max_stream_shards gauge ins, instance, ip, job, cls The number of shards for a single stream reported by ingesters during a sync operation.
loki_rate_store_max_unique_stream_rate_bytes gauge ins, instance, ip, job, cls The maximum stream rate for any stream reported by ingesters during a sync operation. Sharded Streams are considered separate.
loki_rate_store_stream_rate_bytes_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_rate_store_stream_rate_bytes_count Unknown ins, instance, ip, job, cls N/A
loki_rate_store_stream_rate_bytes_sum Unknown ins, instance, ip, job, cls N/A
loki_rate_store_stream_shards_bucket Unknown ins, instance, ip, le, job, cls N/A
loki_rate_store_stream_shards_count Unknown ins, instance, ip, job, cls N/A
loki_rate_store_stream_shards_sum Unknown ins, instance, ip, job, cls N/A
loki_rate_store_streams gauge ins, instance, ip, job, cls The number of unique streams reported by all ingesters. Sharded streams are combined
loki_request_duration_seconds_bucket Unknown ins, instance, method, ip, le, ws, route, job, cls, status_code N/A
loki_request_duration_seconds_count Unknown ins, instance, method, ip, ws, route, job, cls, status_code N/A
loki_request_duration_seconds_sum Unknown ins, instance, method, ip, ws, route, job, cls, status_code N/A
loki_request_message_bytes_bucket Unknown ins, instance, method, ip, le, route, job, cls N/A
loki_request_message_bytes_count Unknown ins, instance, method, ip, route, job, cls N/A
loki_request_message_bytes_sum Unknown ins, instance, method, ip, route, job, cls N/A
loki_response_message_bytes_bucket Unknown ins, instance, method, ip, le, route, job, cls N/A
loki_response_message_bytes_count Unknown ins, instance, method, ip, route, job, cls N/A
loki_response_message_bytes_sum Unknown ins, instance, method, ip, route, job, cls N/A
loki_results_cache_version_comparisons_total Unknown ins, instance, ip, job, cls N/A
loki_store_chunks_downloaded_total Unknown ins, instance, ip, status, job, cls N/A
loki_store_chunks_per_batch_bucket Unknown ins, instance, ip, le, status, job, cls N/A
loki_store_chunks_per_batch_count Unknown ins, instance, ip, status, job, cls N/A
loki_store_chunks_per_batch_sum Unknown ins, instance, ip, status, job, cls N/A
loki_store_series_total Unknown ins, instance, ip, status, job, cls N/A
loki_stream_sharding_count unknown ins, instance, ip, job, cls Total number of times the distributor has sharded streams
loki_tcp_connections gauge ins, instance, ip, protocol, job, cls Current number of accepted TCP connections.
loki_tcp_connections_limit gauge ins, instance, ip, protocol, job, cls The max number of TCP connections that can be accepted (0 means no limit).
net_conntrack_dialer_conn_attempted_total counter ins, instance, ip, dialer_name, job, cls Total number of connections attempted by the given dialer a given name.
net_conntrack_dialer_conn_closed_total counter ins, instance, ip, dialer_name, job, cls Total number of connections closed which originated from the dialer of a given name.
net_conntrack_dialer_conn_established_total counter ins, instance, ip, dialer_name, job, cls Total number of connections successfully established by the given dialer a given name.
net_conntrack_dialer_conn_failed_total counter ins, instance, ip, dialer_name, reason, job, cls Total number of connections failed to dial by the dialer a given name.
net_conntrack_listener_conn_accepted_total counter ins, instance, ip, listener_name, job, cls Total number of connections opened to the listener of a given name.
net_conntrack_listener_conn_closed_total counter ins, instance, ip, listener_name, job, cls Total number of connections closed that were made to the listener of a given name.
nginx_connections_accepted counter ins, instance, ip, job, cls Accepted client connections
nginx_connections_active gauge ins, instance, ip, job, cls Active client connections
nginx_connections_handled counter ins, instance, ip, job, cls Handled client connections
nginx_connections_reading gauge ins, instance, ip, job, cls Connections where NGINX is reading the request header
nginx_connections_waiting gauge ins, instance, ip, job, cls Idle client connections
nginx_connections_writing gauge ins, instance, ip, job, cls Connections where NGINX is writing the response back to the client
nginx_exporter_build_info gauge revision, version, ins, instance, ip, tags, goarch, goversion, job, cls, branch, goos A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which nginx_exporter was built, and the goos and goarch for the build.
nginx_http_requests_total counter ins, instance, ip, job, cls Total http requests
nginx_up gauge ins, instance, ip, job, cls Status of the last metric scrape
plugins_active_instances gauge ins, instance, ip, job, cls The number of active plugin instances
plugins_datasource_instances_total Unknown ins, instance, ip, job, cls N/A
process_cpu_seconds_total counter ins, instance, ip, job, cls Total user and system CPU time spent in seconds.
process_max_fds gauge ins, instance, ip, job, cls Maximum number of open file descriptors.
process_open_fds gauge ins, instance, ip, job, cls Number of open file descriptors.
process_resident_memory_bytes gauge ins, instance, ip, job, cls Resident memory size in bytes.
process_start_time_seconds gauge ins, instance, ip, job, cls Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes gauge ins, instance, ip, job, cls Virtual memory size in bytes.
process_virtual_memory_max_bytes gauge ins, instance, ip, job, cls Maximum amount of virtual memory available in bytes.
prometheus_api_remote_read_queries gauge ins, instance, ip, job, cls The current number of remote read queries being executed or waiting.
prometheus_build_info gauge revision, version, ins, instance, ip, tags, goarch, goversion, job, cls, branch, goos A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which prometheus was built, and the goos and goarch for the build.
prometheus_config_last_reload_success_timestamp_seconds gauge ins, instance, ip, job, cls Timestamp of the last successful configuration reload.
prometheus_config_last_reload_successful gauge ins, instance, ip, job, cls Whether the last configuration reload attempt was successful.
prometheus_engine_queries gauge ins, instance, ip, job, cls The current number of queries being executed or waiting.
prometheus_engine_queries_concurrent_max gauge ins, instance, ip, job, cls The max number of concurrent queries.
prometheus_engine_query_duration_seconds summary ins, instance, ip, job, cls, quantile, slice Query timings
prometheus_engine_query_duration_seconds_count Unknown ins, instance, ip, job, cls, slice N/A
prometheus_engine_query_duration_seconds_sum Unknown ins, instance, ip, job, cls, slice N/A
prometheus_engine_query_log_enabled gauge ins, instance, ip, job, cls State of the query log.
prometheus_engine_query_log_failures_total counter ins, instance, ip, job, cls The number of query log failures.
prometheus_engine_query_samples_total counter ins, instance, ip, job, cls The total number of samples loaded by all queries.
prometheus_http_request_duration_seconds_bucket Unknown ins, instance, ip, le, job, cls, handler N/A
prometheus_http_request_duration_seconds_count Unknown ins, instance, ip, job, cls, handler N/A
prometheus_http_request_duration_seconds_sum Unknown ins, instance, ip, job, cls, handler N/A
prometheus_http_requests_total counter ins, instance, ip, job, cls, code, handler Counter of HTTP requests.
prometheus_http_response_size_bytes_bucket Unknown ins, instance, ip, le, job, cls, handler N/A
prometheus_http_response_size_bytes_count Unknown ins, instance, ip, job, cls, handler N/A
prometheus_http_response_size_bytes_sum Unknown ins, instance, ip, job, cls, handler N/A
prometheus_notifications_alertmanagers_discovered gauge ins, instance, ip, job, cls The number of alertmanagers discovered and active.
prometheus_notifications_dropped_total counter ins, instance, ip, job, cls Total number of alerts dropped due to errors when sending to Alertmanager.
prometheus_notifications_errors_total counter ins, instance, ip, alertmanager, job, cls Total number of errors sending alert notifications.
prometheus_notifications_latency_seconds summary ins, instance, ip, alertmanager, job, cls, quantile Latency quantiles for sending alert notifications.
prometheus_notifications_latency_seconds_count Unknown ins, instance, ip, alertmanager, job, cls N/A
prometheus_notifications_latency_seconds_sum Unknown ins, instance, ip, alertmanager, job, cls N/A
prometheus_notifications_queue_capacity gauge ins, instance, ip, job, cls The capacity of the alert notifications queue.
prometheus_notifications_queue_length gauge ins, instance, ip, job, cls The number of alert notifications in the queue.
prometheus_notifications_sent_total counter ins, instance, ip, alertmanager, job, cls Total number of alerts sent.
prometheus_ready gauge ins, instance, ip, job, cls Whether Prometheus startup was fully completed and the server is ready for normal operation.
prometheus_remote_storage_exemplars_in_total counter ins, instance, ip, job, cls Exemplars in to remote storage, compare to exemplars out for queue managers.
prometheus_remote_storage_highest_timestamp_in_seconds gauge ins, instance, ip, job, cls Highest timestamp that has come into the remote storage via the Appender interface, in seconds since epoch.
prometheus_remote_storage_histograms_in_total counter ins, instance, ip, job, cls HistogramSamples in to remote storage, compare to histograms out for queue managers.
prometheus_remote_storage_samples_in_total counter ins, instance, ip, job, cls Samples in to remote storage, compare to samples out for queue managers.
prometheus_remote_storage_string_interner_zero_reference_releases_total counter ins, instance, ip, job, cls The number of times release has been called for strings that are not interned.
prometheus_rule_evaluation_duration_seconds summary ins, instance, ip, job, cls, quantile The duration for a rule to execute.
prometheus_rule_evaluation_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
prometheus_rule_evaluation_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
prometheus_rule_evaluation_failures_total counter ins, instance, ip, job, cls, rule_group The total number of rule evaluation failures.
prometheus_rule_evaluations_total counter ins, instance, ip, job, cls, rule_group The total number of rule evaluations.
prometheus_rule_group_duration_seconds summary ins, instance, ip, job, cls, quantile The duration of rule group evaluations.
prometheus_rule_group_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
prometheus_rule_group_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
prometheus_rule_group_interval_seconds gauge ins, instance, ip, job, cls, rule_group The interval of a rule group.
prometheus_rule_group_iterations_missed_total counter ins, instance, ip, job, cls, rule_group The total number of rule group evaluations missed due to slow rule group evaluation.
prometheus_rule_group_iterations_total counter ins, instance, ip, job, cls, rule_group The total number of scheduled rule group evaluations, whether executed or missed.
prometheus_rule_group_last_duration_seconds gauge ins, instance, ip, job, cls, rule_group The duration of the last rule group evaluation.
prometheus_rule_group_last_evaluation_samples gauge ins, instance, ip, job, cls, rule_group The number of samples returned during the last rule group evaluation.
prometheus_rule_group_last_evaluation_timestamp_seconds gauge ins, instance, ip, job, cls, rule_group The timestamp of the last rule group evaluation in seconds.
prometheus_rule_group_rules gauge ins, instance, ip, job, cls, rule_group The number of rules.
prometheus_sd_azure_cache_hit_total counter ins, instance, ip, job, cls Number of cache hit during refresh.
prometheus_sd_azure_failures_total counter ins, instance, ip, job, cls Number of Azure service discovery refresh failures.
prometheus_sd_consul_rpc_duration_seconds summary endpoint, ins, instance, ip, job, cls, call, quantile The duration of a Consul RPC call in seconds.
prometheus_sd_consul_rpc_duration_seconds_count Unknown endpoint, ins, instance, ip, job, cls, call N/A
prometheus_sd_consul_rpc_duration_seconds_sum Unknown endpoint, ins, instance, ip, job, cls, call N/A
prometheus_sd_consul_rpc_failures_total counter ins, instance, ip, job, cls The number of Consul RPC call failures.
prometheus_sd_discovered_targets gauge ins, instance, ip, config, job, cls Current number of discovered targets.
prometheus_sd_dns_lookup_failures_total counter ins, instance, ip, job, cls The number of DNS-SD lookup failures.
prometheus_sd_dns_lookups_total counter ins, instance, ip, job, cls The number of DNS-SD lookups.
prometheus_sd_failed_configs gauge ins, instance, ip, job, cls Current number of service discovery configurations that failed to load.
prometheus_sd_file_mtime_seconds gauge ins, instance, ip, filename, job, cls Timestamp (mtime) of files read by FileSD. Timestamp is set at read time.
prometheus_sd_file_read_errors_total counter ins, instance, ip, job, cls The number of File-SD read errors.
prometheus_sd_file_scan_duration_seconds summary ins, instance, ip, job, cls, quantile The duration of the File-SD scan in seconds.
prometheus_sd_file_scan_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
prometheus_sd_file_scan_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
prometheus_sd_file_watcher_errors_total counter ins, instance, ip, job, cls The number of File-SD errors caused by filesystem watch failures.
prometheus_sd_http_failures_total counter ins, instance, ip, job, cls Number of HTTP service discovery refresh failures.
prometheus_sd_kubernetes_events_total counter event, ins, instance, role, ip, job, cls The number of Kubernetes events handled.
prometheus_sd_kuma_fetch_duration_seconds summary ins, instance, ip, job, cls, quantile The duration of a Kuma MADS fetch call.
prometheus_sd_kuma_fetch_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
prometheus_sd_kuma_fetch_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
prometheus_sd_kuma_fetch_failures_total counter ins, instance, ip, job, cls The number of Kuma MADS fetch call failures.
prometheus_sd_kuma_fetch_skipped_updates_total counter ins, instance, ip, job, cls The number of Kuma MADS fetch calls that result in no updates to the targets.
prometheus_sd_linode_failures_total counter ins, instance, ip, job, cls Number of Linode service discovery refresh failures.
prometheus_sd_nomad_failures_total counter ins, instance, ip, job, cls Number of nomad service discovery refresh failures.
prometheus_sd_received_updates_total counter ins, instance, ip, job, cls Total number of update events received from the SD providers.
prometheus_sd_updates_total counter ins, instance, ip, job, cls Total number of update events sent to the SD consumers.
prometheus_target_interval_length_seconds summary ins, instance, interval, ip, job, cls, quantile Actual intervals between scrapes.
prometheus_target_interval_length_seconds_count Unknown ins, instance, interval, ip, job, cls N/A
prometheus_target_interval_length_seconds_sum Unknown ins, instance, interval, ip, job, cls N/A
prometheus_target_metadata_cache_bytes gauge ins, instance, ip, scrape_job, job, cls The number of bytes that are currently used for storing metric metadata in the cache
prometheus_target_metadata_cache_entries gauge ins, instance, ip, scrape_job, job, cls Total number of metric metadata entries in the cache
prometheus_target_scrape_pool_exceeded_label_limits_total counter ins, instance, ip, job, cls Total number of times scrape pools hit the label limits, during sync or config reload.
prometheus_target_scrape_pool_exceeded_target_limit_total counter ins, instance, ip, job, cls Total number of times scrape pools hit the target limit, during sync or config reload.
prometheus_target_scrape_pool_reloads_failed_total counter ins, instance, ip, job, cls Total number of failed scrape pool reloads.
prometheus_target_scrape_pool_reloads_total counter ins, instance, ip, job, cls Total number of scrape pool reloads.
prometheus_target_scrape_pool_sync_total counter ins, instance, ip, scrape_job, job, cls Total number of syncs that were executed on a scrape pool.
prometheus_target_scrape_pool_target_limit gauge ins, instance, ip, scrape_job, job, cls Maximum number of targets allowed in this scrape pool.
prometheus_target_scrape_pool_targets gauge ins, instance, ip, scrape_job, job, cls Current number of targets in this scrape pool.
prometheus_target_scrape_pools_failed_total counter ins, instance, ip, job, cls Total number of scrape pool creations that failed.
prometheus_target_scrape_pools_total counter ins, instance, ip, job, cls Total number of scrape pool creation attempts.
prometheus_target_scrapes_cache_flush_forced_total counter ins, instance, ip, job, cls How many times a scrape cache was flushed due to getting big while scrapes are failing.
prometheus_target_scrapes_exceeded_body_size_limit_total counter ins, instance, ip, job, cls Total number of scrapes that hit the body size limit
prometheus_target_scrapes_exceeded_native_histogram_bucket_limit_total counter ins, instance, ip, job, cls Total number of scrapes that hit the native histogram bucket limit and were rejected.
prometheus_target_scrapes_exceeded_sample_limit_total counter ins, instance, ip, job, cls Total number of scrapes that hit the sample limit and were rejected.
prometheus_target_scrapes_exemplar_out_of_order_total counter ins, instance, ip, job, cls Total number of exemplar rejected due to not being out of the expected order.
prometheus_target_scrapes_sample_duplicate_timestamp_total counter ins, instance, ip, job, cls Total number of samples rejected due to duplicate timestamps but different values.
prometheus_target_scrapes_sample_out_of_bounds_total counter ins, instance, ip, job, cls Total number of samples rejected due to timestamp falling outside of the time bounds.
prometheus_target_scrapes_sample_out_of_order_total counter ins, instance, ip, job, cls Total number of samples rejected due to not being out of the expected order.
prometheus_target_sync_failed_total counter ins, instance, ip, scrape_job, job, cls Total number of target sync failures.
prometheus_target_sync_length_seconds summary ins, instance, ip, scrape_job, job, cls, quantile Actual interval to sync the scrape pool.
prometheus_target_sync_length_seconds_count Unknown ins, instance, ip, scrape_job, job, cls N/A
prometheus_target_sync_length_seconds_sum Unknown ins, instance, ip, scrape_job, job, cls N/A
prometheus_template_text_expansion_failures_total counter ins, instance, ip, job, cls The total number of template text expansion failures.
prometheus_template_text_expansions_total counter ins, instance, ip, job, cls The total number of template text expansions.
prometheus_treecache_watcher_goroutines gauge ins, instance, ip, job, cls The current number of watcher goroutines.
prometheus_treecache_zookeeper_failures_total counter ins, instance, ip, job, cls The total number of ZooKeeper failures.
prometheus_tsdb_blocks_loaded gauge ins, instance, ip, job, cls Number of currently loaded data blocks
prometheus_tsdb_checkpoint_creations_failed_total counter ins, instance, ip, job, cls Total number of checkpoint creations that failed.
prometheus_tsdb_checkpoint_creations_total counter ins, instance, ip, job, cls Total number of checkpoint creations attempted.
prometheus_tsdb_checkpoint_deletions_failed_total counter ins, instance, ip, job, cls Total number of checkpoint deletions that failed.
prometheus_tsdb_checkpoint_deletions_total counter ins, instance, ip, job, cls Total number of checkpoint deletions attempted.
prometheus_tsdb_clean_start gauge ins, instance, ip, job, cls -1: lockfile is disabled. 0: a lockfile from a previous execution was replaced. 1: lockfile creation was clean
prometheus_tsdb_compaction_chunk_range_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
prometheus_tsdb_compaction_chunk_range_seconds_count Unknown ins, instance, ip, job, cls N/A
prometheus_tsdb_compaction_chunk_range_seconds_sum Unknown ins, instance, ip, job, cls N/A
prometheus_tsdb_compaction_chunk_samples_bucket Unknown ins, instance, ip, le, job, cls N/A
prometheus_tsdb_compaction_chunk_samples_count Unknown ins, instance, ip, job, cls N/A
prometheus_tsdb_compaction_chunk_samples_sum Unknown ins, instance, ip, job, cls N/A
prometheus_tsdb_compaction_chunk_size_bytes_bucket Unknown ins, instance, ip, le, job, cls N/A
prometheus_tsdb_compaction_chunk_size_bytes_count Unknown ins, instance, ip, job, cls N/A
prometheus_tsdb_compaction_chunk_size_bytes_sum Unknown ins, instance, ip, job, cls N/A
prometheus_tsdb_compaction_duration_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
prometheus_tsdb_compaction_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
prometheus_tsdb_compaction_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
prometheus_tsdb_compaction_populating_block gauge ins, instance, ip, job, cls Set to 1 when a block is currently being written to the disk.
prometheus_tsdb_compactions_failed_total counter ins, instance, ip, job, cls Total number of compactions that failed for the partition.
prometheus_tsdb_compactions_skipped_total counter ins, instance, ip, job, cls Total number of skipped compactions due to disabled auto compaction.
prometheus_tsdb_compactions_total counter ins, instance, ip, job, cls Total number of compactions that were executed for the partition.
prometheus_tsdb_compactions_triggered_total counter ins, instance, ip, job, cls Total number of triggered compactions for the partition.
prometheus_tsdb_data_replay_duration_seconds gauge ins, instance, ip, job, cls Time taken to replay the data on disk.
prometheus_tsdb_exemplar_exemplars_appended_total counter ins, instance, ip, job, cls Total number of appended exemplars.
prometheus_tsdb_exemplar_exemplars_in_storage gauge ins, instance, ip, job, cls Number of exemplars currently in circular storage.
prometheus_tsdb_exemplar_last_exemplars_timestamp_seconds gauge ins, instance, ip, job, cls The timestamp of the oldest exemplar stored in circular storage. Useful to check for what timerange the current exemplar buffer limit allows. This usually means the last timestampfor all exemplars for a typical setup. This is not true though if one of the series timestamp is in future compared to rest series.
prometheus_tsdb_exemplar_max_exemplars gauge ins, instance, ip, job, cls Total number of exemplars the exemplar storage can store, resizeable.
prometheus_tsdb_exemplar_out_of_order_exemplars_total counter ins, instance, ip, job, cls Total number of out of order exemplar ingestion failed attempts.
prometheus_tsdb_exemplar_series_with_exemplars_in_storage gauge ins, instance, ip, job, cls Number of series with exemplars currently in circular storage.
prometheus_tsdb_head_active_appenders gauge ins, instance, ip, job, cls Number of currently active appender transactions
prometheus_tsdb_head_chunks gauge ins, instance, ip, job, cls Total number of chunks in the head block.
prometheus_tsdb_head_chunks_created_total counter ins, instance, ip, job, cls Total number of chunks created in the head
prometheus_tsdb_head_chunks_removed_total counter ins, instance, ip, job, cls Total number of chunks removed in the head
prometheus_tsdb_head_chunks_storage_size_bytes gauge ins, instance, ip, job, cls Size of the chunks_head directory.
prometheus_tsdb_head_gc_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
prometheus_tsdb_head_gc_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
prometheus_tsdb_head_max_time gauge ins, instance, ip, job, cls Maximum timestamp of the head block. The unit is decided by the library consumer.
prometheus_tsdb_head_max_time_seconds gauge ins, instance, ip, job, cls Maximum timestamp of the head block.
prometheus_tsdb_head_min_time gauge ins, instance, ip, job, cls Minimum time bound of the head block. The unit is decided by the library consumer.
prometheus_tsdb_head_min_time_seconds gauge ins, instance, ip, job, cls Minimum time bound of the head block.
prometheus_tsdb_head_out_of_order_samples_appended_total counter ins, instance, ip, job, cls Total number of appended out of order samples.
prometheus_tsdb_head_samples_appended_total counter ins, instance, ip, type, job, cls Total number of appended samples.
prometheus_tsdb_head_series gauge ins, instance, ip, job, cls Total number of series in the head block.
prometheus_tsdb_head_series_created_total counter ins, instance, ip, job, cls Total number of series created in the head
prometheus_tsdb_head_series_not_found_total counter ins, instance, ip, job, cls Total number of requests for series that were not found.
prometheus_tsdb_head_series_removed_total counter ins, instance, ip, job, cls Total number of series removed in the head
prometheus_tsdb_head_truncations_failed_total counter ins, instance, ip, job, cls Total number of head truncations that failed.
prometheus_tsdb_head_truncations_total counter ins, instance, ip, job, cls Total number of head truncations attempted.
prometheus_tsdb_isolation_high_watermark gauge ins, instance, ip, job, cls The highest TSDB append ID that has been given out.
prometheus_tsdb_isolation_low_watermark gauge ins, instance, ip, job, cls The lowest TSDB append ID that is still referenced.
prometheus_tsdb_lowest_timestamp gauge ins, instance, ip, job, cls Lowest timestamp value stored in the database. The unit is decided by the library consumer.
prometheus_tsdb_lowest_timestamp_seconds gauge ins, instance, ip, job, cls Lowest timestamp value stored in the database.
prometheus_tsdb_mmap_chunk_corruptions_total counter ins, instance, ip, job, cls Total number of memory-mapped chunk corruptions.
prometheus_tsdb_mmap_chunks_total counter ins, instance, ip, job, cls Total number of chunks that were memory-mapped.
prometheus_tsdb_out_of_bound_samples_total counter ins, instance, ip, type, job, cls Total number of out of bound samples ingestion failed attempts with out of order support disabled.
prometheus_tsdb_out_of_order_samples_total counter ins, instance, ip, type, job, cls Total number of out of order samples ingestion failed attempts due to out of order being disabled.
prometheus_tsdb_reloads_failures_total counter ins, instance, ip, job, cls Number of times the database failed to reloadBlocks block data from disk.
prometheus_tsdb_reloads_total counter ins, instance, ip, job, cls Number of times the database reloaded block data from disk.
prometheus_tsdb_retention_limit_bytes gauge ins, instance, ip, job, cls Max number of bytes to be retained in the tsdb blocks, configured 0 means disabled
prometheus_tsdb_retention_limit_seconds gauge ins, instance, ip, job, cls How long to retain samples in storage.
prometheus_tsdb_size_retentions_total counter ins, instance, ip, job, cls The number of times that blocks were deleted because the maximum number of bytes was exceeded.
prometheus_tsdb_snapshot_replay_error_total counter ins, instance, ip, job, cls Total number snapshot replays that failed.
prometheus_tsdb_storage_blocks_bytes gauge ins, instance, ip, job, cls The number of bytes that are currently used for local storage by all blocks.
prometheus_tsdb_symbol_table_size_bytes gauge ins, instance, ip, job, cls Size of symbol table in memory for loaded blocks
prometheus_tsdb_time_retentions_total counter ins, instance, ip, job, cls The number of times that blocks were deleted because the maximum time limit was exceeded.
prometheus_tsdb_tombstone_cleanup_seconds_bucket Unknown ins, instance, ip, le, job, cls N/A
prometheus_tsdb_tombstone_cleanup_seconds_count Unknown ins, instance, ip, job, cls N/A
prometheus_tsdb_tombstone_cleanup_seconds_sum Unknown ins, instance, ip, job, cls N/A
prometheus_tsdb_too_old_samples_total counter ins, instance, ip, type, job, cls Total number of out of order samples ingestion failed attempts with out of support enabled, but sample outside of time window.
prometheus_tsdb_vertical_compactions_total counter ins, instance, ip, job, cls Total number of compactions done on overlapping blocks.
prometheus_tsdb_wal_completed_pages_total counter ins, instance, ip, job, cls Total number of completed pages.
prometheus_tsdb_wal_corruptions_total counter ins, instance, ip, job, cls Total number of WAL corruptions.
prometheus_tsdb_wal_fsync_duration_seconds summary ins, instance, ip, job, cls, quantile Duration of write log fsync.
prometheus_tsdb_wal_fsync_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
prometheus_tsdb_wal_fsync_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
prometheus_tsdb_wal_page_flushes_total counter ins, instance, ip, job, cls Total number of page flushes.
prometheus_tsdb_wal_segment_current gauge ins, instance, ip, job, cls Write log segment index that TSDB is currently writing to.
prometheus_tsdb_wal_storage_size_bytes gauge ins, instance, ip, job, cls Size of the write log directory.
prometheus_tsdb_wal_truncate_duration_seconds_count Unknown ins, instance, ip, job, cls N/A
prometheus_tsdb_wal_truncate_duration_seconds_sum Unknown ins, instance, ip, job, cls N/A
prometheus_tsdb_wal_truncations_failed_total counter ins, instance, ip, job, cls Total number of write log truncations that failed.
prometheus_tsdb_wal_truncations_total counter ins, instance, ip, job, cls Total number of write log truncations attempted.
prometheus_tsdb_wal_writes_failed_total counter ins, instance, ip, job, cls Total number of write log writes that failed.
prometheus_web_federation_errors_total counter ins, instance, ip, job, cls Total number of errors that occurred while sending federation responses.
prometheus_web_federation_warnings_total counter ins, instance, ip, job, cls Total number of warnings that occurred while sending federation responses.
promhttp_metric_handler_requests_in_flight gauge ins, instance, ip, job, cls Current number of scrapes being served.
promhttp_metric_handler_requests_total counter ins, instance, ip, job, cls, code Total number of scrapes by HTTP status code.
pushgateway_build_info gauge revision, version, ins, instance, ip, tags, goarch, goversion, job, cls, branch, goos A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which pushgateway was built, and the goos and goarch for the build.
pushgateway_http_requests_total counter ins, instance, method, ip, job, cls, code, handler Total HTTP requests processed by the Pushgateway, excluding scrapes.
querier_cache_added_new_total Unknown ins, instance, ip, job, cache, cls N/A
querier_cache_added_total Unknown ins, instance, ip, job, cache, cls N/A
querier_cache_entries gauge ins, instance, ip, job, cache, cls The total number of entries
querier_cache_evicted_total Unknown ins, instance, ip, job, reason, cache, cls N/A
querier_cache_gets_total Unknown ins, instance, ip, job, cache, cls N/A
querier_cache_memory_bytes gauge ins, instance, ip, job, cache, cls The current cache size in bytes
querier_cache_misses_total Unknown ins, instance, ip, job, cache, cls N/A
querier_cache_stale_gets_total Unknown ins, instance, ip, job, cache, cls N/A
ring_member_heartbeats_total Unknown ins, instance, ip, job, cls N/A
ring_member_tokens_owned gauge ins, instance, ip, job, cls The number of tokens owned in the ring.
ring_member_tokens_to_own gauge ins, instance, ip, job, cls The number of tokens to own in the ring.
scrape_duration_seconds Unknown ins, instance, ip, job, cls N/A
scrape_samples_post_metric_relabeling Unknown ins, instance, ip, job, cls N/A
scrape_samples_scraped Unknown ins, instance, ip, job, cls N/A
scrape_series_added Unknown ins, instance, ip, job, cls N/A
up Unknown ins, instance, ip, job, cls N/A

PING 指标

PING 任务包含有 54 类可用监控指标,由 blackbox_epxorter 提供。

Metric Name Type Labels Description
agent_up Unknown ins, ip, job, instance, cls N/A
probe_dns_lookup_time_seconds gauge ins, ip, job, instance, cls Returns the time taken for probe dns lookup in seconds
probe_duration_seconds gauge ins, ip, job, instance, cls Returns how long the probe took to complete in seconds
probe_icmp_duration_seconds gauge ins, ip, job, phase, instance, cls Duration of icmp request by phase
probe_icmp_reply_hop_limit gauge ins, ip, job, instance, cls Replied packet hop limit (TTL for ipv4)
probe_ip_addr_hash gauge ins, ip, job, instance, cls Specifies the hash of IP address. It’s useful to detect if the IP address changes.
probe_ip_protocol gauge ins, ip, job, instance, cls Specifies whether probe ip protocol is IP4 or IP6
probe_success gauge ins, ip, job, instance, cls Displays whether or not the probe was a success
scrape_duration_seconds Unknown ins, ip, job, instance, cls N/A
scrape_samples_post_metric_relabeling Unknown ins, ip, job, instance, cls N/A
scrape_samples_scraped Unknown ins, ip, job, instance, cls N/A
scrape_series_added Unknown ins, ip, job, instance, cls N/A
up Unknown ins, ip, job, instance, cls N/A

PUSH 指标

PushGateway 提供 44 类监控指标。

Metric Name Type Labels Description
agent_up Unknown job, cls, instance, ins, ip N/A
go_gc_duration_seconds summary job, cls, instance, ins, quantile, ip A summary of the pause duration of garbage collection cycles.
go_gc_duration_seconds_count Unknown job, cls, instance, ins, ip N/A
go_gc_duration_seconds_sum Unknown job, cls, instance, ins, ip N/A
go_goroutines gauge job, cls, instance, ins, ip Number of goroutines that currently exist.
go_info gauge job, cls, instance, ins, ip, version Information about the Go environment.
go_memstats_alloc_bytes counter job, cls, instance, ins, ip Total number of bytes allocated, even if freed.
go_memstats_alloc_bytes_total counter job, cls, instance, ins, ip Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes gauge job, cls, instance, ins, ip Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total counter job, cls, instance, ins, ip Total number of frees.
go_memstats_gc_sys_bytes gauge job, cls, instance, ins, ip Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes gauge job, cls, instance, ins, ip Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes gauge job, cls, instance, ins, ip Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes gauge job, cls, instance, ins, ip Number of heap bytes that are in use.
go_memstats_heap_objects gauge job, cls, instance, ins, ip Number of allocated objects.
go_memstats_heap_released_bytes gauge job, cls, instance, ins, ip Number of heap bytes released to OS.
go_memstats_heap_sys_bytes gauge job, cls, instance, ins, ip Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds gauge job, cls, instance, ins, ip Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total counter job, cls, instance, ins, ip Total number of pointer lookups.
go_memstats_mallocs_total counter job, cls, instance, ins, ip Total number of mallocs.
go_memstats_mcache_inuse_bytes gauge job, cls, instance, ins, ip Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes gauge job, cls, instance, ins, ip Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes gauge job, cls, instance, ins, ip Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes gauge job, cls, instance, ins, ip Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes gauge job, cls, instance, ins, ip Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes gauge job, cls, instance, ins, ip Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes gauge job, cls, instance, ins, ip Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes gauge job, cls, instance, ins, ip Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes gauge job, cls, instance, ins, ip Number of bytes obtained from system.
go_threads gauge job, cls, instance, ins, ip Number of OS threads created.
process_cpu_seconds_total counter job, cls, instance, ins, ip Total user and system CPU time spent in seconds.
process_max_fds gauge job, cls, instance, ins, ip Maximum number of open file descriptors.
process_open_fds gauge job, cls, instance, ins, ip Number of open file descriptors.
process_resident_memory_bytes gauge job, cls, instance, ins, ip Resident memory size in bytes.
process_start_time_seconds gauge job, cls, instance, ins, ip Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes gauge job, cls, instance, ins, ip Virtual memory size in bytes.
process_virtual_memory_max_bytes gauge job, cls, instance, ins, ip Maximum amount of virtual memory available in bytes.
pushgateway_build_info gauge job, goversion, cls, branch, instance, tags, revision, goarch, ins, ip, version, goos A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which pushgateway was built, and the goos and goarch for the build.
pushgateway_http_requests_total counter job, cls, method, code, handler, instance, ins, ip Total HTTP requests processed by the Pushgateway, excluding scrapes.
scrape_duration_seconds Unknown job, cls, instance, ins, ip N/A
scrape_samples_post_metric_relabeling Unknown job, cls, instance, ins, ip N/A
scrape_samples_scraped Unknown job, cls, instance, ins, ip N/A
scrape_series_added Unknown job, cls, instance, ins, ip N/A
up Unknown job, cls, instance, ins, ip N/A

6.2 - FAQ

Pigsty INFRA module frequently asked questions

Which components are included in INFRA

  • Ansible for automation, deployment, and administration;
  • Nginx for exposing any WebUI service and serving the yum/apt repo;
  • Self-Signed CA for SSL/TLS certificates;
  • Prometheus for monitoring metrics
  • Grafana for monitoring/visualization
  • Loki for logging collection
  • AlertManager for alerts aggregation
  • Chronyd for NTP time sync on the admin node.
  • DNSMasq for DNS registration and resolution.
  • ETCD as DCS for PGSQL HA; (dedicated module)
  • PostgreSQL on meta nodes as CMDB; (optional)
  • Docker for stateless applications & tools (optional)

How to restore Prometheus targets

If you accidentally deleted the Prometheus targets dir, you can register monitoring targets to Prometheus again with the:

./infra.yml -t register_prometheus  # register all infra targets to prometheus on infra nodes
./node.yml  -t register_prometheus  # register all node  targets to prometheus on infra nodes
./etcd.yml  -t register_prometheus  # register all etcd targets to prometheus on infra nodes
./minio.yml -t register_prometheus  # register all minio targets to prometheus on infra nodes
./pgsql.yml -t register_prometheus  # register all pgsql targets to prometheus on infra nodes

How to restore Grafana datasource

PGSQL Databases in pg_databases are registered as Grafana datasource by default.

If you accidentally deleted the registered postgres datasource in Grafana, you can register them again with

./pgsql.yml -t register_grafana  # register all pgsql database (in pg_databases) as grafana datasource

How to restore the HAProxy admin page proxy

The haproxy admin page is proxied by Nginx under the default server.

If you accidentally deleted the registered haproxy proxy settings in /etc/nginx/conf.d/haproxy, you can restore them again with

./node.yml -t register_nginx     # register all haproxy admin page proxy settings to nginx on infra nodes

How to restore the DNS registration

PGSQL cluster/instance domain names are registered to /etc/hosts.d/<name> on infra nodes by default.

You can restore them again with the following:

./pgsql.yml -t pg_dns   # register pg DNS names to dnsmasq on infra nodes

How to expose new Nginx upstream service

If you wish to expose a new WebUI service via the Nginx portal, you can add the service definition to the infra_portal parameter.

And re-run ./infra.yml -t nginx_config,nginx_launch to update & apply the Nginx configuration.

If you wish to access with HTTPS, you must remove files/pki/csr/pigsty.csr, files/pki/nginx/pigsty.{key,crt} to force re-generating the Nginx SSL/TLS certificate to include the new upstream’s domain name.


How to expose web service through Nginx?

While you can directly access services via IP:Port, we still recommend consolidating access points by using domain names and uniformly accessing various web-based services through the Nginx portal. This approach helps centralize access, reduce the number of exposed ports, and facilitates access control and auditing.

If you wish to expose a new WebUI service through the Nginx portal, you can add the service definition to the infra_portal parameter. For example, here is the config used by the public demo site, which exposes several additional web services:

infra_portal:
  home         : { domain: home.pigsty.cc }
  grafana      : { domain: demo.pigsty.cc ,endpoint: "${admin_ip}:3000" ,websocket: true }
  prometheus   : { domain: p.pigsty.cc ,endpoint: "${admin_ip}:9090" }
  alertmanager : { domain: a.pigsty.cc ,endpoint: "${admin_ip}:9093" }
  blackbox     : { endpoint: "${admin_ip}:9115" }
  loki         : { endpoint: "${admin_ip}:3100" }
  # 新增的 Web 门户
  minio        : { domain: sss.pigsty  ,endpoint: "${admin_ip}:9001" ,scheme: https ,websocket: true }
  postgrest    : { domain: api.pigsty.cc  ,endpoint: "127.0.0.1:8884"   }
  pgadmin      : { domain: adm.pigsty.cc  ,endpoint: "127.0.0.1:8885"   }
  pgweb        : { domain: cli.pigsty.cc  ,endpoint: "127.0.0.1:8886"   }
  bytebase     : { domain: ddl.pigsty.cc  ,endpoint: "127.0.0.1:8887"   }
  gitea        : { domain: git.pigsty.cc  ,endpoint: "127.0.0.1:8889"   }
  wiki         : { domain: wiki.pigsty.cc ,endpoint: "127.0.0.1:9002"   }
  noco         : { domain: noco.pigsty.cc ,endpoint: "127.0.0.1:9003"   }
  supa         : { domain: supa.pigsty.cc ,endpoint: "127.0.0.1:8000", websocket: true }

After completing the Nginx upstream service definition, use the following configuration and command to register the new service with Nginx.

./infra.yml -t nginx_config           # 重新生成 Nginx 配置文件
./infra.yml -t nginx_launch           # 更新并应用 Nginx 配置。

# 您也可以使用 Ansible 手工重载 Nginx 配置
ansible infra -b -a 'nginx -s reload'  # 重载Nginx配置

If you wish to access via HTTPS, you must delete files/pki/csr/pigsty.csr and files/pki/nginx/pigsty.{key,crt} to force the regeneration of the Nginx SSL/TLS certificate to include the new upstream domain names. If you prefer to use an SSL certificate issued by an authoritative organization instead of a certificate issued by Pigsty’s self-signed CA, you can place it in the /etc/nginx/conf.d/cert/ directory and modify the corresponding configuration: /etc/nginx/conf.d/<name>.conf.


How to manually add upstream repo files

Pigsty has a built-in wrap script bin/repo-add, which will invoke ansible playbook node.yml to adding repo files to corresponding nodes.

bin/repo-add <selector> [modules]
bin/repo-add 10.10.10.10           # add node repos for node 10.10.10.10
bin/repo-add infra   node,infra    # add node and infra repos for group infra
bin/repo-add infra   node,local    # add node repos and local pigsty repo
bin/repo-add pg-test node,pgsql    # add node & pgsql repos for group pg-test

7 - Module: NODE

Tune nodes into the desired state and monitor it, manage node, vip haproxy, and exporters.

Configuration | Administration | Playbook | Dashboard | Parameter


Concept

Node is an abstraction of hardware resources, which can be bare metal, virtual machines, or even k8s pods.

There are different types of nodes in Pigsty:

The admin node is usually overlapped with the infra node, if there’s more than one infra node, the first one is often used as the default admin node, and the rest of the infra nodes can be used as backup admin nodes.

Common Node

You can manage nodes with Pigsty, and install modules on them. The node.yml playbook will adjust the node to desired state.

Some services will be added to all nodes by default:

Component Port Description Status
Node Exporter 9100 Node Monitoring Metrics Exporter Enabled
HAProxy Admin 9101 HAProxy admin page Enabled
Promtail 9080 Log collecting agent Enabled
Docker Daemon 9323 Enable Container Service Disabled
Keepalived - Manage Node Cluster L2 VIP Disabled
Keepalived Exporter 9650 Monitoring Keepalived Status Disabled

Docker & Keepalived are optional components, enabled when required.

ADMIN Node

There is one and only one admin node in a pigsty deployment, which is specified by admin_ip. It is set to the local primary IP during configure.

The node will have ssh / sudo access to all other nodes, which is critical; ensure it’s fully secured.

INFRA Node

A pigsty deployment may have one or more infra nodes, usually 2 ~ 3, in a large production environment.

The infra group specifies infra nodes in the inventory. And infra nodes will have INFRA module installed (DNS, Nginx, Prometheus, Grafana, etc…),

The admin node is also the default and first infra node, and infra nodes can be used as ‘backup’ admin nodes.

Component Port Domain Description
Nginx 80 h.pigsty Web Service Portal (YUM/APT Repo)
AlertManager 9093 a.pigsty Alert Aggregation and delivery
Prometheus 9090 p.pigsty Monitoring Time Series Database
Grafana 3000 g.pigsty Visualization Platform
Loki 3100 - Logging Collection Server
PushGateway 9091 - Collect One-Time Job Metrics
BlackboxExporter 9115 - Blackbox Probing
Dnsmasq 53 - DNS Server
Chronyd 123 - NTP Time Server
PostgreSQL 5432 - Pigsty CMDB & default database
Ansible - - Run playbooks

PGSQL Node

The node with PGSQL module installed is called a PGSQL node. The node and pg instance is 1:1 deployed. And node instance can be borrowed from corresponding pg instances with node_id_from_pg.

Component Port Description Status
Postgres 5432 Pigsty CMDB Enabled
Pgbouncer 6432 Pgbouncer Connection Pooling Service Enabled
Patroni 8008 Patroni HA Component Enabled
Haproxy Primary 5433 Primary connection pool: Read/Write Service Enabled
Haproxy Replica 5434 Replica connection pool: Read-only Service Enabled
Haproxy Default 5436 Primary Direct Connect Service Enabled
Haproxy Offline 5438 Offline Direct Connect: Offline Read Service Enabled
Haproxy service 543x Customized PostgreSQL Services On Demand
Haproxy Admin 9101 Monitoring metrics and traffic management Enabled
PG Exporter 9630 PG Monitoring Metrics Exporter Enabled
PGBouncer Exporter 9631 PGBouncer Monitoring Metrics Exporter Enabled
Node Exporter 9100 Node Monitoring Metrics Exporter Enabled
Promtail 9080 Collect Postgres, Pgbouncer, Patroni logs Enabled
Docker Daemon 9323 Docker Container Service (disable by default) Disabled
vip-manager - Bind VIP to the primary Disabled
keepalived - Node Cluster L2 VIP manager (disable by default) Disabled
Keepalived Exporter 9650 Keepalived Metrics Exporter (disable by default) Disabled

Configuration

Each node has identity parameters that are configured through the parameters in <cluster>.hosts and <cluster>.vars.

Pigsty uses IP as a unique identifier for database nodes. This IP must be the IP that the database instance listens to and serves externally, But it would be inappropriate to use a public IP address!

This is very important. The IP is the inventory_hostname of the host in the inventory, which is reflected as the key in the <cluster>.hosts object.

You can use ansible_* parameters to overwrite ssh behavior, e.g. connect via domain name / alias, but the primary IPv4 is still the core identity of the node.

nodename and node_cluster are not mandatory; nodename will use the node’s current hostname by default, while node_cluster will use the fixed default value: nodes.

If node_id_from_pg is enabled, the node will borrow PGSQL identity and use it as Node’s identity, i.e. node_cluster is set to pg_cluster if applicable, and nodename is set to ${pg_cluster}-${pg_seq}. If nodename_overwrite is enabled, node’s hostname will be overwritten by nodename

Pigsty labels a node with identity parameters in the monitoring system. Which maps nodename to ins, and node_cluster into cls.

Name Type Level Necessity Comment
inventory_hostname ip - Required Node IP
nodename string I Optional Node Name
node_cluster string C Optional Node cluster name

The following cluster config declares a three-node node cluster:

node-test:
  hosts:
    10.10.10.11: { nodename: node-test-1 }
    10.10.10.12: { nodename: node-test-2 }
    10.10.10.13: { nodename: node-test-3 }
  vars:
    node_cluster: node-test

Default values:

#nodename:           # [INSTANCE] # node instance identity, use hostname if missing, optional
node_cluster: nodes   # [CLUSTER] # node cluster identity, use 'nodes' if missing, optional
nodename_overwrite: true          # overwrite node's hostname with nodename?
nodename_exchange: false          # exchange nodename among play hosts?
node_id_from_pg: true             # use postgres identity as node identity if applicable?

Administration

Here are some common administration tasks for NODE module.


Add Node

To add a node into Pigsty, you need to have nopass ssh/sudo access to the node

# ./node.yml -l <cls|ip|group>        # the underlying playbook
# bin/node-add <selector|ip...>       # add cluster/node to pigsty
bin/node-add node-test                # init node cluster 'node-test'
bin/node-add 10.10.10.10              # init node '10.10.10.10'

Remove Node

To remove a node from Pigsty, you can use the following:

# ./node-rm.yml -l <cls|ip|group>    # the underlying playbook
# bin/node-rm <selector|ip...>       # remove node from pigsty:
bin/node-rm node-test                # remove node cluster 'node-test'
bin/node-rm 10.10.10.10              # remove node '10.10.10.10'

Create Admin

If the current user does not have nopass ssh/sudo access to the node, you can use another admin user to bootstrap the node:

node.yml -t node_admin -k -K -e ansible_user=<another admin>   # input ssh/sudo password for another admin 

Bind VIP

You can bind an optional L2 VIP on a node cluster with vip_enabled.

proxy:
  hosts:
    10.10.10.29: { nodename: proxy-1 } 
    10.10.10.30: { nodename: proxy-2 } # , vip_role: master }
  vars:
    node_cluster: proxy
    vip_enabled: true
    vip_vrid: 128
    vip_address: 10.10.10.99
    vip_interface: eth1
./node.yml -l proxy -t node_vip     # enable for the first time
./node.yml -l proxy -t vip_refresh  # refresh vip config (e.g. designated master) 

Other Tasks

# Play
./node.yml -t node                            # init node itself (haproxy monitor not included)
./node.yml -t haproxy                         # setup haproxy on node to expose services
./node.yml -t monitor                         # setup node_exporter & promtail for metrics & logs
./node.yml -t node_vip                        # enable keepalived for node cluster L2 VIP
./node.yml -t vip_config,vip_reload           # refresh L2 VIP configuration
./node.yml -t haproxy_config,haproxy_reload   # refresh haproxy services definition on node cluster
./node.yml -t register_prometheus             # register node to Prometheus
./node.yml -t register_nginx                  # register haproxy admin page url to Nginx on infra nodes

# Task
./node.yml -t node-id        # generate node identity
./node.yml -t node_name      # setup hostname
./node.yml -t node_hosts     # setup /etc/hosts records
./node.yml -t node_resolv    # setup dns resolver
./node.yml -t node_firewall  # setup firewall & selinux
./node.yml -t node_ca        # add & trust ca certificate
./node.yml -t node_repo      # add upstream repo
./node.yml -t node_pkg       # install yum packages
./node.yml -t node_feature   # setup numa, grub, static network
./node.yml -t node_kernel    # enable kernel modules
./node.yml -t node_tune      # setup tuned profile
./node.yml -t node_sysctl    # setup additional sysctl parameters
./node.yml -t node_profile   # write /etc/profile.d/node.sh
./node.yml -t node_ulimit    # setup resource limits
./node.yml -t node_data      # setup main data dir
./node.yml -t node_admin     # setup admin user and ssh key
./node.yml -t node_timezone  # setup timezone
./node.yml -t node_ntp       # setup ntp server/clients
./node.yml -t node_crontab   # add/overwrite crontab tasks
./node.yml -t node_vip       # setup optional l2 vrrp vip for node cluster

Playbook

There are two node playbooks node.yml and node-rm.yml

node.yml

The playbook node.yml will init node for pigsty

Subtasks of this playbook:

# node-id       : generate node identity
# node_name     : setup hostname
# node_hosts    : setup /etc/hosts records
# node_resolv   : setup dns resolver
# node_firewall : setup firewall & selinux
# node_ca       : add & trust ca certificate
# node_repo     : add upstream repo
# node_pkg      : install yum packages
# node_feature  : setup numa, grub, static network
# node_kernel   : enable kernel modules
# node_tune     : setup tuned profile
# node_sysctl   : setup additional sysctl parameters
# node_profile  : write /etc/profile.d/node.sh
# node_ulimit   : setup resource limits
# node_data     : setup main data dir
# node_admin    : setup admin user and ssh key
# node_timezone : setup timezone
# node_ntp      : setup ntp server/clients
# node_crontab  : add/overwrite crontab tasks
# node_vip      : setup optional l2 vrrp vip for node cluster
#   - vip_install
#   - vip_config
#   - vip_launch
#   - vip_reload
# haproxy       : setup haproxy on node to expose services
#   - haproxy_install
#   - haproxy_config
#   - haproxy_launch
#   - haproxy_reload
# monitor       : setup node_exporter & promtail for metrics & logs
#   - haproxy_register
#   - vip_dns
#   - node_exporter
#     - node_exporter_config
#     - node_exporter_launch
#   - vip_exporter
#     - vip_exporter_config
#     - vip_exporter_launch
#   - node_register
#   - promtail
#     - promtail_clean
#     - promtail_config
#     - promtail_install
#     - promtail_launch

asciicast

node-rm.yml

The playbook node-rm.yml will remove node from pigsty.playbook

Subtasks of this playbook:

# register       : remove register from prometheus & nginx
#   - prometheus : remove registered prometheus monitor target
#   - nginx      : remove nginx proxy record for haproxy admin
# vip            : remove node keepalived if enabled
# haproxy        : remove haproxy load balancer
# node_exporter  : remove monitoring exporter
# vip_exporter   : remove keepalived_exporter if enabled
# promtail       : remove loki log agent
# profile        : remove /etc/profile.d/node.sh

Dashboard

There are 6 dashboards for NODE module.

NODE Overview: Overview of all nodes

Node Overview Dashboard

node-overview.jpg

NODE Cluster: Detail information about one dedicate node cluster

Node Cluster Dashboard

node-cluster.jpg

Node Instance : Detail information about one single node instance

Node Instance Dashboard

node-instance.jpg

NODE Alert: Overview of key metrics of all node clusters/instances

Node Alert Dashboard

node-alert.jpg

NODE VIP: Detail information about a L2 VIP on a node cluster

Node VIP Dashboard

node-vip.jpg

Node Haproxy : Detail information about haproxy on node instance

Node Haproxy Dashboard

node-haproxy.jpg


Parameter

There are 11 sections, 66 parameters about NODE module.

Parameters
Parameter Section Type Level Comment
nodename NODE_ID string I node instance identity, use hostname if missing, optional
node_cluster NODE_ID string C node cluster identity, use ’nodes’ if missing, optional
nodename_overwrite NODE_ID bool C overwrite node’s hostname with nodename?
nodename_exchange NODE_ID bool C exchange nodename among play hosts?
node_id_from_pg NODE_ID bool C use postgres identity as node identity if applicable?
node_write_etc_hosts NODE_DNS bool G/C/I modify /etc/hosts on target node?
node_default_etc_hosts NODE_DNS string[] G static dns records in /etc/hosts
node_etc_hosts NODE_DNS string[] C extra static dns records in /etc/hosts
node_dns_method NODE_DNS enum C how to handle dns servers: add,none,overwrite
node_dns_servers NODE_DNS string[] C dynamic nameserver in /etc/resolv.conf
node_dns_options NODE_DNS string[] C dns resolv options in /etc/resolv.conf
node_repo_modules NODE_PACKAGE enum C/A how to setup node repo: none,local,public,both
node_repo_remove NODE_PACKAGE bool C/A remove existing repo on node?
node_packages NODE_PACKAGE string[] C packages to be installed current nodes
node_default_packages NODE_PACKAGE string[] G default packages to be installed on all nodes
node_disable_firewall NODE_TUNE bool C disable node firewall? true by default
node_disable_selinux NODE_TUNE bool C disable node selinux? true by default
node_disable_numa NODE_TUNE bool C disable node numa, reboot required
node_disable_swap NODE_TUNE bool C disable node swap, use with caution
node_static_network NODE_TUNE bool C preserve dns resolver settings after reboot
node_disk_prefetch NODE_TUNE bool C setup disk prefetch on HDD to increase performance
node_kernel_modules NODE_TUNE string[] C kernel modules to be enabled on this node
node_hugepage_count NODE_TUNE int C number of 2MB hugepage, take precedence over ratio
node_hugepage_ratio NODE_TUNE float C node mem hugepage ratio, 0 disable it by default
node_overcommit_ratio NODE_TUNE int C node mem overcommit ratio (50-100), 0 disable it by default
node_tune NODE_TUNE enum C node tuned profile: none,oltp,olap,crit,tiny
node_sysctl_params NODE_TUNE dict C sysctl parameters in k:v format in addition to tuned
node_data NODE_ADMIN path C node main data directory, /data by default
node_admin_enabled NODE_ADMIN bool C create a admin user on target node?
node_admin_uid NODE_ADMIN int C uid and gid for node admin user
node_admin_username NODE_ADMIN username C name of node admin user, dba by default
node_admin_ssh_exchange NODE_ADMIN bool C exchange admin ssh key among node cluster
node_admin_pk_current NODE_ADMIN bool C add current user’s ssh pk to admin authorized_keys
node_admin_pk_list NODE_ADMIN string[] C ssh public keys to be added to admin user
node_timezone NODE_TIME string C setup node timezone, empty string to skip
node_ntp_enabled NODE_TIME bool C enable chronyd time sync service?
node_ntp_servers NODE_TIME string[] C ntp servers in /etc/chrony.conf
node_crontab_overwrite NODE_TIME bool C overwrite or append to /etc/crontab?
node_crontab NODE_TIME string[] C crontab entries in /etc/crontab
vip_enabled NODE_VIP bool C enable vip on this node cluster?
vip_address NODE_VIP ip C node vip address in ipv4 format, required if vip is enabled
vip_vrid NODE_VIP int C required, integer, 1-254, should be unique among same VLAN
vip_role NODE_VIP enum I optional, master/backup, backup by default, use as init role
vip_preempt NODE_VIP bool C/I optional, true/false, false by default, enable vip preemption
vip_interface NODE_VIP string C/I node vip network interface to listen, eth0 by default
vip_dns_suffix NODE_VIP string C node vip dns name suffix, empty string by default
vip_exporter_port NODE_VIP port C keepalived exporter listen port, 9650 by default
haproxy_enabled HAPROXY bool C enable haproxy on this node?
haproxy_clean HAPROXY bool G/C/A cleanup all existing haproxy config?
haproxy_reload HAPROXY bool A reload haproxy after config?
haproxy_auth_enabled HAPROXY bool G enable authentication for haproxy admin page
haproxy_admin_username HAPROXY username G haproxy admin username, admin by default
haproxy_admin_password HAPROXY password G haproxy admin password, pigsty by default
haproxy_exporter_port HAPROXY port C haproxy admin/exporter port, 9101 by default
haproxy_client_timeout HAPROXY interval C client side connection timeout, 24h by default
haproxy_server_timeout HAPROXY interval C server side connection timeout, 24h by default
haproxy_services HAPROXY service[] C list of haproxy service to be exposed on node
node_exporter_enabled NODE_EXPORTER bool C setup node_exporter on this node?
node_exporter_port NODE_EXPORTER port C node exporter listen port, 9100 by default
node_exporter_options NODE_EXPORTER arg C extra server options for node_exporter
promtail_enabled PROMTAIL bool C enable promtail logging collector?
promtail_clean PROMTAIL bool G/A purge existing promtail status file during init?
promtail_port PROMTAIL port C promtail listen port, 9080 by default
promtail_positions PROMTAIL path C promtail position status file path

7.1 - Metrics

Pigsty NODE module metric list

NODE module has 747 available metrics.

Metric Name Type Labels Description
ALERTS Unknown alertname, ip, level, severity, ins, job, alertstate, category, instance, cls N/A
ALERTS_FOR_STATE Unknown alertname, ip, level, severity, ins, job, category, instance, cls N/A
deprecated_flags_inuse_total Unknown instance, ins, job, ip, cls N/A
go_gc_duration_seconds summary quantile, instance, ins, job, ip, cls A summary of the pause duration of garbage collection cycles.
go_gc_duration_seconds_count Unknown instance, ins, job, ip, cls N/A
go_gc_duration_seconds_sum Unknown instance, ins, job, ip, cls N/A
go_goroutines gauge instance, ins, job, ip, cls Number of goroutines that currently exist.
go_info gauge version, instance, ins, job, ip, cls Information about the Go environment.
go_memstats_alloc_bytes gauge instance, ins, job, ip, cls Number of bytes allocated and still in use.
go_memstats_alloc_bytes_total counter instance, ins, job, ip, cls Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes gauge instance, ins, job, ip, cls Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total counter instance, ins, job, ip, cls Total number of frees.
go_memstats_gc_sys_bytes gauge instance, ins, job, ip, cls Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes gauge instance, ins, job, ip, cls Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes gauge instance, ins, job, ip, cls Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes gauge instance, ins, job, ip, cls Number of heap bytes that are in use.
go_memstats_heap_objects gauge instance, ins, job, ip, cls Number of allocated objects.
go_memstats_heap_released_bytes gauge instance, ins, job, ip, cls Number of heap bytes released to OS.
go_memstats_heap_sys_bytes gauge instance, ins, job, ip, cls Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds gauge instance, ins, job, ip, cls Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total counter instance, ins, job, ip, cls Total number of pointer lookups.
go_memstats_mallocs_total counter instance, ins, job, ip, cls Total number of mallocs.
go_memstats_mcache_inuse_bytes gauge instance, ins, job, ip, cls Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes gauge instance, ins, job, ip, cls Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes gauge instance, ins, job, ip, cls Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes gauge instance, ins, job, ip, cls Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes gauge instance, ins, job, ip, cls Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes gauge instance, ins, job, ip, cls Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes gauge instance, ins, job, ip, cls Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes gauge instance, ins, job, ip, cls Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes gauge instance, ins, job, ip, cls Number of bytes obtained from system.
go_threads gauge instance, ins, job, ip, cls Number of OS threads created.
haproxy:cls:usage Unknown job, cls N/A
haproxy:ins:uptime Unknown instance, ins, job, ip, cls N/A
haproxy:ins:usage Unknown instance, ins, job, ip, cls N/A
haproxy_backend_active_servers gauge proxy, instance, ins, job, ip, cls Total number of active UP servers with a non-zero weight
haproxy_backend_agg_check_status gauge state, proxy, instance, ins, job, ip, cls Backend’s aggregated gauge of servers’ state check status
haproxy_backend_agg_server_check_status gauge state, proxy, instance, ins, job, ip, cls [DEPRECATED] Backend’s aggregated gauge of servers’ status
haproxy_backend_agg_server_status gauge state, proxy, instance, ins, job, ip, cls Backend’s aggregated gauge of servers’ status
haproxy_backend_backup_servers gauge proxy, instance, ins, job, ip, cls Total number of backup UP servers with a non-zero weight
haproxy_backend_bytes_in_total counter proxy, instance, ins, job, ip, cls Total number of request bytes since process started
haproxy_backend_bytes_out_total counter proxy, instance, ins, job, ip, cls Total number of response bytes since process started
haproxy_backend_check_last_change_seconds gauge proxy, instance, ins, job, ip, cls How long ago the last server state changed, in seconds
haproxy_backend_check_up_down_total counter proxy, instance, ins, job, ip, cls Total number of failed checks causing UP to DOWN server transitions, per server/backend, since the worker process started
haproxy_backend_client_aborts_total counter proxy, instance, ins, job, ip, cls Total number of requests or connections aborted by the client since the worker process started
haproxy_backend_connect_time_average_seconds gauge proxy, instance, ins, job, ip, cls Avg. connect time for last 1024 successful connections.
haproxy_backend_connection_attempts_total counter proxy, instance, ins, job, ip, cls Total number of outgoing connection attempts on this backend/server since the worker process started
haproxy_backend_connection_errors_total counter proxy, instance, ins, job, ip, cls Total number of failed connections to server since the worker process started
haproxy_backend_connection_reuses_total counter proxy, instance, ins, job, ip, cls Total number of reused connection on this backend/server since the worker process started
haproxy_backend_current_queue gauge proxy, instance, ins, job, ip, cls Number of current queued connections
haproxy_backend_current_sessions gauge proxy, instance, ins, job, ip, cls Number of current sessions on the frontend, backend or server
haproxy_backend_downtime_seconds_total counter proxy, instance, ins, job, ip, cls Total time spent in DOWN state, for server or backend
haproxy_backend_failed_header_rewriting_total counter proxy, instance, ins, job, ip, cls Total number of failed HTTP header rewrites since the worker process started
haproxy_backend_http_cache_hits_total counter proxy, instance, ins, job, ip, cls Total number of HTTP requests not found in the cache on this frontend/backend since the worker process started
haproxy_backend_http_cache_lookups_total counter proxy, instance, ins, job, ip, cls Total number of HTTP requests looked up in the cache on this frontend/backend since the worker process started
haproxy_backend_http_comp_bytes_bypassed_total counter proxy, instance, ins, job, ip, cls Total number of bytes that bypassed HTTP compression for this object since the worker process started (CPU/memory/bandwidth limitation)
haproxy_backend_http_comp_bytes_in_total counter proxy, instance, ins, job, ip, cls Total number of bytes submitted to the HTTP compressor for this object since the worker process started
haproxy_backend_http_comp_bytes_out_total counter proxy, instance, ins, job, ip, cls Total number of bytes emitted by the HTTP compressor for this object since the worker process started
haproxy_backend_http_comp_responses_total counter proxy, instance, ins, job, ip, cls Total number of HTTP responses that were compressed for this object since the worker process started
haproxy_backend_http_requests_total counter proxy, instance, ins, job, ip, cls Total number of HTTP requests processed by this object since the worker process started
haproxy_backend_http_responses_total counter ip, proxy, ins, code, job, instance, cls Total number of HTTP responses with status 100-199 returned by this object since the worker process started
haproxy_backend_internal_errors_total counter proxy, instance, ins, job, ip, cls Total number of internal errors since process started
haproxy_backend_last_session_seconds gauge proxy, instance, ins, job, ip, cls How long ago some traffic was seen on this object on this worker process, in seconds
haproxy_backend_limit_sessions gauge proxy, instance, ins, job, ip, cls Frontend/listener/server’s maxconn, backend’s fullconn
haproxy_backend_loadbalanced_total counter proxy, instance, ins, job, ip, cls Total number of requests routed by load balancing since the worker process started (ignores queue pop and stickiness)
haproxy_backend_max_connect_time_seconds gauge proxy, instance, ins, job, ip, cls Maximum observed time spent waiting for a connection to complete
haproxy_backend_max_queue gauge proxy, instance, ins, job, ip, cls Highest value of queued connections encountered since process started
haproxy_backend_max_queue_time_seconds gauge proxy, instance, ins, job, ip, cls Maximum observed time spent in the queue
haproxy_backend_max_response_time_seconds gauge proxy, instance, ins, job, ip, cls Maximum observed time spent waiting for a server response
haproxy_backend_max_session_rate gauge proxy, instance, ins, job, ip, cls Highest value of sessions per second observed since the worker process started
haproxy_backend_max_sessions gauge proxy, instance, ins, job, ip, cls Highest value of current sessions encountered since process started
haproxy_backend_max_total_time_seconds gauge proxy, instance, ins, job, ip, cls Maximum observed total request+response time (request+queue+connect+response+processing)
haproxy_backend_queue_time_average_seconds gauge proxy, instance, ins, job, ip, cls Avg. queue time for last 1024 successful connections.
haproxy_backend_redispatch_warnings_total counter proxy, instance, ins, job, ip, cls Total number of server redispatches due to connection failures since the worker process started
haproxy_backend_requests_denied_total counter proxy, instance, ins, job, ip, cls Total number of denied requests since process started
haproxy_backend_response_errors_total counter proxy, instance, ins, job, ip, cls Total number of invalid responses since the worker process started
haproxy_backend_response_time_average_seconds gauge proxy, instance, ins, job, ip, cls Avg. response time for last 1024 successful connections.
haproxy_backend_responses_denied_total counter proxy, instance, ins, job, ip, cls Total number of denied responses since process started
haproxy_backend_retry_warnings_total counter proxy, instance, ins, job, ip, cls Total number of server connection retries since the worker process started
haproxy_backend_server_aborts_total counter proxy, instance, ins, job, ip, cls Total number of requests or connections aborted by the server since the worker process started
haproxy_backend_sessions_total counter proxy, instance, ins, job, ip, cls Total number of sessions since process started
haproxy_backend_status gauge state, proxy, instance, ins, job, ip, cls Current status of the service, per state label value.
haproxy_backend_total_time_average_seconds gauge proxy, instance, ins, job, ip, cls Avg. total time for last 1024 successful connections.
haproxy_backend_uweight gauge proxy, instance, ins, job, ip, cls Server’s user weight, or sum of active servers’ user weights for a backend
haproxy_backend_weight gauge proxy, instance, ins, job, ip, cls Server’s effective weight, or sum of active servers’ effective weights for a backend
haproxy_frontend_bytes_in_total counter proxy, instance, ins, job, ip, cls Total number of request bytes since process started
haproxy_frontend_bytes_out_total counter proxy, instance, ins, job, ip, cls Total number of response bytes since process started
haproxy_frontend_connections_rate_max gauge proxy, instance, ins, job, ip, cls Highest value of connections per second observed since the worker process started
haproxy_frontend_connections_total counter proxy, instance, ins, job, ip, cls Total number of new connections accepted on this frontend since the worker process started
haproxy_frontend_current_sessions gauge proxy, instance, ins, job, ip, cls Number of current sessions on the frontend, backend or server
haproxy_frontend_denied_connections_total counter proxy, instance, ins, job, ip, cls Total number of incoming connections blocked on a listener/frontend by a tcp-request connection rule since the worker process started
haproxy_frontend_denied_sessions_total counter proxy, instance, ins, job, ip, cls Total number of incoming sessions blocked on a listener/frontend by a tcp-request connection rule since the worker process started
haproxy_frontend_failed_header_rewriting_total counter proxy, instance, ins, job, ip, cls Total number of failed HTTP header rewrites since the worker process started
haproxy_frontend_http_cache_hits_total counter proxy, instance, ins, job, ip, cls Total number of HTTP requests not found in the cache on this frontend/backend since the worker process started
haproxy_frontend_http_cache_lookups_total counter proxy, instance, ins, job, ip, cls Total number of HTTP requests looked up in the cache on this frontend/backend since the worker process started
haproxy_frontend_http_comp_bytes_bypassed_total counter proxy, instance, ins, job, ip, cls Total number of bytes that bypassed HTTP compression for this object since the worker process started (CPU/memory/bandwidth limitation)
haproxy_frontend_http_comp_bytes_in_total counter proxy, instance, ins, job, ip, cls Total number of bytes submitted to the HTTP compressor for this object since the worker process started
haproxy_frontend_http_comp_bytes_out_total counter proxy, instance, ins, job, ip, cls Total number of bytes emitted by the HTTP compressor for this object since the worker process started
haproxy_frontend_http_comp_responses_total counter proxy, instance, ins, job, ip, cls Total number of HTTP responses that were compressed for this object since the worker process started
haproxy_frontend_http_requests_rate_max gauge proxy, instance, ins, job, ip, cls Highest value of http requests observed since the worker process started
haproxy_frontend_http_requests_total counter proxy, instance, ins, job, ip, cls Total number of HTTP requests processed by this object since the worker process started
haproxy_frontend_http_responses_total counter ip, proxy, ins, code, job, instance, cls Total number of HTTP responses with status 100-199 returned by this object since the worker process started
haproxy_frontend_intercepted_requests_total counter proxy, instance, ins, job, ip, cls Total number of HTTP requests intercepted on the frontend (redirects/stats/services) since the worker process started
haproxy_frontend_internal_errors_total counter proxy, instance, ins, job, ip, cls Total number of internal errors since process started
haproxy_frontend_limit_session_rate gauge proxy, instance, ins, job, ip, cls Limit on the number of sessions accepted in a second (frontend only, ‘rate-limit sessions’ setting)
haproxy_frontend_limit_sessions gauge proxy, instance, ins, job, ip, cls Frontend/listener/server’s maxconn, backend’s fullconn
haproxy_frontend_max_session_rate gauge proxy, instance, ins, job, ip, cls Highest value of sessions per second observed since the worker process started
haproxy_frontend_max_sessions gauge proxy, instance, ins, job, ip, cls Highest value of current sessions encountered since process started
haproxy_frontend_request_errors_total counter proxy, instance, ins, job, ip, cls Total number of invalid requests since process started
haproxy_frontend_requests_denied_total counter proxy, instance, ins, job, ip, cls Total number of denied requests since process started
haproxy_frontend_responses_denied_total counter proxy, instance, ins, job, ip, cls Total number of denied responses since process started
haproxy_frontend_sessions_total counter proxy, instance, ins, job, ip, cls Total number of sessions since process started
haproxy_frontend_status gauge state, proxy, instance, ins, job, ip, cls Current status of the service, per state label value.
haproxy_process_active_peers gauge instance, ins, job, ip, cls Current number of verified active peers connections on the current worker process
haproxy_process_build_info gauge version, instance, ins, job, ip, cls Build info
haproxy_process_busy_polling_enabled gauge instance, ins, job, ip, cls 1 if busy-polling is currently in use on the worker process, otherwise zero (config.busy-polling)
haproxy_process_bytes_out_rate gauge instance, ins, job, ip, cls Number of bytes emitted by current worker process over the last second
haproxy_process_bytes_out_total counter instance, ins, job, ip, cls Total number of bytes emitted by current worker process since started
haproxy_process_connected_peers gauge instance, ins, job, ip, cls Current number of peers having passed the connection step on the current worker process
haproxy_process_connections_total counter instance, ins, job, ip, cls Total number of connections on this worker process since started
haproxy_process_current_backend_ssl_key_rate gauge instance, ins, job, ip, cls Number of SSL keys created on backends in this worker process over the last second
haproxy_process_current_connection_rate gauge instance, ins, job, ip, cls Number of front connections created on this worker process over the last second
haproxy_process_current_connections gauge instance, ins, job, ip, cls Current number of connections on this worker process
haproxy_process_current_frontend_ssl_key_rate gauge instance, ins, job, ip, cls Number of SSL keys created on frontends in this worker process over the last second
haproxy_process_current_run_queue gauge instance, ins, job, ip, cls Total number of active tasks+tasklets in the current worker process
haproxy_process_current_session_rate gauge instance, ins, job, ip, cls Number of sessions created on this worker process over the last second
haproxy_process_current_ssl_connections gauge instance, ins, job, ip, cls Current number of SSL endpoints on this worker process (front+back)
haproxy_process_current_ssl_rate gauge instance, ins, job, ip, cls Number of SSL connections created on this worker process over the last second
haproxy_process_current_tasks gauge instance, ins, job, ip, cls Total number of tasks in the current worker process (active + sleeping)
haproxy_process_current_zlib_memory gauge instance, ins, job, ip, cls Amount of memory currently used by HTTP compression on the current worker process (in bytes)
haproxy_process_dropped_logs_total counter instance, ins, job, ip, cls Total number of dropped logs for current worker process since started
haproxy_process_failed_resolutions counter instance, ins, job, ip, cls Total number of failed DNS resolutions in current worker process since started
haproxy_process_frontend_ssl_reuse gauge instance, ins, job, ip, cls Percent of frontend SSL connections which did not require a new key
haproxy_process_hard_max_connections gauge instance, ins, job, ip, cls Hard limit on the number of per-process connections (imposed by Memmax_MB or Ulimit-n)
haproxy_process_http_comp_bytes_in_total counter instance, ins, job, ip, cls Number of bytes submitted to the HTTP compressor in this worker process over the last second
haproxy_process_http_comp_bytes_out_total counter instance, ins, job, ip, cls Number of bytes emitted by the HTTP compressor in this worker process over the last second
haproxy_process_idle_time_percent gauge instance, ins, job, ip, cls Percentage of last second spent waiting in the current worker thread
haproxy_process_jobs gauge instance, ins, job, ip, cls Current number of active jobs on the current worker process (frontend connections, master connections, listeners)
haproxy_process_limit_connection_rate gauge instance, ins, job, ip, cls Hard limit for ConnRate (global.maxconnrate)
haproxy_process_limit_http_comp gauge instance, ins, job, ip, cls Limit of CompressBpsOut beyond which HTTP compression is automatically disabled
haproxy_process_limit_session_rate gauge instance, ins, job, ip, cls Hard limit for SessRate (global.maxsessrate)
haproxy_process_limit_ssl_rate gauge instance, ins, job, ip, cls Hard limit for SslRate (global.maxsslrate)
haproxy_process_listeners gauge instance, ins, job, ip, cls Current number of active listeners on the current worker process
haproxy_process_max_backend_ssl_key_rate gauge instance, ins, job, ip, cls Highest SslBackendKeyRate reached on this worker process since started (in SSL keys per second)
haproxy_process_max_connection_rate gauge instance, ins, job, ip, cls Highest ConnRate reached on this worker process since started (in connections per second)
haproxy_process_max_connections gauge instance, ins, job, ip, cls Hard limit on the number of per-process connections (configured or imposed by Ulimit-n)
haproxy_process_max_fds gauge instance, ins, job, ip, cls Hard limit on the number of per-process file descriptors
haproxy_process_max_frontend_ssl_key_rate gauge instance, ins, job, ip, cls Highest SslFrontendKeyRate reached on this worker process since started (in SSL keys per second)
haproxy_process_max_memory_bytes gauge instance, ins, job, ip, cls Worker process’s hard limit on memory usage in byes (-m on command line)
haproxy_process_max_pipes gauge instance, ins, job, ip, cls Hard limit on the number of pipes for splicing, 0=unlimited
haproxy_process_max_session_rate gauge instance, ins, job, ip, cls Highest SessRate reached on this worker process since started (in sessions per second)
haproxy_process_max_sockets gauge instance, ins, job, ip, cls Hard limit on the number of per-process sockets
haproxy_process_max_ssl_connections gauge instance, ins, job, ip, cls Hard limit on the number of per-process SSL endpoints (front+back), 0=unlimited
haproxy_process_max_ssl_rate gauge instance, ins, job, ip, cls Highest SslRate reached on this worker process since started (in connections per second)
haproxy_process_max_zlib_memory gauge instance, ins, job, ip, cls Limit on the amount of memory used by HTTP compression above which it is automatically disabled (in bytes, see global.maxzlibmem)
haproxy_process_nbproc gauge instance, ins, job, ip, cls Number of started worker processes (historical, always 1)
haproxy_process_nbthread gauge instance, ins, job, ip, cls Number of started threads (global.nbthread)
haproxy_process_pipes_free_total counter instance, ins, job, ip, cls Current number of allocated and available pipes in this worker process
haproxy_process_pipes_used_total counter instance, ins, job, ip, cls Current number of pipes in use in this worker process
haproxy_process_pool_allocated_bytes gauge instance, ins, job, ip, cls Amount of memory allocated in pools (in bytes)
haproxy_process_pool_failures_total counter instance, ins, job, ip, cls Number of failed pool allocations since this worker was started
haproxy_process_pool_used_bytes gauge instance, ins, job, ip, cls Amount of pool memory currently used (in bytes)
haproxy_process_recv_logs_total counter instance, ins, job, ip, cls Total number of log messages received by log-forwarding listeners on this worker process since started
haproxy_process_relative_process_id gauge instance, ins, job, ip, cls Relative worker process number (1)
haproxy_process_requests_total counter instance, ins, job, ip, cls Total number of requests on this worker process since started
haproxy_process_spliced_bytes_out_total counter instance, ins, job, ip, cls Total number of bytes emitted by current worker process through a kernel pipe since started
haproxy_process_ssl_cache_lookups_total counter instance, ins, job, ip, cls Total number of SSL session ID lookups in the SSL session cache on this worker since started
haproxy_process_ssl_cache_misses_total counter instance, ins, job, ip, cls Total number of SSL session ID lookups that didn’t find a session in the SSL session cache on this worker since started
haproxy_process_ssl_connections_total counter instance, ins, job, ip, cls Total number of SSL endpoints on this worker process since started (front+back)
haproxy_process_start_time_seconds gauge instance, ins, job, ip, cls Start time in seconds
haproxy_process_stopping gauge instance, ins, job, ip, cls 1 if the worker process is currently stopping, otherwise zero
haproxy_process_unstoppable_jobs gauge instance, ins, job, ip, cls Current number of unstoppable jobs on the current worker process (master connections)
haproxy_process_uptime_seconds gauge instance, ins, job, ip, cls How long ago this worker process was started (seconds)
haproxy_server_bytes_in_total counter proxy, instance, ins, job, server, ip, cls Total number of request bytes since process started
haproxy_server_bytes_out_total counter proxy, instance, ins, job, server, ip, cls Total number of response bytes since process started
haproxy_server_check_code gauge proxy, instance, ins, job, server, ip, cls layer5-7 code, if available of the last health check.
haproxy_server_check_duration_seconds gauge proxy, instance, ins, job, server, ip, cls Total duration of the latest server health check, in seconds.
haproxy_server_check_failures_total counter proxy, instance, ins, job, server, ip, cls Total number of failed individual health checks per server/backend, since the worker process started
haproxy_server_check_last_change_seconds gauge proxy, instance, ins, job, server, ip, cls How long ago the last server state changed, in seconds
haproxy_server_check_status gauge state, proxy, instance, ins, job, server, ip, cls Status of last health check, per state label value.
haproxy_server_check_up_down_total counter proxy, instance, ins, job, server, ip, cls Total number of failed checks causing UP to DOWN server transitions, per server/backend, since the worker process started
haproxy_server_client_aborts_total counter proxy, instance, ins, job, server, ip, cls Total number of requests or connections aborted by the client since the worker process started
haproxy_server_connect_time_average_seconds gauge proxy, instance, ins, job, server, ip, cls Avg. connect time for last 1024 successful connections.
haproxy_server_connection_attempts_total counter proxy, instance, ins, job, server, ip, cls Total number of outgoing connection attempts on this backend/server since the worker process started
haproxy_server_connection_errors_total counter proxy, instance, ins, job, server, ip, cls Total number of failed connections to server since the worker process started
haproxy_server_connection_reuses_total counter proxy, instance, ins, job, server, ip, cls Total number of reused connection on this backend/server since the worker process started
haproxy_server_current_queue gauge proxy, instance, ins, job, server, ip, cls Number of current queued connections
haproxy_server_current_sessions gauge proxy, instance, ins, job, server, ip, cls Number of current sessions on the frontend, backend or server
haproxy_server_current_throttle gauge proxy, instance, ins, job, server, ip, cls Throttling ratio applied to a server’s maxconn and weight during the slowstart period (0 to 100%)
haproxy_server_downtime_seconds_total counter proxy, instance, ins, job, server, ip, cls Total time spent in DOWN state, for server or backend
haproxy_server_failed_header_rewriting_total counter proxy, instance, ins, job, server, ip, cls Total number of failed HTTP header rewrites since the worker process started
haproxy_server_idle_connections_current gauge proxy, instance, ins, job, server, ip, cls Current number of idle connections available for reuse on this server
haproxy_server_idle_connections_limit gauge proxy, instance, ins, job, server, ip, cls Limit on the number of available idle connections on this server (server ‘pool_max_conn’ directive)
haproxy_server_internal_errors_total counter proxy, instance, ins, job, server, ip, cls Total number of internal errors since process started
haproxy_server_last_session_seconds gauge proxy, instance, ins, job, server, ip, cls How long ago some traffic was seen on this object on this worker process, in seconds
haproxy_server_limit_sessions gauge proxy, instance, ins, job, server, ip, cls Frontend/listener/server’s maxconn, backend’s fullconn
haproxy_server_loadbalanced_total counter proxy, instance, ins, job, server, ip, cls Total number of requests routed by load balancing since the worker process started (ignores queue pop and stickiness)
haproxy_server_max_connect_time_seconds gauge proxy, instance, ins, job, server, ip, cls Maximum observed time spent waiting for a connection to complete
haproxy_server_max_queue gauge proxy, instance, ins, job, server, ip, cls Highest value of queued connections encountered since process started
haproxy_server_max_queue_time_seconds gauge proxy, instance, ins, job, server, ip, cls Maximum observed time spent in the queue
haproxy_server_max_response_time_seconds gauge proxy, instance, ins, job, server, ip, cls Maximum observed time spent waiting for a server response
haproxy_server_max_session_rate gauge proxy, instance, ins, job, server, ip, cls Highest value of sessions per second observed since the worker process started
haproxy_server_max_sessions gauge proxy, instance, ins, job, server, ip, cls Highest value of current sessions encountered since process started
haproxy_server_max_total_time_seconds gauge proxy, instance, ins, job, server, ip, cls Maximum observed total request+response time (request+queue+connect+response+processing)
haproxy_server_need_connections_current gauge proxy, instance, ins, job, server, ip, cls Estimated needed number of connections
haproxy_server_queue_limit gauge proxy, instance, ins, job, server, ip, cls Limit on the number of connections in queue, for servers only (maxqueue argument)
haproxy_server_queue_time_average_seconds gauge proxy, instance, ins, job, server, ip, cls Avg. queue time for last 1024 successful connections.
haproxy_server_redispatch_warnings_total counter proxy, instance, ins, job, server, ip, cls Total number of server redispatches due to connection failures since the worker process started
haproxy_server_response_errors_total counter proxy, instance, ins, job, server, ip, cls Total number of invalid responses since the worker process started
haproxy_server_response_time_average_seconds gauge proxy, instance, ins, job, server, ip, cls Avg. response time for last 1024 successful connections.
haproxy_server_responses_denied_total counter proxy, instance, ins, job, server, ip, cls Total number of denied responses since process started
haproxy_server_retry_warnings_total counter proxy, instance, ins, job, server, ip, cls Total number of server connection retries since the worker process started
haproxy_server_safe_idle_connections_current gauge proxy, instance, ins, job, server, ip, cls Current number of safe idle connections
haproxy_server_server_aborts_total counter proxy, instance, ins, job, server, ip, cls Total number of requests or connections aborted by the server since the worker process started
haproxy_server_sessions_total counter proxy, instance, ins, job, server, ip, cls Total number of sessions since process started
haproxy_server_status gauge state, proxy, instance, ins, job, server, ip, cls Current status of the service, per state label value.
haproxy_server_total_time_average_seconds gauge proxy, instance, ins, job, server, ip, cls Avg. total time for last 1024 successful connections.
haproxy_server_unsafe_idle_connections_current gauge proxy, instance, ins, job, server, ip, cls Current number of unsafe idle connections
haproxy_server_used_connections_current gauge proxy, instance, ins, job, server, ip, cls Current number of connections in use
haproxy_server_uweight gauge proxy, instance, ins, job, server, ip, cls Server’s user weight, or sum of active servers’ user weights for a backend
haproxy_server_weight gauge proxy, instance, ins, job, server, ip, cls Server’s effective weight, or sum of active servers’ effective weights for a backend
haproxy_up Unknown instance, ins, job, ip, cls N/A
inflight_requests gauge instance, ins, job, route, ip, cls, method Current number of inflight requests.
jaeger_tracer_baggage_restrictions_updates_total Unknown instance, ins, job, result, ip, cls N/A
jaeger_tracer_baggage_truncations_total Unknown instance, ins, job, ip, cls N/A
jaeger_tracer_baggage_updates_total Unknown instance, ins, job, result, ip, cls N/A
jaeger_tracer_finished_spans_total Unknown instance, ins, job, sampled, ip, cls N/A
jaeger_tracer_reporter_queue_length gauge instance, ins, job, ip, cls Current number of spans in the reporter queue
jaeger_tracer_reporter_spans_total Unknown instance, ins, job, result, ip, cls N/A
jaeger_tracer_sampler_queries_total Unknown instance, ins, job, result, ip, cls N/A
jaeger_tracer_sampler_updates_total Unknown instance, ins, job, result, ip, cls N/A
jaeger_tracer_span_context_decoding_errors_total Unknown instance, ins, job, ip, cls N/A
jaeger_tracer_started_spans_total Unknown instance, ins, job, sampled, ip, cls N/A
jaeger_tracer_throttled_debug_spans_total Unknown instance, ins, job, ip, cls N/A
jaeger_tracer_throttler_updates_total Unknown instance, ins, job, result, ip, cls N/A
jaeger_tracer_traces_total Unknown state, instance, ins, job, sampled, ip, cls N/A
loki_experimental_features_in_use_total Unknown instance, ins, job, ip, cls N/A
loki_internal_log_messages_total Unknown level, instance, ins, job, ip, cls N/A
loki_log_flushes_bucket Unknown instance, ins, job, le, ip, cls N/A
loki_log_flushes_count Unknown instance, ins, job, ip, cls N/A
loki_log_flushes_sum Unknown instance, ins, job, ip, cls N/A
loki_log_messages_total Unknown level, instance, ins, job, ip, cls N/A
loki_logql_querystats_duplicates_total Unknown instance, ins, job, ip, cls N/A
loki_logql_querystats_ingester_sent_lines_total Unknown instance, ins, job, ip, cls N/A
loki_querier_index_cache_corruptions_total Unknown instance, ins, job, ip, cls N/A
loki_querier_index_cache_encode_errors_total Unknown instance, ins, job, ip, cls N/A
loki_querier_index_cache_gets_total Unknown instance, ins, job, ip, cls N/A
loki_querier_index_cache_hits_total Unknown instance, ins, job, ip, cls N/A
loki_querier_index_cache_puts_total Unknown instance, ins, job, ip, cls N/A
net_conntrack_dialer_conn_attempted_total counter ip, ins, job, instance, cls, dialer_name Total number of connections attempted by the given dialer a given name.
net_conntrack_dialer_conn_closed_total counter ip, ins, job, instance, cls, dialer_name Total number of connections closed which originated from the dialer of a given name.
net_conntrack_dialer_conn_established_total counter ip, ins, job, instance, cls, dialer_name Total number of connections successfully established by the given dialer a given name.
net_conntrack_dialer_conn_failed_total counter ip, ins, job, reason, instance, cls, dialer_name Total number of connections failed to dial by the dialer a given name.
node:cls:avail_bytes Unknown job, cls N/A
node:cls:cpu_count Unknown job, cls N/A
node:cls:cpu_usage Unknown job, cls N/A
node:cls:cpu_usage_15m Unknown job, cls N/A
node:cls:cpu_usage_1m Unknown job, cls N/A
node:cls:cpu_usage_5m Unknown job, cls N/A
node:cls:disk_io_bytes_rate1m Unknown job, cls N/A
node:cls:disk_iops_1m Unknown job, cls N/A
node:cls:disk_mreads_rate1m Unknown job, cls N/A
node:cls:disk_mreads_ratio1m Unknown job, cls N/A
node:cls:disk_mwrites_rate1m Unknown job, cls N/A
node:cls:disk_mwrites_ratio1m Unknown job, cls N/A
node:cls:disk_read_bytes_rate1m Unknown job, cls N/A
node:cls:disk_reads_rate1m Unknown job, cls N/A
node:cls:disk_write_bytes_rate1m Unknown job, cls N/A
node:cls:disk_writes_rate1m Unknown job, cls N/A
node:cls:free_bytes Unknown job, cls N/A
node:cls:mem_usage Unknown job, cls N/A
node:cls:network_io_bytes_rate1m Unknown job, cls N/A
node:cls:network_rx_bytes_rate1m Unknown job, cls N/A
node:cls:network_rx_pps1m Unknown job, cls N/A
node:cls:network_tx_bytes_rate1m Unknown job, cls N/A
node:cls:network_tx_pps1m Unknown job, cls N/A
node:cls:size_bytes Unknown job, cls N/A
node:cls:space_usage Unknown job, cls N/A
node:cls:space_usage_max Unknown job, cls N/A
node:cls:stdload1 Unknown job, cls N/A
node:cls:stdload15 Unknown job, cls N/A
node:cls:stdload5 Unknown job, cls N/A
node:cls:time_drift_max Unknown job, cls N/A
node:cpu:idle_time_irate1m Unknown ip, ins, job, cpu, instance, cls N/A
node:cpu:sched_timeslices_rate1m Unknown ip, ins, job, cpu, instance, cls N/A
node:cpu:sched_wait_rate1m Unknown ip, ins, job, cpu, instance, cls N/A
node:cpu:time_irate1m Unknown ip, mode, ins, job, cpu, instance, cls N/A
node:cpu:total_time_irate1m Unknown ip, ins, job, cpu, instance, cls N/A
node:cpu:usage Unknown ip, ins, job, cpu, instance, cls N/A
node:cpu:usage_avg15m Unknown ip, ins, job, cpu, instance, cls N/A
node:cpu:usage_avg1m Unknown ip, ins, job, cpu, instance, cls N/A
node:cpu:usage_avg5m Unknown ip, ins, job, cpu, instance, cls N/A
node:dev:disk_avg_queue_size Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_io_batch_1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_io_bytes_rate1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_io_rt_1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_io_time_rate1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_iops_1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_mreads_rate1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_mreads_ratio1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_mwrites_rate1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_mwrites_ratio1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_read_batch_1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_read_bytes_rate1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_read_rt_1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_read_time_rate1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_reads_rate1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_util_1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_write_batch_1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_write_bytes_rate1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_write_rt_1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_write_time_rate1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:disk_writes_rate1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:network_io_bytes_rate1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:network_rx_bytes_rate1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:network_rx_pps1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:network_tx_bytes_rate1m Unknown ip, device, ins, job, instance, cls N/A
node:dev:network_tx_pps1m Unknown ip, device, ins, job, instance, cls N/A
node:env:avail_bytes Unknown job N/A
node:env:cpu_count Unknown job N/A
node:env:cpu_usage Unknown job N/A
node:env:cpu_usage_15m Unknown job N/A
node:env:cpu_usage_1m Unknown job N/A
node:env:cpu_usage_5m Unknown job N/A
node:env:device_space_usage_max Unknown device, mountpoint, job, fstype N/A
node:env:free_bytes Unknown job N/A
node:env:mem_avail Unknown job N/A
node:env:mem_total Unknown job N/A
node:env:mem_usage Unknown job N/A
node:env:size_bytes Unknown job N/A
node:env:space_usage Unknown job N/A
node:env:stdload1 Unknown job N/A
node:env:stdload15 Unknown job N/A
node:env:stdload5 Unknown job N/A
node:fs:avail_bytes Unknown ip, device, mountpoint, ins, cls, job, instance, fstype N/A
node:fs:free_bytes Unknown ip, device, mountpoint, ins, cls, job, instance, fstype N/A
node:fs:inode_free Unknown ip, device, mountpoint, ins, cls, job, instance, fstype N/A
node:fs:inode_total Unknown ip, device, mountpoint, ins, cls, job, instance, fstype N/A
node:fs:inode_usage Unknown ip, device, mountpoint, ins, cls, job, instance, fstype N/A
node:fs:inode_used Unknown ip, device, mountpoint, ins, cls, job, instance, fstype N/A
node:fs:size_bytes Unknown ip, device, mountpoint, ins, cls, job, instance, fstype N/A
node:fs:space_deriv1h Unknown ip, device, mountpoint, ins, cls, job, instance, fstype N/A
node:fs:space_exhaust Unknown ip, device, mountpoint, ins, cls, job, instance, fstype N/A
node:fs:space_predict_1d Unknown ip, device, mountpoint, ins, cls, job, instance, fstype N/A
node:fs:space_usage Unknown ip, device, mountpoint, ins, cls, job, instance, fstype N/A
node:ins Unknown id, ip, ins, job, nodename, instance, cls N/A
node:ins:avail_bytes Unknown instance, ins, job, ip, cls N/A
node:ins:cpu_count Unknown instance, ins, job, ip, cls N/A
node:ins:cpu_usage Unknown instance, ins, job, ip, cls N/A
node:ins:cpu_usage_15m Unknown instance, ins, job, ip, cls N/A
node:ins:cpu_usage_1m Unknown instance, ins, job, ip, cls N/A
node:ins:cpu_usage_5m Unknown instance, ins, job, ip, cls N/A
node:ins:ctx_switch_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:disk_io_bytes_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:disk_iops_1m Unknown instance, ins, job, ip, cls N/A
node:ins:disk_mreads_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:disk_mreads_ratio1m Unknown instance, ins, job, ip, cls N/A
node:ins:disk_mwrites_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:disk_mwrites_ratio1m Unknown instance, ins, job, ip, cls N/A
node:ins:disk_read_bytes_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:disk_reads_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:disk_write_bytes_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:disk_writes_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:fd_alloc_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:fd_usage Unknown instance, ins, job, ip, cls N/A
node:ins:forks_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:free_bytes Unknown instance, ins, job, ip, cls N/A
node:ins:inode_usage Unknown instance, ins, job, ip, cls N/A
node:ins:interrupt_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:mem_avail Unknown instance, ins, job, ip, cls N/A
node:ins:mem_commit_ratio Unknown instance, ins, job, ip, cls N/A
node:ins:mem_kernel Unknown instance, ins, job, ip, cls N/A
node:ins:mem_rss Unknown instance, ins, job, ip, cls N/A
node:ins:mem_usage Unknown instance, ins, job, ip, cls N/A
node:ins:network_io_bytes_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:network_rx_bytes_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:network_rx_pps1m Unknown instance, ins, job, ip, cls N/A
node:ins:network_tx_bytes_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:network_tx_pps1m Unknown instance, ins, job, ip, cls N/A
node:ins:pagefault_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:pagein_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:pageout_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:pgmajfault_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:sched_wait_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:size_bytes Unknown instance, ins, job, ip, cls N/A
node:ins:space_usage_max Unknown instance, ins, job, ip, cls N/A
node:ins:stdload1 Unknown instance, ins, job, ip, cls N/A
node:ins:stdload15 Unknown instance, ins, job, ip, cls N/A
node:ins:stdload5 Unknown instance, ins, job, ip, cls N/A
node:ins:swap_usage Unknown instance, ins, job, ip, cls N/A
node:ins:swapin_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:swapout_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:tcp_active_opens_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:tcp_dropped_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:tcp_error Unknown instance, ins, job, ip, cls N/A
node:ins:tcp_error_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:tcp_insegs_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:tcp_outsegs_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:tcp_overflow_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:tcp_passive_opens_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:tcp_retrans_ratio1m Unknown instance, ins, job, ip, cls N/A
node:ins:tcp_retranssegs_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:tcp_segs_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:time_drift Unknown instance, ins, job, ip, cls N/A
node:ins:udp_in_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:udp_out_rate1m Unknown instance, ins, job, ip, cls N/A
node:ins:uptime Unknown instance, ins, job, ip, cls N/A
node_arp_entries gauge ip, device, ins, job, instance, cls ARP entries by device
node_boot_time_seconds gauge instance, ins, job, ip, cls Node boot time, in unixtime.
node_context_switches_total counter instance, ins, job, ip, cls Total number of context switches.
node_cooling_device_cur_state gauge instance, ins, job, type, ip, cls Current throttle state of the cooling device
node_cooling_device_max_state gauge instance, ins, job, type, ip, cls Maximum throttle state of the cooling device
node_cpu_guest_seconds_total counter ip, mode, ins, job, cpu, instance, cls Seconds the CPUs spent in guests (VMs) for each mode.
node_cpu_seconds_total counter ip, mode, ins, job, cpu, instance, cls Seconds the CPUs spent in each mode.
node_disk_discard_time_seconds_total counter ip, device, ins, job, instance, cls This is the total number of seconds spent by all discards.
node_disk_discarded_sectors_total counter ip, device, ins, job, instance, cls The total number of sectors discarded successfully.
node_disk_discards_completed_total counter ip, device, ins, job, instance, cls The total number of discards completed successfully.
node_disk_discards_merged_total counter ip, device, ins, job, instance, cls The total number of discards merged.
node_disk_filesystem_info gauge ip, usage, version, device, uuid, ins, type, job, instance, cls Info about disk filesystem.
node_disk_info gauge minor, ip, major, revision, device, model, serial, path, ins, job, instance, cls Info of /sys/block/<block_device>.
node_disk_io_now gauge ip, device, ins, job, instance, cls The number of I/Os currently in progress.
node_disk_io_time_seconds_total counter ip, device, ins, job, instance, cls Total seconds spent doing I/Os.
node_disk_io_time_weighted_seconds_total counter ip, device, ins, job, instance, cls The weighted # of seconds spent doing I/Os.
node_disk_read_bytes_total counter ip, device, ins, job, instance, cls The total number of bytes read successfully.
node_disk_read_time_seconds_total counter ip, device, ins, job, instance, cls The total number of seconds spent by all reads.
node_disk_reads_completed_total counter ip, device, ins, job, instance, cls The total number of reads completed successfully.
node_disk_reads_merged_total counter ip, device, ins, job, instance, cls The total number of reads merged.
node_disk_write_time_seconds_total counter ip, device, ins, job, instance, cls This is the total number of seconds spent by all writes.
node_disk_writes_completed_total counter ip, device, ins, job, instance, cls The total number of writes completed successfully.
node_disk_writes_merged_total counter ip, device, ins, job, instance, cls The number of writes merged.
node_disk_written_bytes_total counter ip, device, ins, job, instance, cls The total number of bytes written successfully.
node_dmi_info gauge bios_vendor, ip, product_family, product_version, product_uuid, system_vendor, bios_version, ins, bios_date, cls, job, product_name, instance, chassis_version, chassis_vendor, product_serial A metric with a constant ‘1’ value labeled by bios_date, bios_release, bios_vendor, bios_version, board_asset_tag, board_name, board_serial, board_vendor, board_version, chassis_asset_tag, chassis_serial, chassis_vendor, chassis_version, product_family, product_name, product_serial, product_sku, product_uuid, product_version, system_vendor if provided by DMI.
node_entropy_available_bits gauge instance, ins, job, ip, cls Bits of available entropy.
node_entropy_pool_size_bits gauge instance, ins, job, ip, cls Bits of entropy pool.
node_exporter_build_info gauge ip, version, revision, goversion, branch, ins, goarch, job, tags, instance, cls, goos A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which node_exporter was built, and the goos and goarch for the build.
node_filefd_allocated gauge instance, ins, job, ip, cls File descriptor statistics: allocated.
node_filefd_maximum gauge instance, ins, job, ip, cls File descriptor statistics: maximum.
node_filesystem_avail_bytes gauge ip, device, mountpoint, ins, cls, job, instance, fstype Filesystem space available to non-root users in bytes.
node_filesystem_device_error gauge ip, device, mountpoint, ins, cls, job, instance, fstype Whether an error occurred while getting statistics for the given device.
node_filesystem_files gauge ip, device, mountpoint, ins, cls, job, instance, fstype Filesystem total file nodes.
node_filesystem_files_free gauge ip, device, mountpoint, ins, cls, job, instance, fstype Filesystem total free file nodes.
node_filesystem_free_bytes gauge ip, device, mountpoint, ins, cls, job, instance, fstype Filesystem free space in bytes.
node_filesystem_readonly gauge ip, device, mountpoint, ins, cls, job, instance, fstype Filesystem read-only status.
node_filesystem_size_bytes gauge ip, device, mountpoint, ins, cls, job, instance, fstype Filesystem size in bytes.
node_forks_total counter instance, ins, job, ip, cls Total number of forks.
node_hwmon_chip_names gauge chip_name, ip, ins, chip, job, instance, cls Annotation metric for human-readable chip names
node_hwmon_energy_joule_total counter sensor, ip, ins, chip, job, instance, cls Hardware monitor for joules used so far (input)
node_hwmon_sensor_label gauge sensor, ip, ins, chip, job, label, instance, cls Label for given chip and sensor
node_intr_total counter instance, ins, job, ip, cls Total number of interrupts serviced.
node_ipvs_connections_total counter instance, ins, job, ip, cls The total number of connections made.
node_ipvs_incoming_bytes_total counter instance, ins, job, ip, cls The total amount of incoming data.
node_ipvs_incoming_packets_total counter instance, ins, job, ip, cls The total number of incoming packets.
node_ipvs_outgoing_bytes_total counter instance, ins, job, ip, cls The total amount of outgoing data.
node_ipvs_outgoing_packets_total counter instance, ins, job, ip, cls The total number of outgoing packets.
node_load1 gauge instance, ins, job, ip, cls 1m load average.
node_load15 gauge instance, ins, job, ip, cls 15m load average.
node_load5 gauge instance, ins, job, ip, cls 5m load average.
node_memory_Active_anon_bytes gauge instance, ins, job, ip, cls Memory information field Active_anon_bytes.
node_memory_Active_bytes gauge instance, ins, job, ip, cls Memory information field Active_bytes.
node_memory_Active_file_bytes gauge instance, ins, job, ip, cls Memory information field Active_file_bytes.
node_memory_AnonHugePages_bytes gauge instance, ins, job, ip, cls Memory information field AnonHugePages_bytes.
node_memory_AnonPages_bytes gauge instance, ins, job, ip, cls Memory information field AnonPages_bytes.
node_memory_Bounce_bytes gauge instance, ins, job, ip, cls Memory information field Bounce_bytes.
node_memory_Buffers_bytes gauge instance, ins, job, ip, cls Memory information field Buffers_bytes.
node_memory_Cached_bytes gauge instance, ins, job, ip, cls Memory information field Cached_bytes.
node_memory_CommitLimit_bytes gauge instance, ins, job, ip, cls Memory information field CommitLimit_bytes.
node_memory_Committed_AS_bytes gauge instance, ins, job, ip, cls Memory information field Committed_AS_bytes.
node_memory_DirectMap1G_bytes gauge instance, ins, job, ip, cls Memory information field DirectMap1G_bytes.
node_memory_DirectMap2M_bytes gauge instance, ins, job, ip, cls Memory information field DirectMap2M_bytes.
node_memory_DirectMap4k_bytes gauge instance, ins, job, ip, cls Memory information field DirectMap4k_bytes.
node_memory_Dirty_bytes gauge instance, ins, job, ip, cls Memory information field Dirty_bytes.
node_memory_FileHugePages_bytes gauge instance, ins, job, ip, cls Memory information field FileHugePages_bytes.
node_memory_FilePmdMapped_bytes gauge instance, ins, job, ip, cls Memory information field FilePmdMapped_bytes.
node_memory_HardwareCorrupted_bytes gauge instance, ins, job, ip, cls Memory information field HardwareCorrupted_bytes.
node_memory_HugePages_Free gauge instance, ins, job, ip, cls Memory information field HugePages_Free.
node_memory_HugePages_Rsvd gauge instance, ins, job, ip, cls Memory information field HugePages_Rsvd.
node_memory_HugePages_Surp gauge instance, ins, job, ip, cls Memory information field HugePages_Surp.
node_memory_HugePages_Total gauge instance, ins, job, ip, cls Memory information field HugePages_Total.
node_memory_Hugepagesize_bytes gauge instance, ins, job, ip, cls Memory information field Hugepagesize_bytes.
node_memory_Hugetlb_bytes gauge instance, ins, job, ip, cls Memory information field Hugetlb_bytes.
node_memory_Inactive_anon_bytes gauge instance, ins, job, ip, cls Memory information field Inactive_anon_bytes.
node_memory_Inactive_bytes gauge instance, ins, job, ip, cls Memory information field Inactive_bytes.
node_memory_Inactive_file_bytes gauge instance, ins, job, ip, cls Memory information field Inactive_file_bytes.
node_memory_KReclaimable_bytes gauge instance, ins, job, ip, cls Memory information field KReclaimable_bytes.
node_memory_KernelStack_bytes gauge instance, ins, job, ip, cls Memory information field KernelStack_bytes.
node_memory_Mapped_bytes gauge instance, ins, job, ip, cls Memory information field Mapped_bytes.
node_memory_MemAvailable_bytes gauge instance, ins, job, ip, cls Memory information field MemAvailable_bytes.
node_memory_MemFree_bytes gauge instance, ins, job, ip, cls Memory information field MemFree_bytes.
node_memory_MemTotal_bytes gauge instance, ins, job, ip, cls Memory information field MemTotal_bytes.
node_memory_Mlocked_bytes gauge instance, ins, job, ip, cls Memory information field Mlocked_bytes.
node_memory_NFS_Unstable_bytes gauge instance, ins, job, ip, cls Memory information field NFS_Unstable_bytes.
node_memory_PageTables_bytes gauge instance, ins, job, ip, cls Memory information field PageTables_bytes.
node_memory_Percpu_bytes gauge instance, ins, job, ip, cls Memory information field Percpu_bytes.
node_memory_SReclaimable_bytes gauge instance, ins, job, ip, cls Memory information field SReclaimable_bytes.
node_memory_SUnreclaim_bytes gauge instance, ins, job, ip, cls Memory information field SUnreclaim_bytes.
node_memory_ShmemHugePages_bytes gauge instance, ins, job, ip, cls Memory information field ShmemHugePages_bytes.
node_memory_ShmemPmdMapped_bytes gauge instance, ins, job, ip, cls Memory information field ShmemPmdMapped_bytes.
node_memory_Shmem_bytes gauge instance, ins, job, ip, cls Memory information field Shmem_bytes.
node_memory_Slab_bytes gauge instance, ins, job, ip, cls Memory information field Slab_bytes.
node_memory_SwapCached_bytes gauge instance, ins, job, ip, cls Memory information field SwapCached_bytes.
node_memory_SwapFree_bytes gauge instance, ins, job, ip, cls Memory information field SwapFree_bytes.
node_memory_SwapTotal_bytes gauge instance, ins, job, ip, cls Memory information field SwapTotal_bytes.
node_memory_Unevictable_bytes gauge instance, ins, job, ip, cls Memory information field Unevictable_bytes.
node_memory_VmallocChunk_bytes gauge instance, ins, job, ip, cls Memory information field VmallocChunk_bytes.
node_memory_VmallocTotal_bytes gauge instance, ins, job, ip, cls Memory information field VmallocTotal_bytes.
node_memory_VmallocUsed_bytes gauge instance, ins, job, ip, cls Memory information field VmallocUsed_bytes.
node_memory_WritebackTmp_bytes gauge instance, ins, job, ip, cls Memory information field WritebackTmp_bytes.
node_memory_Writeback_bytes gauge instance, ins, job, ip, cls Memory information field Writeback_bytes.
node_netstat_Icmp6_InErrors unknown instance, ins, job, ip, cls Statistic Icmp6InErrors.
node_netstat_Icmp6_InMsgs unknown instance, ins, job, ip, cls Statistic Icmp6InMsgs.
node_netstat_Icmp6_OutMsgs unknown instance, ins, job, ip, cls Statistic Icmp6OutMsgs.
node_netstat_Icmp_InErrors unknown instance, ins, job, ip, cls Statistic IcmpInErrors.
node_netstat_Icmp_InMsgs unknown instance, ins, job, ip, cls Statistic IcmpInMsgs.
node_netstat_Icmp_OutMsgs unknown instance, ins, job, ip, cls Statistic IcmpOutMsgs.
node_netstat_Ip6_InOctets unknown instance, ins, job, ip, cls Statistic Ip6InOctets.
node_netstat_Ip6_OutOctets unknown instance, ins, job, ip, cls Statistic Ip6OutOctets.
node_netstat_IpExt_InOctets unknown instance, ins, job, ip, cls Statistic IpExtInOctets.
node_netstat_IpExt_OutOctets unknown instance, ins, job, ip, cls Statistic IpExtOutOctets.
node_netstat_Ip_Forwarding unknown instance, ins, job, ip, cls Statistic IpForwarding.
node_netstat_TcpExt_ListenDrops unknown instance, ins, job, ip, cls Statistic TcpExtListenDrops.
node_netstat_TcpExt_ListenOverflows unknown instance, ins, job, ip, cls Statistic TcpExtListenOverflows.
node_netstat_TcpExt_SyncookiesFailed unknown instance, ins, job, ip, cls Statistic TcpExtSyncookiesFailed.
node_netstat_TcpExt_SyncookiesRecv unknown instance, ins, job, ip, cls Statistic TcpExtSyncookiesRecv.
node_netstat_TcpExt_SyncookiesSent unknown instance, ins, job, ip, cls Statistic TcpExtSyncookiesSent.
node_netstat_TcpExt_TCPSynRetrans unknown instance, ins, job, ip, cls Statistic TcpExtTCPSynRetrans.
node_netstat_TcpExt_TCPTimeouts unknown instance, ins, job, ip, cls Statistic TcpExtTCPTimeouts.
node_netstat_Tcp_ActiveOpens unknown instance, ins, job, ip, cls Statistic TcpActiveOpens.
node_netstat_Tcp_CurrEstab unknown instance, ins, job, ip, cls Statistic TcpCurrEstab.
node_netstat_Tcp_InErrs unknown instance, ins, job, ip, cls Statistic TcpInErrs.
node_netstat_Tcp_InSegs unknown instance, ins, job, ip, cls Statistic TcpInSegs.
node_netstat_Tcp_OutRsts unknown instance, ins, job, ip, cls Statistic TcpOutRsts.
node_netstat_Tcp_OutSegs unknown instance, ins, job, ip, cls Statistic TcpOutSegs.
node_netstat_Tcp_PassiveOpens unknown instance, ins, job, ip, cls Statistic TcpPassiveOpens.
node_netstat_Tcp_RetransSegs unknown instance, ins, job, ip, cls Statistic TcpRetransSegs.
node_netstat_Udp6_InDatagrams unknown instance, ins, job, ip, cls Statistic Udp6InDatagrams.
node_netstat_Udp6_InErrors unknown instance, ins, job, ip, cls Statistic Udp6InErrors.
node_netstat_Udp6_NoPorts unknown instance, ins, job, ip, cls Statistic Udp6NoPorts.
node_netstat_Udp6_OutDatagrams unknown instance, ins, job, ip, cls Statistic Udp6OutDatagrams.
node_netstat_Udp6_RcvbufErrors unknown instance, ins, job, ip, cls Statistic Udp6RcvbufErrors.
node_netstat_Udp6_SndbufErrors unknown instance, ins, job, ip, cls Statistic Udp6SndbufErrors.
node_netstat_UdpLite6_InErrors unknown instance, ins, job, ip, cls Statistic UdpLite6InErrors.
node_netstat_UdpLite_InErrors unknown instance, ins, job, ip, cls Statistic UdpLiteInErrors.
node_netstat_Udp_InDatagrams unknown instance, ins, job, ip, cls Statistic UdpInDatagrams.
node_netstat_Udp_InErrors unknown instance, ins, job, ip, cls Statistic UdpInErrors.
node_netstat_Udp_NoPorts unknown instance, ins, job, ip, cls Statistic UdpNoPorts.
node_netstat_Udp_OutDatagrams unknown instance, ins, job, ip, cls Statistic UdpOutDatagrams.
node_netstat_Udp_RcvbufErrors unknown instance, ins, job, ip, cls Statistic UdpRcvbufErrors.
node_netstat_Udp_SndbufErrors unknown instance, ins, job, ip, cls Statistic UdpSndbufErrors.
node_network_address_assign_type gauge ip, device, ins, job, instance, cls Network device property: address_assign_type
node_network_carrier gauge ip, device, ins, job, instance, cls Network device property: carrier
node_network_carrier_changes_total counter ip, device, ins, job, instance, cls Network device property: carrier_changes_total
node_network_carrier_down_changes_total counter ip, device, ins, job, instance, cls Network device property: carrier_down_changes_total
node_network_carrier_up_changes_total counter ip, device, ins, job, instance, cls Network device property: carrier_up_changes_total
node_network_device_id gauge ip, device, ins, job, instance, cls Network device property: device_id
node_network_dormant gauge ip, device, ins, job, instance, cls Network device property: dormant
node_network_flags gauge ip, device, ins, job, instance, cls Network device property: flags
node_network_iface_id gauge ip, device, ins, job, instance, cls Network device property: iface_id
node_network_iface_link gauge ip, device, ins, job, instance, cls Network device property: iface_link
node_network_iface_link_mode gauge ip, device, ins, job, instance, cls Network device property: iface_link_mode
node_network_info gauge broadcast, ip, device, operstate, ins, job, adminstate, duplex, address, instance, cls Non-numeric data from /sys/class/net/, value is always 1.
node_network_mtu_bytes gauge ip, device, ins, job, instance, cls Network device property: mtu_bytes
node_network_name_assign_type gauge ip, device, ins, job, instance, cls Network device property: name_assign_type
node_network_net_dev_group gauge ip, device, ins, job, instance, cls Network device property: net_dev_group
node_network_protocol_type gauge ip, device, ins, job, instance, cls Network device property: protocol_type
node_network_receive_bytes_total counter ip, device, ins, job, instance, cls Network device statistic receive_bytes.
node_network_receive_compressed_total counter ip, device, ins, job, instance, cls Network device statistic receive_compressed.
node_network_receive_drop_total counter ip, device, ins, job, instance, cls Network device statistic receive_drop.
node_network_receive_errs_total counter ip, device, ins, job, instance, cls Network device statistic receive_errs.
node_network_receive_fifo_total counter ip, device, ins, job, instance, cls Network device statistic receive_fifo.
node_network_receive_frame_total counter ip, device, ins, job, instance, cls Network device statistic receive_frame.
node_network_receive_multicast_total counter ip, device, ins, job, instance, cls Network device statistic receive_multicast.
node_network_receive_nohandler_total counter ip, device, ins, job, instance, cls Network device statistic receive_nohandler.
node_network_receive_packets_total counter ip, device, ins, job, instance, cls Network device statistic receive_packets.
node_network_speed_bytes gauge ip, device, ins, job, instance, cls Network device property: speed_bytes
node_network_transmit_bytes_total counter ip, device, ins, job, instance, cls Network device statistic transmit_bytes.
node_network_transmit_carrier_total counter ip, device, ins, job, instance, cls Network device statistic transmit_carrier.
node_network_transmit_colls_total counter ip, device, ins, job, instance, cls Network device statistic transmit_colls.
node_network_transmit_compressed_total counter ip, device, ins, job, instance, cls Network device statistic transmit_compressed.
node_network_transmit_drop_total counter ip, device, ins, job, instance, cls Network device statistic transmit_drop.
node_network_transmit_errs_total counter ip, device, ins, job, instance, cls Network device statistic transmit_errs.
node_network_transmit_fifo_total counter ip, device, ins, job, instance, cls Network device statistic transmit_fifo.
node_network_transmit_packets_total counter ip, device, ins, job, instance, cls Network device statistic transmit_packets.
node_network_transmit_queue_length gauge ip, device, ins, job, instance, cls Network device property: transmit_queue_length
node_network_up gauge ip, device, ins, job, instance, cls Value is 1 if operstate is ‘up’, 0 otherwise.
node_nf_conntrack_entries gauge instance, ins, job, ip, cls Number of currently allocated flow entries for connection tracking.
node_nf_conntrack_entries_limit gauge instance, ins, job, ip, cls Maximum size of connection tracking table.
node_nf_conntrack_stat_drop gauge instance, ins, job, ip, cls Number of packets dropped due to conntrack failure.
node_nf_conntrack_stat_early_drop gauge instance, ins, job, ip, cls Number of dropped conntrack entries to make room for new ones, if maximum table size was reached.
node_nf_conntrack_stat_found gauge instance, ins, job, ip, cls Number of searched entries which were successful.
node_nf_conntrack_stat_ignore gauge instance, ins, job, ip, cls Number of packets seen which are already connected to a conntrack entry.
node_nf_conntrack_stat_insert gauge instance, ins, job, ip, cls Number of entries inserted into the list.
node_nf_conntrack_stat_insert_failed gauge instance, ins, job, ip, cls Number of entries for which list insertion was attempted but failed.
node_nf_conntrack_stat_invalid gauge instance, ins, job, ip, cls Number of packets seen which can not be tracked.
node_nf_conntrack_stat_search_restart gauge instance, ins, job, ip, cls Number of conntrack table lookups which had to be restarted due to hashtable resizes.
node_os_info gauge id, ip, version, version_id, ins, instance, job, pretty_name, id_like, cls A metric with a constant ‘1’ value labeled by build_id, id, id_like, image_id, image_version, name, pretty_name, variant, variant_id, version, version_codename, version_id.
node_os_version gauge id, ip, ins, instance, job, id_like, cls Metric containing the major.minor part of the OS version.
node_processes_max_processes gauge instance, ins, job, ip, cls Number of max PIDs limit
node_processes_max_threads gauge instance, ins, job, ip, cls Limit of threads in the system
node_processes_pids gauge instance, ins, job, ip, cls Number of PIDs
node_processes_state gauge state, instance, ins, job, ip, cls Number of processes in each state.
node_processes_threads gauge instance, ins, job, ip, cls Allocated threads in system
node_processes_threads_state gauge instance, ins, job, thread_state, ip, cls Number of threads in each state.
node_procs_blocked gauge instance, ins, job, ip, cls Number of processes blocked waiting for I/O to complete.
node_procs_running gauge instance, ins, job, ip, cls Number of processes in runnable state.
node_schedstat_running_seconds_total counter ip, ins, job, cpu, instance, cls Number of seconds CPU spent running a process.
node_schedstat_timeslices_total counter ip, ins, job, cpu, instance, cls Number of timeslices executed by CPU.
node_schedstat_waiting_seconds_total counter ip, ins, job, cpu, instance, cls Number of seconds spent by processing waiting for this CPU.
node_scrape_collector_duration_seconds gauge ip, collector, ins, job, instance, cls node_exporter: Duration of a collector scrape.
node_scrape_collector_success gauge ip, collector, ins, job, instance, cls node_exporter: Whether a collector succeeded.
node_selinux_enabled gauge instance, ins, job, ip, cls SELinux is enabled, 1 is true, 0 is false
node_sockstat_FRAG6_inuse gauge instance, ins, job, ip, cls Number of FRAG6 sockets in state inuse.
node_sockstat_FRAG6_memory gauge instance, ins, job, ip, cls Number of FRAG6 sockets in state memory.
node_sockstat_FRAG_inuse gauge instance, ins, job, ip, cls Number of FRAG sockets in state inuse.
node_sockstat_FRAG_memory gauge instance, ins, job, ip, cls Number of FRAG sockets in state memory.
node_sockstat_RAW6_inuse gauge instance, ins, job, ip, cls Number of RAW6 sockets in state inuse.
node_sockstat_RAW_inuse gauge instance, ins, job, ip, cls Number of RAW sockets in state inuse.
node_sockstat_TCP6_inuse gauge instance, ins, job, ip, cls Number of TCP6 sockets in state inuse.
node_sockstat_TCP_alloc gauge instance, ins, job, ip, cls Number of TCP sockets in state alloc.
node_sockstat_TCP_inuse gauge instance, ins, job, ip, cls Number of TCP sockets in state inuse.
node_sockstat_TCP_mem gauge instance, ins, job, ip, cls Number of TCP sockets in state mem.
node_sockstat_TCP_mem_bytes gauge instance, ins, job, ip, cls Number of TCP sockets in state mem_bytes.
node_sockstat_TCP_orphan gauge instance, ins, job, ip, cls Number of TCP sockets in state orphan.
node_sockstat_TCP_tw gauge instance, ins, job, ip, cls Number of TCP sockets in state tw.
node_sockstat_UDP6_inuse gauge instance, ins, job, ip, cls Number of UDP6 sockets in state inuse.
node_sockstat_UDPLITE6_inuse gauge instance, ins, job, ip, cls Number of UDPLITE6 sockets in state inuse.
node_sockstat_UDPLITE_inuse gauge instance, ins, job, ip, cls Number of UDPLITE sockets in state inuse.
node_sockstat_UDP_inuse gauge instance, ins, job, ip, cls Number of UDP sockets in state inuse.
node_sockstat_UDP_mem gauge instance, ins, job, ip, cls Number of UDP sockets in state mem.
node_sockstat_UDP_mem_bytes gauge instance, ins, job, ip, cls Number of UDP sockets in state mem_bytes.
node_sockstat_sockets_used gauge instance, ins, job, ip, cls Number of IPv4 sockets in use.
node_tcp_connection_states gauge state, instance, ins, job, ip, cls Number of connection states.
node_textfile_scrape_error gauge instance, ins, job, ip, cls 1 if there was an error opening or reading a file, 0 otherwise
node_time_clocksource_available_info gauge ip, device, ins, clocksource, job, instance, cls Available clocksources read from ‘/sys/devices/system/clocksource’.
node_time_clocksource_current_info gauge ip, device, ins, clocksource, job, instance, cls Current clocksource read from ‘/sys/devices/system/clocksource’.
node_time_seconds gauge instance, ins, job, ip, cls System time in seconds since epoch (1970).
node_time_zone_offset_seconds gauge instance, ins, job, time_zone, ip, cls System time zone offset in seconds.
node_timex_estimated_error_seconds gauge instance, ins, job, ip, cls Estimated error in seconds.
node_timex_frequency_adjustment_ratio gauge instance, ins, job, ip, cls Local clock frequency adjustment.
node_timex_loop_time_constant gauge instance, ins, job, ip, cls Phase-locked loop time constant.
node_timex_maxerror_seconds gauge instance, ins, job, ip, cls Maximum error in seconds.
node_timex_offset_seconds gauge instance, ins, job, ip, cls Time offset in between local system and reference clock.
node_timex_pps_calibration_total counter instance, ins, job, ip, cls Pulse per second count of calibration intervals.
node_timex_pps_error_total counter instance, ins, job, ip, cls Pulse per second count of calibration errors.
node_timex_pps_frequency_hertz gauge instance, ins, job, ip, cls Pulse per second frequency.
node_timex_pps_jitter_seconds gauge instance, ins, job, ip, cls Pulse per second jitter.
node_timex_pps_jitter_total counter instance, ins, job, ip, cls Pulse per second count of jitter limit exceeded events.
node_timex_pps_shift_seconds gauge instance, ins, job, ip, cls Pulse per second interval duration.
node_timex_pps_stability_exceeded_total counter instance, ins, job, ip, cls Pulse per second count of stability limit exceeded events.
node_timex_pps_stability_hertz gauge instance, ins, job, ip, cls Pulse per second stability, average of recent frequency changes.
node_timex_status gauge instance, ins, job, ip, cls Value of the status array bits.
node_timex_sync_status gauge instance, ins, job, ip, cls Is clock synchronized to a reliable server (1 = yes, 0 = no).
node_timex_tai_offset_seconds gauge instance, ins, job, ip, cls International Atomic Time (TAI) offset.
node_timex_tick_seconds gauge instance, ins, job, ip, cls Seconds between clock ticks.
node_udp_queues gauge ip, queue, ins, job, exported_ip, instance, cls Number of allocated memory in the kernel for UDP datagrams in bytes.
node_uname_info gauge ip, sysname, version, domainname, release, ins, job, nodename, instance, cls, machine Labeled system information as provided by the uname system call.
node_up Unknown instance, ins, job, ip, cls N/A
node_vmstat_oom_kill unknown instance, ins, job, ip, cls /proc/vmstat information field oom_kill.
node_vmstat_pgfault unknown instance, ins, job, ip, cls /proc/vmstat information field pgfault.
node_vmstat_pgmajfault unknown instance, ins, job, ip, cls /proc/vmstat information field pgmajfault.
node_vmstat_pgpgin unknown instance, ins, job, ip, cls /proc/vmstat information field pgpgin.
node_vmstat_pgpgout unknown instance, ins, job, ip, cls /proc/vmstat information field pgpgout.
node_vmstat_pswpin unknown instance, ins, job, ip, cls /proc/vmstat information field pswpin.
node_vmstat_pswpout unknown instance, ins, job, ip, cls /proc/vmstat information field pswpout.
process_cpu_seconds_total counter instance, ins, job, ip, cls Total user and system CPU time spent in seconds.
process_max_fds gauge instance, ins, job, ip, cls Maximum number of open file descriptors.
process_open_fds gauge instance, ins, job, ip, cls Number of open file descriptors.
process_resident_memory_bytes gauge instance, ins, job, ip, cls Resident memory size in bytes.
process_start_time_seconds gauge instance, ins, job, ip, cls Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes gauge instance, ins, job, ip, cls Virtual memory size in bytes.
process_virtual_memory_max_bytes gauge instance, ins, job, ip, cls Maximum amount of virtual memory available in bytes.
prometheus_remote_storage_exemplars_in_total counter instance, ins, job, ip, cls Exemplars in to remote storage, compare to exemplars out for queue managers.
prometheus_remote_storage_histograms_in_total counter instance, ins, job, ip, cls HistogramSamples in to remote storage, compare to histograms out for queue managers.
prometheus_remote_storage_samples_in_total counter instance, ins, job, ip, cls Samples in to remote storage, compare to samples out for queue managers.
prometheus_remote_storage_string_interner_zero_reference_releases_total counter instance, ins, job, ip, cls The number of times release has been called for strings that are not interned.
prometheus_sd_azure_failures_total counter instance, ins, job, ip, cls Number of Azure service discovery refresh failures.
prometheus_sd_consul_rpc_duration_seconds summary ip, call, quantile, ins, job, instance, cls, endpoint The duration of a Consul RPC call in seconds.
prometheus_sd_consul_rpc_duration_seconds_count Unknown ip, call, ins, job, instance, cls, endpoint N/A
prometheus_sd_consul_rpc_duration_seconds_sum Unknown ip, call, ins, job, instance, cls, endpoint N/A
prometheus_sd_consul_rpc_failures_total counter instance, ins, job, ip, cls The number of Consul RPC call failures.
prometheus_sd_consulagent_rpc_duration_seconds summary ip, call, quantile, ins, job, instance, cls, endpoint The duration of a Consul Agent RPC call in seconds.
prometheus_sd_consulagent_rpc_duration_seconds_count Unknown ip, call, ins, job, instance, cls, endpoint N/A
prometheus_sd_consulagent_rpc_duration_seconds_sum Unknown ip, call, ins, job, instance, cls, endpoint N/A
prometheus_sd_consulagent_rpc_failures_total Unknown instance, ins, job, ip, cls N/A
prometheus_sd_dns_lookup_failures_total counter instance, ins, job, ip, cls The number of DNS-SD lookup failures.
prometheus_sd_dns_lookups_total counter instance, ins, job, ip, cls The number of DNS-SD lookups.
prometheus_sd_file_read_errors_total counter instance, ins, job, ip, cls The number of File-SD read errors.
prometheus_sd_file_scan_duration_seconds summary quantile, instance, ins, job, ip, cls The duration of the File-SD scan in seconds.
prometheus_sd_file_scan_duration_seconds_count Unknown instance, ins, job, ip, cls N/A
prometheus_sd_file_scan_duration_seconds_sum Unknown instance, ins, job, ip, cls N/A
prometheus_sd_file_watcher_errors_total counter instance, ins, job, ip, cls The number of File-SD errors caused by filesystem watch failures.
prometheus_sd_kubernetes_events_total counter ip, event, ins, job, role, instance, cls The number of Kubernetes events handled.
prometheus_target_scrape_pool_exceeded_label_limits_total counter instance, ins, job, ip, cls Total number of times scrape pools hit the label limits, during sync or config reload.
prometheus_target_scrape_pool_exceeded_target_limit_total counter instance, ins, job, ip, cls Total number of times scrape pools hit the target limit, during sync or config reload.
prometheus_target_scrape_pool_reloads_failed_total counter instance, ins, job, ip, cls Total number of failed scrape pool reloads.
prometheus_target_scrape_pool_reloads_total counter instance, ins, job, ip, cls Total number of scrape pool reloads.
prometheus_target_scrape_pools_failed_total counter instance, ins, job, ip, cls Total number of scrape pool creations that failed.
prometheus_target_scrape_pools_total counter instance, ins, job, ip, cls Total number of scrape pool creation attempts.
prometheus_target_scrapes_cache_flush_forced_total counter instance, ins, job, ip, cls How many times a scrape cache was flushed due to getting big while scrapes are failing.
prometheus_target_scrapes_exceeded_body_size_limit_total counter instance, ins, job, ip, cls Total number of scrapes that hit the body size limit
prometheus_target_scrapes_exceeded_sample_limit_total counter instance, ins, job, ip, cls Total number of scrapes that hit the sample limit and were rejected.
prometheus_target_scrapes_exemplar_out_of_order_total counter instance, ins, job, ip, cls Total number of exemplar rejected due to not being out of the expected order.
prometheus_target_scrapes_sample_duplicate_timestamp_total counter instance, ins, job, ip, cls Total number of samples rejected due to duplicate timestamps but different values.
prometheus_target_scrapes_sample_out_of_bounds_total counter instance, ins, job, ip, cls Total number of samples rejected due to timestamp falling outside of the time bounds.
prometheus_target_scrapes_sample_out_of_order_total counter instance, ins, job, ip, cls Total number of samples rejected due to not being out of the expected order.
prometheus_template_text_expansion_failures_total counter instance, ins, job, ip, cls The total number of template text expansion failures.
prometheus_template_text_expansions_total counter instance, ins, job, ip, cls The total number of template text expansions.
prometheus_treecache_watcher_goroutines gauge instance, ins, job, ip, cls The current number of watcher goroutines.
prometheus_treecache_zookeeper_failures_total counter instance, ins, job, ip, cls The total number of ZooKeeper failures.
promhttp_metric_handler_errors_total counter ip, cause, ins, job, instance, cls Total number of internal errors encountered by the promhttp metric handler.
promhttp_metric_handler_requests_in_flight gauge instance, ins, job, ip, cls Current number of scrapes being served.
promhttp_metric_handler_requests_total counter ip, ins, code, job, instance, cls Total number of scrapes by HTTP status code.
promtail_batch_retries_total Unknown host, ip, ins, job, instance, cls N/A
promtail_build_info gauge ip, version, revision, goversion, branch, ins, goarch, job, tags, instance, cls, goos A metric with a constant ‘1’ value labeled by version, revision, branch, goversion from which promtail was built, and the goos and goarch for the build.
promtail_config_reload_fail_total Unknown instance, ins, job, ip, cls N/A
promtail_config_reload_success_total Unknown instance, ins, job, ip, cls N/A
promtail_dropped_bytes_total Unknown host, ip, ins, job, reason, instance, cls N/A
promtail_dropped_entries_total Unknown host, ip, ins, job, reason, instance, cls N/A
promtail_encoded_bytes_total Unknown host, ip, ins, job, instance, cls N/A
promtail_file_bytes_total gauge path, instance, ins, job, ip, cls Number of bytes total.
promtail_files_active_total gauge instance, ins, job, ip, cls Number of active files.
promtail_mutated_bytes_total Unknown host, ip, ins, job, reason, instance, cls N/A
promtail_mutated_entries_total Unknown host, ip, ins, job, reason, instance, cls N/A
promtail_read_bytes_total gauge path, instance, ins, job, ip, cls Number of bytes read.
promtail_read_lines_total Unknown path, instance, ins, job, ip, cls N/A
promtail_request_duration_seconds_bucket Unknown host, ip, ins, job, status_code, le, instance, cls N/A
promtail_request_duration_seconds_count Unknown host, ip, ins, job, status_code, instance, cls N/A
promtail_request_duration_seconds_sum Unknown host, ip, ins, job, status_code, instance, cls N/A
promtail_sent_bytes_total Unknown host, ip, ins, job, instance, cls N/A
promtail_sent_entries_total Unknown host, ip, ins, job, instance, cls N/A
promtail_targets_active_total gauge instance, ins, job, ip, cls Number of active total.
promtail_up Unknown instance, ins, job, ip, cls N/A
request_duration_seconds_bucket Unknown instance, ins, job, status_code, route, ws, le, ip, cls, method N/A
request_duration_seconds_count Unknown instance, ins, job, status_code, route, ws, ip, cls, method N/A
request_duration_seconds_sum Unknown instance, ins, job, status_code, route, ws, ip, cls, method N/A
request_message_bytes_bucket Unknown instance, ins, job, route, le, ip, cls, method N/A
request_message_bytes_count Unknown instance, ins, job, route, ip, cls, method N/A
request_message_bytes_sum Unknown instance, ins, job, route, ip, cls, method N/A
response_message_bytes_bucket Unknown instance, ins, job, route, le, ip, cls, method N/A
response_message_bytes_count Unknown instance, ins, job, route, ip, cls, method N/A
response_message_bytes_sum Unknown instance, ins, job, route, ip, cls, method N/A
scrape_duration_seconds Unknown instance, ins, job, ip, cls N/A
scrape_samples_post_metric_relabeling Unknown instance, ins, job, ip, cls N/A
scrape_samples_scraped Unknown instance, ins, job, ip, cls N/A
scrape_series_added Unknown instance, ins, job, ip, cls N/A
tcp_connections gauge instance, ins, job, protocol, ip, cls Current number of accepted TCP connections.
tcp_connections_limit gauge instance, ins, job, protocol, ip, cls The max number of TCP connections that can be accepted (0 means no limit).
up Unknown instance, ins, job, ip, cls N/A

7.2 - FAQ

Pigsty NODE module frequently asked questions

How to configure NTP service?

If NTP is not configured, use a public NTP service or sync time with the admin node.

If your nodes already have NTP configured, you can leave it there by setting node_ntp_enabled to false.

Otherwise, if you have Internet access, you can use public NTP services such as pool.ntp.org.

If you don’t have Internet access, at least you can sync time with the admin node with the following:

node_ntp_servers:                 # NTP servers in /etc/chrony.conf
  - pool cn.pool.ntp.org iburst
  - pool ${admin_ip} iburst       # assume non-admin nodes do not have internet access

How to force sync time on nodes?

Use chronyc to sync time. You have to configure the NTP service first.

ansible all -b -a 'chronyc -a makestep'     # sync time

You can replace all with any group or host IP address to limit execution scope.


Remote nodes are not accessible via SSH commands.

Consider using Ansible connection parameters if the target machine is hidden behind an SSH springboard machine, or if some customizations have been made that cannot be accessed directly using ssh ip. Additional SSH ports can be specified with ansible_port or ansible_host for SSH Alias.

pg-test:
  vars: { pg_cluster: pg-test }
  hosts:
    10.10.10.11: {pg_seq: 1, pg_role: primary, ansible_host: node-1 }
    10.10.10.12: {pg_seq: 2, pg_role: replica, ansible_port: 22223, ansible_user: admin }
    10.10.10.13: {pg_seq: 3, pg_role: offline, ansible_port: 22224 }

Password required for remote node SSH and SUDO

When performing deployments and changes, the admin user used must have ssh and sudo privileges for all nodes. Password-free is not required.

You can pass in ssh and sudo passwords via the -k|-K parameter when executing the playbook or even use another user to run the playbook via -eansible_host=<another_user>. However, Pigsty strongly recommends configuring SSH passwordless login with passwordless sudo for the admin user.


Create an admin user with the existing admin user.

This will create an admin user specified by node_admin_username with the existing one on that node.

./node.yml -k -K -e ansible_user=<another_admin> -t node_admin`

Exposing node services with HAProxy

You can expose service with haproxy_services in node.yml.

And here’s an example of exposing MinIO service with it: Expose MinIO Service


Why my nodes /etc/yum.repos.d/* are nuked?

Pigsty will try to include all dependencies in the local yum repo on infra nodes. This repo file will be added according to node_repo_modules. And existing repo files will be removed by default according to the default value of node_repo_remove. This will prevent the node from using the Internet repo or some stupid issues.

If you want to keep existing repo files during node init, just set node_repo_remove to false.

If you want to keep existing repo files during infra node local repo bootstrap, just set repo_remove to false.


Why my shell prompt change and how to restore it?

The pigsty prompt is defined with the environment variable PS1 in /etc/profile.d/node.sh.

To restore your existing prompt, just remove that file and login again.


Tencent OpenCloudOS Compatibility Issue

OpenCloudOS does not have softdog module, overwrite node_kernel_modules on global vars:

node_kernel_modules: [ br_netfilter, ip_vs, ip_vs_rr, ip_vs_wrr, ip_vs_sh ]

8 - Module: ETCD

Pigsty has built-in etcd support, which is a reliable distributive consensus storage (DCS), empowering PostgreSQL HA.

ETCD is a distributed, reliable key-value store for the most critical data of a distributed system

Configuration | Administration | Playbook | Dashboard | Parameter

Pigsty use etcd as DCS: Distributed configuration storage (or distributed consensus service). Which is critical to PostgreSQL High-Availability & Auto-Failover.

You have to install ETCD module before any PGSQL modules, since patroni & vip-manager will rely on etcd to work. Unless you are using an external etcd cluster.

You don’t need NODE module to install ETCD, but it requires a valid CA on your local files/pki/ca. Check ETCD Administration SOP for more details.


Configuration

You have to define an etcd cluster before deploying it. There some parameters about etcd.

It is recommending to have at least 3 instances for a serious production environment.

Single Node

Define a group etcd in the inventory, It will create a singleton etcd instance.

# etcd cluster for ha postgres
etcd: { hosts: { 10.10.10.10: { etcd_seq: 1 } }, vars: { etcd_cluster: etcd } }

This is good enough for development, testing & demonstration, but not recommended in serious production environment.

Three Nodes

You can define etcd cluster with multiple nodes.

etcd: # dcs service for postgres/patroni ha consensus
  hosts:  # 1 node for testing, 3 or 5 for production
    10.10.10.10: { etcd_seq: 1 }  # etcd_seq required
    10.10.10.11: { etcd_seq: 2 }  # assign from 1 ~ n
    10.10.10.12: { etcd_seq: 3 }  # odd number please
  vars: # cluster level parameter override roles/etcd
    etcd_cluster: etcd  # mark etcd cluster name etcd
    etcd_safeguard: false # safeguard against purging
    etcd_clean: true # purge etcd during init process

You can use more nodes for production environment, but 3 or 5 nodes are recommended. Remember to use odd number for cluster size.


Administration

Here are some useful administration tasks for etcd:


Create Cluster

If etcd_safeguard is true, or etcd_clean is false, the playbook will abort if any running etcd instance exists to prevent purge etcd by accident.

etcd:
  hosts:
    10.10.10.10: { etcd_seq: 1 }
    10.10.10.11: { etcd_seq: 2 }
    10.10.10.12: { etcd_seq: 3 }
  vars: { etcd_cluster: etcd }
./etcd.yml   # init etcd module on group 'etcd'

Destroy Cluster

To destroy an etcd cluster, just use the etcd_clean subtask of etcd.yml, do think before you type.

./etcd.yml -t etcd_clean  # remove entire cluster, honor the etcd_safeguard
./etcd.yml -t etcd_purge  # purge with brutal force, omit the etcd_safeguard

CLI Environment

Here’s an example of client environment config.

Pigsty use etcd v3 API by default.

alias e="etcdctl"
alias em="etcdctl member"
export ETCDCTL_API=3
export ETCDCTL_ENDPOINTS=https://10.10.10.10:2379
export ETCDCTL_CACERT=/etc/pki/ca.crt
export ETCDCTL_CERT=/etc/etcd/server.crt
export ETCDCTL_KEY=/etc/etcd/server.key

CRUD

You can do CRUD with following commands.

e put a 10 ; e get a; e del a ; # V3 API

Reload Config

If etcd cluster membership changes, we need to refresh etcd endpoints references:

  • config file of existing etcd members
  • etcdctl client environment variables
  • patroni dcs endpoint config
  • vip-manager dcs endpoint config

To refresh etcd config file /etc/etcd/etcd.conf on existing members:

./etcd.yml -t etcd_conf                           # refresh /etc/etcd/etcd.conf with latest status
ansible etcd -f 1 -b -a 'systemctl restart etcd'  # optional: restart etcd

To refresh etcdctl client environment variables

$ ./etcd.yml -t etcd_env                          # refresh /etc/profile.d/etcdctl.sh

To update etcd endpoints reference on patroni:

./pgsql.yml -t pg_conf                            # regenerate patroni config
ansible all -f 1 -b -a 'systemctl reload patroni' # reload patroni config

To update etcd endpoints reference on vip-manager, (optional, if you are using a L2 vip)

./pgsql.yml -t pg_vip_config                           # regenerate vip-manager config
ansible all -f 1 -b -a 'systemctl restart vip-manager' # restart vip-manager to use new config

Add Member

ETCD Reference: Add a member

You can add new members to existing etcd cluster in 5 steps:

  1. issue etcdctl member add command to tell existing cluster that a new member is coming (use learner mode)
  2. update inventory group etcd with new instance
  3. init the new member with etcd_init=existing, to join the existing cluster rather than create a new one (VERY IMPORTANT)
  4. promote the new member from leaner to follower
  5. update etcd endpoints reference with reload-config

Short Version

etcdctl member add <etcd-?> --learner=true --peer-urls=https://<new_ins_ip>:2380
./etcd.yml -l <new_ins_ip> -e etcd_init=existing
etcdctl member promote <new_ins_server_id>
Detail: Add member to etcd cluster

Here’s the detail, let’s start from one single etcd instance.

etcd:
  hosts:
    10.10.10.10: { etcd_seq: 1 } # <--- this is the existing instance
    10.10.10.11: { etcd_seq: 2 } # <--- add this new member definition to inventory
  vars: { etcd_cluster: etcd }

Add a learner instance etcd-2 to cluster with etcd member add:

# tell the existing cluster that a new member etcd-2 is coming
$ etcdctl member add etcd-2 --learner=true --peer-urls=https://10.10.10.11:2380
Member 33631ba6ced84cf8 added to cluster 6646fbcf5debc68f

ETCD_NAME="etcd-2"
ETCD_INITIAL_CLUSTER="etcd-2=https://10.10.10.11:2380,etcd-1=https://10.10.10.10:2380"
ETCD_INITIAL_ADVERTISE_PEER_URLS="https://10.10.10.11:2380"
ETCD_INITIAL_CLUSTER_STATE="existing"

Check the member list with etcdctl member list (or em list), we can see an unstarted member:

33631ba6ced84cf8, unstarted, , https://10.10.10.11:2380, , true
429ee12c7fbab5c1, started, etcd-1, https://10.10.10.10:2380, https://10.10.10.10:2379, false

Init the new etcd instance etcd-2 with etcd.yml playbook, we can see the new member is started:

$ ./etcd.yml -l 10.10.10.11 -e etcd_init=existing    # etcd_init=existing must be set
...
33631ba6ced84cf8, started, etcd-2, https://10.10.10.11:2380, https://10.10.10.11:2379, true
429ee12c7fbab5c1, started, etcd-1, https://10.10.10.10:2380, https://10.10.10.10:2379, false

Promote the new member, from leaner to follower:

$ etcdctl member promote 33631ba6ced84cf8   # promote the new learner
Member 33631ba6ced84cf8 promoted in cluster 6646fbcf5debc68f

$ em list                # check again, the new member is started
33631ba6ced84cf8, started, etcd-2, https://10.10.10.11:2380, https://10.10.10.11:2379, false
429ee12c7fbab5c1, started, etcd-1, https://10.10.10.10:2380, https://10.10.10.10:2379, fals

The new member is added, don’t forget to reload config.

Repeat the steps above to add more members. remember to use at least 3 members for production.


Remove Member

To remove a member from existing etcd cluster, it usually takes 3 steps:

  1. remove/uncomment it from inventory and reload config
  2. remove it with etcdctl member remove <server_id> command and kick it out of the cluster
  3. temporarily add it back to inventory and purge that instance, then remove it from inventory permanently
Detail: Remove member from etcd cluster

Here’s the detail, let’s start from a 3 instance etcd cluster:

etcd:
  hosts:
    10.10.10.10: { etcd_seq: 1 }
    10.10.10.11: { etcd_seq: 2 }
    10.10.10.12: { etcd_seq: 3 }   # <---- comment this line, then reload-config
  vars: { etcd_cluster: etcd }

Then, you’ll have to actually kick it from cluster with etcdctl member remove command:

$ etcdctl member list
429ee12c7fbab5c1, started, etcd-1, https://10.10.10.10:2380, https://10.10.10.10:2379, false
33631ba6ced84cf8, started, etcd-2, https://10.10.10.11:2380, https://10.10.10.11:2379, false
93fcf23b220473fb, started, etcd-3, https://10.10.10.12:2380, https://10.10.10.12:2379, false  # <--- remove this

$ etcdctl member remove 93fcf23b220473fb  # kick it from cluster
Member 93fcf23b220473fb removed from cluster 6646fbcf5debc68f

Finally, you have to shutdown the instance, and purge it from node, you have to uncomment the member in inventory temporarily, then purge it with etcd.yml playbook:

./etcd.yml -t etcd_purge -l 10.10.10.12   # purge it (the member is in inventory again)

After that, remove the member from inventory permanently, all clear!


Playbook

There’s a built-in playbook: etcd.yml for installing etcd cluster. But you have to define it first.

./etcd.yml    # install etcd cluster on group 'etcd'

Here are available sub tasks:

  • etcd_assert : generate etcd identity
  • etcd_install : install etcd rpm packages
  • etcd_clean : cleanup existing etcd
    • etcd_check : check etcd instance is running
    • etcd_purge : remove running etcd instance & data
  • etcd_dir : create etcd data & conf dir
  • etcd_config : generate etcd config
    • etcd_conf : generate etcd main config
    • etcd_cert : generate etcd ssl cert
  • etcd_launch : launch etcd service
  • etcd_register : register etcd to prometheus

If etcd_safeguard is true, or etcd_clean is false, the playbook will abort if any running etcd instance exists to prevent purge etcd by accident.

asciicast


Dashboard

There is one dashboard for ETCD module:

ETCD Overview: Overview of the ETCD cluster

ETCD Overview Dashboard

etcd-overview.jpg


Parameter

There are 10 parameters about ETCD module.

Parameter Type Level Comment
etcd_seq int I etcd instance identifier, REQUIRED
etcd_cluster string C etcd cluster & group name, etcd by default
etcd_safeguard bool G/C/A prevent purging running etcd instance?
etcd_clean bool G/C/A purging existing etcd during initialization?
etcd_data path C etcd data directory, /data/etcd by default
etcd_port port C etcd client port, 2379 by default
etcd_peer_port port C etcd peer port, 2380 by default
etcd_init enum C etcd initial cluster state, new or existing
etcd_election_timeout int C etcd election timeout, 1000ms by default
etcd_heartbeat_interval int C etcd heartbeat interval, 100ms by default

8.1 - Metrics

Pigsty ETCD module metric list

ETCD module has 177 available metrics

Metric Name Type Labels Description
etcd:ins:backend_commit_rt_p99_5m Unknown cls, ins, instance, job, ip N/A
etcd:ins:disk_fsync_rt_p99_5m Unknown cls, ins, instance, job, ip N/A
etcd:ins:network_peer_rt_p99_1m Unknown cls, To, ins, instance, job, ip N/A
etcd_cluster_version gauge cls, cluster_version, ins, instance, job, ip Which version is running. 1 for ‘cluster_version’ label with current cluster version
etcd_debugging_auth_revision gauge cls, ins, instance, job, ip The current revision of auth store.
etcd_debugging_disk_backend_commit_rebalance_duration_seconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_debugging_disk_backend_commit_rebalance_duration_seconds_count Unknown cls, ins, instance, job, ip N/A
etcd_debugging_disk_backend_commit_rebalance_duration_seconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_debugging_disk_backend_commit_spill_duration_seconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_debugging_disk_backend_commit_spill_duration_seconds_count Unknown cls, ins, instance, job, ip N/A
etcd_debugging_disk_backend_commit_spill_duration_seconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_debugging_disk_backend_commit_write_duration_seconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_debugging_disk_backend_commit_write_duration_seconds_count Unknown cls, ins, instance, job, ip N/A
etcd_debugging_disk_backend_commit_write_duration_seconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_debugging_lease_granted_total counter cls, ins, instance, job, ip The total number of granted leases.
etcd_debugging_lease_renewed_total counter cls, ins, instance, job, ip The number of renewed leases seen by the leader.
etcd_debugging_lease_revoked_total counter cls, ins, instance, job, ip The total number of revoked leases.
etcd_debugging_lease_ttl_total_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_debugging_lease_ttl_total_count Unknown cls, ins, instance, job, ip N/A
etcd_debugging_lease_ttl_total_sum Unknown cls, ins, instance, job, ip N/A
etcd_debugging_mvcc_compact_revision gauge cls, ins, instance, job, ip The revision of the last compaction in store.
etcd_debugging_mvcc_current_revision gauge cls, ins, instance, job, ip The current revision of store.
etcd_debugging_mvcc_db_compaction_keys_total counter cls, ins, instance, job, ip Total number of db keys compacted.
etcd_debugging_mvcc_db_compaction_last gauge cls, ins, instance, job, ip The unix time of the last db compaction. Resets to 0 on start.
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_count Unknown cls, ins, instance, job, ip N/A
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_count Unknown cls, ins, instance, job, ip N/A
etcd_debugging_mvcc_db_compaction_total_duration_milliseconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_debugging_mvcc_events_total counter cls, ins, instance, job, ip Total number of events sent by this member.
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_count Unknown cls, ins, instance, job, ip N/A
etcd_debugging_mvcc_index_compaction_pause_duration_milliseconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_debugging_mvcc_keys_total gauge cls, ins, instance, job, ip Total number of keys.
etcd_debugging_mvcc_pending_events_total gauge cls, ins, instance, job, ip Total number of pending events to be sent.
etcd_debugging_mvcc_range_total counter cls, ins, instance, job, ip Total number of ranges seen by this member.
etcd_debugging_mvcc_slow_watcher_total gauge cls, ins, instance, job, ip Total number of unsynced slow watchers.
etcd_debugging_mvcc_total_put_size_in_bytes gauge cls, ins, instance, job, ip The total size of put kv pairs seen by this member.
etcd_debugging_mvcc_watch_stream_total gauge cls, ins, instance, job, ip Total number of watch streams.
etcd_debugging_mvcc_watcher_total gauge cls, ins, instance, job, ip Total number of watchers.
etcd_debugging_server_lease_expired_total counter cls, ins, instance, job, ip The total number of expired leases.
etcd_debugging_snap_save_marshalling_duration_seconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_debugging_snap_save_marshalling_duration_seconds_count Unknown cls, ins, instance, job, ip N/A
etcd_debugging_snap_save_marshalling_duration_seconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_debugging_snap_save_total_duration_seconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_debugging_snap_save_total_duration_seconds_count Unknown cls, ins, instance, job, ip N/A
etcd_debugging_snap_save_total_duration_seconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_debugging_store_expires_total counter cls, ins, instance, job, ip Total number of expired keys.
etcd_debugging_store_reads_total counter cls, action, ins, instance, job, ip Total number of reads action by (get/getRecursive), local to this member.
etcd_debugging_store_watch_requests_total counter cls, ins, instance, job, ip Total number of incoming watch requests (new or reestablished).
etcd_debugging_store_watchers gauge cls, ins, instance, job, ip Count of currently active watchers.
etcd_debugging_store_writes_total counter cls, action, ins, instance, job, ip Total number of writes (e.g. set/compareAndDelete) seen by this member.
etcd_disk_backend_commit_duration_seconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_disk_backend_commit_duration_seconds_count Unknown cls, ins, instance, job, ip N/A
etcd_disk_backend_commit_duration_seconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_disk_backend_defrag_duration_seconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_disk_backend_defrag_duration_seconds_count Unknown cls, ins, instance, job, ip N/A
etcd_disk_backend_defrag_duration_seconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_disk_backend_snapshot_duration_seconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_disk_backend_snapshot_duration_seconds_count Unknown cls, ins, instance, job, ip N/A
etcd_disk_backend_snapshot_duration_seconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_disk_defrag_inflight gauge cls, ins, instance, job, ip Whether or not defrag is active on the member. 1 means active, 0 means not.
etcd_disk_wal_fsync_duration_seconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_disk_wal_fsync_duration_seconds_count Unknown cls, ins, instance, job, ip N/A
etcd_disk_wal_fsync_duration_seconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_disk_wal_write_bytes_total gauge cls, ins, instance, job, ip Total number of bytes written in WAL.
etcd_grpc_proxy_cache_hits_total gauge cls, ins, instance, job, ip Total number of cache hits
etcd_grpc_proxy_cache_keys_total gauge cls, ins, instance, job, ip Total number of keys/ranges cached
etcd_grpc_proxy_cache_misses_total gauge cls, ins, instance, job, ip Total number of cache misses
etcd_grpc_proxy_events_coalescing_total counter cls, ins, instance, job, ip Total number of events coalescing
etcd_grpc_proxy_watchers_coalescing_total gauge cls, ins, instance, job, ip Total number of current watchers coalescing
etcd_mvcc_db_open_read_transactions gauge cls, ins, instance, job, ip The number of currently open read transactions
etcd_mvcc_db_total_size_in_bytes gauge cls, ins, instance, job, ip Total size of the underlying database physically allocated in bytes.
etcd_mvcc_db_total_size_in_use_in_bytes gauge cls, ins, instance, job, ip Total size of the underlying database logically in use in bytes.
etcd_mvcc_delete_total counter cls, ins, instance, job, ip Total number of deletes seen by this member.
etcd_mvcc_hash_duration_seconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_mvcc_hash_duration_seconds_count Unknown cls, ins, instance, job, ip N/A
etcd_mvcc_hash_duration_seconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_mvcc_hash_rev_duration_seconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_mvcc_hash_rev_duration_seconds_count Unknown cls, ins, instance, job, ip N/A
etcd_mvcc_hash_rev_duration_seconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_mvcc_put_total counter cls, ins, instance, job, ip Total number of puts seen by this member.
etcd_mvcc_range_total counter cls, ins, instance, job, ip Total number of ranges seen by this member.
etcd_mvcc_txn_total counter cls, ins, instance, job, ip Total number of txns seen by this member.
etcd_network_active_peers gauge cls, ins, Local, instance, job, ip, Remote The current number of active peer connections.
etcd_network_client_grpc_received_bytes_total counter cls, ins, instance, job, ip The total number of bytes received from grpc clients.
etcd_network_client_grpc_sent_bytes_total counter cls, ins, instance, job, ip The total number of bytes sent to grpc clients.
etcd_network_peer_received_bytes_total counter cls, ins, instance, job, ip, From The total number of bytes received from peers.
etcd_network_peer_round_trip_time_seconds_bucket Unknown cls, To, ins, instance, job, le, ip N/A
etcd_network_peer_round_trip_time_seconds_count Unknown cls, To, ins, instance, job, ip N/A
etcd_network_peer_round_trip_time_seconds_sum Unknown cls, To, ins, instance, job, ip N/A
etcd_network_peer_sent_bytes_total counter cls, To, ins, instance, job, ip The total number of bytes sent to peers.
etcd_server_apply_duration_seconds_bucket Unknown cls, version, ins, instance, job, le, success, ip, op N/A
etcd_server_apply_duration_seconds_count Unknown cls, version, ins, instance, job, success, ip, op N/A
etcd_server_apply_duration_seconds_sum Unknown cls, version, ins, instance, job, success, ip, op N/A
etcd_server_client_requests_total counter client_api_version, cls, ins, instance, type, job, ip The total number of client requests per client version.
etcd_server_go_version gauge cls, ins, instance, job, server_go_version, ip Which Go version server is running with. 1 for ‘server_go_version’ label with current version.
etcd_server_has_leader gauge cls, ins, instance, job, ip Whether or not a leader exists. 1 is existence, 0 is not.
etcd_server_health_failures counter cls, ins, instance, job, ip The total number of failed health checks
etcd_server_health_success counter cls, ins, instance, job, ip The total number of successful health checks
etcd_server_heartbeat_send_failures_total counter cls, ins, instance, job, ip The total number of leader heartbeat send failures (likely overloaded from slow disk).
etcd_server_id gauge cls, ins, instance, job, server_id, ip Server or member ID in hexadecimal format. 1 for ‘server_id’ label with current ID.
etcd_server_is_leader gauge cls, ins, instance, job, ip Whether or not this member is a leader. 1 if is, 0 otherwise.
etcd_server_is_learner gauge cls, ins, instance, job, ip Whether or not this member is a learner. 1 if is, 0 otherwise.
etcd_server_leader_changes_seen_total counter cls, ins, instance, job, ip The number of leader changes seen.
etcd_server_learner_promote_successes counter cls, ins, instance, job, ip The total number of successful learner promotions while this member is leader.
etcd_server_proposals_applied_total gauge cls, ins, instance, job, ip The total number of consensus proposals applied.
etcd_server_proposals_committed_total gauge cls, ins, instance, job, ip The total number of consensus proposals committed.
etcd_server_proposals_failed_total counter cls, ins, instance, job, ip The total number of failed proposals seen.
etcd_server_proposals_pending gauge cls, ins, instance, job, ip The current number of pending proposals to commit.
etcd_server_quota_backend_bytes gauge cls, ins, instance, job, ip Current backend storage quota size in bytes.
etcd_server_read_indexes_failed_total counter cls, ins, instance, job, ip The total number of failed read indexes seen.
etcd_server_slow_apply_total counter cls, ins, instance, job, ip The total number of slow apply requests (likely overloaded from slow disk).
etcd_server_slow_read_indexes_total counter cls, ins, instance, job, ip The total number of pending read indexes not in sync with leader’s or timed out read index requests.
etcd_server_snapshot_apply_in_progress_total gauge cls, ins, instance, job, ip 1 if the server is applying the incoming snapshot. 0 if none.
etcd_server_version gauge cls, server_version, ins, instance, job, ip Which version is running. 1 for ‘server_version’ label with current version.
etcd_snap_db_fsync_duration_seconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_snap_db_fsync_duration_seconds_count Unknown cls, ins, instance, job, ip N/A
etcd_snap_db_fsync_duration_seconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_snap_db_save_total_duration_seconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_snap_db_save_total_duration_seconds_count Unknown cls, ins, instance, job, ip N/A
etcd_snap_db_save_total_duration_seconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_snap_fsync_duration_seconds_bucket Unknown cls, ins, instance, job, le, ip N/A
etcd_snap_fsync_duration_seconds_count Unknown cls, ins, instance, job, ip N/A
etcd_snap_fsync_duration_seconds_sum Unknown cls, ins, instance, job, ip N/A
etcd_up Unknown cls, ins, instance, job, ip N/A
go_gc_duration_seconds summary cls, ins, instance, quantile, job, ip A summary of the pause duration of garbage collection cycles.
go_gc_duration_seconds_count Unknown cls, ins, instance, job, ip N/A
go_gc_duration_seconds_sum Unknown cls, ins, instance, job, ip N/A
go_goroutines gauge cls, ins, instance, job, ip Number of goroutines that currently exist.
go_info gauge cls, version, ins, instance, job, ip Information about the Go environment.
go_memstats_alloc_bytes gauge cls, ins, instance, job, ip Number of bytes allocated and still in use.
go_memstats_alloc_bytes_total counter cls, ins, instance, job, ip Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes gauge cls, ins, instance, job, ip Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total counter cls, ins, instance, job, ip Total number of frees.
go_memstats_gc_cpu_fraction gauge cls, ins, instance, job, ip The fraction of this program’s available CPU time used by the GC since the program started.
go_memstats_gc_sys_bytes gauge cls, ins, instance, job, ip Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes gauge cls, ins, instance, job, ip Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes gauge cls, ins, instance, job, ip Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes gauge cls, ins, instance, job, ip Number of heap bytes that are in use.
go_memstats_heap_objects gauge cls, ins, instance, job, ip Number of allocated objects.
go_memstats_heap_released_bytes gauge cls, ins, instance, job, ip Number of heap bytes released to OS.
go_memstats_heap_sys_bytes gauge cls, ins, instance, job, ip Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds gauge cls, ins, instance, job, ip Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total counter cls, ins, instance, job, ip Total number of pointer lookups.
go_memstats_mallocs_total counter cls, ins, instance, job, ip Total number of mallocs.
go_memstats_mcache_inuse_bytes gauge cls, ins, instance, job, ip Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes gauge cls, ins, instance, job, ip Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes gauge cls, ins, instance, job, ip Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes gauge cls, ins, instance, job, ip Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes gauge cls, ins, instance, job, ip Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes gauge cls, ins, instance, job, ip Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes gauge cls, ins, instance, job, ip Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes gauge cls, ins, instance, job, ip Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes gauge cls, ins, instance, job, ip Number of bytes obtained from system.
go_threads gauge cls, ins, instance, job, ip Number of OS threads created.
grpc_server_handled_total counter cls, ins, instance, grpc_code, job, grpc_method, grpc_type, ip, grpc_service Total number of RPCs completed on the server, regardless of success or failure.
grpc_server_msg_received_total counter cls, ins, instance, job, grpc_type, grpc_method, ip, grpc_service Total number of RPC stream messages received on the server.
grpc_server_msg_sent_total counter cls, ins, instance, job, grpc_type, grpc_method, ip, grpc_service Total number of gRPC stream messages sent by the server.
grpc_server_started_total counter cls, ins, instance, job, grpc_type, grpc_method, ip, grpc_service Total number of RPCs started on the server.
os_fd_limit gauge cls, ins, instance, job, ip The file descriptor limit.
os_fd_used gauge cls, ins, instance, job, ip The number of used file descriptors.
process_cpu_seconds_total counter cls, ins, instance, job, ip Total user and system CPU time spent in seconds.
process_max_fds gauge cls, ins, instance, job, ip Maximum number of open file descriptors.
process_open_fds gauge cls, ins, instance, job, ip Number of open file descriptors.
process_resident_memory_bytes gauge cls, ins, instance, job, ip Resident memory size in bytes.
process_start_time_seconds gauge cls, ins, instance, job, ip Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes gauge cls, ins, instance, job, ip Virtual memory size in bytes.
process_virtual_memory_max_bytes gauge cls, ins, instance, job, ip Maximum amount of virtual memory available in bytes.
promhttp_metric_handler_requests_in_flight gauge cls, ins, instance, job, ip Current number of scrapes being served.
promhttp_metric_handler_requests_total counter cls, ins, instance, job, ip, code Total number of scrapes by HTTP status code.
scrape_duration_seconds Unknown cls, ins, instance, job, ip N/A
scrape_samples_post_metric_relabeling Unknown cls, ins, instance, job, ip N/A
scrape_samples_scraped Unknown cls, ins, instance, job, ip N/A
scrape_series_added Unknown cls, ins, instance, job, ip N/A
up Unknown cls, ins, instance, job, ip N/A

8.2 - FAQ

Pigsty ETCD dcs module frequently asked questions

What is the impact of ETCD failure?

ETCD availability is critical for the PGSQL cluster’s HA, which is guaranteed by using multiple nodes. With a 3-node ETCD cluster, if one node is down, the other two nodes can still function normally; and with a 5-node ETCD cluster, two-node failure can still be tolerated. If more than half of the ETCD nodes are down, the ETCD cluster and its service will be unavailable. Before Patroni 3.0, this could lead to a global PGSQL outage; all primary will be demoted and reject write requests.

Since pigsty 2.0, the patroni 3.0 DCS failsafe mode is enabled by default, which will LOCK the PGSQL cluster status if the ETCD cluster is unavailable and all PGSQL members are still known to the primary.

The PGSQL cluster can still function normally, but you must recover the ETCD cluster ASAP. (you can’t configure the PGSQL cluster through patroni if etcd is down)


How to use existing external etcd cluster?

The hard-coded group, etcd, will be used as DCS servers for PGSQL. You can initialize them with etcd.yml or assume it is an existing external etcd cluster.

To use an existing external etcd cluster, define them as usual and make sure your current etcd cluster certificate is signed by the same CA as your self-signed CA for PGSQL.


How to add a new member to the existing etcd cluster?

Check Add a member to etcd cluster

etcdctl member add <etcd-?> --learner=true --peer-urls=https://<new_ins_ip>:2380 # on admin node
./etcd.yml -l <new_ins_ip> -e etcd_init=existing                                 # init new etcd member
etcdctl member promote <new_ins_server_id>                                       # on admin node

How to remove a member from an existing etcd cluster?

Check Remove member from etcd cluster

etcdctl member remove <etcd_server_id>   # kick member out of the cluster (on admin node)
./etcd.yml -l <ins_ip> -t etcd_purge     # purge etcd instance

9 - Module: MINIO

Pigsty has built-in MinIO support. MinIO is an S3 OSS alternative which is used as an optional PostgreSQL backup repo

Min.IO: S3-Compatible Open-Source Multi-Cloud Object Storage

Configuration | Administration | Playbook | Dashboard | Parameter

MinIO is an S3-compatible object storage server. It’s designed to be scalable, secure, and easy to use. It has native multi-node multi-driver HA support and can store documents, pictures, videos, and backups.

Pigsty uses MinIO as an optional PostgreSQL backup storage repo, in addition to the default local posix FS repo. If the MinIO repo is used, the MINIO module should be installed before any PGSQL modules.

MinIO requires a trusted CA to work, so you have to install it in addition to NODE module.


Configuration

You have to define a MinIO cluster before deploying it. There are some parameters for MinIO.


Single-Node Single-Drive

Reference: deploy-minio-single-node-single-drive

To define a singleton MinIO instance, it’s straightforward:

# 1 Node 1 Driver (DEFAULT)
minio: { hosts: { 10.10.10.10: { minio_seq: 1 } }, vars: { minio_cluster: minio } }

The only required params are minio_seq and minio_cluster, which generate a unique identity for each MinIO instance.

Single-Node Single-Driver mode is for development purposes, so you can use a common dir as the data dir, which is /data/minio by default. Beware that in multi-driver or multi-node mode, MinIO will refuse to start if using a common dir as the data dir rather than a mount point.


Single-Node Multi-Drive

Reference: deploy-minio-single-node-multi-drive

To use multiple disks on a single node, you have to specify the minio_data in the format of {{ prefix }}{x...y}, which defines a series of disk mount points.

minio:
  hosts: { 10.10.10.10: { minio_seq: 1 } }
  vars:
    minio_cluster: minio         # minio cluster name, minio by default
    minio_data: '/data{1...4}'   # minio data dir(s), use {x...y} to specify multi drivers

This example defines a single-node MinIO cluster with 4 drivers: /data1, /data2, /data3, /data4. You have to mount them properly before launching MinIO:

mkfs.xfs /dev/sdb; mkdir /data1; mount -t xfs /dev/sdb /data1;   # mount 1st driver, ...

Multi-Node Multi-Drive

Reference: deploy-minio-multi-node-multi-drive

The extra minio_node param will be used for a multi-node deployment:

minio:
  hosts:
    10.10.10.10: { minio_seq: 1 }
    10.10.10.11: { minio_seq: 2 }
    10.10.10.12: { minio_seq: 3 }
  vars:
    minio_cluster: minio
    minio_data: '/data{1...2}'                         # use two disk per node
    minio_node: '${minio_cluster}-${minio_seq}.pigsty' # minio node name pattern

The ${minio_cluster} and ${minio_seq} will be replaced with the value of minio_cluster and minio_seq respectively and used as MinIO nodename.


Expose Service

MinIO will serve on port 9000 by default. If a multi-node MinIO cluster is deployed, you can access its service via any node. It would be better to expose MinIO service via a load balancer, such as the default haproxy on NODE, or use the L2 vip.

To expose MinIO service with haproxy, you have to define an extra service with haproxy_services:

minio:
  hosts:
    10.10.10.10: { minio_seq: 1 , nodename: minio-1 }
    10.10.10.11: { minio_seq: 2 , nodename: minio-2 }
    10.10.10.12: { minio_seq: 3 , nodename: minio-3 }
  vars:
    minio_cluster: minio
    node_cluster: minio
    minio_data: '/data{1...2}'         # use two disk per node
    minio_node: '${minio_cluster}-${minio_seq}.pigsty' # minio node name pattern
    haproxy_services:                  # EXPOSING MINIO SERVICE WITH HAPROXY
      - name: minio                    # [REQUIRED] service name, unique
        port: 9002                     # [REQUIRED] service port, unique
        options:                       # [OPTIONAL] minio health check
          - option httpchk
          - option http-keep-alive
          - http-check send meth OPTIONS uri /minio/health/live
          - http-check expect status 200
        servers:
          - { name: minio-1 ,ip: 10.10.10.10 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
          - { name: minio-2 ,ip: 10.10.10.11 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }
          - { name: minio-3 ,ip: 10.10.10.12 ,port: 9000 ,options: 'check-ssl ca-file /etc/pki/ca.crt check port 9000' }

Access Service

To use the exposed service, you have to update/append the MinIO credential in the pgbackrest_repo section:

# This is the newly added HA MinIO Repo definition, USE THIS INSTEAD!
minio_ha:
  type: s3
  s3_endpoint: minio-1.pigsty   # s3_endpoint could be any load balancer: 10.10.10.1{0,1,2}, or domain names point to any of the 3 nodes
  s3_region: us-east-1          # you could use external domain name: sss.pigsty , which resolve to any members  (`minio_domain`)
  s3_bucket: pgsql              # instance & nodename can be used : minio-1.pigsty minio-1.pigsty minio-1.pigsty minio-1 minio-2 minio-3
  s3_key: pgbackrest            # Better using a new password for MinIO pgbackrest user
  s3_key_secret: S3User.SomeNewPassWord
  s3_uri_style: path
  path: /pgbackrest
  storage_port: 9002            # Use the load balancer port 9002 instead of default 9000 (direct access)
  storage_ca_file: /etc/pki/ca.crt
  bundle: y
  cipher_type: aes-256-cbc      # Better using a new cipher password for your production environment
  cipher_pass: pgBackRest.With.Some.Extra.PassWord.And.Salt.${pg_cluster}
  retention_full_type: time
  retention_full: 14

Expose Admin

MinIO will serve an admin web portal on port 9001 by default.

It’s not wise to expose the admin portal to the public, but if you wish to do so, add MinIO to the infra_portal and refresh the nginx server:

infra_portal:   # domain names and upstream servers
  # ...         # MinIO admin page require HTTPS / Websocket to work
  minio1       : { domain: sss.pigsty  ,endpoint: 10.10.10.10:9001 ,scheme: https ,websocket: true }
  minio2       : { domain: sss2.pigsty ,endpoint: 10.10.10.11:9001 ,scheme: https ,websocket: true }
  minio3       : { domain: sss3.pigsty ,endpoint: 10.10.10.12:9001 ,scheme: https ,websocket: true }

Check the MinIO demo config and special Vagrantfile for more details.


Administration

Here are some common MinIO mcli commands for reference, check MinIO Client for more details.


Create Cluster

To create a defined minio cluster, run the minio.yml playbook on minio group:

./minio.yml -l minio   # install minio cluster on group 'minio'

Client Setup

To access MinIO servers, you have to configure client mcli alias first:

mcli alias ls  # list minio alias (there's a sss by default)
mcli alias set sss https://sss.pigsty:9000 minioadmin minioadmin              # root user
mcli alias set pgbackrest https://sss.pigsty:9000 pgbackrest S3User.Backup    # backup user

You can manage business users with mcli as well:

mcli admin user list sss     # list all users on sss
set +o history # hide password in history and create minio user
mcli admin user add sss dba S3User.DBA
mcli admin user add sss pgbackrest S3User.Backup
set -o history 

CRUD

You can CRUD minio bucket with mcli:

mcli ls sss/          # list buckets of alias 'sss'
mcli mb --ignore-existing sss/hello  # create a bucket named 'hello'
mcli rb --force sss/hello            # remove bucket 'hello' with force

Or perform object CRUD:

mcli cp -r /www/pigsty/*.rpm sss/infra/repo/         # upload files to bucket 'infra' with prefix 'repo'
mcli cp sss/infra/repo/pg_exporter-0.5.0.x86_64.rpm /tmp/  # download file from minio to local

Playbook

There’s a built-in playbook: minio.yml for installing the MinIO cluster. But you have to define it first.

minio.yml

minio.yml

  • minio-id : generate minio identity
  • minio_os_user : create os user minio
  • minio_install : install minio/mcli rpm
  • minio_clean : remove minio data (not default)
  • minio_dir : create minio directories
  • minio_config : generate minio config
    • minio_conf : minio main config
    • minio_cert : minio ssl cert
    • minio_dns : write minio dns records
  • minio_launch : launch minio service
  • minio_register : register minio to prometheus
  • minio_provision : create minio aliases/buckets/users
    • minio_alias : create minio client alias
    • minio_bucket : create minio buckets
    • minio_user : create minio biz users

Trusted ca file: /etc/pki/ca.crt should exist on all nodes already. which is generated in role: ca and loaded & trusted by default in role: node.

You should install MINIO module on Pigsty-managed nodes (i.e., Install NODE first)

asciicast


Dashboard

There are two dashboards for MINIO module.

MinIO Overview: Overview of one single MinIO cluster

MinIO Instance: Detail information about one single MinIO instance

minio-overview.jpg


Parameter

There are 15 parameters in MINIO module.

Parameter Type Level Comment
minio_seq int I minio instance identifier, REQUIRED
minio_cluster string C minio cluster name, minio by default
minio_clean bool G/C/A cleanup minio during init?, false by default
minio_user username C minio os user, minio by default
minio_node string C minio node name pattern
minio_data path C minio data dir(s), use {x…y} to specify multi drivers
minio_domain string G minio external domain name, sss.pigsty by default
minio_port port C minio service port, 9000 by default
minio_admin_port port C minio console port, 9001 by default
minio_access_key username C root access key, minioadmin by default
minio_secret_key password C root secret key, minioadmin by default
minio_extra_vars string C extra environment variables for minio server
minio_alias string G alias name for local minio deployment
minio_buckets bucket[] C list of minio bucket to be created
minio_users user[] C list of minio user to be created
#minio_seq: 1                     # minio instance identifier, REQUIRED
minio_cluster: minio              # minio cluster name, minio by default
minio_clean: false                # cleanup minio during init?, false by default
minio_user: minio                 # minio os user, `minio` by default
minio_node: '${minio_cluster}-${minio_seq}.pigsty' # minio node name pattern
minio_data: '/data/minio'         # minio data dir(s), use {x...y} to specify multi drivers
minio_domain: sss.pigsty          # minio external domain name, `sss.pigsty` by default
minio_port: 9000                  # minio service port, 9000 by default
minio_admin_port: 9001            # minio console port, 9001 by default
minio_access_key: minioadmin      # root access key, `minioadmin` by default
minio_secret_key: minioadmin      # root secret key, `minioadmin` by default
minio_extra_vars: ''              # extra environment variables
minio_alias: sss                  # alias name for local minio deployment
minio_buckets: [ { name: pgsql }, { name: infra },  { name: redis } ]
minio_users:
  - { access_key: dba , secret_key: S3User.DBA, policy: consoleAdmin }
  - { access_key: pgbackrest , secret_key: S3User.Backup, policy: readwrite }

9.1 - Metrics

Pigsty MINIO module metric list

MINIO module has 79 available metrics

Metric Name Type Labels Description
minio_audit_failed_messages counter ip, job, target_id, cls, instance, server, ins Total number of messages that failed to send since start
minio_audit_target_queue_length gauge ip, job, target_id, cls, instance, server, ins Number of unsent messages in queue for target
minio_audit_total_messages counter ip, job, target_id, cls, instance, server, ins Total number of messages sent since start
minio_cluster_bucket_total gauge ip, job, cls, instance, server, ins Total number of buckets in the cluster
minio_cluster_capacity_raw_free_bytes gauge ip, job, cls, instance, server, ins Total free capacity online in the cluster
minio_cluster_capacity_raw_total_bytes gauge ip, job, cls, instance, server, ins Total capacity online in the cluster
minio_cluster_capacity_usable_free_bytes gauge ip, job, cls, instance, server, ins Total free usable capacity online in the cluster
minio_cluster_capacity_usable_total_bytes gauge ip, job, cls, instance, server, ins Total usable capacity online in the cluster
minio_cluster_drive_offline_total gauge ip, job, cls, instance, server, ins Total drives offline in this cluster
minio_cluster_drive_online_total gauge ip, job, cls, instance, server, ins Total drives online in this cluster
minio_cluster_drive_total gauge ip, job, cls, instance, server, ins Total drives in this cluster
minio_cluster_health_erasure_set_healing_drives gauge pool, ip, job, cls, set, instance, server, ins Get the count of healing drives of this erasure set
minio_cluster_health_erasure_set_online_drives gauge pool, ip, job, cls, set, instance, server, ins Get the count of the online drives in this erasure set
minio_cluster_health_erasure_set_read_quorum gauge pool, ip, job, cls, set, instance, server, ins Get the read quorum for this erasure set
minio_cluster_health_erasure_set_status gauge pool, ip, job, cls, set, instance, server, ins Get current health status for this erasure set
minio_cluster_health_erasure_set_write_quorum gauge pool, ip, job, cls, set, instance, server, ins Get the write quorum for this erasure set
minio_cluster_health_status gauge ip, job, cls, instance, server, ins Get current cluster health status
minio_cluster_nodes_offline_total gauge ip, job, cls, instance, server, ins Total number of MinIO nodes offline
minio_cluster_nodes_online_total gauge ip, job, cls, instance, server, ins Total number of MinIO nodes online
minio_cluster_objects_size_distribution gauge ip, range, job, cls, instance, server, ins Distribution of object sizes across a cluster
minio_cluster_objects_version_distribution gauge ip, range, job, cls, instance, server, ins Distribution of object versions across a cluster
minio_cluster_usage_deletemarker_total gauge ip, job, cls, instance, server, ins Total number of delete markers in a cluster
minio_cluster_usage_object_total gauge ip, job, cls, instance, server, ins Total number of objects in a cluster
minio_cluster_usage_total_bytes gauge ip, job, cls, instance, server, ins Total cluster usage in bytes
minio_cluster_usage_version_total gauge ip, job, cls, instance, server, ins Total number of versions (includes delete marker) in a cluster
minio_cluster_webhook_failed_messages counter ip, job, cls, instance, server, ins Number of messages that failed to send
minio_cluster_webhook_online gauge ip, job, cls, instance, server, ins Is the webhook online?
minio_cluster_webhook_queue_length counter ip, job, cls, instance, server, ins Webhook queue length
minio_cluster_webhook_total_messages counter ip, job, cls, instance, server, ins Total number of messages sent to this target
minio_cluster_write_quorum gauge ip, job, cls, instance, server, ins Maximum write quorum across all pools and sets
minio_node_file_descriptor_limit_total gauge ip, job, cls, instance, server, ins Limit on total number of open file descriptors for the MinIO Server process
minio_node_file_descriptor_open_total gauge ip, job, cls, instance, server, ins Total number of open file descriptors by the MinIO Server process
minio_node_go_routine_total gauge ip, job, cls, instance, server, ins Total number of go routines running
minio_node_ilm_expiry_pending_tasks gauge ip, job, cls, instance, server, ins Number of pending ILM expiry tasks in the queue
minio_node_ilm_transition_active_tasks gauge ip, job, cls, instance, server, ins Number of active ILM transition tasks
minio_node_ilm_transition_missed_immediate_tasks gauge ip, job, cls, instance, server, ins Number of missed immediate ILM transition tasks
minio_node_ilm_transition_pending_tasks gauge ip, job, cls, instance, server, ins Number of pending ILM transition tasks in the queue
minio_node_ilm_versions_scanned counter ip, job, cls, instance, server, ins Total number of object versions checked for ilm actions since server start
minio_node_io_rchar_bytes counter ip, job, cls, instance, server, ins Total bytes read by the process from the underlying storage system including cache, /proc/[pid]/io rchar
minio_node_io_read_bytes counter ip, job, cls, instance, server, ins Total bytes read by the process from the underlying storage system, /proc/[pid]/io read_bytes
minio_node_io_wchar_bytes counter ip, job, cls, instance, server, ins Total bytes written by the process to the underlying storage system including page cache, /proc/[pid]/io wchar
minio_node_io_write_bytes counter ip, job, cls, instance, server, ins Total bytes written by the process to the underlying storage system, /proc/[pid]/io write_bytes
minio_node_process_cpu_total_seconds counter ip, job, cls, instance, server, ins Total user and system CPU time spent in seconds
minio_node_process_resident_memory_bytes gauge ip, job, cls, instance, server, ins Resident memory size in bytes
minio_node_process_starttime_seconds gauge ip, job, cls, instance, server, ins Start time for MinIO process per node, time in seconds since Unix epoc
minio_node_process_uptime_seconds gauge ip, job, cls, instance, server, ins Uptime for MinIO process per node in seconds
minio_node_scanner_bucket_scans_finished counter ip, job, cls, instance, server, ins Total number of bucket scans finished since server start
minio_node_scanner_bucket_scans_started counter ip, job, cls, instance, server, ins Total number of bucket scans started since server start
minio_node_scanner_directories_scanned counter ip, job, cls, instance, server, ins Total number of directories scanned since server start
minio_node_scanner_objects_scanned counter ip, job, cls, instance, server, ins Total number of unique objects scanned since server start
minio_node_scanner_versions_scanned counter ip, job, cls, instance, server, ins Total number of object versions scanned since server start
minio_node_syscall_read_total counter ip, job, cls, instance, server, ins Total read SysCalls to the kernel. /proc/[pid]/io syscr
minio_node_syscall_write_total counter ip, job, cls, instance, server, ins Total write SysCalls to the kernel. /proc/[pid]/io syscw
minio_notify_current_send_in_progress gauge ip, job, cls, instance, server, ins Number of concurrent async Send calls active to all targets (deprecated, please use ‘minio_notify_target_current_send_in_progress’ instead)
minio_notify_events_errors_total counter ip, job, cls, instance, server, ins Events that were failed to be sent to the targets (deprecated, please use ‘minio_notify_target_failed_events’ instead)
minio_notify_events_sent_total counter ip, job, cls, instance, server, ins Total number of events sent to the targets (deprecated, please use ‘minio_notify_target_total_events’ instead)
minio_notify_events_skipped_total counter ip, job, cls, instance, server, ins Events that were skipped to be sent to the targets due to the in-memory queue being full
minio_s3_requests_4xx_errors_total counter ip, job, cls, instance, server, ins, api Total number of S3 requests with (4xx) errors
minio_s3_requests_errors_total counter ip, job, cls, instance, server, ins, api Total number of S3 requests with (4xx and 5xx) errors
minio_s3_requests_incoming_total gauge ip, job, cls, instance, server, ins Total number of incoming S3 requests
minio_s3_requests_inflight_total gauge ip, job, cls, instance, server, ins, api Total number of S3 requests currently in flight
minio_s3_requests_rejected_auth_total counter ip, job, cls, instance, server, ins Total number of S3 requests rejected for auth failure
minio_s3_requests_rejected_header_total counter ip, job, cls, instance, server, ins Total number of S3 requests rejected for invalid header
minio_s3_requests_rejected_invalid_total counter ip, job, cls, instance, server, ins Total number of invalid S3 requests
minio_s3_requests_rejected_timestamp_total counter ip, job, cls, instance, server, ins Total number of S3 requests rejected for invalid timestamp
minio_s3_requests_total counter ip, job, cls, instance, server, ins, api Total number of S3 requests
minio_s3_requests_ttfb_seconds_distribution gauge ip, job, cls, le, instance, server, ins, api Distribution of time to first byte across API calls
minio_s3_requests_waiting_total gauge ip, job, cls, instance, server, ins Total number of S3 requests in the waiting queue
minio_s3_traffic_received_bytes counter ip, job, cls, instance, server, ins Total number of s3 bytes received
minio_s3_traffic_sent_bytes counter ip, job, cls, instance, server, ins Total number of s3 bytes sent
minio_software_commit_info gauge ip, job, cls, instance, commit, server, ins Git commit hash for the MinIO release
minio_software_version_info gauge ip, job, cls, instance, version, server, ins MinIO Release tag for the server
minio_up Unknown ip, job, cls, instance, ins N/A
minio_usage_last_activity_nano_seconds gauge ip, job, cls, instance, server, ins Time elapsed (in nano seconds) since last scan activity.
scrape_duration_seconds Unknown ip, job, cls, instance, ins N/A
scrape_samples_post_metric_relabeling Unknown ip, job, cls, instance, ins N/A
scrape_samples_scraped Unknown ip, job, cls, instance, ins N/A
scrape_series_added Unknown ip, job, cls, instance, ins N/A
up Unknown ip, job, cls, instance, ins N/A

9.2 - FAQ

Pigsty MINIO module frequently asked questions

Fail to launch multi-node / multi-driver MinIO cluster.

In Multi-Driver or Multi-Node mode, MinIO will refuse to start if the data dir is not a valid mount point.

Use mounted disks for MinIO data dir rather than some regular directory. You can use the regular directory only in the single node, single drive mode.


How to deploy a multi-node multi-drive MinIO cluster?

Check Create Multi-Node Multi-Driver MinIO Cluster


How to add a member to the existing MinIO cluster?

You’d better plan the MinIO cluster before deployment… Since this requires a global restart

Check this: Expand MinIO Deployment


How to use a HA MinIO deployment for PGSQL?

Access the HA MinIO cluster with an optional load balancer and different ports.

Here is an example: Access MinIO Service

10 - Module: REDIS

Pigsty has built-in Redis support, which is a high-performance data-structure server. Deploy redis in standalone, cluster or sentinel mode.

Concept

The entity model of Redis is almost the same as that of PostgreSQL, which also includes the concepts of Cluster and Instance. The Cluster here does not refer to the native Redis Cluster mode.

The core difference between the REDIS module and the PGSQL module is that Redis uses a single-node multi-instance deployment rather than the 1:1 deployment: multiple Redis instances are typically deployed on a physical/virtual machine node to utilize multi-core CPUs fully. Therefore, the ways to configure and administer Redis instances are slightly different from PGSQL.

In Redis managed by Pigsty, nodes are entirely subordinate to the cluster, which means that currently, it is not allowed to deploy Redis instances of two different clusters on one node. However, this does not affect deploying multiple independent Redis primary replica instances on one node.


Configuration

Redis Identity

Redis identity parameters are required parameters when defining a Redis cluster.

Name Attribute Description Example
redis_cluster REQUIRED, cluster level cluster name redis-test
redis_node REQUIRED, node level Node Sequence Number 1,2
redis_instances REQUIRED, node level Instance Definition { 6001 : {} ,6002 : {}}
  • redis_cluster: Redis cluster name, top-level namespace for cluster sources.
  • redis_node: Redis node identity, integer, and node number in the cluster.
  • redis_instances: A Dict with the Key as redis port and the value as an instance level parameter.

Redis Mode

There are three redis_mode available in Pigsty:

  • standalone: setup Redis in standalone (master-slave) mode
  • cluster: setup this Redis cluster as a Redis native cluster
  • sentinel: setup Redis as a sentinel for standalone Redis HA

Redis Definition

Here are three examples:

  • A 1-node, one master & one slave Redis Standalone cluster: redis-ms
  • A 1-node, 3-instance Redis Sentinel cluster: redis-sentinel
  • A 2-node, 6-instance Redis Cluster: redis-cluster
redis-ms: # redis classic primary & replica
  hosts: { 10.10.10.10: { redis_node: 1 , redis_instances: { 6379: { }, 6380: { replica_of: '10.10.10.10 6379' } } } }
  vars: { redis_cluster: redis-ms ,redis_password: 'redis.ms' ,redis_max_memory: 64MB }

redis-meta: # redis sentinel x 3
  hosts: { 10.10.10.11: { redis_node: 1 , redis_instances: { 26379: { } ,26380: { } ,26381: { } } } }
  vars:
    redis_cluster: redis-meta
    redis_password: 'redis.meta'
    redis_mode: sentinel
    redis_max_memory: 16MB
    redis_sentinel_monitor: # primary list for redis sentinel, use cls as name, primary ip:port
      - { name: redis-ms, host: 10.10.10.10, port: 6379 ,password: redis.ms, quorum: 2 }

redis-test: # redis native cluster: 3m x 3s
  hosts:
    10.10.10.12: { redis_node: 1 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } }
    10.10.10.13: { redis_node: 2 ,redis_instances: { 6379: { } ,6380: { } ,6381: { } } }
  vars: { redis_cluster: redis-test ,redis_password: 'redis.test' ,redis_mode: cluster, redis_max_memory: 32MB }

Limitation

  • A Redis node can only belong to one Redis cluster, which means you cannot assign a node to two different Redis clusters simultaneously.
  • On each Redis node, you need to assign a unique port number to the Redis instance to avoid port conflicts.
  • Typically, the same Redis cluster will use the same password, but multiple Redis instances on a Redis node cannot set different passwords (because redis_exporter only allows one password).
  • Redis Cluster has built-in HA, while standalone HA requires manually configured in Sentinel because we are unsure if you have any sentinels available. Fortunately, configuring standalone Redis HA is straightforward: Configure HA with sentinel.

Administration

Here are some common administration tasks for Redis. Check FAQ: Redis for more details.


Init Redis

Init Cluster/Node/Instance

# init all redis instances on group <cluster>
./redis.yml -l <cluster>      # init redis cluster

# init redis node
./redis.yml -l 10.10.10.10    # init redis node

# init one specific redis instance 10.10.10.11:6379
./redis.yml -l 10.10.10.11 -e redis_port=6379 -t redis

You can also use wrapper script:

bin/redis-add redis-ms          # create redis cluster 'redis-ms'
bin/redis-add 10.10.10.10       # create redis node '10.10.10.10'
bin/redis-add 10.10.10.10 6379  # create redis instance '10.10.10.10:6379'

Remove Redis

Remove Cluster/Node/Instance

# Remove cluster `redis-test`
redis-rm.yml -l redis-test

# Remove cluster `redis-test`, and uninstall packages
redis-rm.yml -l redis-test -e redis_uninstall=true

# Remove all instance on redis node 10.10.10.13
redis-rm.yml -l 10.10.10.13

# Remove one specific instance 10.10.10.13:6379
redis-rm.yml -l 10.10.10.13 -e redis_port=6379

You can also use wrapper script:

bin/redis-rm redis-ms          # remove redis cluster 'redis-ms'
bin/redis-rm 10.10.10.10       # remove redis node '10.10.10.10'
bin/redis-rm 10.10.10.10 6379  # remove redis instance '10.10.10.10:6379'

Reconfigure Redis

You can partially run redis.yml tasks to re-configure redis.

./redis.yml -l <cluster> -t redis_config,redis_launch

Beware that redis can not be reload online, you have to restart redis to make config effective.


Use Redis Client Tools

Access redis instance with redis-cli:

$ redis-cli -h 10.10.10.10 -p 6379 # <--- connect with host and port
10.10.10.10:6379> auth redis.ms    # <--- auth with password
OK
10.10.10.10:6379> set a 10         # <--- set a key
OK
10.10.10.10:6379> get a            # <--- get a key back
"10"

Redis also has a redis-benchmark which can be used for benchmark and generate load on redis server:

redis-benchmark -h 10.10.10.13 -p 6379

Configure Redis Replica

https://redis.io/commands/replicaof/

# promote a redis instance to primary
> REPLICAOF NO ONE
"OK"

# make a redis instance replica of another instance
> REPLICAOF 127.0.0.1 6799
"OK"

Configure HA with Sentinel

You have to enable HA for redis standalone m-s cluster manually with your redis sentinel.

Take the 4-node sandbox as an example, a redis sentinel cluster redis-meta is used manage the redis-ms standalone cluster.

# for each sentinel, add redis master to the sentinel with:
$ redis-cli -h 10.10.10.11 -p 26379 -a redis.meta
10.10.10.11:26379> SENTINEL MONITOR redis-ms 10.10.10.10 6379 1
10.10.10.11:26379> SENTINEL SET redis-ms auth-pass redis.ms      # if auth enabled, password has to be configured

If you wish to remove a redis master from sentinel, use SENTINEL REMOVE <name>.

You can configure multiple redis master on sentinel cluster with redis_sentinel_monitor.

redis_sentinel_monitor: # primary list for redis sentinel, use cls as name, primary ip:port
  - { name: redis-src, host: 10.10.10.45, port: 6379 ,password: redis.src, quorum: 1 }
  - { name: redis-dst, host: 10.10.10.48, port: 6379 ,password: redis.dst, quorum: 1 }

And refresh master list on sentinel cluster with:

./redis.yml -l redis-meta -t redis-ha   # replace redis-meta if your sentinel cluster has different name

Playbook

There are two playbooks for redis:

redis.yml

The playbook redis.yml will init redis cluster/node/instance:

redis_node        : init redis node
  - redis_install : install redis & redis_exporter
  - redis_user    : create os user redis
  - redis_dir     # redis redis fhs
redis_exporter    : config and launch redis_exporter
  - redis_exporter_config  : generate redis_exporter config
  - redis_exporter_launch  : launch redis_exporter
redis_instance    : config and launch redis cluster/node/instance
  - redis_check   : check redis instance existence
  - redis_clean   : purge existing redis instance
  - redis_config  : generate redis instance config
  - redis_launch  : launch redis instance
redis_register    : register redis to prometheus
redis_ha          : setup redis sentinel
redis_join        : join redis cluster
Example: Init redis cluster

asciicast

redis-rm.yml

The playbook redis-rm.yml will remove redis cluster/node/instance:

- register       : remove monitor target from prometheus
- redis_exporter : stop and disable redis_exporter
- redis          : stop and disable redis cluster/node/instance
- redis_data     : remove redis data (rdb, aof)
- redis_pkg      : uninstall redis & redis_exporter packages

Dashboard

There are three dashboards for REDIS module.

Redis Overview: Overview of all Redis Instances

Redis Overview Dashboard

redis-overview.jpg


Redis Cluster : Overview of one single redis cluster

Redis Cluster Dashboard

redis-cluster.jpg


Redis Instance : Overview of one single redis cluster

Redis Instance Dashboard

redis-instance



Parameter

There 20 parameters in the redis module.

Parameter Type Level Comment
redis_cluster string C redis cluster name, required identity parameter
redis_instances dict I redis instances definition on this redis node
redis_node int I redis node sequence number, node int id required
redis_fs_main path C redis main data mountpoint, /data by default
redis_exporter_enabled bool C install redis exporter on redis nodes?
redis_exporter_port port C redis exporter listen port, 9121 by default
redis_exporter_options string C/I cli args and extra options for redis exporter
redis_safeguard bool G/C/A prevent purging running redis instance?
redis_clean bool G/C/A purging existing redis during init?
redis_rmdata bool G/C/A remove redis data when purging redis server?
redis_mode enum C redis mode: standalone,cluster,sentinel
redis_conf string C redis config template path, except sentinel
redis_bind_address ip C redis bind address, empty string will use host ip
redis_max_memory size C/I max memory used by each redis instance
redis_mem_policy enum C redis memory eviction policy
redis_password password C redis password, empty string will disable password
redis_rdb_save string[] C redis rdb save directives, disable with empty list
redis_aof_enabled bool C enable redis append only file?
redis_rename_commands dict C rename redis dangerous commands
redis_cluster_replicas int C replica number for one master in redis cluster
redis_sentinel_monitor master[] C sentinel master list, sentinel cluster only
#redis_cluster:        <CLUSTER> # redis cluster name, required identity parameter
#redis_node: 1            <NODE> # redis node sequence number, node int id required
#redis_instances: {}      <NODE> # redis instances definition on this redis node
redis_fs_main: /data              # redis main data mountpoint, `/data` by default
redis_exporter_enabled: true      # install redis exporter on redis nodes?
redis_exporter_port: 9121         # redis exporter listen port, 9121 by default
redis_exporter_options: ''        # cli args and extra options for redis exporter
redis_safeguard: false            # prevent purging running redis instance?
redis_clean: true                 # purging existing redis during init?
redis_rmdata: true                # remove redis data when purging redis server?
redis_mode: standalone            # redis mode: standalone,cluster,sentinel
redis_conf: redis.conf            # redis config template path, except sentinel
redis_bind_address: '0.0.0.0'     # redis bind address, empty string will use host ip
redis_max_memory: 1GB             # max memory used by each redis instance
redis_mem_policy: allkeys-lru     # redis memory eviction policy
redis_password: ''                # redis password, empty string will disable password
redis_rdb_save: ['1200 1']        # redis rdb save directives, disable with empty list
redis_aof_enabled: false          # enable redis append only file?
redis_rename_commands: {}         # rename redis dangerous commands
redis_cluster_replicas: 1         # replica number for one master in redis cluster
redis_sentinel_monitor: []        # sentinel master list, works on sentinel cluster only

10.1 - Metrics

Pigsty REDIS module metric list

REDIS module has 275 available metrics

Metric Name Type Labels Description
ALERTS Unknown cls, ip, level, severity, instance, category, ins, alertname, job, alertstate N/A
ALERTS_FOR_STATE Unknown cls, ip, level, severity, instance, category, ins, alertname, job N/A
redis:cls:aof_rewrite_time Unknown cls, job N/A
redis:cls:blocked_clients Unknown cls, job N/A
redis:cls:clients Unknown cls, job N/A
redis:cls:cmd_qps Unknown cls, cmd, job N/A
redis:cls:cmd_rt Unknown cls, cmd, job N/A
redis:cls:cmd_time Unknown cls, cmd, job N/A
redis:cls:conn_rate Unknown cls, job N/A
redis:cls:conn_reject Unknown cls, job N/A
redis:cls:cpu_sys Unknown cls, job N/A
redis:cls:cpu_sys_child Unknown cls, job N/A
redis:cls:cpu_usage Unknown cls, job N/A
redis:cls:cpu_usage_child Unknown cls, job N/A
redis:cls:cpu_user Unknown cls, job N/A
redis:cls:cpu_user_child Unknown cls, job N/A
redis:cls:fork_time Unknown cls, job N/A
redis:cls:key_evict Unknown cls, job N/A
redis:cls:key_expire Unknown cls, job N/A
redis:cls:key_hit Unknown cls, job N/A
redis:cls:key_hit_rate Unknown cls, job N/A
redis:cls:key_miss Unknown cls, job N/A
redis:cls:mem_max Unknown cls, job N/A
redis:cls:mem_usage Unknown cls, job N/A
redis:cls:mem_usage_max Unknown cls, job N/A
redis:cls:mem_used Unknown cls, job N/A
redis:cls:net_traffic Unknown cls, job N/A
redis:cls:qps Unknown cls, job N/A
redis:cls:qps_mu Unknown cls, job N/A
redis:cls:qps_realtime Unknown cls, job N/A
redis:cls:qps_sigma Unknown cls, job N/A
redis:cls:rt Unknown cls, job N/A
redis:cls:rt_mu Unknown cls, job N/A
redis:cls:rt_sigma Unknown cls, job N/A
redis:cls:rx Unknown cls, job N/A
redis:cls:size Unknown cls, job N/A
redis:cls:tx Unknown cls, job N/A
redis:env:blocked_clients Unknown job N/A
redis:env:clients Unknown job N/A
redis:env:cmd_qps Unknown cmd, job N/A
redis:env:cmd_rt Unknown cmd, job N/A
redis:env:cmd_time Unknown cmd, job N/A
redis:env:conn_rate Unknown job N/A
redis:env:conn_reject Unknown job N/A
redis:env:cpu_usage Unknown job N/A
redis:env:cpu_usage_child Unknown job N/A
redis:env:key_evict Unknown job N/A
redis:env:key_expire Unknown job N/A
redis:env:key_hit Unknown job N/A
redis:env:key_hit_rate Unknown job N/A
redis:env:key_miss Unknown job N/A
redis:env:mem_usage Unknown job N/A
redis:env:net_traffic Unknown job N/A
redis:env:qps Unknown job N/A
redis:env:qps_mu Unknown job N/A
redis:env:qps_realtime Unknown job N/A
redis:env:qps_sigma Unknown job N/A
redis:env:rt Unknown job N/A
redis:env:rt_mu Unknown job N/A
redis:env:rt_sigma Unknown job N/A
redis:env:rx Unknown job N/A
redis:env:tx Unknown job N/A
redis:ins Unknown cls, id, instance, ins, job N/A
redis:ins:blocked_clients Unknown cls, ip, instance, ins, job N/A
redis:ins:clients Unknown cls, ip, instance, ins, job N/A
redis:ins:cmd_qps Unknown cls, cmd, ip, instance, ins, job N/A
redis:ins:cmd_rt Unknown cls, cmd, ip, instance, ins, job N/A
redis:ins:cmd_time Unknown cls, cmd, ip, instance, ins, job N/A
redis:ins:conn_rate Unknown cls, ip, instance, ins, job N/A
redis:ins:conn_reject Unknown cls, ip, instance, ins, job N/A
redis:ins:cpu_sys Unknown cls, ip, instance, ins, job N/A
redis:ins:cpu_sys_child Unknown cls, ip, instance, ins, job N/A
redis:ins:cpu_usage Unknown cls, ip, instance, ins, job N/A
redis:ins:cpu_usage_child Unknown cls, ip, instance, ins, job N/A
redis:ins:cpu_user Unknown cls, ip, instance, ins, job N/A
redis:ins:cpu_user_child Unknown cls, ip, instance, ins, job N/A
redis:ins:key_evict Unknown cls, ip, instance, ins, job N/A
redis:ins:key_expire Unknown cls, ip, instance, ins, job N/A
redis:ins:key_hit Unknown cls, ip, instance, ins, job N/A
redis:ins:key_hit_rate Unknown cls, ip, instance, ins, job N/A
redis:ins:key_miss Unknown cls, ip, instance, ins, job N/A
redis:ins:lsn_rate Unknown cls, ip, instance, ins, job N/A
redis:ins:mem_usage Unknown cls, ip, instance, ins, job N/A
redis:ins:net_traffic Unknown cls, ip, instance, ins, job N/A
redis:ins:qps Unknown cls, ip, instance, ins, job N/A
redis:ins:qps_mu Unknown cls, ip, instance, ins, job N/A
redis:ins:qps_realtime Unknown cls, ip, instance, ins, job N/A
redis:ins:qps_sigma Unknown cls, ip, instance, ins, job N/A
redis:ins:rt Unknown cls, ip, instance, ins, job N/A
redis:ins:rt_mu Unknown cls, ip, instance, ins, job N/A
redis:ins:rt_sigma Unknown cls, ip, instance, ins, job N/A
redis:ins:rx Unknown cls, ip, instance, ins, job N/A
redis:ins:tx Unknown cls, ip, instance, ins, job N/A
redis:node:ip Unknown cls, ip, instance, ins, job N/A
redis:node:mem_alloc Unknown cls, ip, job N/A
redis:node:mem_total Unknown cls, ip, job N/A
redis:node:mem_used Unknown cls, ip, job N/A
redis:node:qps Unknown cls, ip, job N/A
redis_active_defrag_running gauge cls, ip, instance, ins, job active_defrag_running metric
redis_allocator_active_bytes gauge cls, ip, instance, ins, job allocator_active_bytes metric
redis_allocator_allocated_bytes gauge cls, ip, instance, ins, job allocator_allocated_bytes metric
redis_allocator_frag_bytes gauge cls, ip, instance, ins, job allocator_frag_bytes metric
redis_allocator_frag_ratio gauge cls, ip, instance, ins, job allocator_frag_ratio metric
redis_allocator_resident_bytes gauge cls, ip, instance, ins, job allocator_resident_bytes metric
redis_allocator_rss_bytes gauge cls, ip, instance, ins, job allocator_rss_bytes metric
redis_allocator_rss_ratio gauge cls, ip, instance, ins, job allocator_rss_ratio metric
redis_aof_current_rewrite_duration_sec gauge cls, ip, instance, ins, job aof_current_rewrite_duration_sec metric
redis_aof_enabled gauge cls, ip, instance, ins, job aof_enabled metric
redis_aof_last_bgrewrite_status gauge cls, ip, instance, ins, job aof_last_bgrewrite_status metric
redis_aof_last_cow_size_bytes gauge cls, ip, instance, ins, job aof_last_cow_size_bytes metric
redis_aof_last_rewrite_duration_sec gauge cls, ip, instance, ins, job aof_last_rewrite_duration_sec metric
redis_aof_last_write_status gauge cls, ip, instance, ins, job aof_last_write_status metric
redis_aof_rewrite_in_progress gauge cls, ip, instance, ins, job aof_rewrite_in_progress metric
redis_aof_rewrite_scheduled gauge cls, ip, instance, ins, job aof_rewrite_scheduled metric
redis_blocked_clients gauge cls, ip, instance, ins, job blocked_clients metric
redis_client_recent_max_input_buffer_bytes gauge cls, ip, instance, ins, job client_recent_max_input_buffer_bytes metric
redis_client_recent_max_output_buffer_bytes gauge cls, ip, instance, ins, job client_recent_max_output_buffer_bytes metric
redis_clients_in_timeout_table gauge cls, ip, instance, ins, job clients_in_timeout_table metric
redis_cluster_connections gauge cls, ip, instance, ins, job cluster_connections metric
redis_cluster_current_epoch gauge cls, ip, instance, ins, job cluster_current_epoch metric
redis_cluster_enabled gauge cls, ip, instance, ins, job cluster_enabled metric
redis_cluster_known_nodes gauge cls, ip, instance, ins, job cluster_known_nodes metric
redis_cluster_messages_received_total gauge cls, ip, instance, ins, job cluster_messages_received_total metric
redis_cluster_messages_sent_total gauge cls, ip, instance, ins, job cluster_messages_sent_total metric
redis_cluster_my_epoch gauge cls, ip, instance, ins, job cluster_my_epoch metric
redis_cluster_size gauge cls, ip, instance, ins, job cluster_size metric
redis_cluster_slots_assigned gauge cls, ip, instance, ins, job cluster_slots_assigned metric
redis_cluster_slots_fail gauge cls, ip, instance, ins, job cluster_slots_fail metric
redis_cluster_slots_ok gauge cls, ip, instance, ins, job cluster_slots_ok metric
redis_cluster_slots_pfail gauge cls, ip, instance, ins, job cluster_slots_pfail metric
redis_cluster_state gauge cls, ip, instance, ins, job cluster_state metric
redis_cluster_stats_messages_meet_received gauge cls, ip, instance, ins, job cluster_stats_messages_meet_received metric
redis_cluster_stats_messages_meet_sent gauge cls, ip, instance, ins, job cluster_stats_messages_meet_sent metric
redis_cluster_stats_messages_ping_received gauge cls, ip, instance, ins, job cluster_stats_messages_ping_received metric
redis_cluster_stats_messages_ping_sent gauge cls, ip, instance, ins, job cluster_stats_messages_ping_sent metric
redis_cluster_stats_messages_pong_received gauge cls, ip, instance, ins, job cluster_stats_messages_pong_received metric
redis_cluster_stats_messages_pong_sent gauge cls, ip, instance, ins, job cluster_stats_messages_pong_sent metric
redis_commands_duration_seconds_total counter cls, cmd, ip, instance, ins, job Total amount of time in seconds spent per command
redis_commands_failed_calls_total counter cls, cmd, ip, instance, ins, job Total number of errors prior command execution per command
redis_commands_latencies_usec_bucket Unknown cls, cmd, ip, le, instance, ins, job N/A
redis_commands_latencies_usec_count Unknown cls, cmd, ip, instance, ins, job N/A
redis_commands_latencies_usec_sum Unknown cls, cmd, ip, instance, ins, job N/A
redis_commands_processed_total counter cls, ip, instance, ins, job commands_processed_total metric
redis_commands_rejected_calls_total counter cls, cmd, ip, instance, ins, job Total number of errors within command execution per command
redis_commands_total counter cls, cmd, ip, instance, ins, job Total number of calls per command
redis_config_io_threads gauge cls, ip, instance, ins, job config_io_threads metric
redis_config_maxclients gauge cls, ip, instance, ins, job config_maxclients metric
redis_config_maxmemory gauge cls, ip, instance, ins, job config_maxmemory metric
redis_connected_clients gauge cls, ip, instance, ins, job connected_clients metric
redis_connected_slave_lag_seconds gauge cls, ip, slave_ip, instance, slave_state, ins, slave_port, job Lag of connected slave
redis_connected_slave_offset_bytes gauge cls, ip, slave_ip, instance, slave_state, ins, slave_port, job Offset of connected slave
redis_connected_slaves gauge cls, ip, instance, ins, job connected_slaves metric
redis_connections_received_total counter cls, ip, instance, ins, job connections_received_total metric
redis_cpu_sys_children_seconds_total counter cls, ip, instance, ins, job cpu_sys_children_seconds_total metric
redis_cpu_sys_main_thread_seconds_total counter cls, ip, instance, ins, job cpu_sys_main_thread_seconds_total metric
redis_cpu_sys_seconds_total counter cls, ip, instance, ins, job cpu_sys_seconds_total metric
redis_cpu_user_children_seconds_total counter cls, ip, instance, ins, job cpu_user_children_seconds_total metric
redis_cpu_user_main_thread_seconds_total counter cls, ip, instance, ins, job cpu_user_main_thread_seconds_total metric
redis_cpu_user_seconds_total counter cls, ip, instance, ins, job cpu_user_seconds_total metric
redis_db_keys gauge cls, ip, instance, ins, db, job Total number of keys by DB
redis_db_keys_expiring gauge cls, ip, instance, ins, db, job Total number of expiring keys by DB
redis_defrag_hits gauge cls, ip, instance, ins, job defrag_hits metric
redis_defrag_key_hits gauge cls, ip, instance, ins, job defrag_key_hits metric
redis_defrag_key_misses gauge cls, ip, instance, ins, job defrag_key_misses metric
redis_defrag_misses gauge cls, ip, instance, ins, job defrag_misses metric
redis_dump_payload_sanitizations counter cls, ip, instance, ins, job dump_payload_sanitizations metric
redis_errors_total counter cls, ip, err, instance, ins, job Total number of errors per error type
redis_evicted_keys_total counter cls, ip, instance, ins, job evicted_keys_total metric
redis_expired_keys_total counter cls, ip, instance, ins, job expired_keys_total metric
redis_expired_stale_percentage gauge cls, ip, instance, ins, job expired_stale_percentage metric
redis_expired_time_cap_reached_total gauge cls, ip, instance, ins, job expired_time_cap_reached_total metric
redis_exporter_build_info gauge cls, golang_version, ip, commit_sha, instance, version, ins, job, build_date redis exporter build_info
redis_exporter_last_scrape_connect_time_seconds gauge cls, ip, instance, ins, job exporter_last_scrape_connect_time_seconds metric
redis_exporter_last_scrape_duration_seconds gauge cls, ip, instance, ins, job exporter_last_scrape_duration_seconds metric
redis_exporter_last_scrape_error gauge cls, ip, instance, ins, job The last scrape error status.
redis_exporter_scrape_duration_seconds_count Unknown cls, ip, instance, ins, job N/A
redis_exporter_scrape_duration_seconds_sum Unknown cls, ip, instance, ins, job N/A
redis_exporter_scrapes_total counter cls, ip, instance, ins, job Current total redis scrapes.
redis_instance_info gauge cls, ip, os, role, instance, run_id, redis_version, tcp_port, process_id, ins, redis_mode, maxmemory_policy, redis_build_id, job Information about the Redis instance
redis_io_threaded_reads_processed counter cls, ip, instance, ins, job io_threaded_reads_processed metric
redis_io_threaded_writes_processed counter cls, ip, instance, ins, job io_threaded_writes_processed metric
redis_io_threads_active gauge cls, ip, instance, ins, job io_threads_active metric
redis_keyspace_hits_total counter cls, ip, instance, ins, job keyspace_hits_total metric
redis_keyspace_misses_total counter cls, ip, instance, ins, job keyspace_misses_total metric
redis_last_key_groups_scrape_duration_milliseconds gauge cls, ip, instance, ins, job Duration of the last key group metrics scrape in milliseconds
redis_last_slow_execution_duration_seconds gauge cls, ip, instance, ins, job The amount of time needed for last slow execution, in seconds
redis_latency_percentiles_usec summary cls, cmd, ip, instance, quantile, ins, job A summary of latency percentile distribution per command
redis_latency_percentiles_usec_count Unknown cls, cmd, ip, instance, ins, job N/A
redis_latency_percentiles_usec_sum Unknown cls, cmd, ip, instance, ins, job N/A
redis_latest_fork_seconds gauge cls, ip, instance, ins, job latest_fork_seconds metric
redis_lazyfree_pending_objects gauge cls, ip, instance, ins, job lazyfree_pending_objects metric
redis_loading_dump_file gauge cls, ip, instance, ins, job loading_dump_file metric
redis_master_last_io_seconds_ago gauge cls, ip, master_host, instance, ins, job, master_port Master last io seconds ago
redis_master_link_up gauge cls, ip, master_host, instance, ins, job, master_port Master link status on Redis slave
redis_master_repl_offset gauge cls, ip, instance, ins, job master_repl_offset metric
redis_master_sync_in_progress gauge cls, ip, master_host, instance, ins, job, master_port Master sync in progress
redis_mem_clients_normal gauge cls, ip, instance, ins, job mem_clients_normal metric
redis_mem_clients_slaves gauge cls, ip, instance, ins, job mem_clients_slaves metric
redis_mem_fragmentation_bytes gauge cls, ip, instance, ins, job mem_fragmentation_bytes metric
redis_mem_fragmentation_ratio gauge cls, ip, instance, ins, job mem_fragmentation_ratio metric
redis_mem_not_counted_for_eviction_bytes gauge cls, ip, instance, ins, job mem_not_counted_for_eviction_bytes metric
redis_memory_max_bytes gauge cls, ip, instance, ins, job memory_max_bytes metric
redis_memory_used_bytes gauge cls, ip, instance, ins, job memory_used_bytes metric
redis_memory_used_dataset_bytes gauge cls, ip, instance, ins, job memory_used_dataset_bytes metric
redis_memory_used_lua_bytes gauge cls, ip, instance, ins, job memory_used_lua_bytes metric
redis_memory_used_overhead_bytes gauge cls, ip, instance, ins, job memory_used_overhead_bytes metric
redis_memory_used_peak_bytes gauge cls, ip, instance, ins, job memory_used_peak_bytes metric
redis_memory_used_rss_bytes gauge cls, ip, instance, ins, job memory_used_rss_bytes metric
redis_memory_used_scripts_bytes gauge cls, ip, instance, ins, job memory_used_scripts_bytes metric
redis_memory_used_startup_bytes gauge cls, ip, instance, ins, job memory_used_startup_bytes metric
redis_migrate_cached_sockets_total gauge cls, ip, instance, ins, job migrate_cached_sockets_total metric
redis_module_fork_in_progress gauge cls, ip, instance, ins, job module_fork_in_progress metric
redis_module_fork_last_cow_size gauge cls, ip, instance, ins, job module_fork_last_cow_size metric
redis_net_input_bytes_total counter cls, ip, instance, ins, job net_input_bytes_total metric
redis_net_output_bytes_total counter cls, ip, instance, ins, job net_output_bytes_total metric
redis_number_of_cached_scripts gauge cls, ip, instance, ins, job number_of_cached_scripts metric
redis_process_id gauge cls, ip, instance, ins, job process_id metric
redis_pubsub_channels gauge cls, ip, instance, ins, job pubsub_channels metric
redis_pubsub_patterns gauge cls, ip, instance, ins, job pubsub_patterns metric
redis_pubsubshard_channels gauge cls, ip, instance, ins, job pubsubshard_channels metric
redis_rdb_bgsave_in_progress gauge cls, ip, instance, ins, job rdb_bgsave_in_progress metric
redis_rdb_changes_since_last_save gauge cls, ip, instance, ins, job rdb_changes_since_last_save metric
redis_rdb_current_bgsave_duration_sec gauge cls, ip, instance, ins, job rdb_current_bgsave_duration_sec metric
redis_rdb_last_bgsave_duration_sec gauge cls, ip, instance, ins, job rdb_last_bgsave_duration_sec metric
redis_rdb_last_bgsave_status gauge cls, ip, instance, ins, job rdb_last_bgsave_status metric
redis_rdb_last_cow_size_bytes gauge cls, ip, instance, ins, job rdb_last_cow_size_bytes metric
redis_rdb_last_save_timestamp_seconds gauge cls, ip, instance, ins, job rdb_last_save_timestamp_seconds metric
redis_rejected_connections_total counter cls, ip, instance, ins, job rejected_connections_total metric
redis_repl_backlog_first_byte_offset gauge cls, ip, instance, ins, job repl_backlog_first_byte_offset metric
redis_repl_backlog_history_bytes gauge cls, ip, instance, ins, job repl_backlog_history_bytes metric
redis_repl_backlog_is_active gauge cls, ip, instance, ins, job repl_backlog_is_active metric
redis_replica_partial_resync_accepted gauge cls, ip, instance, ins, job replica_partial_resync_accepted metric
redis_replica_partial_resync_denied gauge cls, ip, instance, ins, job replica_partial_resync_denied metric
redis_replica_resyncs_full gauge cls, ip, instance, ins, job replica_resyncs_full metric
redis_replication_backlog_bytes gauge cls, ip, instance, ins, job replication_backlog_bytes metric
redis_second_repl_offset gauge cls, ip, instance, ins, job second_repl_offset metric
redis_sentinel_master_ckquorum_status gauge cls, ip, message, instance, ins, master_name, job Master ckquorum status
redis_sentinel_master_ok_sentinels gauge cls, ip, instance, ins, master_address, master_name, job The number of okay sentinels monitoring this master
redis_sentinel_master_ok_slaves gauge cls, ip, instance, ins, master_address, master_name, job The number of okay slaves of the master
redis_sentinel_master_sentinels gauge cls, ip, instance, ins, master_address, master_name, job The number of sentinels monitoring this master
redis_sentinel_master_setting_ckquorum gauge cls, ip, instance, ins, master_address, master_name, job Show the current ckquorum config for each master
redis_sentinel_master_setting_down_after_milliseconds gauge cls, ip, instance, ins, master_address, master_name, job Show the current down-after-milliseconds config for each master
redis_sentinel_master_setting_failover_timeout gauge cls, ip, instance, ins, master_address, master_name, job Show the current failover-timeout config for each master
redis_sentinel_master_setting_parallel_syncs gauge cls, ip, instance, ins, master_address, master_name, job Show the current parallel-syncs config for each master
redis_sentinel_master_slaves gauge cls, ip, instance, ins, master_address, master_name, job The number of slaves of the master
redis_sentinel_master_status gauge cls, ip, master_status, instance, ins, master_address, master_name, job Master status on Sentinel
redis_sentinel_masters gauge cls, ip, instance, ins, job The number of masters this sentinel is watching
redis_sentinel_running_scripts gauge cls, ip, instance, ins, job Number of scripts in execution right now
redis_sentinel_scripts_queue_length gauge cls, ip, instance, ins, job Queue of user scripts to execute
redis_sentinel_simulate_failure_flags gauge cls, ip, instance, ins, job Failures simulations
redis_sentinel_tilt gauge cls, ip, instance, ins, job Sentinel is in TILT mode
redis_slave_expires_tracked_keys gauge cls, ip, instance, ins, job slave_expires_tracked_keys metric
redis_slave_info gauge cls, ip, master_host, instance, read_only, ins, job, master_port Information about the Redis slave
redis_slave_priority gauge cls, ip, instance, ins, job slave_priority metric
redis_slave_repl_offset gauge cls, ip, master_host, instance, ins, job, master_port Slave replication offset
redis_slowlog_last_id gauge cls, ip, instance, ins, job Last id of slowlog
redis_slowlog_length gauge cls, ip, instance, ins, job Total slowlog
redis_start_time_seconds gauge cls, ip, instance, ins, job Start time of the Redis instance since unix epoch in seconds.
redis_target_scrape_request_errors_total counter cls, ip, instance, ins, job Errors in requests to the exporter
redis_total_error_replies counter cls, ip, instance, ins, job total_error_replies metric
redis_total_reads_processed counter cls, ip, instance, ins, job total_reads_processed metric
redis_total_system_memory_bytes gauge cls, ip, instance, ins, job total_system_memory_bytes metric
redis_total_writes_processed counter cls, ip, instance, ins, job total_writes_processed metric
redis_tracking_clients gauge cls, ip, instance, ins, job tracking_clients metric
redis_tracking_total_items gauge cls, ip, instance, ins, job tracking_total_items metric
redis_tracking_total_keys gauge cls, ip, instance, ins, job tracking_total_keys metric
redis_tracking_total_prefixes gauge cls, ip, instance, ins, job tracking_total_prefixes metric
redis_unexpected_error_replies counter cls, ip, instance, ins, job unexpected_error_replies metric
redis_up gauge cls, ip, instance, ins, job Information about the Redis instance
redis_uptime_in_seconds gauge cls, ip, instance, ins, job uptime_in_seconds metric
scrape_duration_seconds Unknown cls, ip, instance, ins, job N/A
scrape_samples_post_metric_relabeling Unknown cls, ip, instance, ins, job N/A
scrape_samples_scraped Unknown cls, ip, instance, ins, job N/A
scrape_series_added Unknown cls, ip, instance, ins, job N/A
up Unknown cls, ip, instance, ins, job N/A

10.2 - FAQ

Pigsty REDIS module frequently asked questions

ABORT due to existing redis instance

use redis_clean = true and redis_safeguard = false to force clean redis data

This happens when you run redis.yml to init a redis instance that is already running, and redis_clean is set to false.

If redis_clean is set to true (and the redis_safeguard is set to false, too), the redis.yml playbook will remove the existing redis instance and re-init it as a new one, which makes the redis.yml playbook fully idempotent.


ABORT due to redis_safeguard enabled

This happens when removing a redis instance with redis_safeguard set to true.

You can disable redis_safeguard to remove the Redis instance. This is redis_safeguard is what it is for.


How to add a single new redis instance on this node?

Use bin/redis-add <ip> <port> to deploy a new redis instance on node.


How to remove a single redis instance from the node?

bin/redis-rm <ip> <port> to remove a single redis instance from node

11 - Module: MONGO

Pigsty has built-in FerretDB support, which is a MongoDB compatiable middleware based on PostgreSQL.

Configuration | Administration | Playbook | Dashboard | Parameter


Overview

MongoDB was once a stunning technology, allowing developers to cast aside the “schema constraints” of relational databases and quickly build applications. However, over time, MongoDB abandoned its open-source nature, changing its license to SSPL, which made it unusable for many open-source projects and early commercial projects. Most MongoDB users actually do not need the advanced features provided by MongoDB, but they do need an easy-to-use open-source document database solution. To fill this gap, FerretDB was born.

PostgreSQL’s JSON functionality is already well-rounded: binary storage JSONB, GIN arbitrary field indexing, various JSON processing functions, JSON PATH, and JSON Schema, it has long been a fully-featured, high-performance document database. However, providing alternative functionality and direct emulation are not the same. FerretDB can provide a smooth transition to PostgreSQL for applications driven by MongoDB drivers.

Pigsty provided a Docker-Compose support for FerretDB in 1.x, and native deployment support since v2.3. As an optional feature, it greatly benefits the enrichment of the PostgreSQL ecosystem. The Pigsty community has already become a partner with the FerretDB community, and we shall find more opportunities to work together in the future.


Configuration

You have to define a Mongo (FerretDB) cluster before deploying it. There are some parameters for it:

Here’s an example to utilize the default single-node pg-meta cluster as MongoDB:

ferret:
  hosts: { 10.10.10.10: { mongo_seq: 1 } }
  vars:
    mongo_cluster: ferret
    mongo_pgurl: 'postgres://dbuser_meta:[email protected]:5432/meta'

The mongo_cluster and mongo_seq are required identity parameters, you also need mongo_pgurl to specify the underlying PostgreSQL URL for FerretDB.

You can also setup multiple replicas and bind an L2 VIP to them, utilize the underlying HA Postgres cluster through Services

mongo-test:
  hosts:
    10.10.10.11: { mongo_seq: 1 }
    10.10.10.12: { mongo_seq: 2 }
    10.10.10.13: { mongo_seq: 3 }
  vars:
    mongo_cluster: mongo-test
    mongo_pgurl: 'postgres://test:[email protected]:5436/test'
    vip_enabled: true
    vip_vrid: 128
    vip_address: 10.10.10.99
    vip_interface: eth1

Administration

Create Cluster

To create a defined mongo/ferretdb cluster, run the mongo.yml playbook:

./mongo.yml -l ferret    # install mongo/ferretdb on group 'ferret'

Since FerretDB saves all data in underlying PostgreSQL, it is safe to run the playbook multiple times.

Remove Cluster

To remove a mongo/ferretdb cluster, run the mongo.yml playbook with mongo_purge subtask and mongo_purge flag.

./mongo.yml -e mongo_purge=true -t mongo_purge

FerretDB Connect

You can connect to FerretDB with any MongoDB driver using the MongoDB connection string, here we use the mongosh command line tool installed above as an example:

mongosh 'mongodb://dbuser_meta:[email protected]:27017?authMechanism=PLAIN'
mongosh 'mongodb://test:[email protected]:27017/test?authMechanism=PLAIN'

Since Pigsty uses the scram-sha-256 as the default auth method, you must use the PLAIN auth mechanism to connect to FerretDB. Check FerretDB: authentication for details.

You can also use other PostgreSQL users to connect to FerretDB, just specify them in the connection string:

mongosh 'mongodb://dbuser_dba:[email protected]:27017?authMechanism=PLAIN'

Quick Start

You can connect to FerretDB, and pretend it is a MongoDB cluster.

$ mongosh 'mongodb://dbuser_meta:[email protected]:27017?authMechanism=PLAIN'

The MongoDB commands will be translated into SQL commands and run in underlying PostgreSQL:

use test                            # CREATE SCHEMA test;
db.dropDatabase()                   # DROP SCHEMA test;
db.createCollection('posts')        # CREATE TABLE posts(_data JSONB,...)
db.posts.insert({                   # INSERT INTO posts VALUES(...);
    title: 'Post One',body: 'Body of post one',category: 'News',tags: ['news', 'events'],
    user: {name: 'John Doe',status: 'author'},date: Date()}
)
db.posts.find().limit(2).pretty()   # SELECT * FROM posts LIMIT 2;
db.posts.createIndex({ title: 1 })  # CREATE INDEX ON posts(_data->>'title');

If you are not familiar with MongoDB, here is a quick start: Perform CRUD Operations with MongoDB Shell

To generate some load, you can run a simple benchmark with mongosh:

cat > benchmark.js <<'EOF'
const coll = "testColl";
const numDocs = 10000;

for (let i = 0; i < numDocs; i++) {  // insert
  db.getCollection(coll).insert({ num: i, name: "MongoDB Benchmark Test" });
}

for (let i = 0; i < numDocs; i++) {  // select
  db.getCollection(coll).find({ num: i });
}

for (let i = 0; i < numDocs; i++) {  // update
  db.getCollection(coll).update({ num: i }, { $set: { name: "Updated" } });
}

for (let i = 0; i < numDocs; i++) {  // delete
  db.getCollection(coll).deleteOne({ num: i });
}
EOF

mongosh 'mongodb://dbuser_meta:[email protected]:27017?authMechanism=PLAIN' benchmark.js

You can check supported Mongo commands on ferretdb: supported commands, and there may be some differences between MongoDB and FerretDB. Check ferretdb: differences for details, it’s not a big deal for sane usage.


Playbook

There’s a built-in playbook mongo.yml for installing the FerretDB cluster. But you have to define it first.

mongo.yml

mongo.yml: Install MongoDB/FerretDB on the target host.

This playbook consists of the following sub-tasks:

  • mongo_check : check mongo identity
  • mongo_dbsu : create os user mongod
  • mongo_install : install mongo/ferretdb rpm
  • mongo_purge : purge mongo/ferretdb
  • mongo_config : config mongo/ferretdb
    • mongo_cert : issue mongo/ferretdb ssl certs
  • mongo_launch : launch mongo/ferretdb service
  • mongo_register : register mongo/ferretdb to prometheus

Dashboard

There is one dashboard for MONGO module for now.

Mongo Overview

Mongo Overview: Overview of a Mongo/FerretDB cluster

mongo-overview.jpg


Parameter

There are 9 parameters in MONGO module.

Parameter Type Level Comment
mongo_seq int I mongo instance identifier, REQUIRED
mongo_cluster string C mongo cluster name, MONGO by default
mongo_pgurl pgurl C/I underlying postgres URL for ferretdb
mongo_ssl_enabled bool C mongo/ferretdb ssl enabled, false by default
mongo_listen ip C mongo listen address, empty for all addr
mongo_port port C mongo service port, 27017 by default
mongo_ssl_port port C mongo tls listen port, 27018 by default
mongo_exporter_port port C mongo exporter port, 9216 by default
mongo_extra_vars string C extra environment variables for MONGO server
# mongo_cluster:        #CLUSTER  # mongo cluster name, required identity parameter
# mongo_seq: 0          #INSTANCE # mongo instance seq number, required identity parameter
# mongo_pgurl: 'postgres:///'     # mongo/ferretdb underlying postgresql url, required
mongo_ssl_enabled: false          # mongo/ferretdb ssl enabled, false by default
mongo_listen: ''                  # mongo/ferretdb listen address, '' for all addr
mongo_port: 27017                 # mongo/ferretdb listen port, 27017 by default
mongo_ssl_port: 27018             # mongo/ferretdb tls listen port, 27018 by default
mongo_exporter_port: 9216         # mongo/ferretdb exporter port, 9216 by default
mongo_extra_vars: ''              # extra environment variables for mongo/ferretdb

11.1 - Metrics

Pigsty MONGO module metric list

MONGO module has 54 available metrics

Metric Name Type Labels Description
ferretdb_client_accepts_total Unknown error, cls, ip, ins, instance, job N/A
ferretdb_client_duration_seconds_bucket Unknown error, le, cls, ip, ins, instance, job N/A
ferretdb_client_duration_seconds_count Unknown error, cls, ip, ins, instance, job N/A
ferretdb_client_duration_seconds_sum Unknown error, cls, ip, ins, instance, job N/A
ferretdb_client_requests_total Unknown cls, ip, ins, opcode, instance, command, job N/A
ferretdb_client_responses_total Unknown result, argument, cls, ip, ins, opcode, instance, command, job N/A
ferretdb_postgresql_metadata_databases gauge cls, ip, ins, instance, job The current number of database in the registry.
ferretdb_postgresql_pool_size gauge cls, ip, ins, instance, job The current number of pools.
ferretdb_up gauge cls, version, commit, ip, ins, dirty, telemetry, package, update_available, uuid, instance, job, branch, debug FerretDB instance state.
go_gc_duration_seconds summary cls, ip, ins, instance, quantile, job A summary of the pause duration of garbage collection cycles.
go_gc_duration_seconds_count Unknown cls, ip, ins, instance, job N/A
go_gc_duration_seconds_sum Unknown cls, ip, ins, instance, job N/A
go_goroutines gauge cls, ip, ins, instance, job Number of goroutines that currently exist.
go_info gauge cls, version, ip, ins, instance, job Information about the Go environment.
go_memstats_alloc_bytes gauge cls, ip, ins, instance, job Number of bytes allocated and still in use.
go_memstats_alloc_bytes_total counter cls, ip, ins, instance, job Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes gauge cls, ip, ins, instance, job Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total counter cls, ip, ins, instance, job Total number of frees.
go_memstats_gc_sys_bytes gauge cls, ip, ins, instance, job Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes gauge cls, ip, ins, instance, job Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes gauge cls, ip, ins, instance, job Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes gauge cls, ip, ins, instance, job Number of heap bytes that are in use.
go_memstats_heap_objects gauge cls, ip, ins, instance, job Number of allocated objects.
go_memstats_heap_released_bytes gauge cls, ip, ins, instance, job Number of heap bytes released to OS.
go_memstats_heap_sys_bytes gauge cls, ip, ins, instance, job Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds gauge cls, ip, ins, instance, job Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total counter cls, ip, ins, instance, job Total number of pointer lookups.
go_memstats_mallocs_total counter cls, ip, ins, instance, job Total number of mallocs.
go_memstats_mcache_inuse_bytes gauge cls, ip, ins, instance, job Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes gauge cls, ip, ins, instance, job Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes gauge cls, ip, ins, instance, job Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes gauge cls, ip, ins, instance, job Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes gauge cls, ip, ins, instance, job Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes gauge cls, ip, ins, instance, job Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes gauge cls, ip, ins, instance, job Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes gauge cls, ip, ins, instance, job Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes gauge cls, ip, ins, instance, job Number of bytes obtained from system.
go_threads gauge cls, ip, ins, instance, job Number of OS threads created.
mongo_up Unknown cls, ip, ins, instance, job N/A
process_cpu_seconds_total counter cls, ip, ins, instance, job Total user and system CPU time spent in seconds.
process_max_fds gauge cls, ip, ins, instance, job Maximum number of open file descriptors.
process_open_fds gauge cls, ip, ins, instance, job Number of open file descriptors.
process_resident_memory_bytes gauge cls, ip, ins, instance, job Resident memory size in bytes.
process_start_time_seconds gauge cls, ip, ins, instance, job Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes gauge cls, ip, ins, instance, job Virtual memory size in bytes.
process_virtual_memory_max_bytes gauge cls, ip, ins, instance, job Maximum amount of virtual memory available in bytes.
promhttp_metric_handler_errors_total counter job, cls, ip, ins, instance, cause Total number of internal errors encountered by the promhttp metric handler.
promhttp_metric_handler_requests_in_flight gauge cls, ip, ins, instance, job Current number of scrapes being served.
promhttp_metric_handler_requests_total counter job, cls, ip, ins, instance, code Total number of scrapes by HTTP status code.
scrape_duration_seconds Unknown cls, ip, ins, instance, job N/A
scrape_samples_post_metric_relabeling Unknown cls, ip, ins, instance, job N/A
scrape_samples_scraped Unknown cls, ip, ins, instance, job N/A
scrape_series_added Unknown cls, ip, ins, instance, job N/A
up Unknown cls, ip, ins, instance, job N/A

11.2 - FAQ

Pigsty FerretDB / MONGO module frequently asked questions

Install mongosh

cat > /etc/yum.repos.d/mongo.repo <<EOF
[mongodb-org-7.0]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/7.0/x86_64/
gpgcheck=0
enabled=1
EOF

yum install -y mongodb-mongosh

# or just install via rpm & links
rpm -ivh https://mirrors.tuna.tsinghua.edu.cn/mongodb/yum/el7/RPMS/mongodb-mongosh-1.9.1.x86_64.rpm

12 - Module: DOCKER

Docker Daemon services, which allows you to pull-up stateless software in additional to PostgreSQL.

Deploy docker on Pigsty managed nodes: Configuration | Administration | Playbook | Dashboard | Parameter


Concept

Docker is a popular container service which provides standardize software deliver solution.


Configuration

Docker module is different from other modules, which does not require pre-configuration to install & enable.

Just run the docker.yml playbook on any Pigsty managed node.

But if you wish to add docker daemon as prometheus monitoring target, you have to specify the docker_enabled parameters on those nodes.


Administration

Using Mirrors

Consider using docker mirror registry, log in with:

docker login quay.io    # entering your username & password

Monitoring

Docker monitoring is part ot NODE module’s responsibility, to register docker target to prometheus.

You have to define docker_enabled on nodes, then re-register them with:

./node.yml -l <selector> -t register_prometheus

Compose Template

Pigsty has a series of built-in docker compose templates.


Playbook

There’s one playbook to install docker on designated node:

docker.yml

The playbook docker.yml will install docker for the given node.

Subtasks of this playbook:

  • docker_install : install docker on nodes
  • docker_admin : add user to docker admin group
  • docker_config : generate docker daemon config
  • docker_launch : launch docker daemon systemd service
  • docker_image : load docker images from /tmp/docker/*.tgz if exists

Parameter

There are 4 parameters about DOCKER module.

Parameter Section Type Level Comment
docker_enabled DOCKER bool C enable docker on this node?
docker_cgroups_driver DOCKER enum C docker cgroup fs driver: cgroupfs,systemd
docker_registry_mirrors DOCKER string[] C docker registry mirror list
docker_image_cache DOCKER path C docker image cache dir, /tmp/docker by default

12.1 - Metrics

Pigsty Docker module metric list

DOCKER module has 123 available metrics

Metric Name Type Labels Description
builder_builds_failed_total counter ip, cls, reason, ins, job, instance Number of failed image builds
builder_builds_triggered_total counter ip, cls, ins, job, instance Number of triggered image builds
docker_up Unknown ip, cls, ins, job, instance N/A
engine_daemon_container_actions_seconds_bucket Unknown ip, cls, ins, job, instance, le, action N/A
engine_daemon_container_actions_seconds_count Unknown ip, cls, ins, job, instance, action N/A
engine_daemon_container_actions_seconds_sum Unknown ip, cls, ins, job, instance, action N/A
engine_daemon_container_states_containers gauge ip, cls, ins, job, instance, state The count of containers in various states
engine_daemon_engine_cpus_cpus gauge ip, cls, ins, job, instance The number of cpus that the host system of the engine has
engine_daemon_engine_info gauge ip, cls, architecture, ins, job, instance, os_version, kernel, version, graphdriver, os, daemon_id, commit, os_type The information related to the engine and the OS it is running on
engine_daemon_engine_memory_bytes gauge ip, cls, ins, job, instance The number of bytes of memory that the host system of the engine has
engine_daemon_events_subscribers_total gauge ip, cls, ins, job, instance The number of current subscribers to events
engine_daemon_events_total counter ip, cls, ins, job, instance The number of events logged
engine_daemon_health_checks_failed_total counter ip, cls, ins, job, instance The total number of failed health checks
engine_daemon_health_check_start_duration_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
engine_daemon_health_check_start_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
engine_daemon_health_check_start_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
engine_daemon_health_checks_total counter ip, cls, ins, job, instance The total number of health checks
engine_daemon_host_info_functions_seconds_bucket Unknown ip, cls, ins, job, instance, le, function N/A
engine_daemon_host_info_functions_seconds_count Unknown ip, cls, ins, job, instance, function N/A
engine_daemon_host_info_functions_seconds_sum Unknown ip, cls, ins, job, instance, function N/A
engine_daemon_image_actions_seconds_bucket Unknown ip, cls, ins, job, instance, le, action N/A
engine_daemon_image_actions_seconds_count Unknown ip, cls, ins, job, instance, action N/A
engine_daemon_image_actions_seconds_sum Unknown ip, cls, ins, job, instance, action N/A
engine_daemon_network_actions_seconds_bucket Unknown ip, cls, ins, job, instance, le, action N/A
engine_daemon_network_actions_seconds_count Unknown ip, cls, ins, job, instance, action N/A
engine_daemon_network_actions_seconds_sum Unknown ip, cls, ins, job, instance, action N/A
etcd_debugging_snap_save_marshalling_duration_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
etcd_debugging_snap_save_marshalling_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
etcd_debugging_snap_save_marshalling_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
etcd_debugging_snap_save_total_duration_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
etcd_debugging_snap_save_total_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
etcd_debugging_snap_save_total_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
etcd_disk_wal_fsync_duration_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
etcd_disk_wal_fsync_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
etcd_disk_wal_fsync_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
etcd_disk_wal_write_bytes_total gauge ip, cls, ins, job, instance Total number of bytes written in WAL.
etcd_snap_db_fsync_duration_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
etcd_snap_db_fsync_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
etcd_snap_db_fsync_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
etcd_snap_db_save_total_duration_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
etcd_snap_db_save_total_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
etcd_snap_db_save_total_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
etcd_snap_fsync_duration_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
etcd_snap_fsync_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
etcd_snap_fsync_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
go_gc_duration_seconds summary ip, cls, ins, job, instance, quantile A summary of the pause duration of garbage collection cycles.
go_gc_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
go_gc_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
go_goroutines gauge ip, cls, ins, job, instance Number of goroutines that currently exist.
go_info gauge ip, cls, ins, job, version, instance Information about the Go environment.
go_memstats_alloc_bytes counter ip, cls, ins, job, instance Total number of bytes allocated, even if freed.
go_memstats_alloc_bytes_total counter ip, cls, ins, job, instance Total number of bytes allocated, even if freed.
go_memstats_buck_hash_sys_bytes gauge ip, cls, ins, job, instance Number of bytes used by the profiling bucket hash table.
go_memstats_frees_total counter ip, cls, ins, job, instance Total number of frees.
go_memstats_gc_sys_bytes gauge ip, cls, ins, job, instance Number of bytes used for garbage collection system metadata.
go_memstats_heap_alloc_bytes gauge ip, cls, ins, job, instance Number of heap bytes allocated and still in use.
go_memstats_heap_idle_bytes gauge ip, cls, ins, job, instance Number of heap bytes waiting to be used.
go_memstats_heap_inuse_bytes gauge ip, cls, ins, job, instance Number of heap bytes that are in use.
go_memstats_heap_objects gauge ip, cls, ins, job, instance Number of allocated objects.
go_memstats_heap_released_bytes gauge ip, cls, ins, job, instance Number of heap bytes released to OS.
go_memstats_heap_sys_bytes gauge ip, cls, ins, job, instance Number of heap bytes obtained from system.
go_memstats_last_gc_time_seconds gauge ip, cls, ins, job, instance Number of seconds since 1970 of last garbage collection.
go_memstats_lookups_total counter ip, cls, ins, job, instance Total number of pointer lookups.
go_memstats_mallocs_total counter ip, cls, ins, job, instance Total number of mallocs.
go_memstats_mcache_inuse_bytes gauge ip, cls, ins, job, instance Number of bytes in use by mcache structures.
go_memstats_mcache_sys_bytes gauge ip, cls, ins, job, instance Number of bytes used for mcache structures obtained from system.
go_memstats_mspan_inuse_bytes gauge ip, cls, ins, job, instance Number of bytes in use by mspan structures.
go_memstats_mspan_sys_bytes gauge ip, cls, ins, job, instance Number of bytes used for mspan structures obtained from system.
go_memstats_next_gc_bytes gauge ip, cls, ins, job, instance Number of heap bytes when next garbage collection will take place.
go_memstats_other_sys_bytes gauge ip, cls, ins, job, instance Number of bytes used for other system allocations.
go_memstats_stack_inuse_bytes gauge ip, cls, ins, job, instance Number of bytes in use by the stack allocator.
go_memstats_stack_sys_bytes gauge ip, cls, ins, job, instance Number of bytes obtained from system for stack allocator.
go_memstats_sys_bytes gauge ip, cls, ins, job, instance Number of bytes obtained from system.
go_threads gauge ip, cls, ins, job, instance Number of OS threads created.
logger_log_entries_size_greater_than_buffer_total counter ip, cls, ins, job, instance Number of log entries which are larger than the log buffer
logger_log_read_operations_failed_total counter ip, cls, ins, job, instance Number of log reads from container stdio that failed
logger_log_write_operations_failed_total counter ip, cls, ins, job, instance Number of log write operations that failed
process_cpu_seconds_total counter ip, cls, ins, job, instance Total user and system CPU time spent in seconds.
process_max_fds gauge ip, cls, ins, job, instance Maximum number of open file descriptors.
process_open_fds gauge ip, cls, ins, job, instance Number of open file descriptors.
process_resident_memory_bytes gauge ip, cls, ins, job, instance Resident memory size in bytes.
process_start_time_seconds gauge ip, cls, ins, job, instance Start time of the process since unix epoch in seconds.
process_virtual_memory_bytes gauge ip, cls, ins, job, instance Virtual memory size in bytes.
process_virtual_memory_max_bytes gauge ip, cls, ins, job, instance Maximum amount of virtual memory available in bytes.
promhttp_metric_handler_requests_in_flight gauge ip, cls, ins, job, instance Current number of scrapes being served.
promhttp_metric_handler_requests_total counter ip, cls, ins, job, instance, code Total number of scrapes by HTTP status code.
scrape_duration_seconds Unknown ip, cls, ins, job, instance N/A
scrape_samples_post_metric_relabeling Unknown ip, cls, ins, job, instance N/A
scrape_samples_scraped Unknown ip, cls, ins, job, instance N/A
scrape_series_added Unknown ip, cls, ins, job, instance N/A
swarm_dispatcher_scheduling_delay_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
swarm_dispatcher_scheduling_delay_seconds_count Unknown ip, cls, ins, job, instance N/A
swarm_dispatcher_scheduling_delay_seconds_sum Unknown ip, cls, ins, job, instance N/A
swarm_manager_configs_total gauge ip, cls, ins, job, instance The number of configs in the cluster object store
swarm_manager_leader gauge ip, cls, ins, job, instance Indicates if this manager node is a leader
swarm_manager_networks_total gauge ip, cls, ins, job, instance The number of networks in the cluster object store
swarm_manager_nodes gauge ip, cls, ins, job, instance, state The number of nodes
swarm_manager_secrets_total gauge ip, cls, ins, job, instance The number of secrets in the cluster object store
swarm_manager_services_total gauge ip, cls, ins, job, instance The number of services in the cluster object store
swarm_manager_tasks_total gauge ip, cls, ins, job, instance, state The number of tasks in the cluster object store
swarm_node_manager gauge ip, cls, ins, job, instance Whether this node is a manager or not
swarm_raft_snapshot_latency_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
swarm_raft_snapshot_latency_seconds_count Unknown ip, cls, ins, job, instance N/A
swarm_raft_snapshot_latency_seconds_sum Unknown ip, cls, ins, job, instance N/A
swarm_raft_transaction_latency_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
swarm_raft_transaction_latency_seconds_count Unknown ip, cls, ins, job, instance N/A
swarm_raft_transaction_latency_seconds_sum Unknown ip, cls, ins, job, instance N/A
swarm_store_batch_latency_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
swarm_store_batch_latency_seconds_count Unknown ip, cls, ins, job, instance N/A
swarm_store_batch_latency_seconds_sum Unknown ip, cls, ins, job, instance N/A
swarm_store_lookup_latency_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
swarm_store_lookup_latency_seconds_count Unknown ip, cls, ins, job, instance N/A
swarm_store_lookup_latency_seconds_sum Unknown ip, cls, ins, job, instance N/A
swarm_store_memory_store_lock_duration_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
swarm_store_memory_store_lock_duration_seconds_count Unknown ip, cls, ins, job, instance N/A
swarm_store_memory_store_lock_duration_seconds_sum Unknown ip, cls, ins, job, instance N/A
swarm_store_read_tx_latency_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
swarm_store_read_tx_latency_seconds_count Unknown ip, cls, ins, job, instance N/A
swarm_store_read_tx_latency_seconds_sum Unknown ip, cls, ins, job, instance N/A
swarm_store_write_tx_latency_seconds_bucket Unknown ip, cls, ins, job, instance, le N/A
swarm_store_write_tx_latency_seconds_count Unknown ip, cls, ins, job, instance N/A
swarm_store_write_tx_latency_seconds_sum Unknown ip, cls, ins, job, instance N/A
up Unknown ip, cls, ins, job, instance N/A

12.2 - FAQ

Pigsty Docker module frequently asked questions

How to install Docker ?

Install with the docker.yml playbook, targeting at any node managed by Pigsty.

./docker.yml -l <selector>    

13 - Tasks

Look up common tasks and how to perform them using a short sequence of steps

14 - Application

Docker compose template for software using PostgreSQL, and data visualization Apps

Lots of software using PostgreSQL. Pigsty has some docker compose template for some popular software.

You can launch stateless software with docker easily and using external HA PostgreSQL for higher availability & durability.

Docker is not installed by default, You can install docker with docker.yml playbook, e.g.: ./docker.yml -l infra

Available software and their docker-compose template can be found in pigsty/app

pigsty-app.jpg


PostgreSQL Administration

Use more advanced tools to manage PostgreSQL instances / clusters.

  • PgAdmin4: A GUI tool for managing PostgreSQL instances.
  • ByteBase: A GUI IaC tool for PostgreSQL schema migration.
  • PGWeb: A tool automatically generates back-end API services based on PG database schema.
  • SchemaSPY: Generates detailed visual reports of database schemas.
  • Pgbadger: Generate PostgreSQL summary report from log samples.

Application Development

Scaffold your application with PostgreSQL and its ecosystem.

  • Supabase: Supabase is an open source Firebase alternative based on PostgreSQL
  • FerretDB: FerretDB, a truly open source MongoDB alternative in PostgreSQL.
  • EdgeDB: EdgeDB, open source graph-like database based on PostgreSQL
  • PostgREST: PostgREST, serve a RESTful API from any Postgres database automatically.
  • Kong: Kong, a scalable, open source API Gateway with Redis/PostgreSQL/OpenResty
  • DuckDB: DuckDB, in-process SQL olap DBMS that works well with PostgreSQL

Business Software

Launch open-source software with PostgreSQL at ease.

  • Wiki.js: Wiki.js, the most powerful and extensible open source wiki software
  • Gitea: Gitea, a painless self-hosted git service
  • NocoDB: NocoDB, Open source AirTable alternative.
  • Gitlab: open-source code hosting platform.
  • Harbour: open-source mirror repo
  • Jira: open-source project management platform.
  • Confluence: open-source knowledge hosting platform.
  • Odoo: open-source ERP
  • Mastodon: PG-based social network
  • Discourse: open-source forum based on PG and Redis
  • Jupyter Lab: A battery-included Python lab environment for data analysis and processing.
  • Grafana: use postgres as backend storage
  • Promscale: use postgres/timescaledb as prometheus metrics storage

Visualization

Perform data visualization with PostgreSQL, Grafana & Echarts.

  • isd: noaa weather data visualization: github.com/Vonng/isd, Demo
  • pglog: PostgreSQL CSVLOG sample analyzer. Demo
  • covid: Covid-19 data visualization
  • dbeng: Database popularity visualization
  • price: RDS, ECS price comparison

14.1 - Data Visualization Applet

使用 Pigsty Grafana & Echarts 工具箱进行数据分析与可视化

Applet的结构

Applet,是一种自包含的,运行于Pigsty基础设施中的数据小应用。

一个Pigsty应用通常包括以下内容中的至少一样或全部:

  • 图形界面(Grafana Dashboard定义) 放置于ui目录
  • 数据定义(PostgreSQL DDL File),放置于 sql 目录
  • 数据文件(各类资源,需要下载的文件),放置于data目录
  • 逻辑脚本(执行各类逻辑),放置于bin目录

Pigsty默认提供了几个样例应用:

  • pglog, 分析PostgreSQL CSV日志样本。
  • covid, 可视化WHO COVID-19数据,查阅各国疫情数据。
  • pglog, NOAA ISD,可以查询全球30000个地表气象站从1901年来的气象观测记录。

应用的结构

一个Pigsty应用会在应用根目录提供一个安装脚本:install或相关快捷方式。您需要使用管理用户在 管理节点 执行安装。安装脚本会检测当前的环境(获取 METADB_URLPIGSTY_HOMEGRAFANA_ENDPOINT等信息以执行安装)

通常,带有APP标签的面板会被列入Pigsty Grafana首页导航中App下拉菜单中,带有APPOverview标签的面板则会列入首页面板导航中。

您可以从 https://github.com/Vonng/pigsty/releases/download/v1.5.1/app.tgz 下载带有基础数据的应用进行安装。

14.1.1 - Analyse CSVLOG Sample with the built-in PGLOG

Analyse CSVLOG Sample with the built-in PGLOG

PGLOG是Pigsty自带的一个样例应用,固定使用MetaDB中pglog.sample表作为数据来源。您只需要将日志灌入该表,然后访问相关Dashboard即可。

Pigsty提供了一些趁手的命令,用于拉取csv日志,并灌入样本表中。在元节点上,默认提供下列快捷命令:

catlog  [node=localhost]  [date=today]   # 打印CSV日志到标准输出
pglog                                    # 从标准输入灌入CSVLOG
pglog12                                  # 灌入PG12格式的CSVLOG
pglog13                                  # 灌入PG13格式的CSVLOG
pglog14                                  # 灌入PG14格式的CSVLOG (=pglog)

catlog | pglog                       # 分析当前节点当日的日志
catlog node-1 '2021-07-15' | pglog   # 分析node-1在2021-07-15的csvlog

接下来,您可以访问以下的连接,查看样例日志分析界面。

pglog-overview.jpg

pglog-session.jpg

catlog命令从特定节点拉取特定日期的CSV数据库日志,写入stdout

默认情况下,catlog会拉取当前节点当日的日志,您可以通过参数指定节点与日期。

组合使用pglogcatlog,即可快速拉取数据库CSV日志进行分析。

catlog | pglog                       # 分析当前节点当日的日志
catlog node-1 '2021-07-15' | pglog   # 分析node-1在2021-07-15的csvlog

14.1.2 - NOAA ISD Station

Fetch, Parse, Analyze, and Visualize Integrated Surface Weather Station Dataset

Including 30000 meteorology station, daily, sub-hourly observation records, from 1900-2023. https://github.com/Vonng/isd

ISD Overview

It is recommended to use with Pigsty, the battery-included PostgreSQL distribution with Grafana & echarts for visualization. It will setup everything for your with make all;

Otherwise, you’ll have to provide your own PostgreSQL instance, and setup grafana dashboards manually.


Quick Start

Clone this repo

git clone https://github.com/Vonng/isd.git; cd isd;

Prepare a PostgreSQL Instance

Export PGURL in your environment to specify the target postgres database:

# the default PGURL for pigsty's meta database, change that accordingly
export PGURL=postgres://dbuser_dba:[email protected]:5432/meta?sslmode=disable
psql "${PGURL}" -c 'SELECT 1'  # check if connection is ok

then init database schema with:

make sql              # setup postgres schema on target database

Get isd station metadata

The basic station metadata can be downloaded and loaded with:

make reload-station   # equivalent to get-station + load-station

Fetch and load isd.daily

To load isd.daily dataset, which is organized by yearly tarball files. You can download the raw data from noaa and parse with isd parser

make get-parser       # download parser binary from github, you can just build with: make build
make reload-daily     # download and reload latest daily data and re-calculates monthly/yearly data

Load Parsed Stable CSV Data

Or just load the pre-parsed stable part from GitHub. Which is well-formatted CSV that does not require an isd parser.

make get-stable       # download stable isd.daily dataset from Github
make load-stable      # load downloaded stable isd.daily dataset into database

More Data

There are two parts of isd datasets needs to be regularly updated: station metadata & isd.daily of the latest year, you can reload them with:

make reload           # reload-station + reload-daily

You can download and load isd.daily in a specific year with:

bin/get-daily  2022                   # get daily observation summary of a specific year (1900-2023)
bin/load-daily "${PGURL}" 2022    # load daily data of a specific year 

You can also download and load isd.hourly in a specific year with:

bin/get-hourly  2022                  # get hourly observation record of a specific year (1900-2023)
bin/load-hourly "${PGURL}" 2022   # load hourly data of a specific year 

Data

Dataset

There are four official datasets

Dataset Sample Document Comments
ISD Hourly isd-hourly-sample.csv isd-hourly-document.pdf (Sub)Hour observation records
ISD Daily isd-daily-sample.csv isd-daily-format.txt Daily summary
ISD Monthly N/A isd-gsom-document.pdf Not used, Generate from isd.daily
ISD Yearly N/A isd-gsoy-document.pdf Not used, Generate from isd.daily

Daily Dataset

  • Tarball size 2.8GB (until 2023-06-24)
  • Table size 24GB, Index size 6GB, Total size in PostgreSQL = 30GB
  • If timescaledb compression is used, it will be compressed to around 4.5GB

Hourly dataset

  • Tarball size 117GB
  • Table size 1TB+ , Index size 600GB+

Schema

CREATE TABLE isd.station
(
    station    VARCHAR(12) PRIMARY KEY,
    usaf       VARCHAR(6) GENERATED ALWAYS AS (substring(station, 1, 6)) STORED,
    wban       VARCHAR(5) GENERATED ALWAYS AS (substring(station, 7, 5)) STORED,
    name       VARCHAR(32),
    country    VARCHAR(2),
    province   VARCHAR(2),
    icao       VARCHAR(4),
    location   GEOMETRY(POINT),
    longitude  NUMERIC GENERATED ALWAYS AS (Round(ST_X(location)::NUMERIC, 6)) STORED,
    latitude   NUMERIC GENERATED ALWAYS AS (Round(ST_Y(location)::NUMERIC, 6)) STORED,
    elevation  NUMERIC,
    period     daterange,
    begin_date DATE GENERATED ALWAYS AS (lower(period)) STORED,
    end_date   DATE GENERATED ALWAYS AS (upper(period)) STORED
);
CREATE TABLE IF NOT EXISTS isd.daily
(
    station     VARCHAR(12) NOT NULL, -- station number 6USAF+5WBAN
    ts          DATE        NOT NULL, -- observation date
    -- temperature & dew point
    temp_mean   NUMERIC(3, 1),        -- mean temperature ℃
    temp_min    NUMERIC(3, 1),        -- min temperature ℃
    temp_max    NUMERIC(3, 1),        -- max temperature ℃
    dewp_mean   NUMERIC(3, 1),        -- mean dew point ℃
    -- pressure
    slp_mean    NUMERIC(5, 1),        -- sea level pressure (hPa)
    stp_mean    NUMERIC(5, 1),        -- station pressure (hPa)
    -- visible distance
    vis_mean    NUMERIC(6),           -- visible distance (m)
    -- wind speed
    wdsp_mean   NUMERIC(4, 1),        -- average wind speed (m/s)
    wdsp_max    NUMERIC(4, 1),        -- max wind speed (m/s)
    gust        NUMERIC(4, 1),        -- max wind gust (m/s) 
    -- precipitation / snow depth
    prcp_mean   NUMERIC(5, 1),        -- precipitation (mm)
    prcp        NUMERIC(5, 1),        -- rectified precipitation (mm)
    sndp        NuMERIC(5, 1),        -- snow depth (mm)
    -- FRSHTT (Fog/Rain/Snow/Hail/Thunder/Tornado)
    is_foggy    BOOLEAN,              -- (F)og
    is_rainy    BOOLEAN,              -- (R)ain or Drizzle
    is_snowy    BOOLEAN,              -- (S)now or pellets
    is_hail     BOOLEAN,              -- (H)ail
    is_thunder  BOOLEAN,              -- (T)hunder
    is_tornado  BOOLEAN,              -- (T)ornado or Funnel Cloud
    -- record count
    temp_count  SMALLINT,             -- record count for temp
    dewp_count  SMALLINT,             -- record count for dew point
    slp_count   SMALLINT,             -- record count for sea level pressure
    stp_count   SMALLINT,             -- record count for station pressure
    wdsp_count  SMALLINT,             -- record count for wind speed
    visib_count SMALLINT,             -- record count for visible distance
    -- temp marks
    temp_min_f  BOOLEAN,              -- aggregate min temperature
    temp_max_f  BOOLEAN,              -- aggregate max temperature
    prcp_flag   CHAR,                 -- precipitation flag: ABCDEFGHI
    PRIMARY KEY (station, ts)
); -- PARTITION BY RANGE (ts);
ISD Hourly
CREATE TABLE IF NOT EXISTS isd.hourly
(
    station    VARCHAR(12) NOT NULL, -- station id
    ts         TIMESTAMP   NOT NULL, -- timestamp
    -- air
    temp       NUMERIC(3, 1),        -- [-93.2,+61.8]
    dewp       NUMERIC(3, 1),        -- [-98.2,+36.8]
    slp        NUMERIC(5, 1),        -- [8600,10900]
    stp        NUMERIC(5, 1),        -- [4500,10900]
    vis        NUMERIC(6),           -- [0,160000]
    -- wind
    wd_angle   NUMERIC(3),           -- [1,360]
    wd_speed   NUMERIC(4, 1),        -- [0,90]
    wd_gust    NUMERIC(4, 1),        -- [0,110]
    wd_code    VARCHAR(1),           -- code that denotes the character of the WIND-OBSERVATION.
    -- cloud
    cld_height NUMERIC(5),           -- [0,22000]
    cld_code   VARCHAR(2),           -- cloud code
    -- water
    sndp       NUMERIC(5, 1),        -- mm snow
    prcp       NUMERIC(5, 1),        -- mm precipitation
    prcp_hour  NUMERIC(2),           -- precipitation duration in hour
    prcp_code  VARCHAR(1),           -- precipitation type code
    -- sky
    mw_code    VARCHAR(2),           -- manual weather observation code
    aw_code    VARCHAR(2),           -- auto weather observation code
    pw_code    VARCHAR(1),           -- weather code of past period of time
    pw_hour    NUMERIC(2),           -- duration of pw_code period
    -- misc
    -- remark     TEXT,
    -- eqd        TEXT,
    data       JSONB                 -- extra data
) PARTITION BY RANGE (ts);

Parser

There are two parsers: isdd and isdh, which takes noaa original yearly tarball as input, generate CSV as output (which could be directly consumed by PostgreSQL COPY command).

NAME
        isd -- Intergrated Surface Dataset Parser

SYNOPSIS
        isd daily   [-i <input|stdin>] [-o <output|stout>] [-v]
        isd hourly  [-i <input|stdin>] [-o <output|stout>] [-v] [-d raw|ts-first|hour-first]

DESCRIPTION
        The isd program takes noaa isd daily/hourly raw tarball data as input.
        and generate parsed data in csv format as output. Works in pipe mode

        cat data/daily/2023.tar.gz | bin/isd daily -v | psql ${PGURL} -AXtwqc "COPY isd.daily FROM STDIN CSV;" 

        isd daily  -v -i data/daily/2023.tar.gz  | psql ${PGURL} -AXtwqc "COPY isd.daily FROM STDIN CSV;"
        isd hourly -v -i data/hourly/2023.tar.gz | psql ${PGURL} -AXtwqc "COPY isd.hourly FROM STDIN CSV;"

OPTIONS
        -i  <input>     input file, stdin by default
        -o  <output>    output file, stdout by default
        -p  <profpath>  pprof file path, enable if specified
        -d              de-duplicate rows for hourly dataset (raw, ts-first, hour-first)
        -v              verbose mode
        -h              print help

UI

ISD Overview

Show all stations on a world map.

isd-overview.jpg

ISD Country

Show all stations among a country.

isd-country.jpg

ISD Station

Visualize station metadata and daily/monthly/yearly summary

ISD Station Dashboard

isd-station.jpg

ISD Detail

Visualize hourly observation raw metrics.

ISD Station Dashboard

isd-detail.jpg

License

MIT License

14.1.3 - WHO COVID-19 Data Analysis

Pigsty built-in application which visualize WHO covid-19 data

The on-line demo:https://demo.pigsty.cc/d/covid-overview

Installation

在管理节点上进入应用目录,执行make以完成安装。

make            # 如果本地数据可用
make all        # 完整安装,从WHO官网下载数据
makd reload     # 重新下载并加载最新数据 

其他一些子任务:

make reload     # download latest data and pour it again
make ui         # install grafana dashboards
make sql        # install database schemas
make download   # download latest data
make load       # load downloaded data into database
make reload     # download latest data and pour it into database

Dashboards

14.2 - Docker Compose Template

Software and tools that use PostgreSQL can be managed by the docker daemon

您可以使用Docker,快速部署启动软件应用,在容器中,您可以直接使用连接串访问部署于宿主机上的PostgreSQL/Redis数据库。

  • PgAdmin4 : 一个用于管理PostgreSQL数据库实例的GUI工具
  • PGWeb:一个自动根据PG数据库模式生成后端API服务的工具
  • PostgREST:一个自动根据PG数据库模式生成后端API服务的工具
  • ByteBase : 一个用于进行PostgreSQL模式变更的GUI工具
  • Jupyter Lab:一个开箱即用的数据分析与处理Python实验环境

您也可以用Docker拉起一些开箱即用的开源软件服务:

  • Gitea:轻量化代码托管服务
  • Minio:S3兼容的简单对象存储服务
  • Wiki.js:功能完善的私人维基站点
  • Casdoor:单点SSO解决方案

您也可以使用Docker执行一些随用随抛的命令工具,例如:

  • SchemaSPY:生成数据库模式的详细可视化报表
  • Pgbadger:生成数据库日志报表

您也可以用Docker拉起一些开箱即用的开源软件服务:

  • Gitlab:开源代码托管平台。
  • Habour:开源镜像仓库
  • Jira:开源项目管理平台。
  • Confluence:开源知识托管平台。
  • Odoo:开源ERP
  • Mastodon:基于PG的社交网络
  • Discourse:基于PG与Redis的开源论坛

PGADMIN

PgAdmin4 是一个实用的PostgreSQL管理工具,执行以下命令可在管理节点拉起 pgadmin服务:

cd ~/pigsty/app/pgadmin ; docker-compose up -d

默认分配 8885 端口,使用域名: http://adm.pigsty 访问, Demo:http://adm.pigsty.cc。

默认用户名:[email protected],密码:pigsty

PGWeb客户端工具

PGWeb是一款基于浏览器的PG客户端工具,使用以下命令,在元节点上拉起PGWEB服务,默认为主机8886端口。可使用域名: http://cli.pigsty 访问,公开Demo:http://cli.pigsty.cc。

# docker stop pgweb; docker rm pgweb
docker run --init --name pgweb --restart always --detach --publish 8886:8081 sosedoff/pgweb

用户需要自行填写数据库连接串,例如默认CMDB的连接串:

postgres://dbuser_dba:[email protected]:5432/meta?sslmode=disable

ByteBase

ByteBase是一个进行数据库模式变更的工具,以下命令将在元节点 8887 端口启动一个ByteBase。

mkdir -p /data/bytebase/data;
docker run --init --name bytebase --restart always --detach --publish 8887:8887 --volume /data/bytebase/data:/var/opt/bytebase \
    bytebase/bytebase:1.0.4 --data /var/opt/bytebase --host http://ddl.pigsty --port 8887

访问 http://10.10.10.10:8887/ 或 http://ddl.pigsty 即可使用 ByteBase,您需要依次创建项目、环境、实例、数据库,即可开始进行模式变更。 公开Demo地址: http://ddl.pigsty.cc

PostgREST

PostgREST是一个自动根据 PostgreSQL 数据库模式生成 REST API的二进制组件。

例如,以下命令将使用docker拉起 postgrest (本地 8884 端口,使用默认管理员用户,暴露Pigsty CMDB模式)

docker run --init --name postgrest --restart always --detach --publish 8884:8081 postgrest/postgrest

访问 http://10.10.10.10:8884 会展示所有自动生成API的定义,并自动使用 Swagger Editor 暴露API文档。

如果您想要进行增删改查,设计更精细的权限控制,请参考 Tutorial 1 - The Golden Key,生成一个签名JWT。

数据分析环境:Jupyter

Jupyter Lab 是一站式数据分析环境,下列命令将在 8887 端口启动一个Jupyter Server.

docker run -it --restart always --detach --name jupyter -p 8888:8888 -v "${PWD}":/tmp/notebook jupyter/scipy-notebook
docker logs jupyter # 打印日志,获取登陆的Token

访问 http://10.10.10.10:8888/ 即可使用 JupyterLab,(需要填入自动生成的Token)。

您也可以使用 infra-jupyter.yml 在管理节点裸机上启用Jupyter Notebook。

样例:数据库模式报表SchemaSPY

使用以下docker生成数据库模式报表,以CMDB为例:

docker run -v /www/schema/pg-meta/meta/pigsty:/output andrewjones/schemaspy-postgres:latest -host 10.10.10.10 -port 5432 -u dbuser_dba -p DBUser.DBA -db meta -s pigsty

然后访问 http://pigsty/schema/pg-meta/meta/pigsty 即可访问Schema报表

样例:开源代码仓库:Gitlab

请参考Gitlab Docker部署文档 完成Docker部署。

export GITLAB_HOME=/data/gitlab

sudo docker run --detach \
  --hostname gitlab.example.com \
  --publish 443:443 --publish 80:80 --publish 23:22 \
  --name gitlab \
  --restart always \
  --volume $GITLAB_HOME/config:/etc/gitlab \
  --volume $GITLAB_HOME/logs:/var/log/gitlab \
  --volume $GITLAB_HOME/data:/var/opt/gitlab \
  --shm-size 256m \
  gitlab/gitlab-ee:latest
  
sudo docker exec -it gitlab grep 'Password:' /etc/gitlab/initial_root_password

样例:开源技术论坛:Discourse

搭建开源论坛Discourse,需要调整配置 app.yml ,重点是SMTP部分的配置

Discourse配置样例
templates:
  - "templates/web.china.template.yml"
  - "templates/postgres.template.yml"
  - "templates/redis.template.yml"
  - "templates/web.template.yml"
  - "templates/web.ratelimited.template.yml"
## Uncomment these two lines if you wish to add Lets Encrypt (https)
# - "templates/web.ssl.template.yml"
# - "templates/web.letsencrypt.ssl.template.yml"
expose:
  - "80:80"   # http
  - "443:443" # https
params:
  db_default_text_search_config: "pg_catalog.english"
  db_shared_buffers: "768MB"
env:
  LC_ALL: en_US.UTF-8
  LANG: en_US.UTF-8
  LANGUAGE: en_US.UTF-8
  EMBER_CLI_PROD_ASSETS: 1
  UNICORN_WORKERS: 4
  DISCOURSE_HOSTNAME: forum.pigsty
  DISCOURSE_DEVELOPER_EMAILS: '[email protected],[email protected]'
  DISCOURSE_SMTP_ENABLE_START_TLS: false
  DISCOURSE_SMTP_AUTHENTICATION: login
  DISCOURSE_SMTP_OPENSSL_VERIFY_MODE: none
  DISCOURSE_SMTP_ADDRESS: smtpdm.server.address
  DISCOURSE_SMTP_PORT: 80
  DISCOURSE_SMTP_USER_NAME: [email protected]
  DISCOURSE_SMTP_PASSWORD: "<password>"
  DISCOURSE_SMTP_DOMAIN: mail.pigsty.cc
volumes:
  - volume:
      host: /var/discourse/shared/standalone
      guest: /shared
  - volume:
      host: /var/discourse/shared/standalone/log/var-log
      guest: /var/log

hooks:
  after_code:
    - exec:
        cd: $home/plugins
        cmd:
          - git clone https://github.com/discourse/docker_manager.git
run:
  - exec: echo "Beginning of custom commands"
  # - exec: rails r "SiteSetting.notification_email='[email protected]'"
  - exec: echo "End of custom commands"

然后,执行以下命令,拉起Discourse即可。

./launcher rebuild app

14.2.1 - 使用PGAdmin4进行数据库管理

使用Docker拉起PgAdmin4,并加载Pigsty服务器列表

公开Demo地址:http://adm.pigsty.cc

默认用户名与密码: [email protected] / pigsty

太长;不看

cd ~/pigsty/app/pgadmin   # 进入应用目录
make up                   # 拉起pgadmin容器
make conf view            # 加载Pigsty服务器列表文件至Pgadmin容器内并加载

Pigsty的Pgadmin应用模板默认使用8885端口,您可以通过以下地址访问:

http://adm.pigsty 或 http://10.10.10.10:8885

默认用户名与密码: [email protected] / pigsty

make up         # pull up pgadmin with docker-compose
make run        # launch pgadmin with docker
make view       # print pgadmin access point
make log        # tail -f pgadmin logs
make info       # introspect pgadmin with jq
make stop       # stop pgadmin container
make clean      # remove pgadmin container
make conf       # provision pgadmin with pigsty pg servers list 
make dump       # dump servers.json from pgadmin container
make pull       # pull latest pgadmin image
make rmi        # remove pgadmin image
make save       # save pgadmin image to /tmp/pgadmin.tgz
make load       # load pgadmin image from /tmp

14.2.2 - 使用Gitea搭建您自己的代码托管服务

使用Docker拉起Gitea,并使用Pigsty的PG作为外部的元数据库

公开Demo地址:http://git.pigsty.cc

太长;不看

cd ~/pigsty/app/gitea; make up

在本例中,Gitea 默认使用 8889 端口,您可以访问以下位置:

http://git.pigsty 或 http://10.10.10.10:8889

make up      # pull up gitea with docker-compose in minimal mode
make run     # launch gitea with docker , local data dir and external PostgreSQL
make view    # print gitea access point
make log     # tail -f gitea logs
make info    # introspect gitea with jq
make stop    # stop gitea container
make clean   # remove gitea container
make pull    # pull latest gitea image
make rmi     # remove gitea image
make save    # save gitea image to /tmp/gitea.tgz
make load    # load gitea image from /tmp

使用外部的PostgreSQL

Pigsty默认使用容器内的 Sqlite 作为元数据存储,您可以让 Gitea 通过连接串环境变量使用外部的PostgreSQL

# postgres://dbuser_gitea:[email protected]:5432/gitea
db:   { name: gitea, owner: dbuser_gitea, comment: gitea primary database }
user: { name: dbuser_gitea , password: DBUser.gitea, roles: [ dbrole_admin ] }

14.2.3 - 使用Wiki.js搭建百科网站

使用Docker拉起Wiki,并使用Pigsty的PG作为持久数据存储

公开Demo地址:http://wiki.pigsty.cc

太长;不看

cd app/wiki ; docker-compose up -d

准备数据库

# postgres://dbuser_wiki:[email protected]:5432/wiki
- { name: wiki, owner: dbuser_wiki, revokeconn: true , comment: wiki the api gateway database }
- { name: dbuser_wiki, password: DBUser.Wiki , pgbouncer: true , roles: [ dbrole_admin ] }
bin/createuser pg-meta dbuser_wiki
bin/createdb   pg-meta wiki

容器配置

version: "3"
services:
  wiki:
    container_name: wiki
    image: requarks/wiki:2
    environment:
      DB_TYPE: postgres
      DB_HOST: 10.10.10.10
      DB_PORT: 5432
      DB_USER: dbuser_wiki
      DB_PASS: DBUser.Wiki
      DB_NAME: wiki
    restart: unless-stopped
    ports:
      - "9002:3000"

Access

  • Default Port for wiki: 9002
# add to nginx_upstream
- { name: wiki  , domain: wiki.pigsty.cc , endpoint: "127.0.0.1:9002"   }
./infra.yml -t nginx_config
ansible all -b -a 'nginx -s reload'

14.2.4 - 使用Minio存储本地对象与备份

使用Docker拉起Minio,即刻拥有你自己的对象存储服务。

公开Demo地址:http://sss.pigsty.cc

默认用户名: admin / pigsty.minio

太长;不看

Launch minio (s3) service on 9000 & 9001

cd ~/pigsty/app/minio ; docker-compose up -d
docker run -p 9000:9000 -p 9001:9001 \
  -e "MINIO_ROOT_USER=admin" \
  -e "MINIO_ROOT_PASSWORD=pigsty.minio" \
  minio/minio server /data --console-address ":9001"

visit http://10.10.10.10:9000 with user admin and password pigsty.minio

make up         # pull up minio with docker-compose
make run        # launch minio with docker
make view       # print minio access point
make log        # tail -f minio logs
make info       # introspect minio with jq
make stop       # stop minio container
make clean      # remove minio container
make pull       # pull latest minio image
make rmi        # remove minio image
make save       # save minio image to /tmp/minio.tgz
make load       # load minio image from /tmp

14.2.5 - 使用ByteBase对PG模式做版本控制

使用Docker拉起Bytebase,对PG的模式进行版本化管理

公开Demo地址:http://ddl.pigsty.cc

默认用户名与密码: admin / pigsty

Bytebase概览

Schema Migrator for PostgreSQL

cd app/bytebase; make up

Visit http://ddl.pigsty or http://10.10.10.10:8887

make up         # pull up bytebase with docker-compose in minimal mode
make run        # launch bytebase with docker , local data dir and external PostgreSQL
make view       # print bytebase access point
make log        # tail -f bytebase logs
make info       # introspect bytebase with jq
make stop       # stop bytebase container
make clean      # remove bytebase container
make pull       # pull latest bytebase image
make rmi        # remove bytebase image
make save       # save bytebase image to /tmp/bytebase.tgz
make load       # load bytebase image from /tmp

使用外部的PostgreSQL

Bytebase use its internal PostgreSQL database by default, You can use external PostgreSQL for higher durability.

# postgres://dbuser_bytebase:[email protected]:5432/bytebase
db:   { name: bytebase, owner: dbuser_bytebase, comment: bytebase primary database }
user: { name: dbuser_bytebase , password: DBUser.Bytebase, roles: [ dbrole_admin ] }

if you wish to user an external PostgreSQL, drop monitor extensions and views & pg_repack

DROP SCHEMA monitor CASCADE;
DROP EXTENSION pg_repack;

After bytebase initialized, you can create them back with /pg/tmp/pg-init-template.sql

psql bytebase < /pg/tmp/pg-init-template.sql

14.2.6 - 使用PGWeb从网页浏览PostgreSQL数据

使用Docker拉起PGWEB,以便从浏览器进行小批量在线数据查询

公开Demo地址:http://cli.pigsty.cc

使用Docker Compose拉起PGWEB容器:

cd ~/pigsty/app/pgweb ; docker-compose up -d

接下来,访问您本机的 8886 端口,即可看到 PGWEB 的UI界面: http://10.10.10.10:8886

您可以尝试使用下面的URL连接串,通过 PGWEB 连接至数据库实例并进行探索。

postgres://dbuser_meta:[email protected]:5432/meta?sslmode=disable
postgres://test:[email protected]:5432/test?sslmode=disable

快捷方式

make up         # pull up pgweb with docker-compose
make run        # launch pgweb with docker
make view       # print pgweb access point
make log        # tail -f pgweb logs
make info       # introspect pgweb with jq
make stop       # stop pgweb container
make clean      # remove pgweb container
make pull       # pull latest pgweb image
make rmi        # remove pgweb image
make save       # save pgweb image to /tmp/pgweb.tgz
make load       # load pgweb image from /tmp

14.2.7 - 使用PostgREST自动生成RESTful API

使用Docker拉起PostgREST,自动根据PostgreSQL模式生成后端REST API

This is an example of creating pigsty cmdb API with PostgREST

cd ~/pigsty/app/postgrest ; docker-compose up -d

http://10.10.10.10:8884 is the default endpoint for PostgREST

http://10.10.10.10:8883 is the default api docs for PostgREST

make up         # pull up postgrest with docker-compose
make run        # launch postgrest with docker
make ui         # run swagger ui container
make view       # print postgrest access point
make log        # tail -f postgrest logs
make info       # introspect postgrest with jq
make stop       # stop postgrest container
make clean      # remove postgrest container
make rmui       # remove swagger ui container
make pull       # pull latest postgrest image
make rmi        # remove postgrest image
make save       # save postgrest image to /tmp/postgrest.tgz
make load       # load postgrest image from /tmp

Swagger UI

Launch a swagger OpenAPI UI and visualize PostgREST API on 8883 with:

docker run --init --name postgrest --name swagger -p 8883:8080 -e API_URL=http://10.10.10.10:8884 swaggerapi/swagger-ui
# docker run -d -e API_URL=http://10.10.10.10:8884 -p 8883:8080 swaggerapi/swagger-editor # swagger editor

Check http://10.10.10.10:8883/

14.2.8 - KONG API Gateway

Launch kong with docker and use postgres as metadb

TL;DR

cd app/kong ; docker-compose up -d
make up         # pull up kong with docker-compose
make ui         # run swagger ui container
make log        # tail -f kong logs
make info       # introspect kong with jq
make stop       # stop kong container
make clean      # remove kong container
make rmui       # remove swagger ui container
make pull       # pull latest kong image
make rmi        # remove kong image
make save       # save kong image to /tmp/kong.tgz
make load       # load kong image from /tmp

Scripts

  • Default Port: 8000
  • Default SSL Port: 8443
  • Default Admin Port: 8001
  • Default Postgres Database: postgres://dbuser_kong:[email protected]:5432/kong
# postgres://dbuser_kong:[email protected]:5432/kong
- { name: kong, owner: dbuser_kong, revokeconn: true , comment: kong the api gateway database }
- { name: dbuser_kong, password: DBUser.Kong , pgbouncer: true , roles: [ dbrole_admin ] }

15 - Release Notes

Pigsty release note history
Version Time Description Release
v2.7.0 2024-05-16 Extension Overwhelming, new docker apps v2.7.0
v2.6.0 2024-02-29 PG 16 as default version, ParadeDB & DuckDB v2.6.0
v2.5.1 2023-12-01 Routine update, pg16 major extensions v2.5.1
v2.5.0 2023-10-24 Ubuntu/Debian Support: bullseye, bookworm, jammy, focal v2.5.0
v2.4.1 2023-09-24 Supabase/PostgresML support, graphql, jwt, pg_net, vault v2.4.1
v2.4.0 2023-09-14 PG16, RDS Monitor, New Extensions v2.4.0
v2.3.1 2023-09-01 PGVector with HNSW, PG16 RC1, Chinese Docs, Bug Fix v2.3.1
v2.3.0 2023-08-20 PGSQL/REDIS Update, NODE VIP, Mongo/FerretDB, MYSQL Stub v2.3.0
v2.2.0 2023-08-04 Dashboard & Provision overhaul, UOS compatibility v2.2.0
v2.1.0 2023-06-10 PostgreSQL 12 ~ 16beta support v2.1.0
v2.0.2 2023-03-31 Add pgvector support and fix MinIO CVE v2.0.2
v2.0.1 2023-03-21 v2 Bug Fix, security enhance and bump grafana version v2.0.1
v2.0.0 2023-02-28 Compatibility Security Maintainability Enhancement v2.0.0
v1.5.1 2022-06-18 Grafana Security Hotfix v1.5.1
v1.5.0 2022-05-31 Docker Applications v1.5.0
v1.4.1 2022-04-20 Bug fix & Full translation of English documents. v1.4.1
v1.4.0 2022-03-31 MatrixDB Support, Separated INFRA, NODES, PGSQL, REDIS v1.4.0
v1.3.0 2021-11-30 PGCAT Overhaul & PGSQL Enhancement & Redis Support Beta v1.3.0
v1.2.0 2021-11-03 Upgrade default Postgres to 14, monitoring existing pg v1.2.0
v1.1.0 2021-10-12 HomePage, JupyterLab, PGWEB, Pev2 & Pgbadger v1.1.0
v1.0.0 2021-07-26 v1 GA, Monitoring System Overhaul v1.0.0
v0.9.0 2021-04-04 Pigsty GUI, CLI, Logging Integration v0.9.0
v0.8.0 2021-03-28 Service Provision v0.8.0
v0.7.0 2021-03-01 Monitor only deployment v0.7.0
v0.6.0 2021-02-19 Architecture Enhancement v0.6.0
v0.5.0 2021-01-07 Database Customize Template v0.5.0
v0.4.0 2020-12-14 PostgreSQL 13 Support, Official Documentation v0.4.0
v0.3.0 2020-10-22 Provisioning Solution GA v0.3.0
v0.2.0 2020-07-10 PGSQL Monitoring v6 GA v0.2.0
v0.1.0 2020-06-20 Validation on Testing Environment v0.1.0
v0.0.5 2020-08-19 Offline Installation Mode v0.0.5
v0.0.4 2020-07-27 Refactor playbooks into ansible roles v0.0.4
v0.0.3 2020-06-22 Interface enhancement v0.0.3
v0.0.2 2020-04-30 First Commit v0.0.2
v0.0.1 2019-05-15 POC v0.0.1

v2.7.0

Highlight

Extension Overwhelming, adding numerous new extensions written in rust & pgrx:

  • pg_search v0.7.0 : Full text search over SQL tables using the BM25 algorithm
  • pg_lakehouse v0.7.0 : Query engine over object stores like S3 and table formats like Delta Lake
  • pg_analytics v0.6.1 : Accelerates analytical query processing inside Postgres
  • pg_graphql v1.5.4 : GraphQL support to your PostgreSQL database.
  • pg_jsonschema v0.3.1 : PostgreSQL extension providing JSON Schema validation
  • wrappers v0.3.1 : Postgres Foreign Data Wrappers Collections by Supabase
  • pgmq v1.5.2 : A lightweight message queue. Like AWS SQS and RSMQ but on Postgres.
  • pg_tier v0.0.3 : Postgres Extension written in Rust, to enable data tiering to AWS S3
  • pg_vectorize v0.15.0 : The simplest way to orchestrate vector search on Postgres
  • pg_later v0.1.0 : Execute SQL now and get the results later.
  • pg_idkit v0.2.3 : Generating many popular types of identifiers
  • plprql v0.1.0 : Use PRQL in PostgreSQL
  • pgsmcrypto v0.1.0 : PostgreSQL SM Algorithm Extension
  • pg_tiktoken v0.0.1 : OpenAI tiktoken tokenizer for postgres
  • pgdd v0.5.2 : Access Data Dictionary metadata with pure SQL

And some new extensions in plain C & C++

  • parquet_s3_fdw 1.1.0 : ParquetS3 Foreign Data Wrapper for PostgresSQL
  • plv8 3.2.2 : V8 Engine Javascript Procedural Language add-on for PostgreSQL
  • md5hash 1.0.1 : Custom data type for storing MD5 hashes rather than text
  • pg_tde 1.0 alpha: Experimental encrypted access method for PostgreSQL
  • pg_dirtyread 2.6 : Read dead but unvacuumed tuples from a PostgreSQL relation
  • New deb PGDG extensions: pg_roaringbitmap, pgfaceting, mobilitydb, pgsql-http, pg_hint_plan, pg_statviz, pg_rrule
  • New rpm PGDG extensions: pg_profile, pg_show_plans, use PGDG’s pgsql_http, pgsql_gzip, pg_net, pg_bigm instead of Pigsty RPM.

New Features

  • running on certain docker containers.
  • prepare arm64 packages for infra & pgsql packages for el & deb distros.
  • new installation script to download from cloudflare, and more hint.
  • new monitoring dashboard for PGSQL PITR to assist the PITR procedure.
  • make preparation for running pigsty inside docker VM containers
  • add a fool-proof design for running pgsql.yml on node that is not managed by pigsty
  • add config template for each major version: el7, el8, el9, debian11, debian12, ubuntu20, ubuntu22

Software Upgrade

  • PostgreSQL 16.3
  • Patroni 3.3.0
  • pgBackRest 2.51
  • vip-manager v2.5.0
  • Haproxy 2.9.7
  • Grafana 10.4.2
  • Prometheus 2.51
  • Loki & Promtail: 3.0.0 (breaking changes!)
  • Alertmanager 0.27.0
  • BlackBox Exporter 0.25.0
  • Node Exporter 1.8.0
  • pgBackrest Exporter 0.17.0
  • duckdb 0.10.2
  • etcd 3.5.13
  • minio-20240510014138 / mcli-20240509170424
  • pev2 v1.8.0 -> v1.11.0
  • pgvector 0.6.1 -> 0.7.0
  • pg_tle: v1.3.4 -> v1.4.0
  • hydra: v1.1.1 -> v1.1.2
  • duckdb_fdw: v1.1.0 recompile with libduckdb 0.10.2
  • pg_bm25 0.5.6 -> pg_search 0.7.0
  • pg_analytics: 0.5.6 -> 0.6.1
  • pg_graphql: 1.5.0 -> 1.5.4
  • pg_net 0.8.0 -> 0.9.1
  • pg_sparse (deprecated)

Docker Application

  • Odoo: launch open source ERP and plugins
  • Jupyter: run jupyter notebook container
  • PolarDB: run the demo PG RAC playground.
  • supabase: bump to the latest GA version.
  • bytebase: use the latest tag instead of ad hoc version.
  • pg_exporter: update docker image example

Bug Fix

  • Fix role pg_exporters white space in variable templates
  • Fix minio_cluster not commented in global variables
  • Fix the non-exist postgis34 in el7 config template
  • Fix EL8 python3.11-cryptography deps to python3-cryptography according to upstream
  • Fix /pg/bin/pg-role can not get OS user name from environ in non-interact mode
  • Fix /pg/bin/pg-pitr can not hint -X -P flag properly

API Change

  • New parameter node_write_etc_hosts to control whether to write /etc/hosts file on target nodes.
  • Relocatable prometheus target directory with new parameter prometheus_sd_dir.
  • Add -x|--proxy flag to enable and use value of global proxy env by @waitingsong in https://github.com/Vonng/pigsty/pull/405
  • No longer parse infra nginx log details since it brings too much labels to the log.
  • Use alertmanager API Version v2 instead of v1 in prometheus config.
  • Use /pg/cert/ca.crt instead of /etc/pki/ca.crt in pgsql roles.

New Contributors

Package Checksums

ec271a1d34b2b1360f78bfa635986c3a  pigsty-pkg-v2.7.0.el8.x86_64.tgz
f3304bfd896b7e3234d81d8ff4b83577  pigsty-pkg-v2.7.0.debian12.x86_64.tgz
5b071c2a651e8d1e68fc02e7e922f2b3  pigsty-pkg-v2.7.0.ubuntu22.x86_64.tgz

v2.6.0

Highlight

Configuration

  • Disable Grafana Unified Alert to work around the “Database Locked” error。
  • add node_repo_modules to add upstream repos (including local one) to node
  • remove node_local_repo_urls, replaced by node_repo_modules & repo_upstream.
  • remove node_repo_method, replaced by node_repo_modules.
  • add the new local repo into repo_upstream instead of node_local_repo_urls
  • add chrony into node_default_packages
  • remove redis,minio,postgresql client from infra packages
  • replace repo_upstream.baseurl $releasever for pgdg el8/el9 with major.minor instead of major version

Software Upgrade

  • Grafana 10.3.3
  • Prometheus 2.47
  • node_exporter 1.7.0
  • HAProxy 2.9.5
  • Loki / Promtail 2.9.4
  • minio-20240216110548 / mcli-20240217011557
  • etcd 3.5.11
  • Redis 7.2.4
  • Bytebase 2.13.2
  • HAProxy 2.9.5
  • DuckDB 0.10.0
  • FerretDB 1.19
  • Metabase: new docker compose app template added

PostgreSQL x Pigsty Extensions

  • PostgreSQL Minor Version Upgrade 16.2, 15.6, 14.11, 13.14, 12.18
  • PostgreSQL 16 is now used as the default major version
  • pg_exporter 0.6.1, security fix
  • Patroni 3.2.2
  • pgBadger 12.4
  • pgBouncer 1.22
  • pgBackRest 2.50
  • vip-manager 2.3.0
  • PostGIS 3.4.1
  • PGVector 0.6.0
  • TimescaleDB 2.14.1
  • New Extension duckdb_fdw v1.1
  • New Extension pgsql-gzip v1.0.0
  • New Extension pg_sparse from ParadeDB: v0.5.6
  • New Extension pg_bm25 from ParadeDB: v0.5.6
  • New Extension pg_analytics from ParadeDB: v0.5.6
  • Bump AI/ML Extension pgml to v2.8.1 with pg16 support
  • Bump Columnar Extension hydra to v1.1.1 with pg16 support
  • Bump Graph Extension age to v1.5.0 with pg16 support
  • Bump Packaging Extension pg_tle to v1.3.4 with pg16 support
  • Bump GraphQL Extension pg_graphql to v1.5.0 to support supabase
330e9bc16a2f65d57264965bf98174ff  pigsty-v2.6.0.tgz
81abcd0ced798e1198740ab13317c29a  pigsty-pkg-v2.6.0.debian11.x86_64.tgz
7304f4458c9abd3a14245eaf72f4eeb4  pigsty-pkg-v2.6.0.debian12.x86_64.tgz
f914fbb12f90dffc4e29f183753736bb  pigsty-pkg-v2.6.0.el7.x86_64.tgz
fc23d122d0743d1c1cb871ca686449c0  pigsty-pkg-v2.6.0.el8.x86_64.tgz
9d258dbcecefd232f3a18bcce512b75e  pigsty-pkg-v2.6.0.el9.x86_64.tgz
901ee668621682f99799de8932fb716c  pigsty-pkg-v2.6.0.ubuntu20.x86_64.tgz
39872cf774c1fe22697c428be2fc2c22  pigsty-pkg-v2.6.0.ubuntu22.x86_64.tgz

v2.5.1

Routine update with v16.1, v15.5, 14.10, 13.13, 12.17, 11.22

Now PostgreSQL 16 has all the core extensions available (pg_repack & timescaledb added)

  • Software Version Upgrade:
    • PostgreSQL to v16.1, v15.5, 14.10, 13.13, 12.17, 11.22
    • Patroni v3.2.0
    • PgBackrest v2.49
    • Citus 12.1
    • TimescaleDB 2.13.0 (with PG 16 support)
    • Grafana v10.2.2
    • FerretDB 1.15
    • SealOS 4.3.7
    • Bytebase 2.11.1
  • Remove monitor schema prefix from PGCAT dashboard queries
  • New template wool.yml for Aliyun free ECS singleton
  • Add python3-jmespath in addition to python3.11-jmespath for el9
31ee48df1007151009c060e0edbd74de  pigsty-pkg-v2.5.1.el7.x86_64.tgz
a40f1b864ae8a19d9431bcd8e74fa116  pigsty-pkg-v2.5.1.el8.x86_64.tgz
c976cd4431fc70367124fda4e2eac0a7  pigsty-pkg-v2.5.1.el9.x86_64.tgz
7fc1b5bdd3afa267a5fc1d7cb1f3c9a7  pigsty-pkg-v2.5.1.debian11.x86_64.tgz
add0731dc7ed37f134d3cb5b6646624e  pigsty-pkg-v2.5.1.debian12.x86_64.tgz
99048d09fa75ccb8db8e22e2a3b41f28  pigsty-pkg-v2.5.1.ubuntu20.x86_64.tgz
431668425f8ce19388d38e5bfa3a948c  pigsty-pkg-v2.5.1.ubuntu22.x86_64.tgz

v2.5.0

curl -L https://get.pigsty.cc/latest | bash

Highlights

  • Ubuntu / Debian Support: bullseye, bookworm, jammy, focal

  • Dedicate yum/apt repo on repo.pigsty.cc and mirror on packagecloud.io

  • Anolis OS Support (EL 8.8 Compatible)

  • PG Major Candidate: Use PostgreSQL 16 instead of PostgreSQL 14.

  • New Dashboard PGSQL Exporter, PGSQL Patroni, rework on PGSQL Query

  • Extensions Update:

    • Bump PostGIS version to v3.4 on el8, el9, ubuntu22, keep postgis 33 on EL7
    • Remove extension pg_embedding because it is no longer maintained, use pgvector instead.
    • New extension on EL: pointcloud with LIDAR data type support.
    • New extension on EL: imgsmlrpg_similaritypg_bigm 扩展。
    • Include columnar extension hydra and remove citus from default installed extension list.
    • Recompile pg_filedump as PG major version independent package.
  • Software Version Upgrade:

    • Grafana to v10.1.5
    • Prometheus to v2.47
    • Promtail/Loki to v2.9.1
    • Node Exporter to v1.6.1
    • Bytebase to v2.10.0
    • patroni to v3.1.2
    • pgbouncer to v1.21.0
    • pg_exporter to v0.6.0
    • pgbackrest to v2.48.0
    • pgbadger to v12.2
    • pg_graphql to v1.4.0
    • pg_net to v0.7.3
    • ferretdb to v0.12.1
    • sealos to 4.3.5
    • Supabase support to 20231013070755

Ubuntu Support

Pigsty has two ubuntu LTS support: 22.04 (jammy) and 20.04 (focal), and ship corresponding offline packages for them.

Some parameters need to be specified explicitly when deploying on Ubuntu, please refer to ubuntu.yml

  • repo_upstream: Adjust according to ubuntu / debian repo.
  • repo_packages: Adjust according to ubuntu / debian naming convention
  • node_repo_local_urls: use the default value: ['deb [trusted=yes] http://${admin_ip}/pigsty ./']
  • node_default_packages
    • zlib -> zlib1g, readline -> libreadline-dev
    • vim-minimal -> vim-tiny, bind-utils -> dnsutils, perf -> linux-tools-generic,
    • new packages acl to ensure ansible tmp file privileges are set correctly
  • infra_packages: replace all _ with - in names, and replace postgresql16 with postgresql-client-16
  • pg_packages: replace all _ with - in names, patroni-etcd not needed on ubuntu
  • pg_extensions: different naming convention, no passwordcheck_cracklib on ubuntu.
  • pg_dbsu_uid: You have to manually specify pg_dbsu_uid on ubuntu, because PGDG deb package does not specify pg dbsu uid.

API Changes

default values of following parameters have changed:

  • repo_modules: infra,node,pgsql,redis,minio

  • repo_upstream: Now add Pigsty Infra/MinIO/Redis/PGSQL modular upstream repo.

  • repo_packages: remove unused karma,mtail,dellhw_exporter and pg 14 extra extensions, adding pg 16 extra extensions.

  • node_default_packages now add python3-pip as default packages.

  • pg_libs: timescaledb is remove from shared_preload_libraries by default.

  • pg_extensions: citus is nolonger installed by default, and passwordcheck_cracklib is installed by default

    - pg_repack_${pg_version}* wal2json_${pg_version}* passwordcheck_cracklib_${pg_version}*
    - postgis34_${pg_version}* timescaledb-2-postgresql-${pg_version}* pgvector_${pg_version}*
    
87e0be2edc35b18709d7722976e305b0  pigsty-pkg-v2.5.0.el7.x86_64.tgz
e71304d6f53ea6c0f8e2231f238e8204  pigsty-pkg-v2.5.0.el8.x86_64.tgz
39728496c134e4352436d69b02226ee8  pigsty-pkg-v2.5.0.el9.x86_64.tgz
e3f548a6c7961af6107ffeee3eabc9a7  pigsty-pkg-v2.5.0.debian11.x86_64.tgz
1e469cc86a19702e48d7c1a37e2f14f9  pigsty-pkg-v2.5.0.debian12.x86_64.tgz
cc3af3b7c12f98969d3c6962f7c4bd8f  pigsty-pkg-v2.5.0.ubuntu20.x86_64.tgz
c5b2b1a4867eee624e57aed58ac65a80  pigsty-pkg-v2.5.0.ubuntu22.x86_64.tgz

v2.4.1

Highlights

  • Supabase support: run open-source Firebase alternative with external postgres managed by Pigsty: example config
  • PostgresML support: Run LLMs, vector operations, classical Machine Learning in Postgres.
  • GraphQL support: pg_graphql reflects a GraphQL schema from the existing SQL schema.
  • Async HTTP Client support pg_net enables asynchronous (non-blocking) HTTP/HTTPS requests with SQL
  • JWT support: pgjwt is the PostgreSQL implementation of JWT (JSON Web Tokens)
  • Vault support: vault can store encrypted secrets in the Vault
  • New component pg_filedump for pg 14 & 15, low-level data recovery tool for PostgreSQL
  • New extension hydra the columnar available for PG 13 - 15.
  • Reduce offline packages size for el9 400MB by removing proj-data*
  • Bump FerretDB version to v1.10
efabe7632d8994f3fb58f9838b8f9d7d  pigsty-pkg-v2.4.1.el7.x86_64.tgz # 1.1G
ea78957e8c8434b120d1c8c43d769b56  pigsty-pkg-v2.4.1.el8.x86_64.tgz # 1.4G
4ef280a7d28872814e34521978b851bb  pigsty-pkg-v2.4.1.el9.x86_64.tgz # 1.3G

v2.4.0

Get started with bash -c "$(curl -fsSL https://get.pigsty.cc/latest)".

Highlights

  • PostgreSQL 16 support
  • The first LTS version with business support and consulting service
  • Monitoring existing PostgreSQL, RDS for PostgreSQL / PolarDB with PGRDS Dashboards
  • New extension: Apache AGE, openCypher graph query engine on PostgreSQL
  • New extension: zhparser, full text search for Chinese language
  • New extension: pg_roaringbitmap, roaring bitmap for PostgreSQL
  • New extension: pg_embedding, hnsw alternative to pgvector
  • New extension: pg_tle, admin / manage stored procedure extensions
  • New extension: pgsql-http, issue http request with SQL interface
  • Add extensions: pg_auth_mon pg_checksums pg_failover_slots pg_readonly postgresql-unit pg_store_plans pg_uuidv7 set_user
  • Redis enhancement: add monitoring panels for redis sentinel, and auto HA configuration for redis ms cluster.

API Change

  • New Parameter: REDIS.redis_sentinel_monitor: specify masters monitor by redis sentinel cluster

Bug Fix

  • Fix Grafana 10.1 registered datasource will use random uid rather than ins.datname
MD5 (pigsty-pkg-v2.4.0.el7.x86_64.tgz) = 257443e3c171439914cbfad8e9f72b17
MD5 (pigsty-pkg-v2.4.0.el8.x86_64.tgz) = 41ad8007ffbfe7d5e8ba5c4b51ff2adc
MD5 (pigsty-pkg-v2.4.0.el9.x86_64.tgz) = 9a950aed77a6df90b0265a6fa6029250

v2.3.1

Get started with bash -c "$(curl -fsSL https://get.pigsty.cc/latest)".

Highlights

  • PGVector 0.5 with HNSW index support
  • PostgreSQL 16 RC1 for el8/el9 Adding SealOS for kubernetes support

Bug Fix

  • Fix infra.repo.repo_pkg task when downloading rpm with * in their names in repo_packages.
    • if /www/pigsty already have package name match that pattern, some rpm will be skipped.
  • Change default value of vip_dns_suffix to '' empty string rather than .vip
  • Grant sudo privilege for postgres dbsu when pg_dbsu_sudo = limit and patroni_watchdog_mode = required
    • /usr/bin/sudo /sbin/modprobe softdog: enable watchdog module before launching patroni
    • /usr/bin/sudo /bin/chown {{ pg_dbsu }} /dev/watchdog: chown watchdog before launching patroni

Documentation Update

  • Add details to English documentation
  • Add Chinese/zh-cn documentation

Software Upgrade

  • PostgreSQL 16 RC1 on el8/el9
  • PGVector 0.5.0 with hnsw index
  • TimescaleDB 2.11.2
  • grafana 10.1.0
  • loki & promtail 2.8.4
  • mcli-20230829225506 / minio-20230829230735
  • ferretdb 1.9
  • sealos 4.3.3
  • pgbadger 1.12.2
ce69791eb622fa87c543096cdf11f970  pigsty-pkg-v2.3.1.el7.x86_64.tgz
495aba9d6d18ce1ebed6271e6c96b63a  pigsty-pkg-v2.3.1.el8.x86_64.tgz
38b45582cbc337ff363144980d0d7b64  pigsty-pkg-v2.3.1.el9.x86_64.tgz

v2.3.0

Get started with bash -c "$(curl -fsSL https://get.pigsty.cc/latest)"

Highlight

  • INFRA: NODE/PGSQL VIP monitoring support
  • NODE: Allow bind node_vip to node cluster with keepalived
  • REPO: Dedicate yum repo, enable https for get.pigsty.cc and demo.pigsty.cc
  • PGSQL: Fix CVE-2023-39417 with PostgreSQL 15.4, 14.9, 13.12, 12.16, bump patroni version to v3.1.0
  • APP: Bump app/bytebase to v2.6.0, app/ferretdb version to v1.8, new application nocodb
  • REDIS: bump to v7.2 and rework on dashboards
  • MONGO: basic deploy & monitor support with FerretDB 1.8
  • MYSQL: add prometheus/grafana/ca stub for future implementation.

API Change

Add 1 new section NODE.NODE_VIP with 8 new parameter

  • NODE.VIP.vip_enabled : enable vip on this node cluster?
  • NODE.VIP.vip_address : node vip address in ipv4 format, required if vip is enabled
  • NODE.VIP.vip_vrid : required, integer, 1-255 should be unique among same VLAN
  • NODE.VIP.vip_role : master/backup, backup by default, use as init role
  • NODE.VIP.vip_preempt : optional, true/false, false by default, enable vip preemption
  • NODE.VIP.vip_interface : node vip network interface to listen, eth0 by default
  • NODE.VIP.vip_dns_suffix : node vip dns name suffix, .vip by default
  • NODE.VIP.vip_exporter_port : keepalived exporter listen port, 9650 by default
MD5 (pigsty-pkg-v2.3.0.el7.x86_64.tgz) = 81db95f1c591008725175d280ad23615
MD5 (pigsty-pkg-v2.3.0.el8.x86_64.tgz) = 6f4d169b36f6ec4aa33bfd5901c9abbe
MD5 (pigsty-pkg-v2.3.0.el9.x86_64.tgz) = 4bc9ae920e7de6dd8988ca7ee681459d

v2.2.0

Get started with bash -c "$(curl -fsSL http://get.pigsty.cc/latest)"

Release Note: https://doc.pigsty.cc/#/RELEASENOTE

Highlight

  • Monitoring Dashboards Overhaul: https://demo.pigsty.cc
  • Vagrant Sandbox Overhaul: libvirt support and new templates
  • Pigsty EL Yum Repo: Building simplified
  • OS Compatibility: UOS-v20-1050e support
  • New config template: prod simulation with 42 nodes
  • Use official pgdg citus distribution for el7

Software Upgrade

  • PostgreSQL 16 beta2
  • Citus 12 / PostGIS 3.3.3 / TimescaleDB 2.11.1 / PGVector 0.44
  • patroni 3.0.4 / pgbackrest 2.47 / pgbouncer 1.20
  • grafana 10.0.3 / loki/promtail/logcli 2.8.3
  • etcd 3.5.9 / haproxy v2.8.1 / redis v7.0.12
  • minio 20230711212934 / mcli 20230711233044

Bug Fix

  • Fix docker group ownership issue 29434bd
  • Append infra os group rather than set it as primary group
  • Fix redis sentinel systemd enable status 5c96feb
  • Loose bootstrap & configure if /etc/redhat-release not exists
  • Fix grafana 9.x CVE-2023-1410 with 10.0.2
  • Add PG 14 - 16 new command tags and error codes for pglog schema

API Change

Add 1 new parameter

  • INFRA.NGINX.nginx_exporter_enabled : now you can disable nginx_exporter with this parameter

Default value changes:

  • repo_modules: node,pgsql,infra : redis is removed from it
  • repo_upstream:
    • add pigsty-el: distribution independent rpms: such as grafana, minio, pg_exporter, etc…
    • add pigsty-misc: misc rpms: such as redis, minio, pg_exporter, etc…
    • remove citus repo since pgdg now have full official citus support (on el7)
    • remove remi , since redis is now included in pigsty-misc
    • remove grafana in build config for acceleration
  • repo_packages:
    • ansible python3 python3-pip python3-requests python3.11-jmespath dnf-utils modulemd-tools # el7: python36-requests python36-idna yum-utils
    • grafana loki logcli promtail prometheus2 alertmanager karma pushgateway node_exporter blackbox_exporter nginx_exporter redis_exporter
    • redis etcd minio mcli haproxy vip-manager pg_exporter nginx createrepo_c sshpass chrony dnsmasq docker-ce docker-compose-plugin flamegraph
    • lz4 unzip bzip2 zlib yum pv jq git ncdu make patch bash lsof wget uuid tuned perf nvme-cli numactl grubby sysstat iotop htop rsync tcpdump
    • netcat socat ftp lrzsz net-tools ipvsadm bind-utils telnet audit ca-certificates openssl openssh-clients readline vim-minimal
    • postgresql13* wal2json_13* pg_repack_13* passwordcheck_cracklib_13* postgresql12* wal2json_12* pg_repack_12* passwordcheck_cracklib_12* timescaledb-tools
    • postgresql15 postgresql15* citus_15* pglogical_15* wal2json_15* pg_repack_15* pgvector_15* timescaledb-2-postgresql-15* postgis33_15* passwordcheck_cracklib_15* pg_cron_15*
    • postgresql14 postgresql14* citus_14* pglogical_14* wal2json_14* pg_repack_14* pgvector_14* timescaledb-2-postgresql-14* postgis33_14* passwordcheck_cracklib_14* pg_cron_14*
    • postgresql16* wal2json_16* pgvector_16* pg_squeeze_16* postgis34_16* passwordcheck_cracklib_16* pg_cron_16*
    • patroni patroni-etcd pgbouncer pgbadger pgbackrest pgloader pg_activity pg_partman_15 pg_permissions_15 pgaudit17_15 pgexportdoc_15 pgimportdoc_15 pg_statement_rollback_15*
    • orafce_15* mysqlcompat_15 mongo_fdw_15* tds_fdw_15* mysql_fdw_15 hdfs_fdw_15 sqlite_fdw_15 pgbouncer_fdw_15 multicorn2_15* powa_15* pg_stat_kcache_15* pg_stat_monitor_15* pg_qualstats_15 pg_track_settings_15 pg_wait_sampling_15 system_stats_15
    • plprofiler_15* plproxy_15 plsh_15* pldebugger_15 plpgsql_check_15* pgtt_15 pgq_15* pgsql_tweaks_15 count_distinct_15 hypopg_15 timestamp9_15* semver_15* prefix_15* rum_15 geoip_15 periods_15 ip4r_15 tdigest_15 hll_15 pgmp_15 extra_window_functions_15 topn_15
    • pg_background_15 e-maj_15 pg_catcheck_15 pg_prioritize_15 pgcopydb_15 pg_filedump_15 pgcryptokey_15 logerrors_15 pg_top_15 pg_comparator_15 pg_ivm_15* pgsodium_15* pgfincore_15* ddlx_15 credcheck_15 safeupdate_15 pg_squeeze_15* pg_fkpart_15 pg_jobmon_15
  • repo_url_packages:
  • node_default_packages:
    • lz4,unzip,bzip2,zlib,yum,pv,jq,git,ncdu,make,patch,bash,lsof,wget,uuid,tuned,nvme-cli,numactl,grubby,sysstat,iotop,htop,rsync,tcpdump
    • netcat,socat,ftp,lrzsz,net-tools,ipvsadm,bind-utils,telnet,audit,ca-certificates,openssl,readline,vim-minimal,node_exporter,etcd,haproxy,python3,python3-pip
  • infra_packages
    • grafana,loki,logcli,promtail,prometheus2,alertmanager,karma,pushgateway
    • node_exporter,blackbox_exporter,nginx_exporter,redis_exporter,pg_exporter
    • nginx,dnsmasq,ansible,postgresql15,redis,mcli,python3-requests
  • PGSERVICE in .pigsty is removed, replaced with PGDATABASE=postgres.

FHS Changes:

  • bin/dns and bin/ssh now moved to vagrant/
MD5 (pigsty-pkg-v2.2.0.el7.x86_64.tgz) = 5fb6a449a234e36c0d895a35c76add3c
MD5 (pigsty-pkg-v2.2.0.el8.x86_64.tgz) = c7211730998d3b32671234e91f529fd0
MD5 (pigsty-pkg-v2.2.0.el9.x86_64.tgz) = 385432fe86ee0f8cbccbbc9454472fdd

v2.1.0

Highlight

  • PostgreSQL 16 beta support, and 12 ~ 15 support.
  • Add PGVector for AI Embedding for 12 - 15
  • Add 6 extra panel & datasource plugins for grafana
  • Add bin/profile to profile remote process and generate flamegraph
  • Add bin/validate to validate pigsty.yml configuration file
  • Add bin/repo-add to add upstream repo files to /etc/yum.repos.d
  • PostgreSQL 16 observability: pg_stat_io and corresponding dashboards

Software Upgrade

  • PostgreSQL 15.3 , 14.8, 13.11, 12.15, 11.20, and 16 beta1
  • pgBackRest 2.46
  • pgbouncer 1.19
  • Redis 7.0.11
  • Grafana v9.5.3
  • Loki / Promtail / Logcli 2.8.2
  • Prometheus 2.44
  • TimescaleDB 2.11.0
  • minio-20230518000536 / mcli-20230518165900
  • Bytebase v2.2.0

Enhancement

  • Now use all id*.pub when installing local user’s public key

v2.0.2

Highlight

Store OpenAI embedding and search similar vectors with pgvector

Changes

  • New extension pgvector for storing OpenAI embedding and searching similar vectors.
  • MinIO CVE-2023-28432 fix, and upgrade to 20230324 with new policy API.
  • Add reload functionality to DNSMASQ systemd services
  • Bump pev to v1.8
  • Bump grafana to v9.4.7
  • Bump MinIO and MCLI version to 20230324
  • Bump bytebase version to v1.15.0
  • Upgrade monitoring dashboards and fix dead links
  • Upgrade aliyun terraform template image to rockylinux 9
  • Adopt grafana provisioning API change since v9.4
  • Add asciinema videos for various administration tasks
  • Fix broken EL8 pgsql deps: remove anonymizer_15 faker_15 and pgloader
MD5 (pigsty-pkg-v2.0.2.el7.x86_64.tgz) = d46440a115d741386d29d6de646acfe2
MD5 (pigsty-pkg-v2.0.2.el8.x86_64.tgz) = 5fa268b5545ac96b40c444210157e1e1
MD5 (pigsty-pkg-v2.0.2.el9.x86_64.tgz) = c8b113d57c769ee86a22579fc98e8345

v2.0.1

Bug fix for v2.0.0 and security improvement.

Enhancement

  • Replace the pig shape logo for compliance with the PostgreSQL trademark policy.
  • Bump grafana version to v9.4 with better UI and bugfix.
  • Bump patroni version to v3.0.1 with some bugfix.
  • Change: rollback grafana systemd service file to rpm default.
  • Use slow copy instead of rsync to copy grafana dashboards.
  • Enhancement: add back default repo files after bootstrap
  • Add asciinema video for various administration tasks.
  • Security Enhance Mode: restrict monitor user privilege.
  • New config template: dual.yml for two-node deployment.
  • Enable log_connections and log_disconnections in crit.yml template.
  • Enable $lib/passwordcheck in pg_libs in crit.yml template.
  • Explicitly grant monitor view permission to pg_monitor role.
  • Remove default dbrole_readonly from dbuser_monitor to limit monitor user privilege
  • Now patroni listen on {{ inventory_hostname }} instead of 0.0.0.0
  • Now you can control postgres/pgbouncer listen to address with pg_listen
  • Now you can use placeholder ${ip}, ${lo}, ${vip} in pg_listen
  • Bump Aliyun terraform image to rocky Linux 9 instead of centos 7.9
  • Bump bytebase to v1.14.0

Bug Fixes

  • Add missing advertise address for alertmanager
  • Fix missing pg_mode error when adding postgres user with bin/pgsql-user
  • Add -a password to redis-join task @ redis.yml
  • Fix missing default value in infra-rm.yml.remove infra data
  • Fix prometheus targets file ownership to prometheus
  • Use admin user rather than root to delete metadata in DCS
  • Fix Meta datasource missing database name due to grafana 9.4 bug.

Caveats

Official EL8 pgdg upstream is broken now, DO use it with caution!

Affected packages: postgis33_15, pgloader, postgresql_anonymizer_15*, postgresql_faker_15

How to Upgrade

cd ~/pigsty; tar -zcf /tmp/files.tgz files; rm -rf ~/pigsty    # backup files dir and remove
cd ~; bash -c "$(curl -fsSL http://get.pigsty.cc/latest)"    # get latest pigsty source
cd ~/pigsty; rm -rf files; tar -xf /tmp/files.tgz -C ~/pigsty  # restore files dir

Checksums

MD5 (pigsty-pkg-v2.0.1.el7.x86_64.tgz) = 5cfbe98fd9706b9e0f15c1065971b3f6
MD5 (pigsty-pkg-v2.0.1.el8.x86_64.tgz) = c34aa460925ae7548866bf51b8b8759c
MD5 (pigsty-pkg-v2.0.1.el9.x86_64.tgz) = 055057cebd93c473a67fb63bcde22d33

Special thanks to @cocoonkid for his feedback.


v2.0.0

“PIGSTY” is now the abbr of “PostgreSQL in Great STYle”

or “PostgreSQL & Infrastructure & Governance System allTogether for You”.

Get pigsty v2.0.0 via the following command:

curl -fsSL http://get.pigsty.cc/latest | bash
Download directly from GitHub Release
# get from GitHub
bash -c "$(curl -fsSL https://raw.githubusercontent.com/Vonng/pigsty/master/bin/get)"

# or download tarball directly with curl
curl -L https://github.com/Vonng/pigsty/releases/download/v2.0.0/pigsty-v2.0.0.tgz -o ~/pigsty.tgz                 # SRC
curl -L https://github.com/Vonng/pigsty/releases/download/v2.0.0/pigsty-pkg-v2.0.0.el9.x86_64.tgz -o /tmp/pkg.tgz  # EL9
curl -L https://github.com/Vonng/pigsty/releases/download/v2.0.0/pigsty-pkg-v2.0.0.el8.x86_64.tgz -o /tmp/pkg.tgz  # EL8
curl -L https://github.com/Vonng/pigsty/releases/download/v2.0.0/pigsty-pkg-v2.0.0.el7.x86_64.tgz -o /tmp/pkg.tgz  # EL7

Highlights

  • PostgreSQL 15.2, PostGIS 3.3, Citus 11.2, TimescaleDB 2.10 now works together and unite as one.
  • Now works on EL 7,8,9 for RHEL, CentOS, Rocky, AlmaLinux, and other EL compatible distributions
  • Security enhancement with self-signed CA, full SSL support, scram-sha-256 pwd encryption, and more.
  • Patroni 3.0 with native HA citus cluster support and dcs failsafe mode to prevent global DCS failures.
  • Auto-Configured, Battery-Included PITR for PostgreSQL powered by pgbackrest, local or S3/minio.
  • Dedicate module ETCD which can be easily deployed and scaled in/out. Used as DCS instead of Consul.
  • Dedicate module MINIO, local S3 alternative for the optional central backup repo for PGSQL PITR.
  • Better config templates with adaptive tuning for Node & PG according to your hardware spec.
  • Use AGPL v3.0 license instead of Apache 2.0 license due to Grafana & MinIO reference.

Compatibility

  • Pigsty now works on EL7, EL8, EL9, and offers corresponding pre-packed offline packages.
  • Pigsty now works on EL compatible distributions: RHEL, CentOS, Rocky, AlmaLinux, OracleLinux,…
  • Pigsty now use RockyLinux 9 as default developing & testing environment instead of CentOS 7
  • EL version, CPU arch, and pigsty version string are part of source & offline package names.
  • PGSQL: PostgreSQL 15.2 / PostGIS 3.3 / TimescaleDB 2.10 / Citus 11.2 now works together.
  • PGSQL: Patroni 3.0 is used as default HA solution for PGSQL, and etcd is used as default DCS.
    • Patroni 3.0 with DCS failsafe mode to prevent global DCS failures (demoting all primary)
    • Patroni 3.0 with native HA citus cluster support, with entirely open sourced v11 citus.
    • vip-manager 2.x with ETCDv3 API, ETCDv2 API is deprecated, so does patroni.
  • PGSQL: pgBackRest v2.44 is introduced to provide battery-include PITR for PGSQL.
    • it will use local backup FS on primary by default for a two-day retention policy
    • it will use S3/minio as an alternative central backup repo for a two-week retention policy
  • ETCD is used as default DCS instead of Consul, And V3 API is used instead of V2 API.
  • NODE module now consist of node itself, haproxy, docker, node_exporter, and promtail
    • chronyd is used as default NTP client instead of ntpd
    • HAPROXY now attach to NODE instead of PGSQL, which can be used for exposing services
    • You can register PG Service to dedicate haproxy clusters rather than local cluster nodes.
    • You can expose ad hoc service in a NodePort manner with haproxy, not limited to pg services.
  • INFRA now consist of dnsmasq, nginx, prometheus, grafana, loki
    • DNSMASQ is enabled on all infra nodes, and added to all nodes as the default resolver.
    • Add blackbox_exporter for ICMP probe, add pushgateway for batch job metrics.
    • Switch to official loki & promtail rpm packages. Use official Grafana Echarts Panel.
    • Add infra dashboards for self-monitoring, add patroni & pg15 metrics to monitoring system
  • Software Upgrade
    • PostgreSQL 15.2 / PostGIS 3.3 / TimescaleDB 2.10 / Citus 11.2
    • Patroni 3.0 / Pgbouncer 1.18 / pgBackRest 2.44 / vip-manager 2.1
    • HAProxy 2.7 / Etcd 3.5 / MinIO 20230222182345 / mcli 20230216192011
    • Prometheus 2.42 / Grafana 9.3 / Loki & Promtail 2.7 / Node Exporter 1.5

Security

  • A full-featured self-signed CA enabled by default
  • Redact password in postgres logs.
  • SSL for Nginx (you have to trust the self-signed CA or use thisisunsafe to dismiss warning)
  • SSL for etcd peer/client traffics by @alemacci
  • SSL for postgres/pgbouncer/patroni by @alemacci
  • scram-sha-256 auth for postgres password encryption by @alemacci
  • Pgbouncer Auth Query by @alemacci
  • Use AES-256-CBC for pgbackrest encryption by @alemacci
  • Adding a security enhancement config template which enforce global SSL
  • Now all hba rules are defined in config inventory, no default rules.

Maintainability

  • Adaptive tuning template for PostgreSQL & Patroni by @Vonng, @alemacci
  • configurable log dir for Patroni & Postgres & Pgbouncer & Pgbackrest by @alemacci
  • Replace fixed ip placeholder 10.10.10.10 with ${admin_ip} that can be referenced
  • Adaptive upstream repo definition that can be switched according EL ver, region & arch.
  • Terraform Templates for AWS CN & Aliyun, which can be used for sandbox IaaS provisioning
  • Vagrant Templates: meta, full, el7 el8, el9, build, minio, citus, etc…
  • New playbook pgsql-monitor.yml for monitoring existing pg instance or RDS PG.
  • New playbook pgsql-migration.yml for migrating existing pg instance to pigsty manged pg.
  • New shell utils under bin/ to simplify the daily administration tasks.
  • Optimize ansible role implementation. which can be used without default parameter values.
  • Now you can define pgbouncer parameters on database & user level

API Changes

69 parameters added, 16 parameters removed, rename 14 parameters

  • INFRA.META.admin_ip : primary meta node ip address
  • INFRA.META.region : upstream mirror region: default|china|europe
  • INFRA.META.os_version : enterprise linux release version: 7,8,9
  • INFRA.CA.ca_cn : ca common name, pigsty-ca by default
  • INFRA.CA.cert_validity : cert validity, 20 years by default
  • INFRA.REPO.repo_enabled : build a local yum repo on infra node?
  • INFRA.REPO.repo_upstream : list of upstream yum repo definition
  • INFRA.REPO.repo_home : home dir of local yum repo, usually same as nginx_home ‘/www’
  • INFRA.NGINX.nginx_ssl_port : https listen port
  • INFRA.NGINX.nginx_ssl_enabled : nginx https enabled?
  • INFRA.PROMTETHEUS.alertmanager_endpoint : altermanager endpoint in (ip|domain):port format
  • NODE.NODE_TUNE.node_hugepage_count : number of 2MB hugepage, take precedence over node_hugepage_ratio
  • NODE.NODE_TUNE.node_hugepage_ratio : mem hugepage ratio, 0 disable it by default
  • NODE.NODE_TUNE.node_overcommit_ratio : node mem overcommit ratio, 0 disable it by default
  • NODE.HAPROXY.haproxy_service : list of haproxy service to be exposed
  • PGSQL.PG_ID.pg_mode : pgsql cluster mode: pgsql,citus,gpsql
  • PGSQL.PG_BUSINESS.pg_dbsu_password : dbsu password, empty string means no dbsu password by default
  • PGSQL.PG_INSTALL.pg_log_dir : postgres log dir, /pg/data/log by default
  • PGSQL.PG_BOOTSTRAP.pg_storage_type : SSD|HDD, SSD by default
  • PGSQL.PG_BOOTSTRAP.patroni_log_dir : patroni log dir, /pg/log by default
  • PGSQL.PG_BOOTSTRAP.patroni_ssl_enabled : secure patroni RestAPI communications with SSL?
  • PGSQL.PG_BOOTSTRAP.patroni_username : patroni rest api username
  • PGSQL.PG_BOOTSTRAP.patroni_password : patroni rest api password (IMPORTANT: CHANGE THIS)
  • PGSQL.PG_BOOTSTRAP.patroni_citus_db : citus database managed by patroni, postgres by default
  • PGSQL.PG_BOOTSTRAP.pg_max_conn : postgres max connections, auto will use recommended value
  • PGSQL.PG_BOOTSTRAP.pg_shared_buffer_ratio : postgres shared buffer memory ratio, 0.25 by default, 0.1~0.4
  • PGSQL.PG_BOOTSTRAP.pg_rto : recovery time objective, ttl to failover, 30s by default
  • PGSQL.PG_BOOTSTRAP.pg_rpo : recovery point objective, 1MB data loss at most by default
  • PGSQL.PG_BOOTSTRAP.pg_pwd_enc : algorithm for encrypting passwords: md5|scram-sha-256
  • PGSQL.PG_BOOTSTRAP.pgbouncer_log_dir : pgbouncer log dir, /var/log/pgbouncer by default
  • PGSQL.PG_BOOTSTRAP.pgbouncer_auth_query : if enabled, query pg_authid table to retrieve biz users instead of populating userlist
  • PGSQL.PG_BOOTSTRAP.pgbouncer_sslmode : SSL for pgbouncer client: disable|allow|prefer|require|verify-ca|verify-full
  • PGSQL.PG_BACKUP.pgbackrest_enabled : pgbackrest enabled?
  • PGSQL.PG_BACKUP.pgbackrest_clean : remove pgbackrest data during init ?
  • PGSQL.PG_BACKUP.pgbackrest_log_dir : pgbackrest log dir, /pg/log by default
  • PGSQL.PG_BACKUP.pgbackrest_method : pgbackrest backup repo method, local or minio
  • PGSQL.PG_BACKUP.pgbackrest_repo : pgbackrest backup repo config
  • PGSQL.PG_SERVICE.pg_service_provider : dedicate haproxy node group name, or empty string for local nodes by default
  • PGSQL.PG_SERVICE.pg_default_service_dest : default service destination if svc.dest=‘default’
  • PGSQL.PG_SERVICE.pg_vip_enabled : enable a l2 vip for pgsql primary? false by default
  • PGSQL.PG_SERVICE.pg_vip_address : vip address in <ipv4>/<mask> format, require if vip is enabled
  • PGSQL.PG_SERVICE.pg_vip_interface : vip network interface to listen, eth0 by default
  • PGSQL.PG_SERVICE.pg_dns_suffix : pgsql cluster dns name suffix, ’’ by default
  • PGSQL.PG_SERVICE.pg_dns_target : auto, primary, vip, none, or ad hoc ip
  • ETCD.etcd_seq : etcd instance identifier, REQUIRED
  • ETCD.etcd_cluster : etcd cluster & group name, etcd by default
  • ETCD.etcd_safeguard : prevent purging running etcd instance?
  • ETCD.etcd_clean : purging existing etcd during initialization?
  • ETCD.etcd_data : etcd data directory, /data/etcd by default
  • ETCD.etcd_port : etcd client port, 2379 by default
  • ETCD.etcd_peer_port : etcd peer port, 2380 by default
  • ETCD.etcd_init : etcd initial cluster state, new or existing
  • ETCD.etcd_election_timeout : etcd election timeout, 1000ms by default
  • ETCD.etcd_heartbeat_interval : etcd heartbeat interval, 100ms by default
  • MINIO.minio_seq : minio instance identifier, REQUIRED
  • MINIO.minio_cluster : minio cluster name, minio by default
  • MINIO.minio_clean : cleanup minio during init?, false by default
  • MINIO.minio_user : minio os user, minio by default
  • MINIO.minio_node : minio node name pattern
  • MINIO.minio_data : minio data dir(s), use {x…y} to specify multi drivers
  • MINIO.minio_domain : minio external domain name, sss.pigsty by default
  • MINIO.minio_port : minio service port, 9000 by default
  • MINIO.minio_admin_port : minio console port, 9001 by default
  • MINIO.minio_access_key : root access key, minioadmin by default
  • MINIO.minio_secret_key : root secret key, minioadmin by default
  • MINIO.minio_extra_vars : extra environment variables for minio server
  • MINIO.minio_alias : alias name for local minio deployment
  • MINIO.minio_buckets : list of minio bucket to be created
  • MINIO.minio_users : list of minio user to be created

Removed Parameters

  • INFRA.CA.ca_homedir: ca home dir, now fixed as /etc/pki/
  • INFRA.CA.ca_cert: ca cert filename, now fixed as ca.key
  • INFRA.CA.ca_key: ca key filename, now fixed as ca.key
  • INFRA.REPO.repo_upstreams: replaced by repo_upstream
  • PGSQL.PG_INSTALL.pgdg_repo: now taken care by node playbooks
  • PGSQL.PG_INSTALL.pg_add_repo: now taken care by node playbooks
  • PGSQL.PG_IDENTITY.pg_backup: not used and conflict with section name
  • PGSQL.PG_IDENTITY.pg_preflight_skip: not used anymore, replace by pg_id
  • DCS.dcs_name : removed due to using etcd
  • DCS.dcs_servers : replaced by using ad hoc group etcd
  • DCS.dcs_registry : removed due to using etcd
  • DCS.dcs_safeguard : replaced by etcd_safeguard
  • DCS.dcs_clean : replaced by etcd_clean
  • PGSQL.PG_VIP.vip_mode : replaced by pg_vip_enabled
  • PGSQL.PG_VIP.vip_address : replaced by pg_vip_address
  • PGSQL.PG_VIP.vip_interface : replaced by pg_vip_interface

Renamed Parameters

  • nginx_upstream -> infra_portal
  • repo_address -> repo_endpoint
  • pg_hostname -> node_id_from_pg
  • pg_sindex -> pg_group
  • pg_services -> pg_default_services
  • pg_services_extra -> pg_services
  • pg_hba_rules_extra -> pg_hba_rules
  • pg_hba_rules -> pg_default_hba_rules
  • pgbouncer_hba_rules_extra -> pgb_hba_rules
  • pgbouncer_hba_rules -> pgb_default_hba_rules
  • node_packages_default -> node_default_packages
  • node_packages_meta -> infra_packages
  • node_packages_meta_pip -> infra_packages_pip
  • node_data_dir -> node_data

Checksums

MD5 (pigsty-pkg-v2.0.0.el7.x86_64.tgz) = 9ff3c973fa5915f65622b91419817c9b
MD5 (pigsty-pkg-v2.0.0.el8.x86_64.tgz) = bd108a6c8f026cb79ee62c3b68b72176
MD5 (pigsty-pkg-v2.0.0.el9.x86_64.tgz) = e24288770f240af0511b0c38fa2f4774

Special thanks to @alemacci for his great contribution!


v1.5.1

WARNING: CREATE INDEX|REINDEX CONCURRENTLY PostgreSQL 14.0 - 14.3 may lead to index data corruption!

Please upgrade postgres to 14.4 ASAP.

Software Upgrade

  • upgrade postgres to 14.4 (important bug fix)
  • upgrade citus to 11.0-2 (with enterprise features)
  • upgrade timescaledb to 2.7 (more continuous aggregates)
  • Upgrade patroni to 2.1.4 (new sync health-check)
  • Upgrade haproxy to 2.6.0 (cli, reload, ssl,…)
  • Upgrade grafana to 9.0.0 (new ui)
  • Upgrade prometheus 2.36.0

Bug fix

  • Fix typo in pgsql-migration.yml
  • remove pid file in haproxy config
  • remove i686 packages when using repotrack under el7
  • Fix redis service systemctl enabled issue
  • Fix patroni systemctl service enabled=no by default issue
  • stop vip-manager when purging existing postgres

API Changes

  • Mark grafana_database and grafana_pgurl as obsolete
  • Add some new etcd & pgsql alias (optional)

New Apps

  • wiki.js : Local wiki with Postgres
  • FerretDB : MongoDB API over Postgres

v1.5.0

Highlights

  • Complete Docker Support, enable on meta nodes by default with lot’s of software templates.
    • bytebase pgadmin4 pgweb postgrest kong minio,…
  • Infra Self Monitoring: Nginx, ETCD, Consul, Grafana, Prometheus, Loki, etc…
  • New CMDB design compatible with redis & greenplum, visualize with CMDB Overview
  • Service Discovery : Consul SD now works again for prometheus targets management
  • Redis playbook now works on single instance with redis_port option.
  • Better cold backup support: crontab for backup, delayed standby with pg_delay
  • Use ETCD as DCS, alternative to Consul
  • Nginx Log Enhancement

Monitoring

Dashboards

  • CMDB Overview: Visualize CMDB Inventory
  • DCS Overview: Show consul & etcd metrics
  • Nginx Overview: Visualize nginx metrics & access/error logs
  • Grafana Overview: Grafana self Monitoring
  • Prometheus Overview:Prometheus self Monitoring
  • INFRA Dashboard & Home Dashboard Reforge

Architecture

  • Infra monitoring targets now have a separated target dir targets/infra
  • Consul SD is available for prometheus
  • etcd , consul , patroni, docker metrics
  • Now infra targets are managed by role infra_register
  • Upgrade pg_exporter to v0.5.0 with scale and default support
    • pg_bgwriter, pg_wal, pg_query, pg_db, pgbouncer_stat now use seconds instead of ms and µs
    • pg_table counters now have default value 0 instead of NaN
    • pg_class is replaced by pg_table and pg_index
    • pg_table_size is now enabled with 300s ttl

Provisioning

  • New optional package docker.tgz contains: Pgadmin, Pgweb, Postgrest, ByteBase, Kong, Minio, etc.
  • New Role etcd to deploy & monitor etcd dcs service
  • Specify which type of DCS to use with pg_dcs_type (etcd now available)
  • Add pg_checksum option to enable data checksum
  • Add pg_delay option to setup delayed standby leaders
  • Add node_crontab and node_crontab_overwrite to create routine jobs such as cold backup
  • Add a series of *_enable options to control components
  • Loki and Promtail are now installed using the RPM package made by frpm.
  • Allow customize monitoring logo

Software Updates

  • Upgrade PostgreSQL to 14.3
  • Upgrade Redis to 6.2.7
  • Upgrade PG Exporter to 0.5.0
  • Upgrade Consul to 1.12.0
  • Upgrade vip-manager to v1.0.2
  • Upgrade Grafana to v8.5.2
  • Upgrade HAproxy to 2.5.7 without rsyslog dependency
  • Upgrade Loki & Promtail to v2.5.0 with RPM packages
  • New packages: pg_probackup

New software / application based on docker:

  • bytebase : DDL Schema Migrator
  • pgadmin4 : Web Admin UI for PostgreSQL
  • pgweb : Web Console for PostgreSQL
  • postgrest : Auto generated REST API for PostgreSQL
  • kong : API Gateway which use PostgreSQL as backend storage
  • swagger openapi : API Specification Generator
  • Minio : S3-compatible object storage
  • Gitea : Private local git service

Bug Fix

  • Fix loki & promtail /etc/default config file name issue
  • Now node_data_dir (/data) is created before consul init if not exists
  • Fix haproxy silence /var/log/messages with inappropriate rsyslog dependency

API Change

New Variable

  • node_data_dir : major data mount path, will be created if not exist.
  • node_crontab_overwrite : overwrite /etc/crontab instead of append
  • node_crontab: node crontab to be appended or overwritten
  • nameserver_enabled: enable nameserver on this meta node?
  • prometheus_enabled: enable prometheus on this meta node?
  • grafana_enabled: enable grafana on this meta node?
  • loki_enabled: enable loki on this meta node?
  • docker_enable: enable docker on this node?
  • consul_enable: enable consul server/agent?
  • etcd_enable: enable etcd server/clients?
  • pg_checksum: enable pg cluster data-checksum?
  • pg_delay: recovery min apply delay for standby leader
  • grafana_customize_logo: customize grafana icon

Reforge

Now *_clean are boolean flags to clean up existing instance during init.

And *_safeguard are boolean flags to avoid purging running instance when executing any playbook.

  • pg_exists_action -> pg_clean
  • pg_disable_purge -> pg_safeguard
  • dcs_exists_action -> dcs_clean
  • dcs_disable_purge -> dcs_safeguard

Rename

  • node_ntp_config -> node_ntp_enabled
  • node_admin_setup -> node_admin_enabled
  • node_admin_pks -> node_admin_pk_list
  • node_dns_hosts -> node_etc_hosts_default
  • node_dns_hosts_extra -> node_etc_hosts
  • node_dns_server -> node_dns_method
  • node_local_repo_url -> node_repo_local_urls
  • node_packages -> node_packages_default
  • node_extra_packages -> node_packages
  • node_meta_pip_install -> node_packages_meta_pip
  • node_sysctl_params -> node_tune_params
  • app_list -> nginx_indexes
  • grafana_plugin -> grafana_plugin_method
  • grafana_cache -> grafana_plugin_cache
  • grafana_plugins -> grafana_plugin_list
  • grafana_git_plugin_git -> grafana_plugin_git
  • haproxy_admin_auth_enabled -> haproxy_auth_enabled
  • pg_shared_libraries -> pg_libs
  • dcs_type -> pg_dcs_type

v1.4.1

Routine bug fix / Docker Support / English Docs

Now docker is enabled on meta node by default. You can launch ton’s of SaaS with it

English document is available now.

Bug Fix


v1.4.0

Architecture

  • Decouple system into 4 major categories: INFRA, NODES, PGSQL, REDIS, which makes pigsty far more clear and more extensible.
  • Single Node Deployment = INFRA + NODES + PGSQL
  • Deploy pgsql clusters = NODES + PGSQL
  • Deploy redis clusters = NODES + REDIS
  • Deploy other databases = NODES + xxx (e.g MONGO, KAFKA, … TBD)

Accessibility

  • CDN for mainland China.
  • Get the latest source with bash -c "$(curl -fsSL http://get.pigsty.cc/latest)"
  • Download & Extract packages with new download script.

Monitor Enhancement

  • Split monitoring system into 5 major categories: INFRA, NODES, REDIS, PGSQL, APP
  • Logging enabled by default
    • now loki and promtail are enabled by default. with prebuilt loki-rpm
  • Models & Labels
    • A hidden ds prometheus datasource variable is added for all dashboards, so you can easily switch different datasource simply by select a new one rather than modifying Grafana Datasources & Dashboards
    • An ip label is added for all metrics, and will be used as join key between database metrics & nodes metrics
  • INFRA Monitoring
    • Home dashboard for infra: INFRA Overview
    • Add logging Dashboards : Logs Instance
    • PGLOG Analysis & PGLOG Session now treated as an example Pigsty APP.
  • NODES Monitoring Application
    • If you don’t care database at all, Pigsty now can be used as host monitoring software alone!
    • Consist of 4 core dashboards: Nodes Overview & Nodes Cluster & Nodes Instance & Nodes Alert
    • Introduce new identity variables for nodes: node_cluster and nodename
    • Variable pg_hostname now means set hostname same as postgres instance name to keep backward-compatible
    • Variable nodename_overwrite control whether overwrite node’s hostname with nodename
    • Variable nodename_exchange will write nodename to each other’s /etc/hosts
    • All nodes metrics reference are overhauled, join by ip
    • Nodes monitoring targets are managed alone under /etc/prometheus/targets/nodes
  • PGSQL Monitoring Enhancement
    • Complete new PGSQL Cluster which simplify and focus on important stuff among cluster.
    • New Dashboard PGSQL Databases which is cluster level object monitoring. Such as tables & queries among the entire cluster rather than single instance.
    • PGSQL Alert dashboard now only focus on pgsql alerts.
    • PGSQL Shard are added to PGSQL
  • Redis Monitoring Enhancement
    • Add nodes monitoring for all redis dashboards.

MatrixDB Support

  • MatrixDB (Greenplum 7) can be deployed via pigsty-matrix.yml playbook
  • MatrixDB Monitor Dashboards : PGSQL MatrixDB
  • Example configuration added: pigsty-mxdb.yml

Provisioning Enhancement

Now pigsty work flow works as this:

 infra.yml ---> install pigsty on single meta node
      |          then add more nodes under pigsty's management
      |
 nodes.yml ---> prepare nodes for pigsty (node setup, dcs, node_exporter, promtail)
      |          then choose one playbook to deploy database clusters on those nodes
      |
      ^--> pgsql.yml   install postgres on prepared nodes
      ^--> redis.yml   install redis on prepared nodes

infra-demo.yml = 
           infra.yml -l meta     +
           nodes.yml -l pg-test  +
           pgsql.yml -l pg-test +
           infra-loki.yml + infra-jupyter.yml + infra-pgweb.yml
 
  • nodes.yml to setup & prepare nodes for pigsty
    • setup node, node_exporter, consul agent on nodes
    • node-remove.yml are used for node de-register
  • pgsql.yml now only works on prepared nodes
    • pgsql-remove now only responsible for postgres itself. (dcs and node monitor are taken by node.yml)
    • Add a series of new options to reuse postgres role in greenplum/matrixdb
  • redis.yml now works on prepared nodes
    • and redis-remove.yml now remove redis from nodes.
  • pgsql-matrix.yml now install matrixdb (Greenplum 7) on prepared nodes.

Software Upgrade

  • PostgreSQL 14.2
  • PostGIS 3.2
  • TimescaleDB 2.6
  • Patroni 2.1.3 (Prometheus Metrics + Failover Slots)
  • HAProxy 2.5.5 (Fix stats error, more metrics)
  • PG Exporter 0.4.1 (Timeout Parameters, and)
  • Grafana 8.4.4
  • Prometheus 2.33.4
  • Greenplum 6.19.4 / MatrixDB 4.4.0
  • Loki are now shipped as rpm packages instead of zip archives

Bug Fix

  • Remove consul dependency for patroni , which makes it much more easier to migrate to a new consul cluster
  • Fix prometheus bin/new scripts default data dir path : /export/prometheus to /data/prometheus
  • Fix typos and tasks
  • Add restart seconds to vip-manager systemd service

API Changes

New Variable

  • node_cluster: Identity variable for node cluster
  • nodename_overwrite: If set, nodename will be set to node’s hostname
  • nodename_exchange : exchange node hostname (in /etc/hosts) among play hosts
  • node_dns_hosts_extra : extra static dns records which can be easily overwritten by single instance/cluster
  • patroni_enabled: if disabled, postgres & patroni bootstrap will not be performed during role postgres
  • pgbouncer_enabled : if disabled, pgbouncer will not be launched during role postgres
  • pg_exporter_params: extra url parameters for pg_exporter when generating monitor target url.
  • pg_provision: bool var to indicate whether perform provision part of role postgres (template, db,user)
  • no_cmdb: cli args for infra.yml and infra-demo.yml playbook which will not create cmdb on meta node.
MD5 (app.tgz) = f887313767982b31a2b094e5589a75ea
MD5 (matrix.tgz) = 3d063437c482d94bd7e35df1a08bbc84
MD5 (pigsty.tgz) = e143b88ebea1474f9ebaffddc6072c49
MD5 (pkg.tgz) = 73e8f5ce995b1f1760cb63c1904fb91b

v1.3.1

[Monitor]

  • PGSQL & PGCAT Dashboard polish
  • optimize layout for pgcat instance & pgcat database
  • add key metrics panels to pgsql instance dashboard, keep consist with pgsql cluster
  • add table/index bloat panels to pgcat database, remove pgcat bloat dashboard.
  • add index information in pgcat database dashboard
  • fix broken panels in grafana 8.3
  • add redis index in nginx homepage

[Deploy]

  • New infra-demo.yml playbook for one-pass bootstrap
  • Use infra-jupyter.yml playbook to deploy optional jupyter lab server
  • Use infra-pgweb.yml playbook to deploy optional pgweb server
  • New pg alias on meta node, can initiate postgres cluster from admin user (in addition to postgres)
  • Adjust all patroni conf templates’s max_locks_per_transactions according to timescaledb-tune ’s advise
  • Add citus.node_conninfo: 'sslmode=prefer' to conf templates in order to use citus without SSL
  • Add all extensions (except for pgrouting) in pgdg14 in package list
  • Upgrade node_exporter to v1.3.1
  • Add PostgREST v9.0.0 to package list. Generate API from postgres schema.

[BugFix]

  • Grafana’s security breach (upgrade to v8.3.1 issue)
  • fix pg_instance & pg_service in register role when start from middle of playbook
  • Fix nginx homepage render issue when host without pg_cluster variable exists
  • Fix style issue when upgrading to grafana 8.3.1

v1.3.0

  • [ENHANCEMENT] Redis Deployment (cluster,sentinel,standalone)

  • [ENHANCEMENT] Redis Monitor

    • Redis Overview Dashboard
    • Redis Cluster Dashboard
    • Redis Instance Dashboard
  • [ENHANCEMENT] monitor: PGCAT Overhaul

    • New Dashboard: PGCAT Instance
    • New Dashboard: PGCAT Database Dashboard
    • Remake Dashboard: PGCAT Table
  • [ENHANCEMENT] monitor: PGSQL Enhancement

    • New Panels: PGSQL Cluster, add 10 key metrics panel (toggled by default)
    • New Panels: PGSQL Instance, add 10 key metrics panel (toggled by default)
    • Simplify & Redesign: PGSQL Service
    • Add cross-references between PGCAT & PGSL dashboards
  • [ENHANCEMENT] monitor deploy

    • Now grafana datasource is automatically registered during monly deployment
  • [ENHANCEMENT] software upgrade

    • add PostgreSQL 13 to default package list
    • upgrade to PostgreSQL 14.1 by default
    • add greenplum rpm and dependencies
    • add redis rpm & source packages
    • add perf as default packages

v1.2.0

  • [ENHANCEMENT] Use PostgreSQL 14 as default version
  • [ENHANCEMENT] Use TimescaleDB 2.5 as default extension
    • now timescaledb & postgis are enabled in cmdb by default
  • [ENHANCEMENT] new monitor-only mode:
    • you can use pigsty to monitor existing pg instances with a connectable url only
    • pg_exporter will be deployed on meta node locally
    • new dashboard PGSQL Cluster Monly for remote clusters
  • [ENHANCEMENT] Software upgrade
    • grafana to 8.2.2
    • pev2 to v0.11.9
    • promscale to 0.6.2
    • pgweb to 0.11.9
    • Add new extensions: pglogical pg_stat_monitor orafce
  • [ENHANCEMENT] Automatic detect machine spec and use proper node_tune and pg_conf templates
  • [ENHANCEMENT] Rework on bloat related views, now more information are exposed
  • [ENHANCEMENT] Remove timescale & citus internal monitoring
  • [ENHANCEMENT] New playbook pgsql-audit.yml to create audit report.
  • [BUG FIX] now pgbouncer_exporter resource owner are {{ pg_dbsu }} instead of postgres
  • [BUG FIX] fix pg_exporter duplicate metrics on pg_table pg_index while executing REINDEX TABLE CONCURRENTLY
  • [CHANGE] now all config templates are minimize into two: auto & demo. (removed: pub4, pg14, demo4, tiny, oltp )
    • pigsty-demo is configured if vagrant is the default user, otherwise pigsty-auto is used.

How to upgrade from v1.1.1

There’s no API change in 1.2.0 You can still use old pigsty.yml configuration files (PG13).

For the infrastructure part. Re-execution of repo will do most of the parts

As for the database. You can still use the existing PG13 instances. In-place upgrade is quite tricky especially when involving extensions such as PostGIS & Timescale. I would highly recommend performing a database migration with logical replication.

The new playbook pgsql-migration.yml will make this a lot easier. It will create a series of scripts which will help you to migrate your cluster with near-zero downtime.


v1.1.1

  • [ENHANCEMENT] replace timescaledb apache version with timescale version
  • [ENHANCEMENT] upgrade prometheus to 2.30
  • [BUG FIX] now pg_exporter config dir’s owner are {{ pg_dbsu }} instead of prometheus

How to upgrade from v1.1.0 The major change in this release is timescaledb. Which replace old apache license version with timescale license version

stop/pause postgres instance with timescaledb
yum remove -y timescaledb_13

[timescale_timescaledb]
name=timescale_timescaledb
baseurl=https://packagecloud.io/timescale/timescaledb/el/7/$basearch
repo_gpgcheck=0
gpgcheck=0
enabled=1

yum install timescaledb-2-postgresql13 

v1.1.0

  • [ENHANCEMENT] add pg_dummy_filesize to create fs space placeholder
  • [ENHANCEMENT] home page overhaul
  • [ENHANCEMENT] add jupyter lab integration
  • [ENHANCEMENT] add pgweb console integration
  • [ENHANCEMENT] add pgbadger support
  • [ENHANCEMENT] add pev2 support, explain visualizer
  • [ENHANCEMENT] add pglog utils
  • [ENHANCEMENT] update default pkg.tgz software version:
    • upgrade postgres to v13.4 (with official pg14 support)
    • upgrade pgbouncer to v1.16 (metrics definition updates)
    • upgrade grafana to v8.1.4
    • upgrade prometheus to v2.2.29
    • upgrade node_exporter to v1.2.2
    • upgrade haproxy to v2.1.1
    • upgrade consul to v1.10.2
    • upgrade vip-manager to v1.0.1

API Changes

  • nginx_upstream now holds different structures. (incompatible)

  • new config entries: app_list, render into home page’s nav entries

  • new config entries: docs_enabled, setup local docs on default server.

  • new config entries: pev2_enabled, setup local pev2 utils.

  • new config entries: pgbadger_enabled, create log summary/report dir

  • new config entries: jupyter_enabled, enable jupyter lab server on meta node

  • new config entries: jupyter_username, specify which user to run jupyter lab

  • new config entries: jupyter_password, specify jupyter lab default password

  • new config entries: pgweb_enabled, enable pgweb server on meta node

  • new config entries: pgweb_username, specify which user to run pgweb

  • rename internal flag repo_exist into repo_exists

  • now default value for repo_address is pigsty instead of yum.pigsty

  • now haproxy access point is http://pigsty instead of http://h.pigsty


v1.0.1

  • Documentation Update
    • Chinese document now viable
    • Machine-Translated English document now viable
  • Bug Fix: pgsql-remove does not remove primary instance.
  • Bug Fix: replace pg_instance with pg_cluster + pg_seq
    • Start-At-Task may fail due to pg_instance undefined
  • Bug Fix: remove citus from default shared preload library
    • citus will force max_prepared_transaction to non-zero value
  • Bug Fix: ssh sudo checking in configure:
    • now ssh -t sudo -n ls is used for privilege checking
  • Typo Fix: pg-backup script typo
  • Alert Adjust: Remove ntp sanity check alert (dupe with ClockSkew)
  • Exporter Adjust: remove collector.systemd to reduce overhead

v1.0.0

v1 GA, Monitoring System Overhaul

Highlights

  • Monitoring System Overhaul

    • New Dashboards on Grafana 8.0
    • New metrics definition, with extra PG14 support
    • Simplified labeling system: static label set: (job, cls, ins)
    • New Alerting Rules & Derived Metrics
    • Monitoring multiple database at one time
    • Realtime log search & csvlog analysis
    • Link-Rich Dashboards, click graphic elements to drill-down|roll-up
  • Architecture Changes

    • Add citus & timescaledb as part of default installation
    • Add PostgreSQL 14beta2 support
    • Simply haproxy admin page index
    • Decouple infra & pgsql by adding a new role register
    • Add new role loki and promtail for logging
    • Add new role environ for setting up environment for admin user on admin node
    • Using static service-discovery for prometheus by default (instead of consul)
    • Add new role remove to gracefully remove cluster & instance
    • Upgrade prometheus & grafana provisioning logics.
    • Upgrade to vip-manager 1.0 , node_exporter 1.2 , pg_exporter 0.4, grafana 8.0
    • Now every database on every instance can be auto-registered as grafana datasource
    • Move consul register tasks to role register, change consul service tags
    • Add cmdb.sql as pg-meta baseline definition (CMDB & PGLOG)
  • Application Framework

    • Extensible framework for new functionalities
    • core app: PostgreSQL Monitor System: pgsql
    • core app: PostgreSQL Catalog explorer: pgcat
    • core app: PostgreSQL Csvlog Analyzer: pglog
    • add example app covid for visualizing covid-19 data.
    • add example app isd for visualizing isd data.
  • Misc

    • Add jupyterlab which brings entire python environment for data science
    • Add vonng-echarts-panel to bring Echarts support back.
    • Add wrap script createpg , createdb, createuser
    • Add cmdb dynamic inventory scripts: load_conf.py, inventory_cmdb, inventory_conf
    • Remove obsolete playbooks: pgsql-monitor, pgsql-service, node-remove, etc….

API Change

Bug Fix

  • Fix default timezone Asia/Shanghai (CST) issue
  • Fix nofile limit for pgbouncer & patroni
  • Pgbouncer userlist & database list will be generated when executing tag pgbouncer

v0.9.0

Pigsty GUI, CLI, Logging Intergration

Features

  • One-Line Installation

    Run this on meta node /bin/bash -c "$(curl -fsSL https://pigsty.cc/install)"

  • MetaDB provisioning

    Now you can use pgsql database on meta node as inventory instead of static yaml file affter bootstrap.

  • Add Loki & Prometail as optinal logging collector

    Now you can view, query, search postgres|pgbouncer|patroni logs with Grafana UI (PG Instance Log)

  • Pigsty CLI/GUI (beta)

    Mange you pigsty deployment with much more human-friendly command line interface.

Bug Fix

  • Log related issues
    • fix connection reset by peer entries in postgres log caused by Haproxy health check.
    • fix Connect Reset Exception in patroni logs caused by haproxy health check
    • fix patroni log time format (remove mill seconds, add timezone)
    • set log_min_duration_statement=1s for dbuser_monitor to get ride of monitor logs.
  • Fix pgbouncer-create-user does not handle md5 password properly
  • Fix obsolete Makefile entries
  • Fix node dns nameserver lost when abort during resolv.conf rewrite
  • Fix db/user template and entry not null check

API Change

  • Set default value of node_disable_swap to false
  • Remove example enties of node_sysctl_params.
  • grafana_plugin default install will now download from CDN if plugins not exists
  • repo_url_packages now download rpm via pigsty CDN to accelerate.
  • proxy_env.no_proxy now add pigsty CDN to noproxy sites。
  • grafana_customize set to false by default,enable it means install pigsty pro UI.
  • node_admin_pk_current add current user’s ~/.ssh/id_rsa.pub to admin pks
  • loki_clean whether to cleanup existing loki data during init
  • loki_data_dir set default data dir for loki logging service
  • promtail_enabled enabling promtail logging agent service?
  • promtail_clean remove existing promtail status during init?
  • promtail_port default port used by promtail, 9080 by default
  • promtail_status_file location of promtail status file
  • promtail_send_url endpoint of loki service which receives log data

v0.8.0

Service Provisioning support is added in this release

New Features

  • Service provision.
  • full locale support.

API Changes

Role vip and haproxy are merged into service.

#------------------------------------------------------------------------------
# SERVICE PROVISION
#------------------------------------------------------------------------------
pg_weight: 100              # default load balance weight (instance level)

# - service - #
pg_services:                                  # how to expose postgres service in cluster?
  # primary service will route {ip|name}:5433 to primary pgbouncer (5433->6432 rw)
  - name: primary           # service name {{ pg_cluster }}_primary
    src_ip: "*"
    src_port: 5433
    dst_port: pgbouncer     # 5433 route to pgbouncer
    check_url: /primary     # primary health check, success when instance is primary
    selector: "[]"          # select all instance as primary service candidate

  # replica service will route {ip|name}:5434 to replica pgbouncer (5434->6432 ro)
  - name: replica           # service name {{ pg_cluster }}_replica
    src_ip: "*"
    src_port: 5434
    dst_port: pgbouncer
    check_url: /read-only   # read-only health check. (including primary)
    selector: "[]"          # select all instance as replica service candidate
    selector_backup: "[? pg_role == `primary`]"   # primary are used as backup server in replica service

  # default service will route {ip|name}:5436 to primary postgres (5436->5432 primary)
  - name: default           # service's actual name is {{ pg_cluster }}-{{ service.name }}
    src_ip: "*"             # service bind ip address, * for all, vip for cluster virtual ip address
    src_port: 5436          # bind port, mandatory
    dst_port: postgres      # target port: postgres|pgbouncer|port_number , pgbouncer(6432) by default
    check_method: http      # health check method: only http is available for now
    check_port: patroni     # health check port:  patroni|pg_exporter|port_number , patroni by default
    check_url: /primary     # health check url path, / as default
    check_code: 200         # health check http code, 200 as default
    selector: "[]"          # instance selector
    haproxy:                # haproxy specific fields
      maxconn: 3000         # default front-end connection
      balance: roundrobin   # load balance algorithm (roundrobin by default)
      default_server_options: 'inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100'

  # offline service will route {ip|name}:5438 to offline postgres (5438->5432 offline)
  - name: offline           # service name {{ pg_cluster }}_replica
    src_ip: "*"
    src_port: 5438
    dst_port: postgres
    check_url: /replica     # offline MUST be a replica
    selector: "[? pg_role == `offline` || pg_offline_query ]"         # instances with pg_role == 'offline' or instance marked with 'pg_offline_query == true'
    selector_backup: "[? pg_role == `replica` && !pg_offline_query]"  # replica are used as backup server in offline service

pg_services_extra: []        # extra services to be added

# - haproxy - #
haproxy_enabled: true                         # enable haproxy among every cluster members
haproxy_reload: true                          # reload haproxy after config
haproxy_policy: roundrobin                    # roundrobin, leastconn
haproxy_admin_auth_enabled: false             # enable authentication for haproxy admin?
haproxy_admin_username: admin                 # default haproxy admin username
haproxy_admin_password: admin                 # default haproxy admin password
haproxy_exporter_port: 9101                   # default admin/exporter port
haproxy_client_timeout: 3h                    # client side connection timeout
haproxy_server_timeout: 3h                    # server side connection timeout

# - vip - #
vip_mode: none                                # none | l2 | l4
vip_reload: true                              # whether reload service after config
# vip_address: 127.0.0.1                      # virtual ip address ip (l2 or l4)
# vip_cidrmask: 24                            # virtual ip address cidr mask (l2 only)
# vip_interface: eth0                         # virtual ip network interface (l2 only)

New Options

# - localization - #
pg_encoding: UTF8                             # default to UTF8
pg_locale: C                                  # default to C
pg_lc_collate: C                              # default to C
pg_lc_ctype: en_US.UTF8                       # default to en_US.UTF8

pg_reload: true                               # reload postgres after hba changes
vip_mode: none                                # none | l2 | l4
vip_reload: true                              # whether reload service after config

Remove Options

haproxy_check_port                            # covered by service options
haproxy_primary_port
haproxy_replica_port
haproxy_backend_port
haproxy_weight
haproxy_weight_fallback
vip_enabled                                   # replace by vip_mode

Service

pg_services and pg_services_extra Defines the services in cluster:

A service has some mandatory fields:

  • name: service’s name
  • src_port: which port to listen and expose service?
  • selector: which instances belonging to this service?
  # default service will route {ip|name}:5436 to primary postgres (5436->5432 primary)
  - name: default           # service's actual name is {{ pg_cluster }}-{{ service.name }}
    src_ip: "*"             # service bind ip address, * for all, vip for cluster virtual ip address
    src_port: 5436          # bind port, mandatory
    dst_port: postgres      # target port: postgres|pgbouncer|port_number , pgbouncer(6432) by default
    check_method: http      # health check method: only http is available for now
    check_port: patroni     # health check port:  patroni|pg_exporter|port_number , patroni by default
    check_url: /primary     # health check url path, / as default
    check_code: 200         # health check http code, 200 as default
    selector: "[]"          # instance selector
    haproxy:                # haproxy specific fields
      maxconn: 3000         # default front-end connection
      balance: roundrobin   # load balance algorithm (roundrobin by default)
      default_server_options: 'inter 3s fastinter 1s downinter 5s rise 3 fall 3 on-marked-down shutdown-sessions slowstart 30s maxconn 3000 maxqueue 128 weight 100'

Database

Add additional locale support: lc_ctype and lc_collate.

It’s mainly because of pg_trgm ’s weird behavior on i18n characters.

pg_databases:
  - name: meta                      # name is the only required field for a database
    # owner: postgres                 # optional, database owner
    # template: template1             # optional, template1 by default
    # encoding: UTF8                # optional, UTF8 by default , must same as template database, leave blank to set to db default
    # locale: C                     # optional, C by default , must same as template database, leave blank to set to db default
    # lc_collate: C                 # optional, C by default , must same as template database, leave blank to set to db default
    # lc_ctype: C                   # optional, C by default , must same as template database, leave blank to set to db default
    allowconn: true                 # optional, true by default, false disable connect at all
    revokeconn: false               # optional, false by default, true revoke connect from public # (only default user and owner have connect privilege on database)
    # tablespace: pg_default          # optional, 'pg_default' is the default tablespace
    connlimit: -1                   # optional, connection limit, -1 or none disable limit (default)
    extensions:                     # optional, extension name and where to create
      - {name: postgis, schema: public}
    parameters:                     # optional, extra parameters with ALTER DATABASE
      enable_partitionwise_join: true
    pgbouncer: true                 # optional, add this database to pgbouncer list? true by default
    comment: pigsty meta database   # optional, comment string for database

v0.7.0

Monitor only deployment support

Overview

  • Monitor Only Deployment

    • Now you can monitoring existing postgres clusters without Pigsty provisioning solution.
    • Intergration with other provisioning solution is available and under further test.
  • Database/User Management

    • Update user/database definition schema to cover more usecases.
    • Add pgsql-createdb.yml and pgsql-user.yml to mange user/db on running clusters.

Features

Bug Fix

API Changes

New Options

prometheus_sd_target: batch                   # batch|single
exporter_install: none                        # none|yum|binary
exporter_repo_url: ''                         # add to yum repo if set
node_exporter_options: '--no-collector.softnet --collector.systemd --collector.ntp --collector.tcpstat --collector.processes'                          # default opts for node_exporter
pg_exporter_url: ''                           # optional, overwrite default pg_exporter target
pgbouncer_exporter_url: ''                    # optional, overwrite default pgbouncer_expoter target

Remove Options

exporter_binary_install: false                 # covered by exporter_install

Structure Changes

pg_default_roles                               # refer to pg_users
pg_users                                       # refer to pg_users
pg_databases                                   # refer to pg_databases

Rename Options

pg_default_privilegs -> pg_default_privileges  # fix typo

Enhancement

Monitoring Provisioning Enhancement

Haproxy Enhancement

Security Enhancement

Software Update

  • Upgrade to PG 13.2 #6

  • Prometheus 2.25 / Grafana 7.4 / Consul 1.9.3 / Node Exporter 1.1 / PG Exporter 0.3.2

API Change

New Config Entries

service_registry: consul                      # none | consul | etcd | both
prometheus_options: '--storage.tsdb.retention=30d'  # prometheus cli opts
prometheus_sd_method: consul                  # Prometheus service discovery method:static|consul
prometheus_sd_interval: 2s                    # Prometheus service discovery refresh interval
pg_offline_query: false                       # set to true to allow offline queries on this instance
node_exporter_enabled: true                   # enabling Node Exporter
pg_exporter_enabled: true                     # enabling PG Exporter
pgbouncer_exporter_enabled: true              # enabling Pgbouncer Exporter
export_binary_install: false                  # install Node/PG Exporter via copy binary
dcs_disable_purge: false                      # force dcs_exists_action = abort to avoid dcs purge
pg_disable_purge: false                       # force pg_exists_action = abort to avoid pg purge
haproxy_weight: 100                           # relative lb weight for backend instance
haproxy_weight_fallback: 1                    # primary server weight in replica service group

Obsolete Config Entries

prometheus_metrics_path                       # duplicate with exporter_metrics_path 
prometheus_retention                          # covered by `prometheus_options`

Database Definition

Database provisioning interface enhancement #33

Old Schema

pg_databases:                       # create a business database 'meta'
  - name: meta
    schemas: [meta]                 # create extra schema named 'meta'
    extensions: [{name: postgis}]   # create extra extension postgis
    parameters:                     # overwrite database meta's default search_path
      search_path: public, monitor

New Schema

pg_databases:
  - name: meta                      # name is the only required field for a database
    owner: postgres                 # optional, database owner
    template: template1             # optional, template1 by default
    encoding: UTF8                  # optional, UTF8 by default
    locale: C                       # optional, C by default
    allowconn: true                 # optional, true by default, false disable connect at all
    revokeconn: false               # optional, false by default, true revoke connect from public # (only default user and owner have connect privilege on database)
    tablespace: pg_default          # optional, 'pg_default' is the default tablespace
    connlimit: -1                   # optional, connection limit, -1 or none disable limit (default)
    extensions:                     # optional, extension name and where to create
      - {name: postgis, schema: public}
    parameters:                     # optional, extra parameters with ALTER DATABASE
      enable_partitionwise_join: true
    pgbouncer: true                 # optional, add this database to pgbouncer list? true by default
    comment: pigsty meta database   # optional, comment string for database

Changes

  • Add new options: template , encoding, locale, allowconn, tablespace, connlimit
  • Add new option revokeconn, which revoke connect privileges from public for this database
  • Add comment field for database

Apply Changes

You can create new database on running postgres clusters with pgsql-createdb.yml playbook.

  1. Define your new database in config files
  2. Pass new database.name with option pg_database to playbook.
./pgsql-createdb.yml -e pg_database=<your_new_database_name>

User Definition

User provisioning interface enhancement #34

Old Schema

pg_users:
  - username: test                  # example production user have read-write access
    password: test                  # example user's password
    options: LOGIN                  # extra options
    groups: [ dbrole_readwrite ]    # dborole_admin|dbrole_readwrite|dbrole_readonly
    comment: default test user for production usage
    pgbouncer: true                 # add to pgbouncer

New Schema

pg_users:
  # complete example of user/role definition for production user
  - name: dbuser_meta               # example production user have read-write access
    password: DBUser.Meta           # example user's password, can be encrypted
    login: true                     # can login, true by default (should be false for role)
    superuser: false                # is superuser? false by default
    createdb: false                 # can create database? false by default
    createrole: false               # can create role? false by default
    inherit: true                   # can this role use inherited privileges?
    replication: false              # can this role do replication? false by default
    bypassrls: false                # can this role bypass row level security? false by default
    connlimit: -1                   # connection limit, -1 disable limit
    expire_at: '2030-12-31'         # 'timestamp' when this role is expired
    expire_in: 365                  # now + n days when this role is expired (OVERWRITE expire_at)
    roles: [dbrole_readwrite]       # dborole_admin|dbrole_readwrite|dbrole_readonly
    pgbouncer: true                 # add this user to pgbouncer? false by default (true for production user)
    parameters:                     # user's default search path
      search_path: public
    comment: test user

Changes

  • username field rename to name
  • groups field rename to roles
  • options now split into separated configration entries: login, superuser, createdb, createrole, inherit, replication,bypassrls,connlimit
  • expire_at and expire_in options
  • pgbouncer option for user is now false by default

Apply Changes

You can create new users on running postgres clusters with pgsql-createuser.yml playbook.

  1. Define your new users in config files (pg_users)
  2. Pass new user.name with option pg_user to playbook.
./pgsql-createuser.yml -e pg_user=<your_new_user_name>

v0.6.0

Architecture Enhancement

Bug Fix

Monitoring Provisioning Enhancement

Haproxy Enhancement

Security Enhancement

Software Update

  • Upgrade to PG 13.2 #6

  • Prometheus 2.25 / Grafana 7.4 / Consul 1.9.3 / Node Exporter 1.1 / PG Exporter 0.3.2

API Change

New Config Entries

service_registry: consul                      # none | consul | etcd | both
prometheus_options: '--storage.tsdb.retention=30d'  # prometheus cli opts
prometheus_sd_method: consul                  # Prometheus service discovery method:static|consul
prometheus_sd_interval: 2s                    # Prometheus service discovery refresh interval
pg_offline_query: false                       # set to true to allow offline queries on this instance
node_exporter_enabled: true                   # enabling Node Exporter
pg_exporter_enabled: true                     # enabling PG Exporter
pgbouncer_exporter_enabled: true              # enabling Pgbouncer Exporter
export_binary_install: false                  # install Node/PG Exporter via copy binary
dcs_disable_purge: false                      # force dcs_exists_action = abort to avoid dcs purge
pg_disable_purge: false                       # force pg_exists_action = abort to avoid pg purge
haproxy_weight: 100                           # relative lb weight for backend instance
haproxy_weight_fallback: 1                    # primary server weight in replica service group

Obsolete Config Entries

prometheus_metrics_path                       # duplicate with exporter_metrics_path 
prometheus_retention                          # covered by `prometheus_options`

v0.5.0

Pigsty now have an Official Site 🎉 !

New Features

  • Add Database Provision Template
  • Add Init Template
  • Add Business Init Template
  • Refactor HBA Rules variables
  • Fix dashboards bugs.
  • Move pg-cluster-replication to default dashboards
  • Use ZJU PostgreSQL mirror as default to accelerate repo build phase.
  • Move documentation to official site: https://pigsty.cc
  • Download newly created offline installation packages: pkg.tgz (v0.5)

Database Provision Template

Now you can customize your database content with pigsty !

pg_users:
  - username: test
    password: test
    comment: default test user
    groups: [ dbrole_readwrite ]    # dborole_admin|dbrole_readwrite|dbrole_readonly
pg_databases:                       # create a business database 'test'
  - name: test
    extensions: [{name: postgis}]   # create extra extension postgis
    parameters:                     # overwrite database meta's default search_path
      search_path: public,monitor

pg-init-template.sql wil be used as default template1 database init script pg-init-business.sql will be used as default business database init script

you can customize default role system, schemas, extensions, privileges with variables now:

Template Configuration
# - system roles - #
pg_replication_username: replicator           # system replication user
pg_replication_password: DBUser.Replicator    # system replication password
pg_monitor_username: dbuser_monitor           # system monitor user
pg_monitor_password: DBUser.Monitor           # system monitor password
pg_admin_username: dbuser_admin               # system admin user
pg_admin_password: DBUser.Admin               # system admin password

# - default roles - #
pg_default_roles:
  - username: dbrole_readonly                 # sample user:
    options: NOLOGIN                          # role can not login
    comment: role for readonly access         # comment string

  - username: dbrole_readwrite                # sample user: one object for each user
    options: NOLOGIN
    comment: role for read-write access
    groups: [ dbrole_readonly ]               # read-write includes read-only access

  - username: dbrole_admin                    # sample user: one object for each user
    options: NOLOGIN BYPASSRLS                # admin can bypass row level security
    comment: role for object creation
    groups: [dbrole_readwrite,pg_monitor,pg_signal_backend]

  # NOTE: replicator, monitor, admin password are overwritten by separated config entry
  - username: postgres                        # reset dbsu password to NULL (if dbsu is not postgres)
    options: SUPERUSER LOGIN
    comment: system superuser

  - username: replicator
    options: REPLICATION LOGIN
    groups: [pg_monitor, dbrole_readonly]
    comment: system replicator

  - username: dbuser_monitor
    options: LOGIN CONNECTION LIMIT 10
    comment: system monitor user
    groups: [pg_monitor, dbrole_readonly]

  - username: dbuser_admin
    options: LOGIN BYPASSRLS
    comment: system admin user
    groups: [dbrole_admin]

  - username: dbuser_stats
    password: DBUser.Stats
    options: LOGIN
    comment: business read-only user for statistics
    groups: [dbrole_readonly]


# object created by dbsu and admin will have their privileges properly set
pg_default_privilegs:
  - GRANT USAGE                         ON SCHEMAS   TO dbrole_readonly
  - GRANT SELECT                        ON TABLES    TO dbrole_readonly
  - GRANT SELECT                        ON SEQUENCES TO dbrole_readonly
  - GRANT EXECUTE                       ON FUNCTIONS TO dbrole_readonly
  - GRANT INSERT, UPDATE, DELETE        ON TABLES    TO dbrole_readwrite
  - GRANT USAGE,  UPDATE                ON SEQUENCES TO dbrole_readwrite
  - GRANT TRUNCATE, REFERENCES, TRIGGER ON TABLES    TO dbrole_admin
  - GRANT CREATE                        ON SCHEMAS   TO dbrole_admin
  - GRANT USAGE                         ON TYPES     TO dbrole_admin

# schemas
pg_default_schemas: [monitor]

# extension
pg_default_extensions:
  - { name: 'pg_stat_statements',  schema: 'monitor' }
  - { name: 'pgstattuple',         schema: 'monitor' }
  - { name: 'pg_qualstats',        schema: 'monitor' }
  - { name: 'pg_buffercache',      schema: 'monitor' }
  - { name: 'pageinspect',         schema: 'monitor' }
  - { name: 'pg_prewarm',          schema: 'monitor' }
  - { name: 'pg_visibility',       schema: 'monitor' }
  - { name: 'pg_freespacemap',     schema: 'monitor' }
  - { name: 'pg_repack',           schema: 'monitor' }
  - name: postgres_fdw
  - name: file_fdw
  - name: btree_gist
  - name: btree_gin
  - name: pg_trgm
  - name: intagg
  - name: intarray

# postgres host-based authentication rules
pg_hba_rules:
  - title: allow meta node password access
    role: common
    rules:
      - host    all     all                         10.10.10.10/32      md5

  - title: allow intranet admin password access
    role: common
    rules:
      - host    all     +dbrole_admin               10.0.0.0/8          md5
      - host    all     +dbrole_admin               172.16.0.0/12       md5
      - host    all     +dbrole_admin               192.168.0.0/16      md5

  - title: allow intranet password access
    role: common
    rules:
      - host    all             all                 10.0.0.0/8          md5
      - host    all             all                 172.16.0.0/12       md5
      - host    all             all                 192.168.0.0/16      md5

  - title: allow local read-write access (local production user via pgbouncer)
    role: common
    rules:
      - local   all     +dbrole_readwrite                               md5
      - host    all     +dbrole_readwrite           127.0.0.1/32        md5

  - title: allow read-only user (stats, personal) password directly access
    role: replica
    rules:
      - local   all     +dbrole_readonly                               md5
      - host    all     +dbrole_readonly           127.0.0.1/32        md5
pg_hba_rules_extra: []

# pgbouncer host-based authentication rules
pgbouncer_hba_rules:
  - title: local password access
    role: common
    rules:
      - local  all          all                                     md5
      - host   all          all                     127.0.0.1/32    md5

  - title: intranet password access
    role: common
    rules:
      - host   all          all                     10.0.0.0/8      md5
      - host   all          all                     172.16.0.0/12   md5
      - host   all          all                     192.168.0.0/16  md5
pgbouncer_hba_rules_extra: []

v0.4.0

The second public beta (v0.4.0) of pigsty is available now ! 🎉

Monitoring System

Skim version of monitoring system consist of 10 essential dashboards:

  • PG Overview
  • PG Cluster
  • PG Service
  • PG Instance
  • PG Database
  • PG Query
  • PG Table
  • PG Table Catalog
  • PG Table Detail
  • Node

Software upgrade

  • Upgrade to PostgreSQL 13.1, Patroni 2.0.1-4, add citus to repo.
  • Upgrade to pg_exporter 0.3.1
  • Upgrade to Grafana 7.3, Ton’s of compatibility work
  • Upgrade to prometheus 2.23, with new UI as default
  • Upgrade to consul 1.9

Misc

  • Update prometheus alert rules
  • Fix alertmanager info links
  • Fix bugs and typos.
  • add a simple backup script

Offline Installation

  • pkg.tgz is the latest offline install package (1GB rpm packages, made under CentOS 7.8)

v0.3.0

The first public beta (v0.3.0) of pigsty is available now ! 🎉

Monitoring System

Skim version of monitoring system consist of 8 essential dashboards:

  • PG Overview
  • PG Cluster
  • PG Service
  • PG Instance
  • PG Database
  • PG Table Overview
  • PG Table Catalog
  • Node

Database Cluster Provision

  • All config files are merged into one file: conf/all.yml by default
  • Use infra.yml to provision meta node(s) and infrastructure
  • Use initdb.yml to provision database clusters
  • Use ins-add.yml to add new instance to database cluster
  • Use ins-del.yml to remove instance from database cluster

Offline Installation

  • pkg.tgz is the latest offline install package (1GB rpm packages, made under CentOS 7.8)