This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

Kernel: PGSQL

How to use another PostgreSQL “kernel” in Pigsty, such as Citus, Babelfish, IvorySQL, PolarDB, Neon, and Greenplum

You can use different “flavors” of PostgreSQL branches, forks and derivatives to replace the “native PG kernel” in Pigsty.

1 - Citus (Distributive)

Deploy native HA citus cluster with Pigsty, horizontal scaling PostgreSQL with better throughput and performance.

Beware that citus for the latest major version PostgreSQL 17 support is still WIP

Pigsty has native citus support:


Install

Citus is a standard PostgreSQL extension, which can be installed and enabled on a native PostgreSQL cluster by following the standard plugin installation process.

To install it manually, you can run the following command:

./pgsql.yml -t pg_extension -e '{"pg_extensions":["citus"]}'

Configuration

To define a citus cluster, you have to specify the following parameters:

Besides, extra hba rules that allow ssl access from local & other data nodes are required. Which may looks like this

You can define each citus cluster separately within a group, like conf/dbms/citus.yml :

all:
  children:
    pg-citus0: # citus data node 0
      hosts: { 10.10.10.10: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus0 , pg_group: 0 }
    pg-citus1: # citus data node 1
      hosts: { 10.10.10.11: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus1 , pg_group: 1 }
    pg-citus2: # citus data node 2
      hosts: { 10.10.10.12: { pg_seq: 1, pg_role: primary } }
      vars: { pg_cluster: pg-citus2 , pg_group: 2 }
    pg-citus3: # citus data node 3, with an extra replica
      hosts:
        10.10.10.13: { pg_seq: 1, pg_role: primary }
        10.10.10.14: { pg_seq: 2, pg_role: replica }
      vars: { pg_cluster: pg-citus3 , pg_group: 3 }
  vars:                               # global parameters for all citus clusters
    pg_mode: citus                    # pgsql cluster mode: citus
    pg_shard: pg-citus                # citus shard name: pg-citus
    patroni_citus_db: meta            # citus distributed database name
    pg_dbsu_password: DBUser.Postgres # all dbsu password access for citus cluster
    pg_users: [ { name: dbuser_meta ,password: DBUser.Meta ,pgbouncer: true ,roles: [ dbrole_admin ] } ]
    pg_databases: [ { name: meta ,extensions: [ { name: citus }, { name: postgis }, { name: timescaledb } ] } ]
    pg_hba_rules:
      - { user: 'all' ,db: all  ,addr: 127.0.0.1/32 ,auth: ssl ,title: 'all user ssl access from localhost' }
      - { user: 'all' ,db: all  ,addr: intra        ,auth: ssl ,title: 'all user ssl access from intranet'  }

You can also specify all citus cluster members within a group, take prod.yml for example.

#==========================================================#
# pg-citus: 10 node citus cluster (5 x primary-replica pair)
#==========================================================#
pg-citus: # citus group
  hosts:
    10.10.10.50: { pg_group: 0, pg_cluster: pg-citus0 ,pg_vip_address: 10.10.10.60/24 ,pg_seq: 0, pg_role: primary }
    10.10.10.51: { pg_group: 0, pg_cluster: pg-citus0 ,pg_vip_address: 10.10.10.60/24 ,pg_seq: 1, pg_role: replica }
    10.10.10.52: { pg_group: 1, pg_cluster: pg-citus1 ,pg_vip_address: 10.10.10.61/24 ,pg_seq: 0, pg_role: primary }
    10.10.10.53: { pg_group: 1, pg_cluster: pg-citus1 ,pg_vip_address: 10.10.10.61/24 ,pg_seq: 1, pg_role: replica }
    10.10.10.54: { pg_group: 2, pg_cluster: pg-citus2 ,pg_vip_address: 10.10.10.62/24 ,pg_seq: 0, pg_role: primary }
    10.10.10.55: { pg_group: 2, pg_cluster: pg-citus2 ,pg_vip_address: 10.10.10.62/24 ,pg_seq: 1, pg_role: replica }
    10.10.10.56: { pg_group: 3, pg_cluster: pg-citus3 ,pg_vip_address: 10.10.10.63/24 ,pg_seq: 0, pg_role: primary }
    10.10.10.57: { pg_group: 3, pg_cluster: pg-citus3 ,pg_vip_address: 10.10.10.63/24 ,pg_seq: 1, pg_role: replica }
    10.10.10.58: { pg_group: 4, pg_cluster: pg-citus4 ,pg_vip_address: 10.10.10.64/24 ,pg_seq: 0, pg_role: primary }
    10.10.10.59: { pg_group: 4, pg_cluster: pg-citus4 ,pg_vip_address: 10.10.10.64/24 ,pg_seq: 1, pg_role: replica }
  vars:
    pg_mode: citus                    # pgsql cluster mode: citus
    pg_shard: pg-citus                # citus shard name: pg-citus
    pg_primary_db: test               # primary database used by citus
    pg_dbsu_password: DBUser.Postgres # all dbsu password access for citus cluster
    pg_vip_enabled: true
    pg_vip_interface: eth1
    pg_extensions: [ 'citus postgis timescaledb pgvector' ]
    pg_libs: 'citus, timescaledb, pg_stat_statements, auto_explain' # citus will be added by patroni automatically
    pg_users: [ { name: test ,password: test ,pgbouncer: true ,roles: [ dbrole_admin ] } ]
    pg_databases: [ { name: test ,owner: test ,extensions: [ { name: citus }, { name: postgis } ] } ]
    pg_hba_rules:
      - { user: 'all' ,db: all  ,addr: 10.10.10.0/24 ,auth: trust ,title: 'trust citus cluster members'        }
      - { user: 'all' ,db: all  ,addr: 127.0.0.1/32  ,auth: ssl   ,title: 'all user ssl access from localhost' }
      - { user: 'all' ,db: all  ,addr: intra         ,auth: ssl   ,title: 'all user ssl access from intranet'  }

And you can create distributed table & reference table on the coordinator node. Any data node can be used as the coordinator node since citus 11.2.


Usage

You can access any (primary) node in the cluster as you would with a regular cluster:

pgbench -i postgres://test:test@pg-citus0/test
pgbench -nv -P1 -T1000 -c 2 postgres://test:test@pg-citus0/test

By default, any changes you make to a shard only occur on that cluster, not on other shards.

If you want to distribute a table, you can use the following command:

psql -h pg-citus0 -d test -c "SELECT create_distributed_table('pgbench_accounts', 'aid'); SELECT truncate_local_data_after_distributing_table('public.pgbench_accounts');"
psql -h pg-citus0 -d test -c "SELECT create_reference_table('pgbench_branches')         ; SELECT truncate_local_data_after_distributing_table('public.pgbench_branches');"
psql -h pg-citus0 -d test -c "SELECT create_reference_table('pgbench_history')          ; SELECT truncate_local_data_after_distributing_table('public.pgbench_history');"
psql -h pg-citus0 -d test -c "SELECT create_reference_table('pgbench_tellers')          ; SELECT truncate_local_data_after_distributing_table('public.pgbench_tellers');"

There are two types of tables you can create:

  • distributed tables (automatic partitioning, need to specify partition key)
  • reference tables (full replication: no need to specify partition key)

Starting from Citus 11.2, any Citus database node can act as a coordinator, meaning any primary node can write.

For example, your changes will be visible on other nodes:

psql -h pg-citus1 -d test -c '\dt+'

And your scan will be distributed:

vagrant@meta-1:~$ psql -h pg-citus3 -d test -c 'explain select * from pgbench_accounts'
                                               QUERY PLAN
---------------------------------------------------------------------------------------------------------
 Custom Scan (Citus Adaptive)  (cost=0.00..0.00 rows=100000 width=352)
   Task Count: 32
   Tasks Shown: One of 32
   ->  Task
         Node: host=10.10.10.52 port=5432 dbname=test
         ->  Seq Scan on pgbench_accounts_102008 pgbench_accounts  (cost=0.00..81.66 rows=3066 width=97)
(6 rows)

You can issue writes from different primary nodes:

pgbench -nv -P1 -T1000 -c 2 postgres://test:test@pg-citus1/test
pgbench -nv -P1 -T1000 -c 2 postgres://test:test@pg-citus2/test
pgbench -nv -P1 -T1000 -c 2 postgres://test:test@pg-citus3/test
pgbench -nv -P1 -T1000 -c 2 postgres://test:test@pg-citus4/test

And in case of primary node failure, the replica will take over with native patroni support:

test=# select * from  pg_dist_node;
 nodeid | groupid |  nodename   | nodeport | noderack | hasmetadata | isactive | noderole | nodecluster | metadatasynced | shouldhaveshards
--------+---------+-------------+----------+----------+-------------+----------+----------+-------------+----------------+------------------
      1 |       0 | 10.10.10.51 |     5432 | default  | t           | t        | primary  | default     | t              | f
      2 |       2 | 10.10.10.54 |     5432 | default  | t           | t        | primary  | default     | t              | t
      5 |       1 | 10.10.10.52 |     5432 | default  | t           | t        | primary  | default     | t              | t
      3 |       4 | 10.10.10.58 |     5432 | default  | t           | t        | primary  | default     | t              | t
      4 |       3 | 10.10.10.56 |     5432 | default  | t           | t        | primary  | default     | t              | t

2 - WiltonDB (MSSQL)

Create SQL Server Compatible PostgreSQL cluster with WiltonDB and Babelfish (Wire Protocol Level)

Pigsty allows users to create a Microsoft SQL Server compatible PostgreSQL cluster using Babelfish and WiltonDB!

  • Babelfish: An open-source MSSQL (Microsoft SQL Server) compatibility extension Open Sourced by AWS
  • WiltonDB: A PostgreSQL kernel distribution focusing on integrating Babelfish

Babelfish is a PostgreSQL extension, but it works on a slightly modified PostgreSQL kernel Fork, WiltonDB provides compiled kernel binaries and extension binary packages on EL/Ubuntu systems.

Pigsty can replace the native PostgreSQL kernel with WiltonDB, providing an out-of-the-box MSSQL compatible cluster along with all the supported by common PostgreSQL clusters, such as HA, PITR, IaC, monitoring, etc.

WiltonDB is very similar to PostgreSQL 15, but it can not use vanilla PostgreSQL extensions directly. WiltonDB has several re-compiled extensions such as system_stats, pg_hint_plan and tds_fdw.

The cluster will listen on the default PostgreSQL port and the default MSSQL 1433 port, providing MSSQL services via the TDS WireProtocol on this port. You can connect to the MSSQL service provided by Pigsty using any MSSQL client, such as SQL Server Management Studio, or using the sqlcmd command-line tool.


Notes

When installing and deploying the MSSQL module, please pay special attention to the following points:

  • WiltonDB is available on EL (7/8/9) and Ubuntu (20.04/22.04) but not available on Debian systems.
  • WiltonDB is currently compiled based on PostgreSQL 15, so you need to specify pg_version: 15.
  • On EL systems, the wiltondb binary is installed by default in the /usr/bin/ directory, while on Ubuntu systems, it is installed in the /usr/lib/postgresql/15/bin/ directory, which is different from the official PostgreSQL binary location.
  • In WiltonDB compatibility mode, the HBA password authentication rule needs to use md5 instead of scram-sha-256. Therefore, you need to override Pigsty’s default HBA rule set and insert the md5 authentication rule required by SQL Server before the dbrole_readonly wildcard authentication rule.
  • WiltonDB can only be enabled for a primary database, and you should designate a user as the Babelfish superuser, allowing Babelfish to create databases and users. The default is mssql and dbuser_myssql. If you change this, you should also modify the user in files/mssql.sql.
  • The WiltonDB TDS cable protocol compatibility plugin babelfishpg_tds needs to be enabled in shared_preload_libraries.
  • After enabling the WiltonDB extension, it listens on the default MSSQL port 1433. You can override Pigsty’s default service definitions to redirect the primary and replica services to port 1433 instead of the 5432 / 6432ports.

The following parameters need to be configured for the MSSQL database cluster:

#----------------------------------#
# PGSQL & MSSQL (Babelfish & Wilton)
#----------------------------------#
# PG Installation
node_repo_modules: local,node,mssql # add mssql and os upstream repos
pg_mode: mssql                      # Microsoft SQL Server Compatible Mode
pg_libs: 'babelfishpg_tds, pg_stat_statements, auto_explain' # add timescaledb to shared_preload_libraries
pg_version: 15                      # The current WiltonDB major version is 15
pg_packages:
  - wiltondb                        # install forked version of postgresql with babelfishpg support
  - patroni pgbouncer pgbackrest pg_exporter pgbadger vip-manager
pg_extensions: []                   # do not install any vanilla postgresql extensions

# PG Provision
pg_default_hba_rules:               # overwrite default HBA rules for babelfish cluster
- {user: '${dbsu}'    ,db: all         ,addr: local     ,auth: ident ,title: 'dbsu access via local os user ident'  }
- {user: '${dbsu}'    ,db: replication ,addr: local     ,auth: ident ,title: 'dbsu replication from local os ident' }
- {user: '${repl}'    ,db: replication ,addr: localhost ,auth: pwd   ,title: 'replicator replication from localhost'}
- {user: '${repl}'    ,db: replication ,addr: intra     ,auth: pwd   ,title: 'replicator replication from intranet' }
- {user: '${repl}'    ,db: postgres    ,addr: intra     ,auth: pwd   ,title: 'replicator postgres db from intranet' }
- {user: '${monitor}' ,db: all         ,addr: localhost ,auth: pwd   ,title: 'monitor from localhost with password' }
- {user: '${monitor}' ,db: all         ,addr: infra     ,auth: pwd   ,title: 'monitor from infra host with password'}
- {user: '${admin}'   ,db: all         ,addr: infra     ,auth: ssl   ,title: 'admin @ infra nodes with pwd & ssl'   }
- {user: '${admin}'   ,db: all         ,addr: world     ,auth: ssl   ,title: 'admin @ everywhere with ssl & pwd'    }
- {user: dbuser_mssql ,db: mssql       ,addr: intra     ,auth: md5   ,title: 'allow mssql dbsu intranet access'     } # <--- use md5 auth method for mssql user
- {user: '+dbrole_readonly',db: all    ,addr: localhost ,auth: pwd   ,title: 'pgbouncer read/write via local socket'}
- {user: '+dbrole_readonly',db: all    ,addr: intra     ,auth: pwd   ,title: 'read/write biz user via password'     }
- {user: '+dbrole_offline' ,db: all    ,addr: intra     ,auth: pwd   ,title: 'allow etl offline tasks from intranet'}
pg_default_services:                # route primary & replica service to mssql port 1433
- { name: primary ,port: 5433 ,dest: 1433  ,check: /primary   ,selector: "[]" }
- { name: replica ,port: 5434 ,dest: 1433  ,check: /read-only ,selector: "[]" , backup: "[? pg_role == `primary` || pg_role == `offline` ]" }
- { name: default ,port: 5436 ,dest: postgres ,check: /primary   ,selector: "[]" }
- { name: offline ,port: 5438 ,dest: postgres ,check: /replica   ,selector: "[? pg_role == `offline` || pg_offline_query ]" , backup: "[? pg_role == `replica` && !pg_offline_query]"}

You can define business database & users in the pg_databases and pg_users section:

#----------------------------------#
# pgsql (singleton on current node)
#----------------------------------#
# this is an example single-node postgres cluster with postgis & timescaledb installed, with one biz database & two biz users
pg-meta:
  hosts:
    10.10.10.10: { pg_seq: 1, pg_role: primary } # <---- primary instance with read-write capability
  vars:
    pg_cluster: pg-test
    pg_users:                           # create MSSQL superuser
      - {name: dbuser_mssql ,password: DBUser.MSSQL ,superuser: true, pgbouncer: true ,roles: [dbrole_admin], comment: superuser & owner for babelfish  }
    pg_primary_db: mssql                # use `mssql` as the primary sql server database
    pg_databases:
      - name: mssql
        baseline: mssql.sql             # init babelfish database & user
        extensions:
          - { name: uuid-ossp          }
          - { name: babelfishpg_common }
          - { name: babelfishpg_tsql   }
          - { name: babelfishpg_tds    }
          - { name: babelfishpg_money  }
          - { name: pg_hint_plan       }
          - { name: system_stats       }
          - { name: tds_fdw            }
        owner: dbuser_mssql
        parameters: { 'babelfishpg_tsql.migration_mode' : 'multi-db' }
        comment: babelfish cluster, a MSSQL compatible pg cluster

Client Access

You can use any SQL Server compatible client tool to access this database cluster.

Microsoft provides sqlcmd as the official command-line tool.

Besides, they have a go version cli tool: go-sqlcmd

Install go-sqlcmd:

curl -LO https://github.com/microsoft/go-sqlcmd/releases/download/v1.4.0/sqlcmd-v1.4.0-linux-amd64.tar.bz2
tar xjvf sqlcmd-v1.4.0-linux-amd64.tar.bz2
sudo mv sqlcmd* /usr/bin/

Get started with go-sqlcmd

$ sqlcmd -S 10.10.10.10,1433 -U dbuser_mssql -P DBUser.MSSQL
1> select @@version
2> go
version                                                                                                                                                                                                                                                         
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Babelfish for PostgreSQL with SQL Server Compatibility - 12.0.2000.8
Oct 22 2023 17:48:32
Copyright (c) Amazon Web Services
PostgreSQL 15.4 (EL 1:15.4.wiltondb3.3_2-2.el8) on x86_64-redhat-linux-gnu (Babelfish 3.3.0)                                        

(1 row affected)

You can route service traffic to MSSQL 1433 port instead of 5433/5434:

# route 5433 on all members to 1433 on primary  
sqlcmd -S 10.10.10.11,5433 -U dbuser_mssql -P DBUser.MSSQL

# route 5434 on all members to 1433 on replicas
sqlcmd -S 10.10.10.11,5434 -U dbuser_mssql -P DBUser.MSSQL

Install

If you have the Internet access, you can add the WiltonDB repository to the node and install it as a node package directly:

node_repo_modules: local,node,pgsql,mssql
node_packages: [ wiltondb ]

Install wiltondb with the following command:

./node.yml -t node_repo,node_pkg

It’s OK to install vanilla PostgreSQL and WiltonDB on the same node, but you can only run one of them at a time, and this is not recommended for production environments.


Extensions

Most of the PGSQL module’s extensions (non-SQL class) cannot be used directly on the WiltonDB core of the MSSQL module and need to be recompiled.

WiltonDB currently comes with the following extension plugins:

Name Version Comment
dblink 1.2 connect to other PostgreSQL databases from within a database
adminpack 2.1 administrative functions for PostgreSQL
dict_int 1.0 text search dictionary template for integers
intagg 1.1 integer aggregator and enumerator (obsolete)
dict_xsyn 1.0 text search dictionary template for extended synonym processing
amcheck 1.3 functions for verifying relation integrity
autoinc 1.0 functions for autoincrementing fields
bloom 1.0 bloom access method - signature file based index
fuzzystrmatch 1.1 determine similarities and distance between strings
intarray 1.5 functions, operators, and index support for 1-D arrays of integers
btree_gin 1.3 support for indexing common datatypes in GIN
btree_gist 1.7 support for indexing common datatypes in GiST
hstore 1.8 data type for storing sets of (key, value) pairs
hstore_plperl 1.0 transform between hstore and plperl
isn 1.2 data types for international product numbering standards
hstore_plperlu 1.0 transform between hstore and plperlu
jsonb_plperl 1.0 transform between jsonb and plperl
citext 1.6 data type for case-insensitive character strings
jsonb_plperlu 1.0 transform between jsonb and plperlu
jsonb_plpython3u 1.0 transform between jsonb and plpython3u
cube 1.5 data type for multidimensional cubes
hstore_plpython3u 1.0 transform between hstore and plpython3u
earthdistance 1.1 calculate great-circle distances on the surface of the Earth
lo 1.1 Large Object maintenance
file_fdw 1.0 foreign-data wrapper for flat file access
insert_username 1.0 functions for tracking who changed a table
ltree 1.2 data type for hierarchical tree-like structures
ltree_plpython3u 1.0 transform between ltree and plpython3u
pg_walinspect 1.0 functions to inspect contents of PostgreSQL Write-Ahead Log
moddatetime 1.0 functions for tracking last modification time
old_snapshot 1.0 utilities in support of old_snapshot_threshold
pgcrypto 1.3 cryptographic functions
pgrowlocks 1.2 show row-level locking information
pageinspect 1.11 inspect the contents of database pages at a low level
pg_surgery 1.0 extension to perform surgery on a damaged relation
seg 1.4 data type for representing line segments or floating-point intervals
pgstattuple 1.5 show tuple-level statistics
pg_buffercache 1.3 examine the shared buffer cache
pg_freespacemap 1.2 examine the free space map (FSM)
postgres_fdw 1.1 foreign-data wrapper for remote PostgreSQL servers
pg_prewarm 1.2 prewarm relation data
tcn 1.0 Triggered change notifications
pg_trgm 1.6 text similarity measurement and index searching based on trigrams
xml2 1.1 XPath querying and XSLT
refint 1.0 functions for implementing referential integrity (obsolete)
pg_visibility 1.2 examine the visibility map (VM) and page-level visibility info
pg_stat_statements 1.10 track planning and execution statistics of all SQL statements executed
sslinfo 1.2 information about SSL certificates
tablefunc 1.0 functions that manipulate whole tables, including crosstab
tsm_system_rows 1.0 TABLESAMPLE method which accepts number of rows as a limit
tsm_system_time 1.0 TABLESAMPLE method which accepts time in milliseconds as a limit
unaccent 1.1 text search dictionary that removes accents
uuid-ossp 1.1 generate universally unique identifiers (UUIDs)
plpgsql 1.0 PL/pgSQL procedural language
babelfishpg_money 1.1.0 babelfishpg_money
system_stats 2.0 EnterpriseDB system statistics for PostgreSQL
tds_fdw 2.0.3 Foreign data wrapper for querying a TDS database (Sybase or Microsoft SQL Server)
babelfishpg_common 3.3.3 Transact SQL Datatype Support
babelfishpg_tds 1.0.0 TDS protocol extension
pg_hint_plan 1.5.1
babelfishpg_tsql 3.3.1 Transact SQL compatibility
  • Pigsty Pro offers the offline installation ability for MSSQL compatible extensions
  • Pigsty Pro offers MSSQL compatible extension porting services, which can port available extensions in the PGSQL module to the MSSQL cluster.

3 - IvorySQL (Oracle)

Run “Oracle-Compatible” PostgreSQL cluster with the IvorySQL Kernel open sourced by HighGo

Pigsty allows you to create PostgreSQL clusters with the IvorySQL kernel, which is a PostgreSQL fork that is compatible with Oracle SQL dialects.

Beware that ivorySQL packages are conflicting with the vanilla PostgreSQL packages, they are mutually exclusive. Pigsty Professional Edition provides offline installation solutions that include the IvorySQL kernel in another local repo.

The latest version of IvroySQL is 3.4, which is compatible with PostgreSQL 16.4. IvorySQL now only supports EL8/EL9.

The last version of IvorySQL that supports EL7 is 3.3, which corresponds to PostgreSQL 16.3.


Get Started

The following parameters need to be configured for the IvorySQL database cluster:

#----------------------------------#
# Ivory SQL Configuration
#----------------------------------#
node_repo_modules: local,node,pgsql,ivory  # add ivorysql upstream repo
pg_mode: ivory                    # IvorySQL Oracle Compatible Mode
pg_packages: [ 'ivorysql patroni pgbouncer pgbackrest pg_exporter pgbadger vip-manager' ]
pg_libs: 'liboracle_parser, pg_stat_statements, auto_explain'
pg_extensions: [ ]                # do not install any vanilla postgresql extensions

You have to dynamically load the liboracle_parser library to enable Oracle SQL compatibility.


Client Access

IvorySQL 3 is equivalent to PostgreSQL 16, you can connect to the IvorySQL cluster using any PostgreSQL compatible client tools.


Installation

If you have the Internet access, you can install the IvorySQL software package online by setting the following parameters:

node_repo_modules: local,node,pgsql,ivory
node_packages: [ ivorysql ]

And install the IvorySQL kernel and related software packages by running the following command:

./node.yml -t node_repo,node_pkg

Extensions

Most of the PGSQL modules’ extension (non-SQL classes) cannot be used directly on the IvorySQL kernel. If you need to use them, you need to recompile and install from source code for the new kernel.

Currently, the IvorySQL kernel comes with the following 101 extension plugins:

name version comment
hstore_plperl 1.0 transform between hstore and plperl
plisql 1.0 PL/iSQL procedural language
hstore_plperlu 1.0 transform between hstore and plperlu
adminpack 2.1 administrative functions for PostgreSQL
insert_username 1.0 functions for tracking who changed a table
dblink 1.2 connect to other PostgreSQL databases from within a database
dict_int 1.0 text search dictionary template for integers
amcheck 1.3 functions for verifying relation integrity
intagg 1.1 integer aggregator and enumerator (obsolete)
autoinc 1.0 functions for autoincrementing fields
bloom 1.0 bloom access method - signature file based index
dict_xsyn 1.0 text search dictionary template for extended synonym processing
btree_gin 1.3 support for indexing common datatypes in GIN
earthdistance 1.1 calculate great-circle distances on the surface of the Earth
file_fdw 1.0 foreign-data wrapper for flat file access
fuzzystrmatch 1.2 determine similarities and distance between strings
btree_gist 1.7 support for indexing common datatypes in GiST
intarray 1.5 functions, operators, and index support for 1-D arrays of integers
citext 1.6 data type for case-insensitive character strings
isn 1.2 data types for international product numbering standards
ivorysql_ora 1.0 Oracle Compatible extenison on Postgres Database
jsonb_plperl 1.0 transform between jsonb and plperl
cube 1.5 data type for multidimensional cubes
dummy_index_am 1.0 dummy_index_am - index access method template
dummy_seclabel 1.0 Test code for SECURITY LABEL feature
hstore 1.8 data type for storing sets of (key, value) pairs
jsonb_plperlu 1.0 transform between jsonb and plperlu
lo 1.1 Large Object maintenance
ltree 1.2 data type for hierarchical tree-like structures
moddatetime 1.0 functions for tracking last modification time
old_snapshot 1.0 utilities in support of old_snapshot_threshold
ora_btree_gin 1.0 support for indexing oracle datatypes in GIN
pg_trgm 1.6 text similarity measurement and index searching based on trigrams
ora_btree_gist 1.0 support for oracle indexing common datatypes in GiST
pg_visibility 1.2 examine the visibility map (VM) and page-level visibility info
pg_walinspect 1.1 functions to inspect contents of PostgreSQL Write-Ahead Log
pgcrypto 1.3 cryptographic functions
pgstattuple 1.5 show tuple-level statistics
pageinspect 1.12 inspect the contents of database pages at a low level
pgrowlocks 1.2 show row-level locking information
pg_buffercache 1.4 examine the shared buffer cache
pg_stat_statements 1.10 track planning and execution statistics of all SQL statements executed
pg_freespacemap 1.2 examine the free space map (FSM)
plsample 1.0 PL/Sample
pg_prewarm 1.2 prewarm relation data
pg_surgery 1.0 extension to perform surgery on a damaged relation
seg 1.4 data type for representing line segments or floating-point intervals
postgres_fdw 1.1 foreign-data wrapper for remote PostgreSQL servers
refint 1.0 functions for implementing referential integrity (obsolete)
test_ext_req_schema1 1.0 Required extension to be referenced
spgist_name_ops 1.0 Test opclass for SP-GiST
test_ext_req_schema2 1.0 Test schema referencing of required extensions
test_shm_mq 1.0 Test code for shared memory message queues
sslinfo 1.2 information about SSL certificates
test_slru 1.0 Test code for SLRU
tablefunc 1.0 functions that manipulate whole tables, including crosstab
bool_plperl 1.0 transform between bool and plperl
tcn 1.0 Triggered change notifications
test_ext_req_schema3 1.0 Test schema referencing of 2 required extensions
test_bloomfilter 1.0 Test code for Bloom filter library
test_copy_callbacks 1.0 Test code for COPY callbacks
test_ginpostinglist 1.0 Test code for ginpostinglist.c
test_custom_rmgrs 1.0 Test code for custom WAL resource managers
test_integerset 1.0 Test code for integerset
test_ddl_deparse 1.0 Test code for DDL deparse feature
tsm_system_rows 1.0 TABLESAMPLE method which accepts number of rows as a limit
test_ext1 1.0 Test extension 1
tsm_system_time 1.0 TABLESAMPLE method which accepts time in milliseconds as a limit
test_ext2 1.0 Test extension 2
unaccent 1.1 text search dictionary that removes accents
test_ext3 1.0 Test extension 3
test_ext4 1.0 Test extension 4
uuid-ossp 1.1 generate universally unique identifiers (UUIDs)
test_ext5 1.0 Test extension 5
worker_spi 1.0 Sample background worker
test_ext6 1.0 test_ext6
test_lfind 1.0 Test code for optimized linear search functions
xml2 1.1 XPath querying and XSLT
test_ext7 1.0 Test extension 7
plpgsql 1.0 PL/pgSQL procedural language
test_ext8 1.0 Test extension 8
test_parser 1.0 example of a custom parser for full-text search
test_pg_dump 1.0 Test pg_dump with an extension
test_ext_cine 1.0 Test extension using CREATE IF NOT EXISTS
test_predtest 1.0 Test code for optimizer/util/predtest.c
test_ext_cor 1.0 Test extension using CREATE OR REPLACE
test_rbtree 1.0 Test code for red-black tree library
test_ext_cyclic1 1.0 Test extension cyclic 1
test_ext_cyclic2 1.0 Test extension cyclic 2
test_ext_extschema 1.0 test @extschema@
test_regex 1.0 Test code for backend/regex/
test_ext_evttrig 1.0 Test extension - event trigger
bool_plperlu 1.0 transform between bool and plperlu
plperl 1.0 PL/Perl procedural language
plperlu 1.0 PL/PerlU untrusted procedural language
hstore_plpython3u 1.0 transform between hstore and plpython3u
jsonb_plpython3u 1.0 transform between jsonb and plpython3u
ltree_plpython3u 1.0 transform between ltree and plpython3u
plpython3u 1.0 PL/Python3U untrusted procedural language
pltcl 1.0 PL/Tcl procedural language
pltclu 1.0 PL/TclU untrusted procedural language

4 - PolarDB PG (RAC)

Replace vanilla PostgreSQL with PolarDB PG, which is an OSS Aurora similar to Oracle RAC

You can deploy an Aurora flavor of PostgreSQL, PolarDB, in Pigsty.

PolarDB is a distributed, shared-nothing, and high-availability database system that is compatible with PostgreSQL 15, open sourced by Aliyun.


Notes

The following parameters need to be tuned to deploy a PolarDB cluster:

#----------------------------------#
# PGSQL & PolarDB
#----------------------------------#
pg_version: 15
pg_packages: [ 'polardb patroni pgbouncer pgbackrest pg_exporter pgbadger vip-manager' ]
pg_extensions: [ ]                # do not install any vanilla postgresql extensions
pg_mode: polar                    # PolarDB Compatible Mode
pg_default_roles:                 # default roles and users in postgres cluster
  - { name: dbrole_readonly  ,login: false ,comment: role for global read-only access     }
  - { name: dbrole_offline   ,login: false ,comment: role for restricted read-only access }
  - { name: dbrole_readwrite ,login: false ,roles: [dbrole_readonly] ,comment: role for global read-write access }
  - { name: dbrole_admin     ,login: false ,roles: [pg_monitor, dbrole_readwrite] ,comment: role for object creation }
  - { name: postgres     ,superuser: true  ,comment: system superuser }
  - { name: replicator   ,superuser: true  ,replication: true ,roles: [pg_monitor, dbrole_readonly] ,comment: system replicator } # <- superuser is required for replication
  - { name: dbuser_dba   ,superuser: true  ,roles: [dbrole_admin]  ,pgbouncer: true ,pool_mode: session, pool_connlimit: 16 ,comment: pgsql admin user }
  - { name: dbuser_monitor ,roles: [pg_monitor] ,pgbouncer: true ,parameters: {log_min_duration_statement: 1000 } ,pool_mode: session ,pool_connlimit: 8 ,comment: pgsql monitor user }

Client Access

PolarDB for PostgreSQL is essentially equivalent to PostgreSQL 11, and any client tools compatible with the PostgreSQL wire protocol can access the PolarDB cluster.


Installation

If your environment has internet access, you can directly add the PolarDB repository to the node and install it as a node package:

node_repo_modules: local,node,pgsql
node_packages: [ polardb ]

ånd then install the PolarDB kernel pacakge with the following command:

./node.yml -t node_repo,node_pkg

Extensions

Most of the PGSQL module’s extension (non pure-SQL) cannot be used directly on the PolarDB kernel. If you need to use them, you need to recompile and install from source code for the new kernel.

Currently, the PolarDB kernel comes with the following 61 extension plugins. In addition to Contrib extensions, the additional extensions provided include:

  • polar_csn 1.0 : polar_csn
  • polar_monitor 1.2 : examine the polardb information
  • polar_monitor_preload 1.1 : examine the polardb information
  • polar_parameter_check 1.0 : kernel extension for parameter validation
  • polar_px 1.0 : Parallel Execution extension
  • polar_stat_env 1.0 : env stat functions for PolarDB
  • polar_stat_sql 1.3 : Kernel statistics gathering, and sql plan nodes information gathering
  • polar_tde_utils 1.0 : Internal extension for TDE
  • polar_vfs 1.0 : polar_vfs
  • polar_worker 1.0 : polar_worker
  • timetravel 1.0 : functions for implementing time travel
  • vector 0.5.1 : vector data type and ivfflat and hnsw access methods
  • smlar 1.0 : compute similary of any one-dimensional arrays

Here is the list of extensions provided by the PolarDB kernel:

name version comment
hstore_plpython2u 1.0 transform between hstore and plpython2u
dict_int 1.0 text search dictionary template for integers
adminpack 2.0 administrative functions for PostgreSQL
hstore_plpython3u 1.0 transform between hstore and plpython3u
amcheck 1.1 functions for verifying relation integrity
hstore_plpythonu 1.0 transform between hstore and plpythonu
autoinc 1.0 functions for autoincrementing fields
insert_username 1.0 functions for tracking who changed a table
bloom 1.0 bloom access method - signature file based index
file_fdw 1.0 foreign-data wrapper for flat file access
dblink 1.2 connect to other PostgreSQL databases from within a database
btree_gin 1.3 support for indexing common datatypes in GIN
fuzzystrmatch 1.1 determine similarities and distance between strings
lo 1.1 Large Object maintenance
intagg 1.1 integer aggregator and enumerator (obsolete)
btree_gist 1.5 support for indexing common datatypes in GiST
hstore 1.5 data type for storing sets of (key, value) pairs
intarray 1.2 functions, operators, and index support for 1-D arrays of integers
citext 1.5 data type for case-insensitive character strings
cube 1.4 data type for multidimensional cubes
hstore_plperl 1.0 transform between hstore and plperl
isn 1.2 data types for international product numbering standards
jsonb_plperl 1.0 transform between jsonb and plperl
dict_xsyn 1.0 text search dictionary template for extended synonym processing
hstore_plperlu 1.0 transform between hstore and plperlu
earthdistance 1.1 calculate great-circle distances on the surface of the Earth
pg_prewarm 1.2 prewarm relation data
jsonb_plperlu 1.0 transform between jsonb and plperlu
pg_stat_statements 1.6 track execution statistics of all SQL statements executed
jsonb_plpython2u 1.0 transform between jsonb and plpython2u
jsonb_plpython3u 1.0 transform between jsonb and plpython3u
jsonb_plpythonu 1.0 transform between jsonb and plpythonu
pg_trgm 1.4 text similarity measurement and index searching based on trigrams
pgstattuple 1.5 show tuple-level statistics
ltree 1.1 data type for hierarchical tree-like structures
ltree_plpython2u 1.0 transform between ltree and plpython2u
pg_visibility 1.2 examine the visibility map (VM) and page-level visibility info
ltree_plpython3u 1.0 transform between ltree and plpython3u
ltree_plpythonu 1.0 transform between ltree and plpythonu
seg 1.3 data type for representing line segments or floating-point intervals
moddatetime 1.0 functions for tracking last modification time
pgcrypto 1.3 cryptographic functions
pgrowlocks 1.2 show row-level locking information
pageinspect 1.7 inspect the contents of database pages at a low level
pg_buffercache 1.3 examine the shared buffer cache
pg_freespacemap 1.2 examine the free space map (FSM)
tcn 1.0 Triggered change notifications
plperl 1.0 PL/Perl procedural language
uuid-ossp 1.1 generate universally unique identifiers (UUIDs)
plperlu 1.0 PL/PerlU untrusted procedural language
refint 1.0 functions for implementing referential integrity (obsolete)
xml2 1.1 XPath querying and XSLT
plpgsql 1.0 PL/pgSQL procedural language
plpython3u 1.0 PL/Python3U untrusted procedural language
pltcl 1.0 PL/Tcl procedural language
pltclu 1.0 PL/TclU untrusted procedural language
polar_csn 1.0 polar_csn
sslinfo 1.2 information about SSL certificates
polar_monitor 1.2 examine the polardb information
polar_monitor_preload 1.1 examine the polardb information
polar_parameter_check 1.0 kernel extension for parameter validation
polar_px 1.0 Parallel Execution extension
tablefunc 1.0 functions that manipulate whole tables, including crosstab
polar_stat_env 1.0 env stat functions for PolarDB
smlar 1.0 compute similary of any one-dimensional arrays
timetravel 1.0 functions for implementing time travel
tsm_system_rows 1.0 TABLESAMPLE method which accepts number of rows as a limit
polar_stat_sql 1.3 Kernel statistics gathering, and sql plan nodes information gathering
tsm_system_time 1.0 TABLESAMPLE method which accepts time in milliseconds as a limit
polar_tde_utils 1.0 Internal extension for TDE
polar_vfs 1.0 polar_vfs
polar_worker 1.0 polar_worker
unaccent 1.1 text search dictionary that removes accents
postgres_fdw 1.0 foreign-data wrapper for remote PostgreSQL servers
  • Pigsty Pro has offline installation support for PolarDB and its extensions
  • Pigsty has partnership with Aliyun and can provide PolarDB kernel enterprise support for enterprise users

5 - PolarDB O(racle)

The commercial version of PolarDB for Oracle, only available in Pigsty Enterprise Edition.

Oracle Compatible version, Fork of PolarDB PG.

This is not available in OSS version.

6 - Supabase (Firebase)

How to self-host Supabase with existing managed HA PostgreSQL cluster, and launch the stateless part with docker-compose?

Supabase is the open-source Firebase alternative based on PostgreSQL.

Pigsty allow you to self-host supabase with your own HA PostgreSQL clusters.

Notice: Supabase is GA since 2024.04.15

  • Quick Start
  • Database Provision
  • Docker Provision
  • Supabase Provision


Quick Start

To run supabase with existing postgres instance, prepare the postgres database with supa.yml template:

./configure -c supa  # use 1-node supabase template
./install.yml        # install pigsty
./docker.yml         # install docker

then launch the supabase - stateless part with the docker-compose file:

cd app/supabase; make up    # https://supabase.com/docs/guides/self-hosting/docker

If your IP address is not placeholder 10.10.10.10, change the .env accordingly before launching

Then you can access the supabase studio dashboard via http://<admin_ip>:8000 by default, the default dashboard username is supabase and password is pigsty.

You can also configure the infra_portal to expose the WebUI to the public through Nginx and SSL.


Postgres

Supabase require certain PostgreSQL extensions, schemas, and roles to work, which can be pre-configured by Pigsty: supa.yml.

It will create a single-node postgres cluster named pg-meta, with the default postgres database properly configured for supabase. and install some popular & necessary extensions.

The postgres will be migrated with the baseline supabase.sql which is gathered from supabase/postgres/migrations/db/migrations in chronological order and slightly modified to fit Pigsty.


Supabase

Supabase stateless part is managed by docker-compose, the docker-compose file we use here is a simplified version of github.com/supabase/docker/docker-compose.yml.

Everything you need to care about is in the .env file, which contains important settings for supabase. It is already configured to use the pg-meta.supa database by default, You have to change that according to your actual deployment.

############
# Secrets - YOU MUST CHANGE THESE BEFORE GOING INTO PRODUCTION
############
# you have to change the JWT_SECRET to a random string with at least 32 characters long
# and issue new ANON_KEY/SERVICE_ROLE_KEY JWT with that new secret, check the tutorial:
# https://supabase.com/docs/guides/self-hosting/docker#securing-your-services
JWT_SECRET=your-super-secret-jwt-token-with-at-least-32-characters-long
ANON_KEY=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyAgCiAgICAicm9sZSI6ICJhbm9uIiwKICAgICJpc3MiOiAic3VwYWJhc2UtZGVtbyIsCiAgICAiaWF0IjogMTY0MTc2OTIwMCwKICAgICJleHAiOiAxNzk5NTM1NjAwCn0.dc_X5iR_VP_qT0zsiyj_I_OZ2T9FtRU2BBNWN8Bu4GE
SERVICE_ROLE_KEY=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyAgCiAgICAicm9sZSI6ICJzZXJ2aWNlX3JvbGUiLAogICAgImlzcyI6ICJzdXBhYmFzZS1kZW1vIiwKICAgICJpYXQiOiAxNjQxNzY5MjAwLAogICAgImV4cCI6IDE3OTk1MzU2MDAKfQ.DaYlNEoUrrEn2Ig7tqibS-PHK5vgusbcbo7X36XVt4Q

############
# Dashboard - Credentials for the Supabase Studio WebUI
############
DASHBOARD_USERNAME=supabase         # change to your own username
DASHBOARD_PASSWORD=pigsty           # change to your own password

############
# Database - You can change these to any PostgreSQL database that has logical replication enabled.
############
POSTGRES_HOST=10.10.10.10           # change to Pigsty managed PostgreSQL cluster/instance VIP/IP/Hostname
POSTGRES_PORT=5432                  # you can use other service port such as 5433, 5436, 6432, etc...
POSTGRES_DB=supa                    # change to supabase database name, `supa` by default in pigsty
POSTGRES_PASSWORD=DBUser.Supa       # supabase dbsu password (shared by multiple supabase biz users)

Usually you’ll have to change these parameters accordingly. Here we’ll use fixed username, password and IP:Port database connstr for simplicity.

The postgres username is fixed as supabase_admin and the password is DBUser.Supa, change that according to your supa.yml And the supabase studio WebUI credential is managed by DASHBOARD_USERNAME and DASHBOARD_PASSWORD, which is supabase and pigsty by default.

The official tutorial: Self-Hosting with Docker just have all the details you need.

Once configured, you can launch the stateless part with docker-compose or make up shortcut:

cd ~/pigsty/app/supabase; make up    #  = docker compose up

Expose Service

The supabase studio dashboard is exposed on port 8000 by default, you can add this service to the infra_portal to expose it to the public through Nginx and SSL.

    infra_portal:                     # domain names and upstream servers
      # ...
      supa : { domain: supa.pigsty ,endpoint: "10.10.10.10:8000", websocket: true }

To expose the service, you can run the infra.yml playbook with the nginx tag:

./infra.yml -t nginx

Make suare supa.pigsty or your own domain is resolvable to the infra_portal server, and you can access the supabase studio dashboard via https://supa.pigsty.

7 - Greenplum (MPP)

Deploy and monitoring Greenplum/YMatrix MPP clusters with Pigsty

Pigsty has native support for Greenplum and its derivative distribution YMatrixDB.

It can deploy Greenplum clusters and monitor them with Pigsty.

To define a Greenplum cluster, you need to specify the following parameters:

Set pg_mode = gpsql and the extra identity parameters pg_shard and gp_role.

#================================================================#
#                        GPSQL Clusters                          #
#================================================================#

#----------------------------------#
# cluster: mx-mdw (gp master)
#----------------------------------#
mx-mdw:
  hosts:
    10.10.10.10: { pg_seq: 1, pg_role: primary , nodename: mx-mdw-1 }
  vars:
    gp_role: master          # this cluster is used as greenplum master
    pg_shard: mx             # pgsql sharding name & gpsql deployment name
    pg_cluster: mx-mdw       # this master cluster name is mx-mdw
    pg_databases:
      - { name: matrixmgr , extensions: [ { name: matrixdbts } ] }
      - { name: meta }
    pg_users:
      - { name: meta , password: DBUser.Meta , pgbouncer: true }
      - { name: dbuser_monitor , password: DBUser.Monitor , roles: [ dbrole_readonly ], superuser: true }

    pgbouncer_enabled: true                # enable pgbouncer for greenplum master
    pgbouncer_exporter_enabled: false      # enable pgbouncer_exporter for greenplum master
    pg_exporter_params: 'host=127.0.0.1&sslmode=disable'  # use 127.0.0.1 as local monitor host

#----------------------------------#
# cluster: mx-sdw (gp master)
#----------------------------------#
mx-sdw:
  hosts:
    10.10.10.11:
      nodename: mx-sdw-1        # greenplum segment node
      pg_instances:             # greenplum segment instances
        6000: { pg_cluster: mx-seg1, pg_seq: 1, pg_role: primary , pg_exporter_port: 9633 }
        6001: { pg_cluster: mx-seg2, pg_seq: 2, pg_role: replica , pg_exporter_port: 9634 }
    10.10.10.12:
      nodename: mx-sdw-2
      pg_instances:
        6000: { pg_cluster: mx-seg2, pg_seq: 1, pg_role: primary , pg_exporter_port: 9633  }
        6001: { pg_cluster: mx-seg3, pg_seq: 2, pg_role: replica , pg_exporter_port: 9634  }
    10.10.10.13:
      nodename: mx-sdw-3
      pg_instances:
        6000: { pg_cluster: mx-seg3, pg_seq: 1, pg_role: primary , pg_exporter_port: 9633 }
        6001: { pg_cluster: mx-seg1, pg_seq: 2, pg_role: replica , pg_exporter_port: 9634 }
  vars:
    gp_role: segment               # these are nodes for gp segments
    pg_shard: mx                   # pgsql sharding name & gpsql deployment name
    pg_cluster: mx-sdw             # these segment clusters name is mx-sdw
    pg_preflight_skip: true        # skip preflight check (since pg_seq & pg_role & pg_cluster not exists)
    pg_exporter_config: pg_exporter_basic.yml                             # use basic config to avoid segment server crash
    pg_exporter_params: 'options=-c%20gp_role%3Dutility&sslmode=disable'  # use gp_role = utility to connect to segments

Besides, you’ll need extra parameters to connect to Greenplum Segment instances for monitoring.

Since Greenplum is no longer Open-Sourced, this feature is only available in the Professional/Enterprise version and is not open-sourced at this time.

8 - Cloudberry (MPP)

Deploy Cloudberry MPP cluster, which is forked from Greenplum.

Install

Pigsty has cloudberry packages for EL 7/8/9:

./node.yml -t node_install  -e '{"node_repo_modules":"pgsql","node_packages":["cloudberrydb"]}'

9 - Neon (Serverless)

Self-Hosting serverless version of PostgreSQL from Neon, which is a powerful, truly scalable, and elastic service.

Neon adopts a storage and compute separation architecture, offering seamless features such as auto-scaling, Scale to Zero, and unique capabilities like database version branching.

Neon official website: https://neon.tech/

Due to the substantial size of Neon’s compiled binaries, they are not currently available to open-source users. If you need them, please contact Pigsty sales.