pg_bulkload

pg_bulkload is a high speed data loading utility for PostgreSQL

Overview

PackageVersionCategoryLicenseLanguage
pg_bulkload3.1.23ETLBSD 3-ClauseC
IDExtensionBinLibLoadCreateTrustRelocSchema
9830pg_bulkloadYesYesNoYesNoNo-
Relatedfile_fdw aws_s3 db_migrator pg_fact_loader mysql_fdw oracle_fdw postgres_fdw pglogical

pg18 fixed by vonng

Version

TypeRepoVersionPG VerPackageDeps
EXTPIGSTY3.1.231817161514pg_bulkload-
RPMPGDG3.1.231817161514pg_bulkload_$v-
DEBPIGSTY3.1.231817161514postgresql-$v-pg-bulkload-
OS / PGPG18PG17PG16PG15PG14
el8.x86_64
el8.aarch64
el9.x86_64
el9.aarch64
el10.x86_64
el10.aarch64
d12.x86_64
d12.aarch64
d13.x86_64
d13.aarch64
PIGSTY 3.1.23
PIGSTY 3.1.23
PIGSTY 3.1.23
PIGSTY 3.1.23
PIGSTY 3.1.23
u22.x86_64
PIGSTY 3.1.23
PIGSTY 3.1.23
PIGSTY 3.1.23
PIGSTY 3.1.23
PIGSTY 3.1.23
u22.aarch64
PIGSTY 3.1.23
PIGSTY 3.1.23
PIGSTY 3.1.23
PIGSTY 3.1.23
PIGSTY 3.1.23
u24.x86_64
PIGSTY 3.1.23
PIGSTY 3.1.23
PIGSTY 3.1.23
PIGSTY 3.1.23
PIGSTY 3.1.23
u24.aarch64
PIGSTY 3.1.23
PIGSTY 3.1.23
PIGSTY 3.1.23
PIGSTY 3.1.23
PIGSTY 3.1.23

Build

You can build the RPM / DEB packages for pg_bulkload using pig build:

pig build pkg pg_bulkload         # build RPM / DEB packages

Install

You can install pg_bulkload directly. First, make sure the PGDG and PIGSTY repositories are added and enabled:

pig repo add pgsql -u          # Add repo and update cache

Install the extension using pig or apt/yum/dnf:

pig install pg_bulkload;          # Install for current active PG version
pig ext install -y pg_bulkload -v 18  # PG 18
pig ext install -y pg_bulkload -v 17  # PG 17
pig ext install -y pg_bulkload -v 16  # PG 16
pig ext install -y pg_bulkload -v 15  # PG 15
pig ext install -y pg_bulkload -v 14  # PG 14
dnf install -y pg_bulkload_18       # PG 18
dnf install -y pg_bulkload_17       # PG 17
dnf install -y pg_bulkload_16       # PG 16
dnf install -y pg_bulkload_15       # PG 15
dnf install -y pg_bulkload_14       # PG 14
apt install -y postgresql-18-pg-bulkload   # PG 18
apt install -y postgresql-17-pg-bulkload   # PG 17
apt install -y postgresql-16-pg-bulkload   # PG 16
apt install -y postgresql-15-pg-bulkload   # PG 15
apt install -y postgresql-14-pg-bulkload   # PG 14

Create Extension:

CREATE EXTENSION pg_bulkload;

Usage

pg_bulkload: pg_bulkload is a high speed data loading utility for PostgreSQL

A high-speed data loading tool for PostgreSQL that bypasses shared buffers for massive data loads, with built-in ETL features for input validation and data transformation.

Basic Usage

Load data using a control file:

pg_bulkload sample_csv.ctl

Output:

NOTICE: BULK LOAD START
NOTICE: BULK LOAD END
    0 Rows skipped.
    8 Rows successfully loaded.
    0 Rows not loaded due to parse errors.
    0 Rows not loaded due to duplicate errors.
    0 Rows replaced with new rows.

Control File Example

# sample_csv.ctl
OUTPUT = my_table
INPUT = /path/to/data.csv
TYPE = CSV
DELIMITER = ,
QUOTE = "\""
ESCAPE = "\""
NULL = ""
SKIP = 1              # skip header row
PARSE_ERRORS = 100    # allow up to 100 parse errors
DUPLICATE_ERRORS = 0  # reject on duplicate key errors
ON_DUPLICATE_KEEP = NEW  # or OLD
TRUNCATE = NO

Loading Modes

  • DIRECT: Bypasses shared buffers, writes directly to data files (fastest)
  • PARALLEL: Uses multiple processes for loading
  • CSV/BINARY/FIXED: Supports various input formats

SQL Interface

-- Load data from within SQL
SELECT pg_bulkload(
    'OUTPUT = my_table, INPUT = /path/to/data.csv, TYPE = CSV'
);

Key Features

  • Bypasses PostgreSQL shared buffers for maximum throughput
  • Input data validation with configurable error thresholds
  • Duplicate key handling (keep new, keep old, or reject)
  • CSV, fixed-length, and binary input formats
  • Skip rows, filter functions for data transformation
  • Parallel loading support

Documentation

Full documentation: http://ossc-db.github.io/pg_bulkload/index.html


Last Modified 2026-03-12: add pg extension catalog (95749bf)