pg_bulkload
Overview
| Package | Version | Category | License | Language |
|---|---|---|---|---|
pg_bulkload | 3.1.23 | ETL | BSD 3-Clause | C |
| ID | Extension | Bin | Lib | Load | Create | Trust | Reloc | Schema |
|---|---|---|---|---|---|---|---|---|
| 9830 | pg_bulkload | Yes | Yes | No | Yes | No | No | - |
| Related | file_fdw aws_s3 db_migrator pg_fact_loader mysql_fdw oracle_fdw postgres_fdw pglogical |
|---|
pg18 fixed by vonng
Version
| Type | Repo | Version | PG Ver | Package | Deps |
|---|---|---|---|---|---|
| EXT | PIGSTY | 3.1.23 | 1817161514 | pg_bulkload | - |
| RPM | PGDG | 3.1.23 | 1817161514 | pg_bulkload_$v | - |
| DEB | PIGSTY | 3.1.23 | 1817161514 | postgresql-$v-pg-bulkload | - |
Build
You can build the RPM / DEB packages for pg_bulkload using pig build:
pig build pkg pg_bulkload # build RPM / DEB packages
Install
You can install pg_bulkload directly. First, make sure the PGDG and PIGSTY repositories are added and enabled:
pig repo add pgsql -u # Add repo and update cache
Install the extension using pig or apt/yum/dnf:
pig install pg_bulkload; # Install for current active PG version
pig ext install -y pg_bulkload -v 18 # PG 18
pig ext install -y pg_bulkload -v 17 # PG 17
pig ext install -y pg_bulkload -v 16 # PG 16
pig ext install -y pg_bulkload -v 15 # PG 15
pig ext install -y pg_bulkload -v 14 # PG 14
dnf install -y pg_bulkload_18 # PG 18
dnf install -y pg_bulkload_17 # PG 17
dnf install -y pg_bulkload_16 # PG 16
dnf install -y pg_bulkload_15 # PG 15
dnf install -y pg_bulkload_14 # PG 14
apt install -y postgresql-18-pg-bulkload # PG 18
apt install -y postgresql-17-pg-bulkload # PG 17
apt install -y postgresql-16-pg-bulkload # PG 16
apt install -y postgresql-15-pg-bulkload # PG 15
apt install -y postgresql-14-pg-bulkload # PG 14
Create Extension:
CREATE EXTENSION pg_bulkload;
Usage
pg_bulkload: pg_bulkload is a high speed data loading utility for PostgreSQL
A high-speed data loading tool for PostgreSQL that bypasses shared buffers for massive data loads, with built-in ETL features for input validation and data transformation.
Basic Usage
Load data using a control file:
pg_bulkload sample_csv.ctl
Output:
NOTICE: BULK LOAD START
NOTICE: BULK LOAD END
0 Rows skipped.
8 Rows successfully loaded.
0 Rows not loaded due to parse errors.
0 Rows not loaded due to duplicate errors.
0 Rows replaced with new rows.
Control File Example
# sample_csv.ctl
OUTPUT = my_table
INPUT = /path/to/data.csv
TYPE = CSV
DELIMITER = ,
QUOTE = "\""
ESCAPE = "\""
NULL = ""
SKIP = 1 # skip header row
PARSE_ERRORS = 100 # allow up to 100 parse errors
DUPLICATE_ERRORS = 0 # reject on duplicate key errors
ON_DUPLICATE_KEEP = NEW # or OLD
TRUNCATE = NO
Loading Modes
- DIRECT: Bypasses shared buffers, writes directly to data files (fastest)
- PARALLEL: Uses multiple processes for loading
- CSV/BINARY/FIXED: Supports various input formats
SQL Interface
-- Load data from within SQL
SELECT pg_bulkload(
'OUTPUT = my_table, INPUT = /path/to/data.csv, TYPE = CSV'
);
Key Features
- Bypasses PostgreSQL shared buffers for maximum throughput
- Input data validation with configurable error thresholds
- Duplicate key handling (keep new, keep old, or reject)
- CSV, fixed-length, and binary input formats
- Skip rows, filter functions for data transformation
- Parallel loading support
Documentation
Full documentation: http://ossc-db.github.io/pg_bulkload/index.html
Feedback
Was this page helpful?
Thanks for the feedback! Please let us know how we can improve.
Sorry to hear that. Please let us know how we can improve.