vchord

Vector database plugin for Postgres, written in Rust

Module:

PGEXT

Overview

PIGSTY 3rd Party Extension: vchord : Vector database plugin for Postgres, written in Rust

Information

Extension ID: 1810
Extension Name: vchord
Package Name: vchord
Category: RAG
License: AGPLv3
Website: https://github.com/tensorchord/VectorChord
Language: Rust
Extra Tags: pgrx
Comment:

Metadata

Latest Version: 0.3.0
Postgres Support: 17,16,15,14
Need Load: Explicit Loading Required
Need DDL: Need CREATE EXTENSION DDL
Relocatable: Can not install to arbitrary schema
Trusted: Untrusted, Require Superuser to Create
Schemas: N/A
Requires: vector

RPM / DEB

RPM Repo: PIGSTY
RPM Name: vchord_$v
RPM Ver : 0.3.0
RPM Deps: pgvector_$v
DEB Repo: PIGSTY
DEB Name: postgresql-$v-vchord
DEB Ver : 0.3.0
DEB Deps: postgresql-$v-pgvector

Availability

OS	Arch	PG17	PG16	PG15	PG14
`el8`	`x86_64`	`vchord_17` PIGSTY 0.3.0	`vchord_16` PIGSTY 0.3.0	`vchord_15` PIGSTY 0.3.0	`vchord_14` PIGSTY 0.3.0
`el8`	`aarch64`	`vchord_17` PIGSTY 0.3.0	`vchord_16` PIGSTY 0.3.0	`vchord_15` PIGSTY 0.3.0	`vchord_14` PIGSTY 0.3.0
`el9`	`x86_64`	`vchord_17` PIGSTY 0.3.0	`vchord_16` PIGSTY 0.3.0	`vchord_15` PIGSTY 0.3.0	`vchord_14` PIGSTY 0.3.0
`el9`	`aarch64`	`vchord_17` PIGSTY 0.3.0	`vchord_16` PIGSTY 0.3.0	`vchord_15` PIGSTY 0.3.0	`vchord_14` PIGSTY 0.3.0
`d12`	`x86_64`	`postgresql-17-vchord` PIGSTY 0.3.0	`postgresql-16-vchord` PIGSTY 0.3.0	`postgresql-15-vchord` PIGSTY 0.3.0	`postgresql-14-vchord` PIGSTY 0.3.0
`d12`	`aarch64`	`postgresql-17-vchord` PIGSTY 0.3.0	`postgresql-16-vchord` PIGSTY 0.3.0	`postgresql-15-vchord` PIGSTY 0.3.0	`postgresql-14-vchord` PIGSTY 0.3.0
`u22`	`x86_64`	`postgresql-17-vchord` PIGSTY 0.3.0	`postgresql-16-vchord` PIGSTY 0.3.0	`postgresql-15-vchord` PIGSTY 0.3.0	`postgresql-14-vchord` PIGSTY 0.3.0
`u22`	`aarch64`	`postgresql-17-vchord` PIGSTY 0.3.0	`postgresql-16-vchord` PIGSTY 0.3.0	`postgresql-15-vchord` PIGSTY 0.3.0	`postgresql-14-vchord` PIGSTY 0.3.0
`u24`	`x86_64`	`postgresql-17-vchord` PIGSTY 0.3.0	`postgresql-16-vchord` PIGSTY 0.3.0	`postgresql-15-vchord` PIGSTY 0.3.0	`postgresql-14-vchord` PIGSTY 0.3.0
`u24`	`aarch64`	`postgresql-17-vchord` PIGSTY 0.3.0	`postgresql-16-vchord` PIGSTY 0.3.0	`postgresql-15-vchord` PIGSTY 0.3.0	`postgresql-14-vchord` PIGSTY 0.3.0

Installation

Install vchord via the pig CLI tool:

pig ext install vchord

Install vchord via Pigsty playbook:

./pgsql.yml -t pg_extension -e '{"pg_extensions": ["vchord"]}' # -l <cls>

Install vchord RPM from YUM repo directly:

dnf install vchord_17;
dnf install vchord_16;
dnf install vchord_15;
dnf install vchord_14;

Install vchord DEB from APT repo directly:

apt install postgresql-17-vchord;
apt install postgresql-16-vchord;
apt install postgresql-15-vchord;
apt install postgresql-14-vchord;

Extension vchord has to be loaded via shared_preload_libraries

shared_preload_libraries = 'vchord'; # add to pg cluster config

Create vchord extension on PostgreSQL cluster:

CREATE EXTENSION vchord CASCADE;

Usage

Add this extension to shared_preload_libraries in postgresql.conf

CREATE EXTENSION vchord CASCADE;

Create Index on embedding:

CREATE INDEX ON gist_train USING vchordrq (embedding vector_l2_ops) WITH (options = $$
residual_quantization = true
[build.internal]
lists = [4096]
spherical_centroids = false
$$);

Docs

Query

The query statement is exactly the same as pgvector. VectorChord supports any filter operation and WHERE/JOIN clauses like pgvecto.rs with VBASE.

SELECT * FROM items ORDER BY embedding <-> '[3,1,2]' LIMIT 5;

Supported distance functions are:

<-> - L2 distance
<#> - (negative) inner product
<=> - cosine distance

Query Performance Tuning

You can fine-tune the search performance by adjusting the probes and epsilon parameters:

-- Set probes to control the number of lists scanned. 
-- Recommended range: 3%–10% of the total `lists` value.
SET vchordrq.probes = 100;

-- Set epsilon to control the reranking precision.
-- Larger value means more rerank for higher recall rate.
-- Don't change it unless you only have limited memory.
-- Recommended range: 1.0–1.9. Default value is 1.9.
SET vchordrq.epsilon = 1.9;

-- vchordrq relies on a projection matrix to optimize performance.
-- Add your vector dimensions to the `prewarm_dim` list to reduce latency.
-- If this is not configured, the first query will have higher latency as the matrix is generated on demand.
-- Default value: '64,128,256,384,512,768,1024,1536'
-- Note: This setting requires a database restart to take effect.
ALTER SYSTEM SET vchordrq.prewarm_dim = '64,128,256,384,512,768,1024,1536';

And for postgres’s setting

-- If using SSDs, set `effective_io_concurrency` to 200 for faster disk I/O.
SET effective_io_concurrency = 200;

-- Disable JIT (Just-In-Time Compilation) as it offers minimal benefit (1–2%) 
-- and adds overhead for single-query workloads.
SET jit = off;

-- Allocate at least 25% of total memory to `shared_buffers`. 
-- For disk-heavy workloads, you can increase this to up to 90% of total memory. You may also want to disable swap with network storage to avoid io hang.
-- Note: A restart is required for this setting to take effect.
ALTER SYSTEM SET shared_buffers = '8GB';

Indexing prewarm

To prewarm the index, you can use the following SQL. It will significantly improve performance when using limited memory.

-- vchordrq_prewarm(index_name::regclass) to prewarm the index into the shared buffer
SELECT vchordrq_prewarm('gist_train_embedding_idx'::regclass)"

Index Build Time

Index building can parallelized, and with external centroid precomputation, the total time is primarily limited by disk speed. Optimize parallelism using the following settings:

-- Set this to the number of CPU cores available for parallel operations.
SET max_parallel_maintenance_workers = 8;
SET max_parallel_workers = 8;

-- Adjust the total number of worker processes. 
-- Note: A restart is required for this setting to take effect.
ALTER SYSTEM SET max_worker_processes = 8;

Indexing Progress

You can check the indexing progress by querying the pg_stat_progress_create_index view.

SELECT phase, round(100.0 * blocks_done / nullif(blocks_total, 0), 1) AS "%" FROM pg_stat_progress_create_index;

External Index Precomputation

Unlike pure SQL, an external index precomputation will first do clustering outside and insert centroids to a PostgreSQL table. Although it might be more complicated, external build is definitely much faster on larger dataset (>5M).

To get started, you need to do a clustering of vectors using faiss, scikit-learn or any other clustering library.

The centroids should be preset in a table of any name with 3 columns:

id(integer): id of each centroid, should be unique
parent(integer, nullable): parent id of each centroid, should be NULL for normal clustering
vector(vector): representation of each centroid, pgvector vector type

And example could be like this:

-- Create table of centroids
CREATE TABLE public.centroids (id integer NOT NULL UNIQUE, parent integer, vector vector(768));
-- Insert centroids into it
INSERT INTO public.centroids (id, parent, vector) VALUES (1, NULL, '{0.1, 0.2, 0.3, ..., 0.768}');
INSERT INTO public.centroids (id, parent, vector) VALUES (2, NULL, '{0.4, 0.5, 0.6, ..., 0.768}');
INSERT INTO public.centroids (id, parent, vector) VALUES (3, NULL, '{0.7, 0.8, 0.9, ..., 0.768}');
-- ...

-- Create index using the centroid table
CREATE INDEX ON gist_train USING vchordrq (embedding vector_l2_ops) WITH (options = $$
[build.external]
table = 'public.centroids'
$$);

To simplify the workflow, we provide end-to-end scripts for external index pre-computation, see scripts.

Limitations

Data Type Support: Currently, only the f32 data type is supported for vectors.
Architecture Compatibility: The fast-scan kernel is optimized for x86_64 architectures. While it runs on aarch64, performance may be lower.
KMeans Clustering: The built-in KMeans clustering is not yet fully optimized and may require substantial memory. We strongly recommend using external centroid precomputation for efficient index construction.

Feedback

Was this page helpful?

Glad to hear it! Please tell us how we can improve.

Sorry to hear that. Please tell us how we can improve.

Last modified 2025-05-08: update extension catalog (65955b6)