count_distinct
An alternative to COUNT(DISTINCT …) aggregate, usable with HashAggregate
Repository
tvondra/count_distinct
https://github.com/tvondra/count_distinct
Source
count_distinct-3.0.2.tar.gz
count_distinct-3.0.2.tar.gz
Overview
| Package | Version | Category | License | Language |
|---|---|---|---|---|
count_distinct | 3.0.2 | FUNC | BSD 2-Clause | C |
| ID | Extension | Bin | Lib | Load | Create | Trust | Reloc | Schema |
|---|---|---|---|---|---|---|---|---|
| 4630 | count_distinct | No | Yes | No | Yes | No | Yes | - |
| Related | topn hll omnisketch ddsketch quantile lower_quantile first_last_agg aggs_for_arrays |
|---|
no pg14 on el8/9 pgdg
Version
| Type | Repo | Version | PG Ver | Package | Deps |
|---|---|---|---|---|---|
| EXT | MIXED | 3.0.2 | 1817161514 | count_distinct | - |
| RPM | PIGSTY | 3.0.2 | 1817161514 | count_distinct_$v | - |
| DEB | PIGSTY | 3.0.2 | 1817161514 | postgresql-$v-count-distinct | - |
Build
You can build the RPM / DEB packages for count_distinct using pig build:
pig build pkg count_distinct # build RPM / DEB packages
Install
You can install count_distinct directly. First, make sure the PGDG and PIGSTY repositories are added and enabled:
pig repo add pgsql -u # Add repo and update cache
Install the extension using pig or apt/yum/dnf:
pig install count_distinct; # Install for current active PG version
pig ext install -y count_distinct -v 18 # PG 18
pig ext install -y count_distinct -v 17 # PG 17
pig ext install -y count_distinct -v 16 # PG 16
pig ext install -y count_distinct -v 15 # PG 15
pig ext install -y count_distinct -v 14 # PG 14
dnf install -y count_distinct_18 # PG 18
dnf install -y count_distinct_17 # PG 17
dnf install -y count_distinct_16 # PG 16
dnf install -y count_distinct_15 # PG 15
dnf install -y count_distinct_14 # PG 14
apt install -y postgresql-18-count-distinct # PG 18
apt install -y postgresql-17-count-distinct # PG 17
apt install -y postgresql-16-count-distinct # PG 16
apt install -y postgresql-15-count-distinct # PG 15
apt install -y postgresql-14-count-distinct # PG 14
Create Extension:
CREATE EXTENSION count_distinct;
Usage
count_distinct: alternative to COUNT(DISTINCT …) with better performance
Provides an alternative to COUNT(DISTINCT ...) that avoids sorting and supports parallel aggregation.
CREATE EXTENSION count_distinct;
Functions
| Function | Description |
|---|---|
count_distinct(value anyelement) | Count distinct values (alternative to COUNT(DISTINCT ...)) |
array_agg_distinct(value anyelement) | Aggregate distinct values into an array |
count_distinct_elements(value anyarray) | Count distinct elements within input arrays |
array_agg_distinct_elements(value anyarray) | Aggregate distinct elements from input arrays |
Examples
CREATE TABLE test_table (id INT, val INT);
INSERT INTO test_table
SELECT mod(i, 1000), (1000 * random())::int
FROM generate_series(1, 10000000) s(i);
-- Instead of: SELECT id, COUNT(DISTINCT val) FROM test_table GROUP BY 1;
-- Use:
SELECT id, count_distinct(val) FROM test_table GROUP BY 1;
-- Aggregate distinct values into an array
SELECT id, array_agg_distinct(val) FROM test_table GROUP BY 1;
-- Count distinct elements across arrays
SELECT count_distinct_elements(ARRAY[1, 2, 2, 3]);
Feedback
Was this page helpful?
Thanks for the feedback! Please let us know how we can improve.
Sorry to hear that. Please let us know how we can improve.