datasketches
Approximate analytics sketches and aggregates for PostgreSQL
Repository
apache/datasketches-postgresql
https://github.com/apache/datasketches-postgresql
Source
apache-datasketches-postgresql-1.7.0-src.tar.gz
apache-datasketches-postgresql-1.7.0-src.tar.gz
Overview
| Package | Version | Category | License | Language |
|---|---|---|---|---|
datasketches | 1.7.0 | FUNC | Apache-2.0 | C++ |
| ID | Extension | Bin | Lib | Load | Create | Trust | Reloc | Schema |
|---|---|---|---|---|---|---|---|---|
| 4690 | datasketches | No | Yes | No | Yes | No | Yes | - |
Built against Apache DataSketches C++ core 5.0.0.
Version
| Type | Repo | Version | PG Ver | Package | Deps |
|---|---|---|---|---|---|
| EXT | PIGSTY | 1.7.0 | 1817161514 | datasketches | - |
| RPM | PIGSTY | 1.7.0 | 1817161514 | datasketches_$v | - |
| DEB | PIGSTY | 1.7.0 | 1817161514 | postgresql-$v-datasketches | - |
Build
You can build the RPM / DEB packages for datasketches using pig build:
pig build pkg datasketches # build RPM / DEB packages
Install
You can install datasketches directly. First, make sure the PGDG and PIGSTY repositories are added and enabled:
pig repo add pgsql -u # Add repo and update cache
Install the extension using pig or apt/yum/dnf:
pig install datasketches; # Install for current active PG version
pig ext install -y datasketches -v 18 # PG 18
pig ext install -y datasketches -v 17 # PG 17
pig ext install -y datasketches -v 16 # PG 16
pig ext install -y datasketches -v 15 # PG 15
pig ext install -y datasketches -v 14 # PG 14
dnf install -y datasketches_18 # PG 18
dnf install -y datasketches_17 # PG 17
dnf install -y datasketches_16 # PG 16
dnf install -y datasketches_15 # PG 15
dnf install -y datasketches_14 # PG 14
apt install -y postgresql-18-datasketches # PG 18
apt install -y postgresql-17-datasketches # PG 17
apt install -y postgresql-16-datasketches # PG 16
apt install -y postgresql-15-datasketches # PG 15
apt install -y postgresql-14-datasketches # PG 14
Create Extension:
CREATE EXTENSION datasketches;
Usage
Sources: README, Apache DataSketches site PostgreSQL extension for approximate analytics sketches and aggregates.
CREATE EXTENSION datasketches;
The extension supports CPC, HLL, Theta, Array Of Doubles, KLL, Quantiles, and Frequent Strings sketches.
Sketch Families
- CPC for compact distinct counting.
- HLL for HyperLogLog-style distinct counting.
- Theta for distinct counting with set operations such as union, intersection, and A-not-B.
- Array Of Doubles for tuple sketches with arrays of double values per key.
- KLL for quantiles, ranks, PMF, and CDF estimation.
- Quantiles sketch for long-term support of distribution estimates.
- Frequent strings for tracking the heaviest items by count or weight.
Examples
SELECT cpc_sketch_to_string(cpc_sketch_build(1));
SELECT cpc_sketch_distinct(id) FROM random_ints_100m;
SELECT cpc_sketch_get_estimate(cpc_sketch_union(sketch)) FROM cpc_sketch_test;
SELECT theta_sketch_get_estimate(theta_sketch_union(sketch)) FROM theta_sketch_test;
SELECT theta_sketch_get_estimate(theta_sketch_intersection(sketch1, sketch2)) FROM theta_set_op_test;
SELECT hll_sketch_get_estimate(hll_sketch_union(sketch)) FROM hll_sketch_test;
SELECT hll_sketch_get_estimate(hll_sketch_union(hll_sketch_build(1), hll_sketch_build(2)));
SELECT kll_float_sketch_get_quantile(kll_float_sketch_merge(sketch), 0.5) FROM kll_float_sketch_test;
SELECT frequent_strings_sketch_result_no_false_negatives(frequent_strings_sketch_build(9, value), 1000000) FROM zipf_1p1_8k_100m;
Core Operations
- Build sketches with
*_sketch_build(...). - Merge or aggregate them with
*_sketch_union(...),*_sketch_merge(...), and sketch-specific set-operation helpers. - Read estimates with
*_sketch_get_estimate(...)and distribution helpers such askll_float_sketch_get_quantile(...).
Notes
- The README says the extension targets PostgreSQL 9.6 and higher and depends on Boost 1.75 and DataSketches C++ core 5.0.0 or later.
- The upstream examples emphasize additive analytics in data cubes, not exact replacement for normal aggregates.
Feedback
Was this page helpful?
Thanks for the feedback! Please let us know how we can improve.
Sorry to hear that. Please let us know how we can improve.