hdfs_fdw
Overview
| Package | Version | Category | License | Language |
|---|---|---|---|---|
hdfs_fdw | 2.3.3 | FDW | BSD 3-Clause | C |
| ID | Extension | Bin | Lib | Load | Create | Trust | Reloc | Schema |
|---|---|---|---|---|---|---|---|---|
| 8740 | hdfs_fdw | No | Yes | No | Yes | No | No | - |
| Related | pg_parquet mongo_fdw kafka_fdw wrappers multicorn jdbc_fdw aws_s3 duckdb_fdw |
|---|
Version
| Type | Repo | Version | PG Ver | Package | Deps |
|---|---|---|---|---|---|
| EXT | PGDG | 2.3.3 | 1817161514 | hdfs_fdw | - |
| RPM | PGDG | 2.3.3 | 1817161514 | hdfs_fdw_$v | - |
Install
You can install hdfs_fdw directly. First, make sure the PGDG repository is added and enabled:
pig repo add pgdg -u # Add PGDG repo and update cache
Install the extension using pig or apt/yum/dnf:
pig install hdfs_fdw; # Install for current active PG version
pig ext install -y hdfs_fdw -v 18 # PG 18
pig ext install -y hdfs_fdw -v 17 # PG 17
pig ext install -y hdfs_fdw -v 16 # PG 16
pig ext install -y hdfs_fdw -v 15 # PG 15
pig ext install -y hdfs_fdw -v 14 # PG 14
dnf install -y hdfs_fdw_18 # PG 18
dnf install -y hdfs_fdw_17 # PG 17
dnf install -y hdfs_fdw_16 # PG 16
dnf install -y hdfs_fdw_15 # PG 15
dnf install -y hdfs_fdw_14 # PG 14
Create Extension:
CREATE EXTENSION hdfs_fdw;
Usage
Create Server
CREATE EXTENSION hdfs_fdw;
CREATE SERVER hdfs_server FOREIGN DATA WRAPPER hdfs_fdw
OPTIONS (host '127.0.0.1', port '10000', client_type 'hiveserver2');
Server Options: host (default localhost), port (default 10000), client_type (hiveserver2 or spark, default hiveserver2), auth_type (NOSASL or LDAP), connect_timeout (default 300), fetch_size (default 10000), log_remote_sql (default false), use_remote_estimate (default false), enable_join_pushdown (default true), enable_aggregate_pushdown (default true), enable_order_by_pushdown (default true).
Create User Mapping
CREATE USER MAPPING FOR postgres SERVER hdfs_server
OPTIONS (username 'hive_user', password 'hive_password');
For NOSASL authentication, omit the OPTIONS clause entirely.
Create Foreign Table
CREATE FOREIGN TABLE weblogs (
client_ip text,
http_status_code text,
uri text,
request_count bigint
)
SERVER hdfs_server
OPTIONS (dbname 'default', table_name 'weblogs');
Table Options: dbname (default default), table_name (defaults to foreign table name), enable_join_pushdown, enable_aggregate_pushdown, enable_order_by_pushdown.
Query
SELECT client_ip, count(*) FROM weblogs GROUP BY client_ip ORDER BY count(*) DESC LIMIT 10;
Spark Example
CREATE SERVER spark_server FOREIGN DATA WRAPPER hdfs_fdw
OPTIONS (host '127.0.0.1', port '10000', client_type 'spark');
CREATE USER MAPPING FOR postgres SERVER spark_server
OPTIONS (username 'spark_user', password 'spark_pass');
CREATE FOREIGN TABLE spark_table (
id int,
name text,
value double precision
)
SERVER spark_server
OPTIONS (dbname 'default', table_name 'my_table');
Pushdown Features
hdfs_fdw pushes down WHERE clauses, JOINs, aggregate functions, ORDER BY, and LIMIT/OFFSET to the remote Hive/Spark server. Control pushdown at the session level:
SET hdfs_fdw.enable_join_pushdown = on;
SET hdfs_fdw.enable_aggregate_pushdown = on;
SET hdfs_fdw.enable_order_by_pushdown = on;
SET hdfs_fdw.enable_limit_pushdown = on;
Feedback
Was this page helpful?
Thanks for the feedback! Please let us know how we can improve.
Sorry to hear that. Please let us know how we can improve.