Accessing External Data with PXF

Accessing External Data with PXF

Data managed by your organization may already reside in external sources. The Greenplum Platform Extension Framework (PXF) provides access to this external data via built-in connectors that map an external data source to a Greenplum Database table definition.

PXF is installed with HDFS, Hive, HBase, and JDBC connectors. These connectors enable you to read external HDFS file system and Hive and HBase table data stored in text, Avro, JSON, RCFile, Parquet, SequenceFile, and ORC formats. You can use the JDBC connector to access an external SQL database.

Note: PXF supports filter pushdown in the Hive, HBase, and JDBC connectors.

The Greenplum Platform Extension Framework includes a protocol C library and a Java service. After you configure and initialize PXF, you start a single PXF JVM process on each Greenplum Database segment host. This long-running process concurrently serves multiple query requests.

For detailed information about the architecture of and using PXF, refer to the Greenplum Platform Extension Framework (PXF) documentation.