Using PXF with External Data
A newer version of this documentation is available. Use the version menu above to view the most up-to-date release of the Greenplum 5.x documentation.
The PXF Extension Framework (PXF) provides parallel, high throughput data access and federated queries across heterogeneous data sources via built-in connectors that map a Greenplum Database external table definition to an external data source. This Greenplum Database extension is based on PXF from Apache HAWQ (incubating).
-
This topic describes the architecture of PXF and its integration with Greenplum Database.
Installing and Configuring PXF
This topic details the PXF installation, configuration, and startup procedures.
-
This topic describes important PXF procedures and concepts, including enabling PXF for use in a database and PXF protocol and external table definitions.
-
This topic describes how to use the PXF HDFS connector and related profiles to read Text and Avro format HDFS files.
-
This topic describes how to use the PXF Hive connector and related profiles to read Hive tables stored in Text, RCFile, Parquet, and ORC storage formats.
-
This topic details the service- and database- level logging configuration procuredures for PXF. It also identifies some common PXF errors.