Using PXF with External Data
A newer version of this documentation is available. Use the version menu above to view the most up-to-date release of the Greenplum 5.x documentation.
The Greenplum Platform Extension Framework (PXF) provides parallel, high throughput data access and federated queries across heterogeneous data sources via built-in connectors that map a Greenplum Database external table definition to an external data source. This Greenplum Database extension is based on PXF from Apache HAWQ (incubating).
This topic describes the architecture of PXF and its integration with Greenplum Database.
This topic details the installation, configuration, and startup procedures for PXF and supporting clients.
This topic describes the procedure that you must perform to upgrade PXF when you install a new version of Greenplum Database.
This topic describes important PXF procedures and concepts, including enabling PXF for use in a database and PXF protocol and external table definitions.
This topic describes how to use the PXF HDFS connector and related profiles to read Text and Avro format HDFS files.
This topic describes how to use the PXF HDFS connector and related profiles to write Text and SequenceFile format binary data to HDFS files.
This topic describes how to use the PXF Hive connector and related profiles to read Hive tables stored in TextFile, RCFile, Parquet, and ORC storage formats.
This topic describes how to use the PXF HBase connector to read HBase table data.
This topic details the service- and database- level logging configuration procedures for PXF. It also identifies some common PXF errors and describes how to address PXF memory issues.