Installing and Configuring PXF

A newer version of this documentation is available. Use the version menu above to view the most up-to-date release of the Greenplum 5.x documentation.

PXF accesses Hadoop services on behalf of Greenplum Database end users. By default, PXF tries to access data source services (HDFS, Hive, HBase) using the identity of the Greenplum Database user account that logs into Greenplum Database. In order to support this functionality, you must configure proxy settings for Hadoop, as well as for Hive and HDFS if you intend to use those PXF connectors. Follow the procedures in:

The Greenplum Platform Extension Framework (PXF) provides connectors to Hadoop, Hive, and HBase data stores. To use these PXF connectors, you must install Hadoop, Hive, and HBase clients on each Greenplum Database segment host as described in this one-time installation and configuration procedure:

You must also configure and initialize PXF itself, and start the PXF service on each segment host: