Initializing and Managing PXF

A newer version of this documentation is available. Use the version menu above to view the most up-to-date release of the Greenplum 5.x documentation.

You must initialize and start PXF before you can use the framework. You must also explicitly enable PXF in each database in which you plan to use it.

Initializing PXF

You must explicitly initialize the PXF service instance. This one-time initialization creates the PXF service web application. It also updates PXF configuration files to include information specific to your configuration and the connectors that you will use.

Prerequisites

Before initializing PXF in your Greenplum Database cluster:

Procedure

Perform the following procedure to initialize PXF on each segment host in your Greenplum Database cluster. You will use the gpssh utility to run a command on multiple hosts.

  1. Log in to the Greenplum Database master node and set up your environment:

    $ ssh gpadmin@<gpmaster>
    gpadmin@gpmaster$ . /usr/local/greenplum-db/greenplum_path.sh
    
  2. Create a text file that lists your Greenplum Database segment hosts, one host name per line. For example, a file named seghostfile may include:

    seghost1
    seghost2
    seghost3
    
  3. If not already present, install the unzip package on each Greenplum Database segment host. You must have operating system super user privileges to install packages:

    gpadmin@gpmaster$ gpssh -e -v -f seghostfile "sudo yum -y install unzip"
    
  4. Run the pxf init command to initialize the PXF service on each segment host. For example:

    gpadmin@gpmaster$ gpssh -e -v -f seghostfile "/usr/local/greenplum-db/pxf/bin/pxf init"
    

    The init command creates and initializes the PXF web application. It also creates the pxf-private.classpath file, which specifies the required PXF JAR files.

Starting PXF

After initializing PXF, you must explicitly start PXF on each segment host in your Greenplum Database cluster. The PXF service, once started, runs as the gpadmin user on default port 5888. Only the gpadmin user can start and stop the PXF service.

Perform the following procedure to start PXF on each segment host in your Greenplum Database cluster. You will use the gpssh command and a seghostfile to run the command on multiple hosts.

  1. Log in to the Greenplum Database master node and set up your environment:

    $ ssh gpadmin@<gpmaster>
    gpadmin@gpmaster$ . /usr/local/greenplum-db/greenplum_path.sh
    
  2. Run the pxf start command to start PXF on each segment host. For example:

    gpadmin@gpmaster$ gpssh -e -v -f seghostfile "/usr/local/greenplum-db/pxf/bin/pxf start"
    

Stopping PXF

If you must stop PXF, for example if you are upgrading PXF, you must explicitly stop PXF on each segment host in your Greenplum Database cluster. Only the gpadmin user can stop the PXF service.

Perform the following procedure to stop PXF on each segment host in your Greenplum Database cluster. You will use the gpssh command and a seghostfile to run the command on multiple hosts.

  1. Log in to the Greenplum Database master node and set up your environment:

    $ ssh gpadmin@<gpmaster>
    gpadmin@gpmaster$ . /usr/local/greenplum-db/greenplum_path.sh
    
  2. Run the pxf stop command to stop PXF on each segment host. For example:

    gpadmin@gpmaster$ gpssh -e -v -f seghostfile "/usr/local/greenplum-db/pxf/bin/pxf stop"
    

PXF Service Management

The pxf command supports init, start, stop, restart, and status operations. These operations run locally. That is, if you want to start or stop the PXF agent on a specific segment host, you can log in to the host and run the command. If you want to start or stop the PXF agent on multiple segment hosts, use the gpssh utility as shown above, or individually log in to each segment host and run the command.

Note: If you have configured PXF Hadoop connectors and you update your Hadoop (or Hive or HBase) configuration while the PXF service is running, you must copy any updated configuration files to each Greenplum Database segment host and restart PXF on each host.