Upgrading PXF

If you are using PXF in your current Greenplum Database installation, you must upgrade the PXF service when you upgrade to a new version of Greenplum Database.

The PXF upgrade procedure describes how to upgrade PXF in your Greenplum Database installation. This procedure uses PXF.from to refer to your currently-installed PXF version and PXF.to to refer to the PXF version installed when you upgrade to the new version of Greenplum Database.

Most PXF installations do not require modifications to PXF configuration files and should experience a seamless upgrade.

Note: Starting in Greenplum Database version 5.12.0, PXF no longer requires a Hadoop client installation. PXF now bundles all of the JAR files on which it depends, and loads these JARs at runtime. PXF still requires that the Hadoop configuration files reside in /etc/<hadoop-service>/conf.

The PXF upgrade procedure has two parts. You perform one procedure before, and one procedure after, you upgrade to a new version of Greenplum Database:

Step 1: PXF Pre-Upgrade Actions

Perform this procedure before you upgrade to a new version of Greenplum Database:

  1. Log in to the Greenplum Database master node and set up the environment. For example:

    $ ssh gpadmin@<gpmaster>
    gpadmin@gpmaster$ . /usr/local/greenplum-db/greenplum_path.sh
    
  2. Create a text file that lists your Greenplum Database segment hosts, one host name per line. For example, a file named seghostfile may include:

    seghost1
    seghost2
    seghost3
    
  3. Run the pxf stop command to stop PXF on each segment host. For example:

    $ gpadmin@gpmaster$ gpssh -e -v -f seghostfile "/usr/local/greenplum-db/pxf/bin/pxf stop"
    
  4. Back up the PXF.from configuration files found in the $GPHOME/pxf/conf/ directory. These files should be the same on all segment hosts, so you need only copy from one of the hosts. For example:

    gpadmin@gpmaster$ mkdir -p /save/pxf-from-conf
    gpadmin@gpmaster$ scp gpadmin@seghost1:/usr/local/greenplum-db/pxf/conf/* /save/pxf-from-conf/
    
  5. Note the locations of any custom JAR files that you may have added to your PXF.from installation. Save a copy of these JAR files in case they are removed or altered when you upgrade to the new version of Greenplum Database.

  6. Upgrade to the new version of Greenplum Database and then continue your PXF upgrade with Step 2: Upgrading PXF.

Step 2: Upgrading PXF

After you upgrade to the new version of Greenplum Database, perform the following procedure to upgrade and configure the PXF.to software:

  1. Log in to the Greenplum Database master node and set up the environment. For example:

    $ ssh gpadmin@<gpmaster>
    gpadmin@gpmaster$ . /usr/local/greenplum-db/greenplum_path.sh
    
  2. PXF user impersonation is on by default in Greenplum Database version 5.5.0 and later. If you are upgrading from an older PXF.from version, you must configure user impersonation for the underlying Hadoop services. Refer to Configuring User Impersonation and Proxying for instructions, including the configuration procedure to turn off PXF user impersonation.

  3. If you updated the pxf-env.sh configuration file in your PXF.from installation, re-apply those changes to the file, and copy the updated pxf-env.sh to all segment hosts. For example, if seghostfile contains a list, one-host-per-line, of the segment hosts in your Greenplum Database cluster:

    gpadmin@gpmaster$ vi /usr/local/greenplum-db/pxf/conf/pxf-env.sh
    gpadmin@gpmaster$ gpscp -v -f seghostfile /usr/local/greenplum-db/pxf/conf/pxf-env.sh =:/usr/local/greenplum-db/pxf/conf/pxf-env.sh
    
  4. Initialize PXF on each segment host as described in Initializing PXF.

  5. If you updated any of the pxf-profiles.xml, pxf-log4j.properties, or pxf-public.classpath configuration files in your PXF.from installation, re-apply those changes and copy the updated file(s) to all segment hosts. Refer to Step 3 above for a similar gpscp command.

    Note: Starting in Greenplum Database version 5.12.0, the package name for PXF classes was changed to use the prefix org.greenplum.*. If you are upgrading from an older PXF.from version and you customized the pxf-profiles.xml file, you must change any org.apache.hawq.pxf.* references to org.greenplum.pxf.* when you re-apply your changes.

  6. If you added additional JAR files to your PXF.from installation, copy them to the corresponding directory in your PXF.to installation on each segment host.

  7. Start PXF on each segment host as described in Starting PXF.