Migrating PXF from Greenplum 5

If you are using PXF in your Greenplum Database 5 installation, Pivotal recommends that you upgrade Greenplum Database to version 5.21.2 or later before you migrate PXF to Greenplum Database 6. If you migrate from an earlier version of Greenplum 5, you will be required to perform additional migration steps in your Greenplum 6 installation.

The PXF Greenplum Database 5 to 6 migration procedure has two parts. You perform one PXF procedure in your Greenplum Database 5 installation, then install, configure, and migrate data to Greenplum 6:

Prerequisites

Before migrating PXF from Greenplum 5 to Greenplum 6, ensure that you can:

  • Identify the version number of your Greenplum 5 installation.
  • Determine if you have gphdfs external tables defined in your Greenplum 5 installation.
  • Identify the file system location of the PXF $PXF_CONF directory in your Greenplum 5 installation. In Greenplum 5.15 and later, $PXF_CONF identifies the user configuration directory that was provided to the pxf cluster init command. This directory contains PXF server configurations, security keytab files, and log files.

Step 1: Complete PXF Greenplum Database 5 Pre-Migration Actions

Perform this procedure in your Greenplum Database 5 installation:

  1. Log in to the Greenplum Database master node. For example:

    $ ssh gpadmin@<gp5master>
    
  2. Identify the Greenplum Database version number of your 5 installation. For example:

    gpadmin@gp5master$ psql -d postgres
    
    SELECT version();
    
               version 
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    PostgreSQL 8.3.23 (Greenplum Database 5.21.2 build commit:610b6d777436fe4a281a371cae85ac40f01f4f5e) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Aug  7 2019 20:38:47
    (1 row)
    

    If you are running a Greenplum Database version prior to 5.21.2, consider upgrading to version 5.21.2 as described in Upgrading PXF in the Greenplum 5 documentation.

  3. Greenplum 6 removes the gphdfs external table protocol. If you have gphdfs external tables defined in your Greenplum 5 installation, you must delete or migrate them to pxf as described in Migrating gphdfs External Tables to PXF.

  4. Stop PXF on each segment host as described in Stopping PXF.

  5. If you plan to install Greenplum Database 6 on a new set of hosts, be sure to save a copy of the $PXF_CONF directory in your Greenplum 5 installation.

  6. Install and configure Greenplum Database 6, migrate Greenplum 5 table definitions and data to your Greenplum 6 installation, and then continue your PXF migration with Step 2: Migrating PXF to Greenplum 6.

Step 2: Migrating PXF to Greenplum 6

After you install and configure Greenplum Database 6 and migrate table definitions and data from the Greenplum 5 installation, perform the following procedure to configure and start the new PXF software:

  1. Log in to the Greenplum Database 6 master node. For example:

    $ ssh gpadmin@<gp6master>
    
  2. If you installed Greenplum Database 6 on a new set of hosts, copy the $PXF_CONF directory from your Greenplum 5 installation to the master node. Consider copying the directory to the same file system location at which it resided in the 5 cluster. For example, if PXF_CONF=/usr/local/greenplum-pxf:

    gpadmin@gp6master$ scp -r gpadmin@<gp5master>:/usr/local/greenplum-pxf /usr/local/
    
  3. Initialize PXF on each segment host as described in Initializing PXF, specifying the PXF_CONF directory that you copied in the step above.

  4. If you are migrating from Greenplum Database version 5.21.1 or earlier, perform the version-applicable steps identified in the Greenplum Database 5.21 Upgrading PXF documentation in your Greenplum Database 6 installation. Start with step 4 in the procedure. (Note that this procedure identifies the actions required to upgrade PXF between Greenplum Database 5.x releases. These steps are required to configure a Greenplum version-6-compatible PXF.)

  5. Synchronize the PXF configuration from the Greenplum Database 6 master host to the standby master and each segment host in the cluster. For example:

    gpadmin@gp6master$ $GPHOME/pxf/bin/pxf cluster sync
    
  6. Start PXF on each Greenplum Database 6 segment host as described in Starting PXF.

  7. Verify the migration by testing that each PXF external table can access the referenced data store.