Migrating PXF from Greenplum 5

A newer version of this documentation is available. Use the version menu above to view the most up-to-date release of the Greenplum 6.x documentation.

If you are using PXF in your Greenplum Database 5 installation, upgrade Greenplum Database to version 5.21.2 or newer before you migrate PXF to Greenplum Database 6.

The PXF Greenplum Database 5 to 6 migration procedure has two parts. You perform one PXF procedure in your Greenplum Database 5 installation, then install, configure, and migrate data to Greenplum 6:

Prerequisites

Before migrating PXF from Greenplum 5 to Greenplum 6, ensure that you can:

  • Identify the version number of your Greenplum 5 installation. If it is older than 5.21.2, upgrade to a newer 5.x version.
  • Identify the version of PXF running in the Greenplum 5 cluster.
  • Determine if you have gphdfs external tables defined in your Greenplum 5 installation.
  • Identify the file system location of the PXF $PXF_CONF directory in your Greenplum 5 installation. In Greenplum 5.15 and later, $PXF_CONF identifies the user configuration directory that was provided to the pxf cluster init command. This directory contains PXF server configurations, security keytab files, and log files.

Step 1: Complete PXF Greenplum Database 5 Pre-Migration Actions

Perform this procedure in your Greenplum Database 5 installation:

  1. Log in to the Greenplum Database master node. For example:

    $ ssh gpadmin@<gp5master>
    
  2. Identify and note the Greenplum Database version number of your 5 installation. For example:

    gpadmin@gp5master$ psql -d postgres
    
    SELECT version();
    
               version 
    ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    PostgreSQL 8.3.23 (Greenplum Database 5.21.2 build commit:610b6d777436fe4a281a371cae85ac40f01f4f5e) on x86_64-pc-linux-gnu, compiled by GCC gcc (GCC) 6.2.0, 64-bit compiled on Aug  7 2019 20:38:47
    (1 row)
    
  3. Identify and note the PXF version number. For example:

    gpadmin@gp5master$ pxf version
    
  4. Greenplum 6 removes the gphdfs external table protocol. If you have gphdfs external tables defined in your Greenplum 5 installation, you must delete or migrate them to pxf as described in Migrating gphdfs External Tables to PXF.

  5. Stop PXF on each segment host as described in Stopping PXF.

  6. If you plan to install Greenplum Database 6 on a new set of hosts, be sure to save a copy of the $PXF_CONF directory in your Greenplum 5 installation.

  7. Install and configure Greenplum Database 6, migrate Greenplum 5 table definitions and data to your Greenplum 6 installation, and then continue your PXF migration with Step 2: Migrating PXF to Greenplum 6.

Step 2: Migrating PXF to Greenplum 6

After you install and configure Greenplum Database 6 and migrate table definitions and data from the Greenplum 5 installation, perform the following procedure to configure and start the new PXF software:

  1. Log in to the Greenplum Database 6 master node. For example:

    $ ssh gpadmin@<gp6master>
    
  2. Identify and note the version number of your Greenplum Database 6 installation.

  3. Identify the version of PXF installed on your Greenplum 6 hosts:

    gpadmin@gp6master$ pxf version
    
  4. If the version of PXF installed in your Greenplum 5 installation is newer than the version of PXF installed in your Greenplum 6 cluster, install the PXF distribution on your Greenplum 6 hosts as described in Installing PXF. Be sure to install a version of PXF that is the same, or newer, than the version of PXF installed in your Greenplum 5 cluster.

  5. If you installed Greenplum Database 6 on a new set of hosts, copy the $PXF_CONF directory from your Greenplum 5 installation to the master node. Consider copying the directory to the same file system location at which it resided in the 5 cluster. For example, if PXF_CONF=/usr/local/greenplum-pxf:

    gpadmin@gp6master$ scp -r gpadmin@<gp5master>:/usr/local/greenplum-pxf /usr/local/
    
  6. Initialize PXF on each segment host as described in Initializing PXF, specifying the PXF_CONF directory that you copied in the step above.

  7. Perform any version-applicable upgrade actions identified in Step 2 of the PXF 5.x upgrade documentation in your Greenplum Database 6 installation. Start with step 6 in the procedure.

  8. Synchronize the PXF configuration from the Greenplum Database 6 master host to the standby master and each segment host in the cluster. For example:

    gpadmin@gp6master$ pxf cluster sync
    
  9. Start PXF on each Greenplum Database 6 segment host as described in Starting PXF.

  10. Verify the migration by testing that each PXF external table can access the referenced data store.