Writing HDFS File Data with PXF

A newer version of this documentation is available. Click here to view the most up-to-date release of the Greenplum 5.x documentation.

The PXF HDFS connector supports writable external tables using the HdfsTextSimple and SequenceWritable profiles. You might create a writable table to export data from a Greenplum Database internal table to binary or text HDFS files.

Use the HdfsTextSimple profile when writing text data. Use the SequenceWritable profile when dealing with binary data.

This section describes how to use these PXF profiles to write data to HDFS.

Note: Tables that you create with writable profiles can only be used for INSERT operations. If you want to query inserted data, you must create a separate external readable table that references the new HDFS file, specifying the equivalent readable profile.

Prerequisites

Before working with HDFS file data using PXF, ensure that:

  • You have installed and configured a Hadoop client on each Greenplum Database segment host. Refer to Installing and Configuring the Hadoop Client for PXF for instructions.
  • You have initialized and started PXF on your Greenplum Database segment hosts. See Configuring, Initializing, and Starting PXF for PXF initialization, configuration, and startup information.
  • You have granted the gpadmin user read and write permission to the appropriate directories in your HDFS file system.

Writing to PXF External Tables

The PXF HDFS connector supports two writable profiles: HdfsTextSimple and SequenceWritable.

Writing to PXF External Tables

Custom Options

HdfsTextSimple Profile

Example: Writing Data Using the HdfsTextSimple Profile

SequenceWritable Profile

Example: Writing Data Using the SequenceWritable Profile

Reading the Record Key

Example: Using Record Keys