About Accessing the S3 Object Store

A newer version of this documentation is available. Use the version menu above to view the most up-to-date release of the Greenplum 6.x documentation.

PXF is installed with a connector to the S3 object store. PXF supports the following additional runtime features with this connector:

  • Overriding the S3 credentials specified in the server configuration by providing them in the CREATE EXTERNAL TABLE command DDL.
  • Using the Amazon S3 Select service to read certain CSV and Parquet data from S3.

Overriding the S3 Server Configuration with DDL

If you are accessing an S3-compatible object store, you can override the credentials in an S3 server configuration by directly specifying the S3 access ID and secret key via these custom options in the CREATE EXTERNAL TABLE LOCATION clause:

Custom Option Value Description
accesskey The AWS account access key ID.
secretkey The secret key associated with the AWS access key ID.

For example:

CREATE EXTERNAL TABLE pxf_ext_tbl(name text, orders int)
  LOCATION ('pxf://S3_BUCKET/dir/file.txt?PROFILE=s3:text&SERVER=s3srvcfg&accesskey=YOURKEY&secretkey=YOURSECRET')
FORMAT 'TEXT' (delimiter=E',');

Credentials that you provide in this manner are visible as part of the external table definition. Do not use this method of passing credentials in a production environment.

PXF does not support overriding Azure, Google Cloud Storage, and Minio server credentials in this manner at this time.

Refer to Configuration Property Precedence for detailed information about the precedence rules that PXF uses to obtain configuration property settings for a Greenplum Database user.

Using the Amazon S3 Select Service

Refer to Reading CSV and Parquet Data from S3 Using S3 Select for specific information on how PXF can use the Amazon S3 Select service to read CSV and Parquet files stored on S3.