Using the S3 Storage Plugin with gpbackup and gprestore

A newer version of this documentation is available. Use the version menu above to view the most up-to-date release of the Greenplum 5.x documentation.

Using the S3 Storage Plugin with gpbackup and gprestore

Warning: The S3 storage plugin for gpbackup and gprestore is an experimental feature and is not intended for use in a production environment. Experimental features are subject to change without notice in future releases.

The S3 storage plugin application lets you use an Amazon Simple Storage Service (Amazon S3) location to store and retrieve backups when you run gpbackup and gprestore. Amazon S3 provides secure, durable, highly-scalable object storage.

To use the S3 storage plugin application, you specify the location of the plugin and the Amazon Web Services (AWS) login and backup location in a configuration file. When you run gpbackup or gprestore, you specify the configuration file with the option --plugin-config. For information about the configuration file, see S3 Storage Plugin Configuration File Format.

If you perform a backup operation with the gpbackup option --plugin-config, you must also specify the --plugin-config option when you restore the backup with gprestore.

S3 Storage Plugin Configuration File Format

The configuration file specifies the absolute path to the Greenplum Database S3 storage plugin executable, AWS connection credentials, and S3 location.

The S3 storage plugin configuration file uses the YAML 1.1 document format and implements its own schema for specifying the location of the Greenplum Database S3 storage plugin, AWS connection credentials, and S3 location and login information.

The configuration file must be a valid YAML document. The gpbackup and gprestore utilities process the control file document in order and use indentation (spaces) to determine the document hierarchy and the relationships of the sections to one another. The use of white space is significant. White space should not be used simply for formatting purposes, and tabs should not be used at all.

This is the structure of a S3 storage plugin configuration file.

executablepath: <absolute-path-to-gpbackup_s3_plugin>
options: 
  region: <aws-region>
  aws_access_key_id: <aws-user-id>
  aws_secret_access_key: <aws-user-id-key>
  bucket: <s3-bucket>
  backupdir: <s3-location>
executablepath
Required. Absolute path to the plugin executable. For example, the Pivotal Greenplum Database installation location is $GPHOME/bin/gpbackup_s3_plugin.
options
Required. Begins the S3 storage plugin options section.
region
Required. The AWS region.
aws_access_key_id
Required. The AWS S3 ID to access the S3 bucket location that stores backup files.
aws_secret_access_key
Required. AWS S3 passcode for the S3 ID to access the S3 bucket location.
bucket
Required. The name of the S3 bucket in the AWS region. The bucket must exist.
backupdir
Required. The S3 location for backups. During a backup operation, the plugin creates the S3 location if it does not exist in the S3 bucket.

Example

This is an example S3 storage plugin configuration file that is used in the next gpbackup example command. The name of the file is s3-test-config.yaml.

executablepath: $GPHOME/bin/gpbackup_s3_plugin
options: 
  region: us-west-2
  aws_access_key_id: test-s3-user
  aws_secret_access_key: asdf1234asdf
  bucket: gpdb-backup
  backupdir: test/backup3
This gpbackup example backs up the database demo using the S3 storage plugin. The absolute path to the S3 storage plugin configuration file is /home/gpadmin/s3-test.
gpbackup --dbname demo --single-data-file --plugin-config /home/gpadmin/s3-test-config.yaml

The S3 storage plugin writes the backup files to this S3 location in the AWS region us-west-2.

gpdb-backup/test/backup3/backups/YYYYMMDD/YYYYMMDDHHMMSS/

Notes

The S3 storage plugin application must be in the same location on every Greenplum Database host. The configuration file is required only on the master host.

When running gpbackup, the --plugin-config option is supported only with --single-data-file or --metadata-only.

When you perform a backup with the S3 storage plugin, the plugin stores the backup files in this location in the S3 bucket.

<backupdir>/backups/<datestamp>/<timestamp>

Where backupdir is the location you specified in the S3 configuration file, and datestamp and timestamp are the backup date and time stamps.

Using Amazon S3 to back up and restore data requires an Amazon AWS account with access to the Amazon S3 bucket. These are the Amazon S3 bucket permissions required for backing up and restoring data.
  • Upload/Delete for the S3 user ID that uploads the files
  • Open/Download and View for the S3 user ID that accesses the files
For information about Amazon S3, see Amazon S3.