gpbackup

gpbackup

Create a Greenplum Database backup for use with the gprestore utility.

Note: gpbackup and gprestore are experimental utilities and are not intended for use in a production environment. Experimental features are subject to change without notice in future releases.

Synopsis

gpbackup -dbname database_name
   [-backupdir directory]
   [-compression-level level
   [-data-only]
   [-debug]
   [-exclude-schema schema_name]
   [-exclude-table-file file_name]
   [-include-schema schema_name]
   [-include-table-file file_name]
   [-leaf-partition-data]
   [-metadata-only]
   [-no-compression]
   [-quiet]
   [-single-data-file]
   [-verbose]
   [-version]
   [-with-stats]

Description

The gpbackup utility backs up the contents of a database into a collection of metadata files and data files that can be used to restore the database at a later time using gprestore. By default, gpbackup backs up objects in the specified database as well as global Greenplum Database system objects. You can optionally supply the -globals option with gprestore to restore global objects. See Objects Included in a Backup or Restore for additional information.

gpbackup stores the object metadata files and DDL files for a backup in the Greenplum Database master data directory by default. Greenplum Database segments use the COPY .. ON SEGMENT command to store their data for backed-up tables in compressed CSV data files, located in each segment's data directory. See Understanding Backup Files for additional information.

You can add the -backupdir option to copy all backup files from the Greenplum Database master and segment hosts to an absolute path for later use. Additional options are provided to filter the backup set in order to include or exclude specific tables.

Each gpbackup task uses a single transaction on the Greenplum database master host. Utility-mode connections are created for each segment host, which perform the associated COPY .. ON SEGMENT operations in parallel. The backup process acquires an ACCESS SHARE lock on each table that is backed up.

gpbackup will send out status email notifications after a back up operation completes, if you place a file named mail_contacts in the home directory of the Greenplum database superuser (gpadmin) or in the same directory as the gpbackup utility ($GPHOME/bin). See Configuring Email Notifications.

Options

-dbname database_name
Required. Specifies the database to back up.
-backupdir directory
Optional. Copies all required backup files (metadata files and data files) to the specified directory. You must specify directory as an absolute path (not relative). If you do not supply this option, metadata files are created on the Greenplum Database master host in the $MASTER_DATA_DIRECTORY/backups/YYYYMMDD/YYYYMMDDhhmmss/ directory. Segment hosts create CSV data files in the <seg_dir>/backups/YYYYMMDD/YYYYMMDDhhmmss/ directory. When you specify a custom backup directory, files are copied to these paths in subdirectories of the backup directory.
-compression-level level
Optional. Specifies the gzip compression level (from 1 to 9) used to compress data files. The default is 1. Note that gpbackup uses compression by default.
-data-only
Optional. Backs up only the table data into CSV files, but does not backup metadata files needed to recreate the tables and other database objects.
-debug
Optional. Displays verbose debug messages during operation.
-exclude-schema schema_name
Optional. Specifies a database schema to exclude from the backup. You can specify this option multiple times to exclude multiple schemas. You cannot combine this option with the -include-schema option. See Filtering the Contents of a Backup or Restore for more information.
-exclude-table-file file_name
Optional. Specifies a text file containing a list of tables to exclude from the backup. Each line in the text file must define a single table using the format <schema-name>.<table-name>. The file must not include trailing lines. If a table or schema name uses any character other than a lowercase letter, number, or an underscore character, then you must include that name in double quotes.
You cannot use this option in combination with -leaf-partition-data. Although you can specify leaf partition names in a file specified with -exclude-table-file, gpbackup ignores the partition names.
See Filtering the Contents of a Backup or Restore for more information.
-include-schema schema_name
Optional. Specifies a database schema to include in the backup. You can specify this option multiple times to include multiple schemas. If you specify this option, any schemas that are not included in subsequent -include-schema options are omitted from the backup set. You cannot combine this option with the -exclude-schema option. See Filtering the Contents of a Backup or Restore for more information.
-include-table-file file_name
Optional. Specifies a text file containing a list of tables to include in the backup. Each line in the text file must define a single table using the format <schema-name>.<table-name>. The file must not include trailing lines. If a table or schema name uses any character other than a lowercase letter, number, or an underscore character, then you must include that name in double quotes. Any tables not listed in this file are omitted from the backup set.
You can optionally specify a table leaf partition name in place of the table name, to include only specific leaf partitions in a backup with the -leaf-partition-data option.
See Filtering the Contents of a Backup or Restore for more information.
-leaf-partition-data
Optional. For partitioned tables, creates one data file per leaf partition instead of one data file for the entire table (the default). Using this option also enables you to specify individual leaf partitions to include in a backup, with the -include-table-file option. You cannot use this option in combination with -exclude-table-file.
-metadata-only
Optional. Creates only the metadata files (DDL) needed to recreate the database objects, but does not back up the actual table data.
-no-compression
Optional. Do not compress the table data CSV files.
-quiet
Optional. Suppress all non-warning, non-error log messages.
-single-data-file
Optional. Create a single data file on each segment host for all tables backed up on that segment. By default, each gpbackup creates one compressed CSV file for each table that is backed up on the segment.
Note: If you use the -single-data-file option to combine table backups into a single file per segment, you cannot perform a parallel restore operation with gprestore, and you cannot use the -include-schema option with gprestore.
-verbose
Optional. Print verbose log messages.
-version
Optional. Print the version number and exit.
-with-stats
Optional. Include query plan statistics in the backup set.

Examples

Backup all schemas and tables in the "demo" database, including global Greenplum Database system objects statistics:
$ gpbackup -dbname demo
Backup all schemas and tables in the "demo" database except for the "twitter" schema:
$ gpbackup -dbname demo -exclude-schema twitter
Backup only the "twitter" schema in the "demo" database:
$ gpbackup -dbname demo -include-schema twitter
Backup all schemas and tables in the "demo" database, including global Greenplum Database system objects and query statistics, and copy all backup files to the /home/gpadmin/backup directory:
$ gpbackup -dbname demo -with-stats -backupdir /home/gpadmin/backup

See Also

gprestore