Choosing the Table Storage Model
Choosing the Table Storage Model
Greenplum Database supports several storage models and a mix of storage models. When you create a table, you choose how to store its data. This topic explains the options for table storage and how to choose the best storage model for your workload.
- Heap Storage
- Append-Optimized Storage
- Choosing Row or Column-Oriented Storage
- Using Compression (Append-Optimized Tables Only)
- Checking the Compression and Distribution of an Append-Optimized Table
- Altering a Table
- Dropping a Table
For information about the parameter, see "Server Configuration Parameters" in the Greenplum Database Reference Guide.
By default, Greenplum Database uses the same heap storage model as PostgreSQL. Heap table storage works best with OLTP-type workloads where the data is often modified after it is initially loaded. UPDATE and DELETE operations require storing row-level versioning information to ensure reliable database transaction processing. Heap tables are best suited for smaller tables, such as dimension tables, that are often updated after they are initially loaded.
Append-optimized table storage works best with denormalized fact tables in a data warehouse environment. Denormalized fact tables are typically the largest tables in the system. Fact tables are usually loaded in batches and accessed by read-only queries. Moving large fact tables to an append-optimized storage model eliminates the storage overhead of the per-row update visibility information, saving about 20 bytes per row. This allows for a leaner and easier-to-optimize page structure. The storage model of append-optimized tables is optimized for bulk data loading. Single row INSERT statements are not recommended.
To create a heap table
Row-oriented heap tables are the default storage type.
=> CREATE TABLE foo (a int, b text) DISTRIBUTED BY (a);
Use the WITH clause of the CREATE TABLE command to declare the table storage options. The default is to create the table as a regular row-oriented heap-storage table. For example, to create an append-optimized table with no compression:
=> CREATE TABLE bar (a int, b text) WITH (appendoptimized=true) DISTRIBUTED BY (a);
UPDATE and DELETE are not allowed on append-optimized tables in a repeatable read or serializable transaction and will cause the transaction to abort. DECLARE...FOR UPDATE and triggers are not supported with append-optimized tables. CLUSTER on append-optimized tables is only supported over B-tree indexes.
Choosing Row or Column-Oriented Storage
Greenplum provides a choice of storage orientation models: row, column, or a combination of both. This topic provides general guidelines for choosing the optimum storage orientation for a table. Evaluate performance using your own data and query workloads.
- Row-oriented storage: good for OLTP types of workloads with many iterative transactions and many columns of a single row needed all at once, so retrieving is efficient.
- Column-oriented storage: good for data warehouse workloads with aggregations of data computed over a small number of columns, or for single columns that require regular updates without modifying other column data.
For most general purpose or mixed workloads, row-oriented storage offers the best combination of flexibility and performance. However, there are use cases where a column-oriented storage model provides more efficient I/O and storage. Consider the following requirements when deciding on the storage orientation model for a table:
Updates of table data. If you load and update the table data
frequently, choose a row-oriented heap table. Column-oriented table storage is only available on append-optimized
See Heap Storage for more information.
- Frequent INSERTs. If rows are frequently inserted into the table, consider a row-oriented model. Column-oriented tables are not optimized for write operations, as column values for a row must be written to different places on disk.
Number of columns requested in queries. If you typically request
all or the majority of columns in the SELECT list or
WHERE clause of your queries, consider a row-oriented model.
Column-oriented tables are best suited to queries that aggregate many values of a single
column where the WHERE or HAVING predicate is also on
the aggregate column. For
SELECT AVG(salary)... WHERE salary > 10000
Or where the WHERE predicate is on a single column and returns a relatively small number of rows. For example:
SELECT salary, dept ... WHERE state='CA'
- Number of columns in the table. Row-oriented storage is more efficient when many columns are required at the same time, or when the row-size of a table is relatively small. Column-oriented tables can offer better query performance on tables with many columns where you access a small subset of columns in your queries.
- Compression. Column data has the same data type, so storage size optimizations are available in column-oriented data that are not available in row-oriented data. For example, many compression schemes use the similarity of adjacent data to compress. However, the greater adjacent compression achieved, the more difficult random access can become, as data must be uncompressed to be read.
To create a column-oriented table
The WITH clause of the CREATE TABLE command specifies the table's storage options. The default is a row-oriented heap table. Tables that use column-oriented storage must be append-optimized tables. For example, to create a column-oriented table:
=> CREATE TABLE bar (a int, b text) WITH (appendoptimized=true, orientation=column) DISTRIBUTED BY (a);
Using Compression (Append-Optimized Tables Only)
There are two types of in-database compression available in the Greenplum Database for append-optimized tables:
- Table-level compression is applied to an entire table.
- Column-level compression is applied to a specific column. You can apply different column-level compression algorithms to different columns.
The following table summarizes the available compression algorithms.
|Table Orientation||Available Compression Types||Supported Algorithms|
|Row||Table||ZLIB, ZSTD, and QUICKLZ1|
|Column||Column and Table||RLE_TYPE, ZLIB, ZSTD, and QUICKLZ1|
When choosing a compression type and level for append-optimized tables, consider these factors:
- CPU usage. Your segment systems must have the available CPU power to compress and uncompress the data.
- Compression ratio/disk size. Minimizing disk size is one factor, but also consider the time and CPU capacity required to compress and scan data. Find the optimal settings for efficiently compressing data without causing excessively long compression times or slow scan rates.
- Speed of compression. QuickLZ compression generally uses less CPU capacity
and compresses data faster at a lower compression ratio than zlib. zlib provides higher
compression ratios at lower speeds.
For example, at compression level 1 (compresslevel=1), QuickLZ and zlib have comparable compression ratios, though at different speeds. Using zlib with compresslevel=6 can significantly increase the compression ratio compared to QuickLZ, though with lower compression speed. Zstandard compression can provide for either good compression ratio or speed, depending on compression level, or a good compromise on both.
- Speed of decompression/scan rate. Performance with compressed
append-optimized tables depends on hardware, query tuning settings, and other factors.
Perform comparison testing to determine the actual performance in your
environment.Note: Do not create compressed append-optimized tables on file systems that use compression. If the file system on which your segment data directory resides is a compressed file system, your append-optimized table must not use compression.
Performance with compressed append-optimized tables depends on hardware, query tuning settings, and other factors. You should perform comparison testing to determine the actual performance in your environment.
An ENCODING clause specifies compression type and level for individual columns. When an ENCODING clause conflicts with a WITH clause, the ENCODING clause has higher precedence than the WITH clause.
To create a compressed table
The WITH clause of the CREATE TABLE command declares the table storage options. Tables that use compression must be append-optimized tables. For example, to create an append-optimized table with zlib compression at a compression level of 5:
=> CREATE TABLE foo (a int, b text) WITH (appendoptimized=true, compresstype=zlib, compresslevel=5);
Checking the Compression and Distribution of an Append-Optimized Table
Greenplum provides built-in functions to check the compression ratio and the distribution of an append-optimized table. The functions take either the object ID or a table name. You can qualify the table name with a schema name.
|Set of (dbid, tuplecount) rows||Shows the distribution of an append-optimized table's rows across the array. Returns a set of rows, each of which includes a segment dbid and the number of tuples stored on the segment.|
|float8||Calculates the compression ratio for a compressed append-optimized table. If information is not available, this function returns a value of -1.|
The compression ratio is returned as a common ratio. For example, a returned value of 3.19, or 3.19:1, means that the uncompressed table is slightly larger than three times the size of the compressed table.
The distribution of the table is returned as a set of rows that indicate how many tuples are stored on each segment. For example, in a system with four primary segments with dbid values ranging from 0 - 3, the function returns four rows similar to the following:
=# SELECT get_ao_distribution('lineitem_comp'); get_ao_distribution --------------------- (0,7500721) (1,7501365) (2,7499978) (3,7497731) (4 rows)
Support for Run-length Encoding
Greenplum Database supports Run-length Encoding (RLE) for column-level compression. RLE data compression stores repeated data as a single data value and a count. For example, in a table with two columns, a date and a description, that contains 200,000 entries containing the value date1 and 400,000 entries containing the value date2, RLE compression for the date field is similar to date1 200000 date2 400000. RLE is not useful with files that do not have large sets of repeated data as it can greatly increase the file size.
There are four levels of RLE compression available. The levels progressively increase the compression ratio, but decrease the compression speed.
Greenplum Database versions 4.2.1 and later support column-oriented RLE compression. To backup a table with RLE compression that you intend to restore to an earlier version of Greenplum Database, alter the table to have no compression or a compression type supported in the earlier version (ZLIB or QUICKLZ) before you start the backup operation.
Greenplum Database combines delta compression with RLE compression for data in columns of type BIGINT, INTEGER, DATE, TIME, or TIMESTAMP. The delta compression algorithm is based on the change between consecutive column values and is designed to improve compression when data is loaded in sorted order or when the compression is applied to data in sorted order.
Adding Column-level Compression
You can add the following storage directives to a column for append-optimized tables with column orientation:
- Compression type
- Compression level
- Block size for a column
Add storage directives using the CREATE TABLE, ALTER TABLE, and CREATE TYPE commands.
The following table details the types of storage directives and possible values for each.
|compresstype||Type of compression.||
zlib: deflate algorithm
quicklz: fast compression
RLE_TYPE: run-length encoding
none: no compression
|Values are not case-sensitive.|
|compresslevel||Compression level.||zlib compression: 1-9||
1 is the fastest method with the least
compression. 1 is the default.
9 is the slowest method with the most compression.
|zstd compression: 1-19||
1 is the fastest method with the least
compression. 1 is the default.
19 is the slowest method with the most compression.
1 – use compression
|1 is the default.|
RLE_TYPE compression: 1 –
1 - apply RLE only
2 - apply RLE then apply zlib compression level 1
3 - apply RLE then apply zlib compression level 5
4 - apply RLE then apply zlib compression level 9
1 is the fastest method with the least
4 is the slowest method with the most compression. 1 is the default.
|blocksize||The size in bytes for each block in the table||8192 – 2097152||The value must be a multiple of 8192.|
The following is the format for adding storage directives.
[ ENCODING ( storage_directive [,…] ) ]
where the word ENCODING is required and the storage directive has three parts:
- The name of the directive
- An equals sign
- The specification
Separate multiple storage directives with a comma. Apply a storage directive to a single column or designate it as the default for all columns, as shown in the following CREATE TABLE clauses.
column_name data_type ENCODING ( storage_directive [, … ] ), …
COLUMN column_name ENCODING ( storage_directive [, … ] ), …
DEFAULT COLUMN ENCODING ( storage_directive [, … ] )
C1 char ENCODING (compresstype=quicklz, blocksize=65536)
COLUMN C1 ENCODING (compresstype=zlib, compresslevel=6, blocksize=65536)
DEFAULT COLUMN ENCODING (compresstype=quicklz)
Default Compression Values
If the compression type, compression level and block size are not defined, the default is no compression, and the block size is set to the Server Configuration Parameter block_size.
Precedence of Compression Settings
Column compression settings are inherited from the type level to the table level to the partition level to the subpartition level. The lowest-level settings have priority.
- Column compression settings defined at the table level override any compression settings for the type.
- Column compression settings specified at the table level override any compression settings for the entire table.
- Column compression settings specified for partitions override any compression settings at the column or table levels.
- Column compression settings specified for subpartitions override any compression settings at the partition, column or table levels.
- When an ENCODING clause conflicts with a WITH clause, the ENCODING clause has higher precedence than the WITH clause.
Tables created using the LIKE clause ignore storage directive and column reference storage directives.
Optimal Location for Column Compression Settings
The best practice is to set the column compression settings at the level where the data resides. See Example 5, which shows a table with a partition depth of 2. RLE_TYPE compression is added to a column at the subpartition level.
Storage Directives Examples
The following examples show the use of storage directives in CREATE TABLE statements.
In this example, column c1 is compressed using zstd and uses the block size defined by the system. Column c2 is compressed with quicklz, and uses a block size of 65536. Column c3 is not compressed and uses the block size defined by the system.
CREATE TABLE T1 (c1 int ENCODING (compresstype=zstd), c2 char ENCODING (compresstype=quicklz, blocksize=65536), c3 char) WITH (appendoptimized=true, orientation=column);
In this example, column c1 is compressed using zlib and uses the block size defined by the system. Column c2 is compressed with quicklz, and uses a block size of 65536. Column c3 is compressed using RLE_TYPE and uses the block size defined by the system.
CREATE TABLE T2 (c1 int ENCODING (compresstype=zlib), c2 char ENCODING (compresstype=quicklz, blocksize=65536), c3 char, COLUMN c3 ENCODING (compresstype=RLE_TYPE) ) WITH (appendoptimized=true, orientation=column);
In this example, column c1 is compressed using zlib and uses the block size defined by the system. Column c2 is compressed with quicklz, and uses a block size of 65536. Column c3 is compressed using zlib and uses the block size defined by the system. Note that column c3 uses zlib (not RLE_TYPE) in the partitions, because the column storage in the partition clause has precedence over the storage directive in the column definition for the table.
CREATE TABLE T3 (c1 int ENCODING (compresstype=zlib), c2 char ENCODING (compresstype=quicklz, blocksize=65536), c3 text, COLUMN c3 ENCODING (compresstype=RLE_TYPE) ) WITH (appendoptimized=true, orientation=column) PARTITION BY RANGE (c3) (START ('1900-01-01'::DATE) END ('2100-12-31'::DATE), COLUMN c3 ENCODING (compresstype=zlib));
In this example, CREATE TABLE assigns the zlib compresstype storage directive to c1. Column c2 has no storage directive and inherits the compression type (quicklz) and block size (65536) from the DEFAULT COLUMN ENCODING clause.
Column c3's ENCODING clause defines its compression type, RLE_TYPE. The ENCODING clause defined for a specific column overrides the DEFAULT ENCODING clause, so column c3 uses the default block size, 32768.
Column c4 has a compress type of none and uses the default block size.
CREATE TABLE T4 (c1 int ENCODING (compresstype=zlib), c2 char, c3 text, c4 smallint ENCODING (compresstype=none), DEFAULT COLUMN ENCODING (compresstype=quicklz, blocksize=65536), COLUMN c3 ENCODING (compresstype=RLE_TYPE) ) WITH (appendoptimized=true, orientation=column);
This example creates an append-optimized, column-oriented table, T5. T5 has two partitions, p1 and p2, each of which has subpartitions. Each subpartition has ENCODING clauses:
- The ENCODING clause for partition p1's subpartition sp1 defines column i's compression type as zlib and block size as 65536.
- The ENCODING clauses for partition
p2's subpartition sp1 defines column
i's compression type as rle_type and block size is
the default value. Column k uses the default compression and its
block size is
CREATE TABLE T5(i int, j int, k int, l int) WITH (appendoptimized=true, orientation=column) PARTITION BY range(i) SUBPARTITION BY range(j) ( partition p1 start(1) end(2) ( subpartition sp1 start(1) end(2) column i encoding(compresstype=zlib, blocksize=65536) ), partition p2 start(2) end(3) ( subpartition sp1 start(1) end(2) column i encoding(compresstype=rle_type) column k encoding(blocksize=8192) ) );
For an example showing how to add a compressed column to an existing table with the ALTER TABLE command, see Adding a Compressed Column to Table.
Adding Compression in a TYPE Command
When you create a new type, you can define default compression attributes for the type. For example, the following CREATE TYPE command defines a type named int33 that specifies quicklz compression.
First, you must define the input and output functions for the new type, int33_in and int33_out:
CREATE FUNCTION int33_in(cstring) RETURNS int33 STRICT IMMUTABLE LANGUAGE internal AS 'int4in'; CREATE FUNCTION int33_out(int33) RETURNS cstring STRICT IMMUTABLE LANGUAGE internal AS 'int4out';
Next, you define the type named int33:
CREATE TYPE int33 ( internallength = 4, input = int33_in, output = int33_out, alignment = int4, default = 123, passedbyvalue, compresstype="zlib", blocksize=65536, compresslevel=1 );
When you specify int33 as a column type in a CREATE TABLE command, the column is created with the storage directives you specified for the type:
CREATE TABLE t2 (c1 int33) WITH (appendoptimized=true, orientation=column);
Table- or column- level storage attributes that you specify in a table definition override type-level storage attributes. For information about creating and adding compression attributes to a type, see CREATE TYPE. For information about changing compression specifications in a type, see ALTER TYPE.
Choosing Block Size
The blocksize is the size, in bytes, for each block in a table. Block sizes must be between 8192 and 2097152 bytes, and be a multiple of 8192. The default is 32768.
Specifying large block sizes can consume large amounts of memory. Block size determines buffering in the storage layer. Greenplum maintains a buffer per partition, and per column in column-oriented tables. Tables with many partitions or columns consume large amounts of memory.
Altering a Table
The ALTER TABLE command changes the definition of a table. Use ALTER TABLE to change table attributes such as column definitions, distribution policy, storage model, and partition structure (see also Maintaining Partitioned Tables). For example, to add a not-null constraint to a table column:
=> ALTER TABLE address ALTER COLUMN street SET NOT NULL;
Altering Table Distribution
ALTER TABLE provides options to change a table's distribution policy. When the table distribution options change, the table data may be redistributed on disk, which can be resource intensive. You can also redistribute table data using the existing distribution policy.
Changing the Distribution Policy
For partitioned tables, changes to the distribution policy apply recursively to the child partitions. This operation preserves the ownership and all other attributes of the table. For example, the following command redistributes the table sales across all segments using the customer_id column as the distribution key:
ALTER TABLE sales SET DISTRIBUTED BY (customer_id);
When you change the hash distribution of a table, table data is automatically redistributed. Changing the distribution policy to a random distribution does not cause the data to be redistributed. For example, the following ALTER TABLE command has no immediate effect:
ALTER TABLE sales SET DISTRIBUTED RANDOMLY;
Changing the distribution policy of a table to DISTRIBUTED REPLICATED or from DISTRIBUTED REPLICATED automatically redistributes the table data.
Redistributing Table Data
To redistribute table data for tables with a random distribution policy (or when the hash distribution policy has not changed) use REORGANIZE=TRUE. Reorganizing data may be necessary to correct a data skew problem, or when segment resources are added to the system. For example, the following command redistributes table data across all segments using the current distribution policy, including random distribution.
ALTER TABLE sales SET WITH (REORGANIZE=TRUE);
Changing the distribution policy of a table to DISTRIBUTED REPLICATED or from DISTRIBUTED REPLICATED always redistributes the table data, even when you use REORGANIZE=FALSE.
Altering the Table Storage Model
Table storage, compression, and orientation can be declared only at creation. To change the storage model, you must create a table with the correct storage options, load the original table data into the new table, drop the original table, and rename the new table with the original table's name. You must also re-grant any table permissions. For example:
CREATE TABLE sales2 (LIKE sales) WITH (appendoptimized=true, compresstype=quicklz, compresslevel=1, orientation=column); INSERT INTO sales2 SELECT * FROM sales; DROP TABLE sales; ALTER TABLE sales2 RENAME TO sales; GRANT ALL PRIVILEGES ON sales TO admin; GRANT SELECT ON sales TO guest;
See Splitting a Partition to learn how to change the storage model of a partitioned table.
Adding a Compressed Column to Table
Use ALTER TABLE command to add a compressed column to a table. All of the options and constraints for compressed columns described in Adding Column-level Compression apply to columns added with the ALTER TABLE command.
The following example shows how to add a column with zlib compression to a table, T1.
ALTER TABLE T1 ADD COLUMN c4 int DEFAULT 0 ENCODING (compresstype=zlib);
Inheritance of Compression Settings
A partition added to a table that has subpartitions defined with compression settings inherits the compression settings from the subpartition. The following example shows how to create a table with subpartition encodings, then alter it to add a partition.
CREATE TABLE ccddl (i int, j int, k int, l int) WITH (appendoptimized = TRUE, orientation=COLUMN) PARTITION BY range(j) SUBPARTITION BY list (k) SUBPARTITION template( SUBPARTITION sp1 values(1, 2, 3, 4, 5), COLUMN i ENCODING(compresstype=ZLIB), COLUMN j ENCODING(compresstype=QUICKLZ), COLUMN k ENCODING(compresstype=ZLIB), COLUMN l ENCODING(compresstype=ZLIB)) (PARTITION p1 START(1) END(10), PARTITION p2 START(10) END(20)) ; ALTER TABLE ccddl ADD PARTITION p3 START(20) END(30) ;
Running the ALTER TABLE command creates partitions of table ccddl named ccddl_1_prt_p3 and ccddl_1_prt_p3_2_prt_sp1. Partition ccddl_1_prt_p3 inherits the different compression encodings of subpartition sp1.
Dropping a Table
The DROP TABLE command removes tables from the database. For example:
DROP TABLE mytable;
To empty a table of rows without removing the table definition, use DELETE or TRUNCATE. For example:
DELETE FROM mytable; TRUNCATE mytable;
DROP TABLE always removes any indexes, rules, triggers, and constraints that exist for the target table. Specify CASCADE to drop a table that is referenced by a view. CASCADE removes dependent views.