VMware Tanzu Greenplum 6.x Release Notes

This document contains release information about VMware Tanzu Greenplum Database 6.x releases. For previous versions of the release notes for Greenplum Database, go to VMware Tanzu Greenplum Database Documentation. For information about Greenplum Database end of life, see VMware Tanzu Greenplum Database end of life policy.

VMware Tanzu Greenplum 6 software is available for download from the VMware Tanzu Greenplum page on VMware Tanzu Network.

VMware Tanzu Greenplum 6 is based on the open source Greenplum Database project code.

Upgrading Greenplum

See Upgrading to Greenplum 6 to upgrade your existing Greenplum software.

Release 6.18

Release 6.18.2

Release Date: 2021-11-12

VMware Tanzu Greenplum 6.18.2 is a maintenance release that resolves several issues.

Resolved Issues

VMware Tanzu Greenplum 6.18.2 resolves these issues:

Cluster Management

179748268 : The gpstate utility was returning a FATAL error when the variable $PGDATABASE was not set. This fix resolves the issue by using postgres as the default value.

179611677 : The gprecoverseg utility now includes error handling for a mirrorless configuration by displaying the message GPDB Mirroring replication is not configured for this Greenplum Database instance.

178978375 : Fixed a bug which was allowing non-superuser connections when starting Greenplum Database on master only mode with the command gpstart -m -R.

173598676 : The gpstart utility now accepts the string -- as part of the data directory name.

31583 : Fixed a bug where gpcheckcat was incorrectly returning issues for pg_extension when PostGIS was being used.

Query Planner

179328648 : GPORCA always broadcasted instead of redistributing a Left Outer Join. This issue has been addressed by allowing the join inner child to match the distribution of the outer child via a redistribution motion.

Server

31888 : Resolved an issue where Greenplum was returning different results when running the query as an attachment caused by an incorrect plan.

31838 : Fixed a bug caused by a mismatch of the lengths of subplan and subroot which was generating a PANIC when running certain queries.

12703 : Greenplum was generating a PANIC due to a null pointer reference when creating an extension. This has been fixed by resetting the related global variables on the query executor.

12161 : Resolved an issue with dispatch conditions and redundant motion node that was generating an assert failure in execMotionUnsortedReceiver.

12648 : Resolved an issue where inserting to an AO/CO table failed when using spgist index, caused by an empty transaction ID.

Greenplum on Dell EMC VxRail

699 : The Terraform script now ensures that thick provisioning is enabled and disks are zeroed when deploying the Greenplum virtual machines.

655 : Fixed a bug with the size of the disk partition /gpdata which was fixed to 1 TB. It now provisions a 16 GB disk that can be expanded as needed.

654 : The file main.tf now consolidates input parameters and vApp options to avoid parameter inconsistencies.

639 : Resolved an issue where virtual machines were over provisioning full memory reservation, which led to memory allocation issues if a host went down. They now provision 50% memory reservation.

608 : Greenplum on Dell EMC Vxrail now includes a virtual machine template OVA downloadable from VMware Marketplace which helps build the Greenplum cluster.

570 : Changed the recommended storage policy from Stripe 4 to Stripe 1.

Release 6.18.1

Release Date: 2021-10-29

VMware Tanzu Greenplum 6.18.1 is a maintenance release that resolves several issues.

Resolved Issues

VMware Tanzu Greenplum 6.18.1 resolves these issues:

Server

31756 : In some cases, an INSERT query that was run on a table containing a sequence column could not be cancelled. This issue is resolved; Greenplum Database now adds an interrupt check when it processes the next value of a sequence.

31740 : Resolved an issue where incremental gprecoverseg failed to bring a mirror segment back online. This was due to the checkpointer process on a newly recovered mirror failing because the recovery process was searching for a file that had been rightfully deleted.

12670 : Introduces a contentId (%c) parameter to the WAL archive_command and restore_command; this parameter identifies a WAL stream from a given segment.

12024 : Removes the redundant calls to Gpmon_Incr_Rows_Out() that made the rowsout field of gpperfmon qexec packets incorrect for Sort nodes.

11371 : Resolves an AOCO table inconsistency that occurred when an ALTER TABLE ... ADD COLUMN operation was rolled back.

179726460 : The pg_xlogdump utility was not providing sufficient information to assist in diagnosing bitmap index issues. This has been resolved. pg_xlogdumpnow includes bitmap-related information.

179719406 : The pg_log utility was not providing sufficient information to assist in diagnosing bitmap index issues. This has been resolved. The bitmap index name is now included in Greenplum Database logs.

Query Optimizer

31784 : If ANALYZE was performed against a leaf partition as part of a transaction, statistics for the leaf were never merged. ANALYZE would first update the leaf statistics, but when checking to determine if all leaves have been analyzed, it would use a cache that did not include the updated number of rows. This would then trigger a sampling of the root table, which can be expensive. The problem was resolved by advancing the command counter to make the page and tuple updates visible when merging the leaf statistics.

31774 : Resolves an issue where the Query Optimizer generated an incorrect plan and returned wrong results on a select count(*) query when it incorrectly handled a nested set-returning function.

31767 : In some cases, the Query Optimizer ran out of memory when planning a query because it did not correctly propagate the OOM error that would ultimately cancel the query. This issue is resolved in the Query Optimizer runaway cleaner exception handler.

Release 6.18.0

Release Date: 2021-10-8

VMware Tanzu Greenplum 6.18.0 is a minor release that includes feature changes and resolves several issues.

Features

Greenplum Database 6.18.0 includes these new and changed features:

  • The gpstate -e and gpstate -s commands now provide more detailed output about the status of primary-mirror segment WAL synchronization. Moreover, as part of these output changes:
    • The gpstate -s output fields Change tracking data size, Estimated total data to synchronize , and Data synchronized output fields are now gone.
    • The Mirror status field in gpstate -s output for primary segments now has just two valid values: Synchronized or Not in Sync.
  • Greenplum 6.18.0 introduces a new Query Optimizer server configuration parameter, optimizer_xform_bind_threshold. You can use this parameter to reduce the optimization time and overall memory usage of queries that include deeply nested expressions by specifying the maximum number of bindings per transform that GPORCA produces per group expression.
  • The new gp_autostats_allow_nonowner server configuration parameter enables you to configure Greenplum Database to trigger automatic statistics collection on a table when the table is updated by a non-owner. This parameter is off by default.
  • Greenplum 6.18.0 introduces the new contrib module gp_legacy_string_agg. This module implements the single-argument string_agg( text ) function that is available in Greenplum 5; you may choose to use the module to aid migration to Greenplum 6.
  • The license file for the Windows Client and Loader Tools Package was updated to the latest version.
  • Greenplum 6.18.0 removes the ~/.gphostcache file; the management utilities now use an alternate mechanism to map hostnames to interfaces.
  • To enhance the supportability of the product and to aid debugging efforts, Greenplum Database now reports both the reserved and maximum virtual memory allocation when it encounters an out of memory (OOM) condition.

Resolved Issues

VMware Tanzu Greenplum 6.18.0 resolves these issues:

Server

Postgres CVE fixes : This release backports the following Postgres CVE fixes:

  • CVE-2020-25696 psql’s \gset allows overwriting specially treated variables.
  • CVE-2020-25695/ Multiple features escape “security restricted operation” sandbox
  • CVE-2021-32027/Buffer overrun from integer overflow in array subscripting calculations

31736, 31727 : When the log_lock_waits GUC was enabled it resulted in spurious deadlock reports and orphaned wait queue states which, in turn, could lead to memory corruption of certain internal tables. This issue is resolved by disabling the log_lock_waitsGUC.The log_lock_waits GUC is not supported by Greenplum Database.

31736 - Resource Queues : Due to improper error handling, Greenplum Database raised a duplicate portal identifier warning that was in some cases immediately followed by an out of shared memory error. This issue is resolved; Greenplum Database now raises distinct errors for duplicate portal identifier and out of shared memory.

31734 : Greenplum was generating the error ERROR: interconnect error: A HTAB entry for motion node 77 already exists after creating the extension for the ltree module. This issue has been resolved.

31725 - Execution : In some cases, Greenplum Database generated a PANIC when the user cancelled a query on an AO table due to a double free of a visimap object. This issue is resolved.

31708 - Catalog and Metadata : Resolves an issue where a non-superuser who ran VACUUM FULL on a table that they had no permission to access could block further access to the table by currently running transactions. Greenplum Database now performs the permission check before it acquires a lock on the table.

31704 - Planner : In some cases, Greenplum Database returned an incorrect plan when an UPDATE statement included a subquery. This issue is resolved.

31679 - Functions and Languages : Resolves an issue where a function invocation failed with the error cannot create a unique ID for path type: 116 by disallowing the unique row id path when Greenplum Database encounters an index-only scan.

31654 - Storage : Resolves an issue where Greenplum Database did not honor the temp_tablespaces setting when it generated temporary files during a sort operation on a large data set. Greenplum now places these temporary files in a tablespace specified in temp_tablespaces.

31617 - Resource Groups : A database instance was failing to start up, with the message Command pg_ctl reports Master gdm instance active because a mirror segment couldn’t be recovered. This was due to the incorrect type of lock being used when creating or altering a resource group. This issue is resolved.

31615 - Analyze : Resolves an issue where Greenplum Database did not collect statistics on a table when the table was updated by a non-owner. Greenplum Database 6.18 introduces the new server configuration parameter gp_autostats_allow_nonowner (default is off) to enable automatic statistics collection when a non-owner updates a table.

31517 : Fixed a bug with function fixup_unknown_vars_in_setop that would cause Greenplum Database to PANIC.

31385 - Planner : Resolves an issue where Greenplum Database returned wrong results for a query because it chose an incorrect motion type during plan tree iteration.

12482 : Reinstates the Greenplum Database-specific --wrapper and --wrapper-args options to the pg_ctl command that were available in Greenplum Database 5.

12420 : Fixes an environmental variable generation issue caused when symlinks were pointing to alternative locations than the ones expected.

12419 : Resolves an issue where Greenplum Database, during node recovery, generated an error when the field standby_mode=on was set in the recovery.conffile. The error was similar to: "FATAL","22023","recovery command file ""recovery.conf"" request for standby mode not specified",,,,,,,0,,"xlog.c",5465 Greenplum Database now supports server recovery in a non-continuous mode using standby_mode=on.

12409 : pg_relation_filenode() function did not provide proper output for Append Optimized (AO) auxiliary tables. This issue has been resolved.

12408 : Resolves an issue where Greenplum Database threw an assertion failure when the size of an AO table exceeded 1GB.

12402 : Resolves an issue where Greenplum Database failed to complete a query when the target relation of an UPDATE or DELETE operation was a partition table and Greenplum generated a unique row id plan.

12299 - gpexpand : Resolves an issue where WAL-replication was using excessive external bandwith due to misuse of the hostname and address in gp_segment_configuration. Greenplum Database now uses the primary’s address for WAL-replication in gpexpand.

11371 : Greenplum Database would initialize pg_aocsseg table entries with frozen tuples to ensure these entries were implicitly visible even after a rollback. This strategy created issues with the roll back of Append Optimized Columnar (AOC) tables. This issue has now been resolved.

685, 9208, 9429 : In some cases, the value of a server configuration parameter could be inconsistent among the query dispatcher (QD) and/or query executors (QEs) when the parameter value was updated and then reset in the same session. This issue is resolved.

Query Optimizer

31733 : Queries were crashing due to GPORCA prematurely terminating the motion node before interconnect was torn down. This issue is now resolved.

31640 : Running queries with GPORCA enabled generated the error ERROR: could not open existing temporary file "base/pgsql_tmp/pgsql_tmp_SIRW_145011_97_0": No such file or directory, caused by temporary file name changes during cross-slice communication. This issue has been resolved.

179345712 : Resolved an issue with the minirepro utility where it generated incorrect input when inserting columns containing arrays.

179244161 : Resolved an issue where Greenplum processes crashed when trying to write an error message to the log due to a null pointer dereference.

167242211 : Resolved an issue that caused wrong results when running queries with GPORCA when having the IS DISTINCT FROM FALSE predicate inside a NOT EXIST subquery.

Cluster Management

31746 : Resolves an issue where Greenplum Database threw cosmetic errors during gpstate execution when a database of the same name as the $USER running the command did not exist.

31581 : gpconfig would fail with error similar to ValueError: filedescriptor out of range in select() due to liveness checks performed on all hostnames associated with each database id. This issue has been resolved and liveness checks are now performend only on unique hostnames.

31558 - gpinitsystem : If a gpinisystem operation failed before creating the necessary segment backout scripts, manual steps were required to clean up the directories after the failure. To resolve this problem gpinisystem now creates a single backout script earlier in the process; the script can be used to cleanup all segment directories even if a failure occurs later in the gpinisystem operation.

179336062 : The Greenplum utilities gprecoverseg, gpinitstandby, gpstart, and gpstop fail when there is a message banner in .bashrc. This issue has been resolved, and now banner messages are not parsed.

178936675 - gpstate : Beginning in Greenplum 6.0, the gpstate utility no longer provided mirror synchronization status. This problem was resolved by adding the Mirror status field in gpstate -s output for primary segments.

178831426 - gpcheckperf : When executing gpcheckperf via gpssh, the gpcheckperf utility could issue a pkill -f command that killed the gpssh process and disconnected the established SSH connections. This problem was resolved by removing the use of pkill -f in gpcheckperf.

178761563 : Removes unrelated output message regarding netperf test, that was shown when running gpcheckperf -r N.

178637128 : Fixes the issue of the missing --help option for gpcheckcat.

Data Loading

N/A : Reverts a previously-committed gpload improvement that removed a left-join merge because it introduced a performance regression in certain situations.

31613, 12454 : Resolves an issue where a COPY command failed to read a file that included special characters at the end of a line because Greenplum Database did not recognize the EOL character.

Release 6.17

Release 6.17.7

Release Date: 2021-10-2

VMware Tanzu Greenplum 6.17.7 is a maintenance release that resolves a single issue.

Resolved Issue

Server

31818 : The fix for issue 11308, added in Greenplum 6.16, introduced a regression that could cause incorrect results for queries having a bitmap index condition. This problem was resolved in 6.17.7 by reverting the original fix.

Release 6.17.6

Release Date: 2021-9-25

VMware Tanzu Greenplum 6.17.6 is a maintenance release that resolves a single issue.

Resolved Issue

Server

31807 : Incorrect bitmap index page data was written to WAL logs, resulting in bitmap index corruption after failover to mirror segments or after crash recovery on primaries. This issue has been resolved.

Release 6.17.5

Release Date: 2021-9-22

VMware Tanzu Greenplum 6.17.5 is a maintenance release that resolves a single issue.

Resolved Issue

Postgres Query Planner

12583 : A previous fix to resolve duplicate sequence values (178253995) resulted in a performance regression. An incomplete hash key in the fix caused most hash table lookups to fail, which increased the time required to insert data having columns with sequence values. The problem was resolved by ensuring that the hash key contains complete information.

Release 6.17.4

Release Date: 2021-9-21

VMware Tanzu Greenplum 6.17.4 is a maintenance release that resolves several issues.

Resolved Issues

VMware Tanzu Greenplum 6.17.4 resolves these issues:

Query Optimizer

31715 : A memory allocation function failed to handle a null pointer exception, which could lead to a system panic when creating the memory pools on Greenplum segments. This issue was resolved, by raising an exception if the memory allocation fails.

31732 : A query plan with a nestled loop join and a dynamic table/index scan on the inner side of the join could cause an out of memory exception during a high concurrency workload. This was caused by an inefficient use of executor memory in the dynamic table/index scan operators. This issue has been resolved.

Server

31776 : Segment failover caused bitmap index corruption and returned the error Invalid page in block 0 of relation. This was caused by the metapage not being present in the shared buffers. This issue has been resolved.

Release 6.17.3

Release Date: 2021-8-30

VMware Tanzu Greenplum 6.17.3 is a maintenance release that resolves several issues.

Resolved Issues

VMware Tanzu Greenplum 6.17.3 resolves these issues:

Query Optimizer

31714 : In some cases, when processing queries that contained inner joins, the optimizer failed to process statistics, which resulted in a PANIC. This has been resolved.

Server

31726, 31696 : A lock table corruption issue caused the Greenplum Database master segment to PANIC if a query waiting for a resource queue lock was cancelled, or if the query errored with a deadlock report. This issue is resolved.

Release 6.17.2

Release Date: 2021-8-6

VMware Tanzu Greenplum 6.17.2 is a maintenance release that resolves several issues.

Resolved Issues

VMware Tanzu Greenplum 6.17.2 resolves these issues:

Cluster Management

31558 : This fix creates a single backout script when gpinitsystem runs so it is possible to clean up all segment directories in case of an error.

178761563 : The gpcheckperf output displayed a NOTICE about deprecated flags for gpnetbench, however these flags are still used in netperf. This fix removes the NOTICE message.

178637128 : The command gpcheckcat now has a --help option.

31581 : For large clusters gpconfig returned the error filedescriptor out of range in select() due to the liveness check being performed on all hostnames associated with each Database ID. This fix addresses this issue by performing liveness check only on unique hostnames.

Extensions

31610 : This version introduces a new type extension pljavat which only installs a language handler for the trusted language pljava and not the untrusted language pljavau.

Query Optimizer

178747560 : For queries that insert from a replicated table into a distributed table, the query optimizer could generate a plan that produced incorrect results. Specifically, for operations involving a SEQUENCE function on the replicated table, the result was not guaranteed to be identical across different segments. This issue was resolved by identifying these scenarios and redistributing tuples from a single segment, rather than exploiting the advantages of replicated table.

31609 : The query optimizer generated a better plan with optimizer_join_order set to “greedy” in comparison to the default value of “exhaustive.” This occurred because the default join order algorithm was partial to finding a plan that exploited dynamic partition elimination (DPE) and sometimes ignoring cases where both static and dynamic partition elimination would improve plan performance. This was done to prevent increasing the optimization time significantly at the expense of a sub-optimal plan. To avoid generating sub-optimal plans, the query optimizer now generates both partition elimination options but only for the greedy algorithm, regardless of the join order chosen. This helps to generate some lower-cost alternatives while incurring only a moderate increase in optimization time (15%).

31556 : Creating a temp table with DISTRIBUTED RANDOMLY from a DISTRIBUTED REPLICATED table put all records into one instance when the query optimizer was enabled. This issue was resolved by adding the necessary logic to insure that, when inserting from a replicated table into a randomly-distributed table, the query optimizer randomizes the tuples to data skew.

31515 : GPORCA could generate wrong results if a query used both a WITH clause and a join condition across different output columns of the same derived tables. The problem occurred because the distribution policy created for the WITH clause subquery was incorrectly propagated. GPORCA has resolved this problem by pushing properties into the WITH clause only when all columns come from the same derived table.

31448 : Incorrectly estimating cardinality for casted columns resulted in a bad join order, which could lead to a query that did not finish. This issue was resolved by updating the cardinality estimation model when the input histogram is not well formed, as is the case when columns are casted.

Server

31564 : Resolved an issue where a database server process was not properly holding the partitionLock and ResQueueLock after releasing a resource queue lock following a deadlock error, which resulted in a segmentation fault.

gpload

31466 : Resolved an issue where gpload would fail with column names that used uppercase or mixed-case characters. gpload now automatically adds double quotes to column names that are not already quoted in the YAML control file.

Release 6.17.1

Release Date: 2021-7-23

VMware Tanzu Greenplum 6.17.1 is a maintenance release that resolves several issues.

Resolved Issues

VMware Tanzu Greenplum 6.17.1 resolves these issues:

Postgres Planner

31630 : Fixed a problem where the Postgres Planner failed to consider aggregates on the entry flow. This could cause queries to fail with: ERROR: MIN/MAX subplan has unexpected flowtype .

Query Optimizer

31522 : During a left outer join query, with an index on the join column, the Query Optimizer explored a plan search space that hit an exception, was unable to generate a plan, and reverted to the Postgres Planner. This issue has been fixed by changing the requested collocation property, and eliminating the invalid search space.

174732670 : The query optimizer added an unnecessary Explicit Redistribute Motion when generating a plan for queries using DELETE on a randomly distributed table. This issue has been resolved.

Server

12197 : Resolves a data corruption issue that could occur with concurrent DML queries on AO/AOCO tables when Greenplum Database was not aware of committed transactions that started after snapshot initiation.

Release 6.17.0

Release Date: 2021-7-9

VMware Tanzu Greenplum 6.17.0 is a minor release that includes feature changes and resolves several issues.

Greenplum Database 6.17.0 includes these new and changed features:

Features

  • The PXF version 6.1.0 distribution is available with this release; you can download it from the Release Download directory named Greenplum Platform Extension Framework on VMware Tanzu Network. Refer to the PXF documentation for information about this release and for installation and upgrade instructions.
  • The gprecoverseg, gpaddmirrors, and gpmovemirrors utilities now include a -b option to specify the maximum number of segments per host to operate on in parallel.
  • The gpecheckcat utility now permits users to skip one or more tests, with the new -s option. In addition, the -R option now accepts a comma-separated list of of multiple tests to run.
  • The Progress DataDirect JDBC Driver v6.0.0+181 is included in this release. See the Readme file for a list of new features available in v6.0.0

Resolved Issues

VMware Tanzu Greenplum 6.17.0 resolves these issues:

Server

178150185 : Resolved an issue where the message, An exception was encountered during the execution of a statementwas unable to be suppressed. Users can now control the verbosity of the exception message with the log_min_messages GUC.

31335 : Resolved an out of memory condition that could occur because the server held certain data contexts longer than necessary.

31185 : When altering a role from superuser to non-superuser, the role resource group was not changing from admin_group to default_group. This fix addresses this issue.

31320 : Resolved an issue where issuing a UPDATE/DELETE statement while a VACUUM operation was running returned the error tuple concurrently updated, due to an out of date snapshot.

31435 : Fixed the error unrecognized parameter "appendoptimized" that was reported when using the appendoptimized clause for a table partition definition.

31584 : Resolved a bug that caused the database to crash when running queries against tables that had not been redistributed after an expansion.

31588 : Resolved an issue where a CTAS statement referencing a table created with the DISTRIBUTED REPLICATED clause was generating incorrect results with the Legacy Planner.

Query Optimizer

31481 : For partitioned tables with indexes, ORCA would cause a PANIC
 while trying to create an index plan, when root and leaf partitions did not have the
 same underlying dropped column structure. This issue has been resolved.

31463 : Running queries with the GPORCA query optimizer that used an INNER JOIN with an EXCEPT clause was causing a fallback to Legacy Planner and reporting the error: No plan has been computed for required properties. This issue has been resolved.

10967 : Resolved an issue where running queries with a LEFT OUTER JOIN were producing incorrect results as it was being replaced with an INNER JOIN in the subquery plan.

31440 : Resolved an issue where running a query using UNION ALL and an external table was returning incorrect results with GPORCA query optimizer, due to not all information from the external table being gathered.

177874343 : The ORCA Query Optimizer was unnecessarily opting out of certain optimizations when it encountered a NO SQL function. It no longer does that.

178477120 : Introduced a fix to verify that materialize does not project in ORCA Query Optimizer.

11880 : Resolved an issue with array overflow for function argument types which was causing memory corruption and a database crash with ORCA Query Optimizer.

31527 : Resolved a bug in the implementation of Left Outer Joins that was causing a database crash when using ORCA Query Optimizer.

Cluster Management

178173416 : Resolved an issue where gprecoverseg -r– when it called gprescoverseg internally a second time – did not correctly pass on the -B and -b options.

31417 : Resolved an issue where gprecoverseg was updating the pg_hba.conf file on all primaries regardless of whether those primaries needed to be updated.

31350 : Resolved an issue where gpmovemirrors -B was calling gprecoverseg with an excessively large number of parallel processes.

31338, 31351 : Resolved an issue where old mirror directories were not being removed from old tablespace directories after a mirror was moved using gpmovemirrors.

31419 : Resolved an issue where gprecoverseg was printing out empty pg_rewind progress lines for mirrors.

31519 : gprecoverseg no longer adds pg_hba.conf entries if they are already present in the file.

31579 : For all GUCs with a vartype of string, you may no longer enclose the value you pass to gpconfig -cin single quotes.

178254153 : Resolved an issue where gprecoverseg -p was reporting an error despite the operation being successful.

177862886 : Resolved an issue where gprecoverseg -p would error out when master could not connect to an individual segment host.

Postgres Query Planner

178238691 : Resolved an issue where an error occurred when the planner was executing a query that performed a UNION ALL between a replicated table and a subquery with an explicit sequence nextval.

178253995 : Resolved an issue where the sequence executor node was generating duplicate sequence values.

Release 6.16

Release 6.16.3

Release Date: 2021-07-02

VMware Tanzu Greenplum 6.16.3 is a maintenance release that resolves several issues.

Resolved Issues

VMware Tanzu Greenplum 6.16.3 resolves these issues:

Server

31549, 31577 : Resolved an issue that made Greenplum unable to cast unknown-type literals to cstring in certain circumstances. This problem could surface in several ways including query panics, out of memory conditions, and the following compression error: "ERROR","XX000","compressed data is corrupt (pg_lzcompress.c:737)".

Analyze

31589 : When running ANALYZE on a leaf table, Greenplum would compute statistics for all columns when merging statistics for the parent table, even if one or more columns had been configured with STATISTICS set to 0 (to disable statistics collection for that column). Greenplum no longer computes statistics for columns that have STATISTICS set to 0 when merging statistics for the parent table.

Release 6.16.2

Release Date: 2021-06-04

VMware Tanzu Greenplum 6.16.2 is a maintenance release that resolves several issues.

Resolved Issues

VMware Tanzu Greenplum 6.16.2 resolves these issues:

Cluster Management

178140424 : Resolved an issue where, when gprecoverseg -v was invoked, pg_rewind’s log was generated but was not preserved in case of an incremental recovery failure.

Server

178280660 : Resolved an issue with the Postgres planner partition selection when the types of the partitioning key and the search values are different.

31436 : Resolved an issue in which, when running \df+ on a function that has exec location INITPLAN, the Execute oncolumn did not properly display “initplan”.

31335 : Resolved an out of memory condition that could occur because the server held certain data contexts longer than necessary.

12015 : Fixed an inconsistency between master and segments regarding the value of collname when creating a DOMAIN.

11999 : Resolved an issue in which CREATE MATERIALIZED VIEW was failing with ERROR: division by zero when WITH NO DATA was being specified.

Query Executor

31439 : Resolved an issue which caused the database to PANIC due to a double free of the memory context TupleSort.

Release 6.16.1

Release Date: 2021-5-21

VMware Tanzu Greenplum 6.16.1 is a maintenance release that resolves several issues.

Resolved Issues

VMware Tanzu Greenplum 6.16.1 resolves these issues:

n/a - gprecoverseg : Resolved an issue where using gprecoverseg to perform an incremental recovery removed log files from the pg_log directory. gprecoverseg now retains files under pg_log so that they can be used for troubleshooting after an incremental recovery.

177336745 - Cluster Management : Resolved an issue with gpconfig where the process hung when cancelling it due to hosts being unreachable.

31453, 177970816 - Query Optimizer : Resolved an issue in which the optimizer was failing with a segfault while querying a view, if the view had a join between tables and any of the table had a column that was dropped after view creation.

31415 - Server : Resolved an issue in which REFRESH MATERIALIZED QUERY was failing if the WHERE clause included an embedded query.

31405 - Server : Resolved an issue in which certain queries resulted in a master panic because of the way temporary files were being tracked.

31383 - Server : Resolved an issue that was causing slow query performance for view creation when using the NOT in clause.

31380 - Cluster Management : The new functionality for gprecoverseg to ignore segments on unreachable hosts had introduced an issue for the gprecoverseg -p newhost option as it skipped the case of a new host replacing an unreachable host. This issue is now resolved.

31368 - Cluster Management : Resolved an issue with gpmovemirror which caused the wrong error message to be printed when there was an issue with the provided configuration file.

31361 - Query Executor : Resolved an issue in which multiple segments failed with PANIC errors due to how newly allocated memory was being handled.

31337 - Query Optimizer : Resolved an issue in which the optimizer was crashing – instead of simply throwing an error – when it was running out of memory.

31297 - Server : Resolved an issue in which VACUUM FULL ANALYZE was failing to clear some tables of bloat.

31279 - Server : Resolved an issue with incremental recovery when a failed segment that was acting as primary had to be recovered. The issue was caused by WAL files required by the recovery being removed.

31110 - Server : Resolved an issue in which a View with an unknown field could not be restored because columns were not being correctly cast from type cstring to type date.

30762 - Server : Resolved an issue with gpload in which a global transaction was not aborted when the session was reset.

29998 - Query Optimizer : Resolved an issue in which the optimizer was returning incorrect results when the query involved a CTE with an EXCEPT clause. The issue occurred because the query optimizer did not add scalar casts for any input columns whose types did not match the output types of the SetOp.

Release 6.16.0

Release Date: 2021-4-22

VMware Tanzu Greenplum 6.16.0 is a minor release that includes feature changes and resolves several issues.

Features

Greenplum Database 6.16.0 includes these new and changed features:

  • Greenplum Resource Groups include now a new mode for assigning CPU resources by percentage, Ceiling enforcement mode, apart from the existing Elastic mode.
  • The PXF version 6.0.0 distribution is available with this release; you can download it from the Release Download directory named Greenplum Platform Extension Framework on VMware Tanzu Network. Refer to the PXF documentation for information about this release and for installation and upgrade instructions.
  • Greenplum Streaming Server (GPSS) version 1.5.3 is included, which includes changes and bug fixes. Refer to the GPSS Documentation for more information about this release and for upgrade instructions.
  • Greenplum Database 6.16.0 includes MADlib version 1.18.0, which introduces new Deep Learning features, improvements, and bug fixes. See the MADlib 1.18.0 Release Notes for a complete list of changes.
  • The gp_sparse_vector module now installs its functions and objects into a schema named sparse_vector. See Resolved Issue 31360.

    Note: If you are using gp_sparse_vector in your current Greenplum Database distribution, review Upgrading the gp_sparse_vector Module for upgrade implications and instructions.
  • The default value of the optimizer_join_order server configuration parameter is changed from exhaustive2 to exhaustive, which was the default used in Greenplum versions prior to version 6.14.0. The Greenplum Database Query Optimizer (GPORCA) uses this configuration parameter to identify the join enumeration algorithm for a query. See Resolved Issue 31391.

Resolved Issues

VMware Tanzu Greenplum 6.16.0 resolves these issues:

31391 - Query Optimizer : Resolves an issue where a query using the GPORCA query optimizer took longer to complete on Greenplum Database version 6.15.0 with the optimizer_join_order server configuration parameter default value of exhaustive2. The default value of this configuration parameter is changed (back to) exhaustive.

31360 - gp_sparse_vector : Resolves an issue where the gp_sparse_vector array_agg() function overrode the system pg_catalog.array_agg() function and returned a different type of array. The gp_sparse_vector module now installs its functions and objects into a separate schema named sparse_vector.

31333 - Optimizer : GPORCA would generate invalid plans for certain queries with scalar subquery in the projection list, and a predicate that could use an underlying index. This could cause Greenplum Database to PANIC when running certain functions with the Optimizer enabled. This issue has now been resolved.

31308 - Server : Fixed an issue of queries hanging when logging is configured to write NOTICE level messages.

31296 - Server : Fixed an issue with gpcheckcat reporting an error when a VIEW is defined without a target and does not have an entry in pg_attribute.

31272 - Data Flow : In dual stack IPv4 and IPv6 hosts, gpfdist would bind to an IPv4 port but fail to bind to an IPv6 port, if there was another process listening on the same IPv6 port. This issue has been resolved.

31259 - Query Optimizer : When querying partitioned tables that had been defined with open boundaries, the Query Optimizer was performing unnecessary scans. This issue has been resolved.

31247 - Cluster Management : gpstate would log ERROR/FATAL messages relating to gpexapnd.status when no expansion was in progress, and when gpstate -s output showed a healthy cluster state. This issue has been resolved.

31346 - Query Optimizer : Resolved an issue with incorrect group pathkey that was causing the Greenplum database to go into recovery mode when using GROUP BY GROUPING SETS.

11655 - Query Optimizer : Fixed the error Lookup of object 0.0.0.0 in cache failed when querying a partitioned table.

11308 - Server : Fixes an issue where a bitmap index scan running concurrently with an index INSERT on a full bitmap page, occasionally failed to read the correct tid.

Release 6.15

Release 6.15.0

Release Date: 2021-3-11

VMware Tanzu Greenplum 6.15.0 is a minor release that includes changed features and resolves several issues.

Features

Greenplum Database 6.15.0 includes these new and changed features:

  • gprecoverseg now rebalances segments whose hosts are reachable even if there are other segments whose hosts are not; for segments on unreachable hosts, gprecoverseg now emits a warning message.
  • Greenplum Streaming Server (GPSS) version 1.5.2 is included, which introduces bug fixes. Refer to the GPSS Release Notes for more information on release content and to access the GPSS documentation.

    Note: If you have previously used GPSS in your Greenplum 6.x installation, you are required to perform upgrade actions as described in Upgrading the Streaming Server.
  • The analyzedb utility has a new --skip_orca_root_stats option. When this option is specified, analyzedb will not update root partition statistics. This option should be used only if GPORCA is disabled.

Resolved Issues

VMware Tanzu Greenplum 6.15.0 resolves these issues:

31267 - Server : Fixed error resource group wait queue is corrupted (resgroup.c:3502) that caused a PANIC on Greenplum Master.

31262 - Server : Resolved an issue with inconsistencies in returned results when using the grouping function.

31260/30895 - External Tables : Fixed issue with Web External Tables when escape='OFF'.

31249 - Query Optimizer : Certain queries that selected random rows in a partitioned table, using functions like random() or timeofday(), would cause host PANIC. This issue has been resolved.

31244 - analyzedb : Added --skip_orca_root_stats option, which prevents analyzedb from updating root partition statistics.

31228 - Data Flow : dblink includes function dblink_connect_no_auth that skips authentication checks.

31223 - Query Optimizer : Fixed optimizer issue with cardinality estimation for point queries on CDOUBLE types, such as timestamp.

31220 - Data Flow : Resolved an issue in which gpfdist reads from external tables were resulting in “No route to host” errors.

31210 - Interconnect : Resolves a high memory usage issue with proxy background worker processes when gp_interconnect_type = proxy, and the pause/resume flow control process causes a busy receiver to cache or duplicate packets while waiting for the backend to consume them. Greenplum Database now uses a more active flow control mechanism to control packet buffering and transmission.

31191 - Server : Fixed planner issue of queries crashing with FailedAssertion error due to planner issue.

31055 - Server : Prohibits the execution of queries with set returning functions in WHERE clause, which caused segment panic. Greenplum now generates an error similar to: set-returning functions are not allowed in WHERE clause.

176693327 - Cluster Management : gpinitsystem now treats equivalent locales as equivalent (for example, en_US.UTF-8 is now treated the same as en_US.utf8).

176521407 - Cluster Management : Because gprecoverseg now reports progress on both incremental and full segment recovery, there is no longer any need to call gpstate -m to determine recovery progress.

Release 6.14

Release 6.14.1

Release Date: 2021-2-22

VMware Tanzu Greenplum 6.14.1 is a maintenance release that resolves several issues and includes related changes

Changed Features

Greenplum Database 6.14.1 includes this change:

  • PostGIS version 2.5.4+pivotal.4 is included, which resolves the segfault described in PostGIS ticket #4691.

Resolved Issues

VMware Tanzu Greenplum 6.14.1 resolves these issues:

31258 - Server : Resolves an issue where the array typecasts of operands in view definitions with operators were erroneously transformed into anyarray typecasts. This caused errors in the backup and restore of the view definitions.

31249 - Query Optimizer : Resolves an issue where Greenplum Database generated a PANIC during query execution when the Query Optimizer attempted to access the argument of a function (such as random() or timeofday()), but the query did not invoke the function with an argument.

31242 - Server : Optimized locking to resolve an issue with certain SELECT queries on the pg_partitions system view, which were waiting on locks taken by other operations.

31232 - Server : Resolves an issue where, after an upgrade from version 5.28 to 6.12, a query execution involving external tables resulted in a query PANIC and segment failover. This issue has been resolved by optimizing the query subplans.

31211 - gpfdist : When an external table was configured with a transform, gpfdist would sporadically return the error 404 Multiple reader to a pipe is forbidden. This issue is resolved.

176684985 - Query Optimizer : This release improves Greenplum Database’s performance for joins with multiple join predicates.

Release 6.14.0

Release Date: 2021-2-5

VMware Tanzu Greenplum 6.14.0 is a minor release that includes changed features and resolves several issues.

Features

Greenplum Database 6.14.0 includes these new and changed features:

  • CentOS/RHEL 8 and SUSE Linux Enterprise Server x86_64 12 (SLES 12) Clients packages are available with this Greenplum Database release; you can download them from the Release Download directory named Greenplum Clients on VMware Tanzu Network.
  • The PXF version 5.16.1 distribution is available with this release; you can download it from the Release Download directory named Greenplum Platform Extension Framework on VMware Tanzu Network.
  • The default value of the optimizer_join_order server configuration parameter is changed from exhaustive to exhaustive2. The Greenplum Database Query Optimizer (GPORCA) uses this configuration parameter to identify the join enumeration algorithm for a query. With this new default, GPORCA operates with an emphasis on generating join orders that are suitable for dynamic partition elimination. This often results in faster optimization times and/or better execution plans, especially when GPORCA evaluates large joins. The Faster Optimization of Join Queries in ORCA blog provides additional information about this feature.
  • The default cost model for the optimizer_cost_model server configuration parameter, calibrated, has been enhanced; GPORCA is now more likely to choose a faster bitmap index with nested loop joins rather than hash joins.
  • GPORCA boosts query execution performance by improving its partition selection algorithm to more often eliminate the default partition.
  • GPORCA now generates a plan alternative for a right outer join transform from a left outer join when equivalent. GPORCA’s cost model determines if/when to pick this alternative; using such a plan can greatly improve query execution performance by introducing partition selectors that reduce the number of partitions scanned.
  • The output of the gprecoverseg -a -s command has been updated to show more verbose progress information. Users can now monitor the progress of the recovering segments in incremental mode.
  • The gpcheckperf command has been updated to support Internet Protocol version 6 (IPv6).

Resolved Issues

VMware Tanzu Greenplum 6.14.0 resolves these issues:

31195 - Server: Execution : Resolves an issue where Greenplum Database generated a PANIC when the pg_get_viewdef_name_ext() function was invoked with a non-view relation.

31094 - Server: Execution : Resolves an issue where a query terminated abormally with the error Context should be init first when gp_workfile_compression=on because Greenplum Database ignored a failing return value from a ZSTD initialization function.

31067 - Query Optimizer : Resolves a performance issue where GPORCA did not consistently eliminate the default partition when the filter condition in a query matched more than a single partition. GPORCA has improved its partition selection algorithm for predicates that contain only disjunctions of equal comparisons where one side is the partition key by categorizing these comparisions as equal filters.

31062 - Cluster Management : Resolves a documentation and --help output issue for the gprecoverseg, gpaddmirrors, gpmovemirrors, gpinitstandby utilities, where the --hba-hostnames command line flag details were missing.

31044 - Query Optimizer : Fixes a plan optimizer issue where the query would fail due to the planning time being dominated by the sort process of irrelevant indexes.

30974 - Server: Execution : Greenplum Database generated a PANIC when a query run in a utility mode connection invoked the gp_toolkit.gp_param_setting() function. This issue is resolved; Greenplum now ignores a function’s EXECUTE ON options when in utility mode, and executes the function only on the local node.

30950 - Query Optimizer : Resolves an issue where GPORCA did not use dynamic partition elimination and spent a long time planning a query that included a mix of unions, outer joins, and subqueries. GPORCA now caches certain object pointers to avoid repeated metadata lookups, substantially decreasing planning time for such queries when optimizer_join_order is set to query or exhaustive2.

30947 - Query Optimizer : Resolves an issue where Greenplum Database returned the error no hash_seq_search scan for hash table "Dynamic Table Scan Pid Index" because GPORCA generated a query plan that incorrectly rescanned a partition selector during dynamic partition elimination. GPORCA now generates a plan that does not demand such a rescan.

11211 - Server : During the parallel recovery and rebalance of segment nodes after a failure, if an error occurred during segment resynchronization, the main recovery process would halt and wait indefinitely. This issue has been fixed.

11058 - Query Optimizer : Resolves an optimizer issue where CTE queries with a RETURNING clause would fail with the error INSERT/UPDATE/DELETE must be executed by a writer segworker group.

174873438 - Planner : Resolves an issue where an index scan generated for a query involving a system table and a replicated table could return incorrect results. Greenplum no longer generates the index scan in this situation.

Release 6.13

Release 6.13.0

Release Date: 2020-12-18

VMware Tanzu Greenplum 6.13.0 is a minor release that includes changed features and resolves several issues.

Features

Greenplum Database 6.13.0 includes these new and changed features:

  • This release introduces the new VMware Tanzu Greenplum Connector for Apache NiFi 1.0.0. The Connector provides a fast and simple, UI-based way to build data ingestion pipelines for Greenplum Database, code-free. The Connector is available as a separate download from VMware Tanzu Network. Refer to the VMware Tanzu Greenplum Connector for Apache NiFi documentation for installation, configuration, and usage information for the Connector.
  • Greenplum Streaming Server (GPSS) version 1.5.0 is included, which introduces many new and changed features and bug fixes. Refer to the GPSS Release Notes for more information on release content and to access the GPSS documentation.

    Note: If you have previously used GPSS in your Greenplum 6.x installation, you are required to perform upgrade actions as described in Upgrading the Streaming Server.
  • The distribution includes the advanced_password_check contrib module; you can use this module to specify password string quality policies for Greenplum Database. Refer to the advanced_password_check module documentation for more information.

  • The Greenplum Database Query Optimizer exposes a new server configuration parameter to enable index-only scans. These scans answer queries from an index alone without requiring any heap access. This configuration parameter is named optimizer_enable_indexonlyscan, and is enabled by default.

Resolved Issues

VMware Tanzu Greenplum 6.13.0 resolves these issues:

30994 - Cluster Management : Resolved an issue where gpstartcould fail with an non-descript error when a segment host was unreachable during Greenplum Database initialization. gpstart now continues the startup process even if some cluster hosts are unreachable.

31028 - Cluster Management : When using gpconfig -s client_min_messages to set the client messaging level (for example “notice” or “warning”), the output showed “error” instead of the level configured by the user. This issue has been resolved.

10720 - Planner : Resolved a problem that could cause the Postgres Planner to crash or return incorrect results if a query used a grouping expression (rather than a column name) along with other ungrouped targets that referenced the group key.

10961 - Locking : Improved the locking behavior to avoid deadlocks that can occur when creating multiple indexes on an Append-Only table.

30951 - Query Optimizer : Resolves an issue where a query performed on a serial-type column of a replicated Greenplum table (created DISTRIBUTED REPLICATED) failed to return a consistent value on all segments.

30976 - Server: Execution : In some cases, Greenplum Database returned incorrect results when it ran certain system functions in an EntryDb (a special query executor (QE) that runs on the master instance). This issue is resolved; Greenplum now returns the error “This query is not currently supported by GPDB.” when it encounters a function invocation that it does not support.

31011 - Server: Execution : Resolved a problem that could cause a segfault when spilling hash tables to disk. In Greenplum 6 this problem was reported with a materialized view query that produced a HashAggregate plan.

31073 - Segment Mirroring : Resolves an issue where the kernel Recv-Q buffer filled up on a Greenplum mirror segment instance when the mirror sent statistics to the statistics collector, but the collector was not running because the mirror was not in hot standby mode.

31082 - Server: Execution : Resolved an issue where the planner generated an incorrect plan for some queries that included a merge join and dynamic partition elimination. The storage needed for the merge join key was incorrectly allocated on the inner side of the plan tree. This led to a SIGSEGV when executing the sort node of the inner side of the merge join. The problem was resolved by ensuring that only the nodes that need to use the join key are provisioned with the right amount of storage, leaving the rest of the plan tree intact.

31090 - Storage: Segment Mirroring : pg_rewind was updated to reduce the number of lstat operations it performs, in order to improve the performance of incremental recovery with a large number of data files.

173157638 - Cluster Management : When the user specified a valid port range using the gpaddmirrors -p option, gpaddmirrors would generate an error similar to error: Value of port offset supplied via -p option produces ports outside of the valid range. Mirror port base range must be between 6432 and 61000. This issue has been resolved.

Release 6.12

Release 6.12.1

Release Date: 2020-11-20

VMware Tanzu Greenplum 6.12.1 is a maintenance release that resolves several issues.

Resolved Issues

VMware Tanzu Greenplum 6.12.1 resolves these issues:

9207 - Server Standby : Fixed a memory overflow condition that could cause a standby node to shut down with the error, "FATAL","XX000","the limit of 500 distributed transactions has been reached (cdbdtxrecovery.c:571)".

11003 - Postgres Planner : Resolved an issue where a query that selected a constant and that specified one or more empty GROUPING SETS returned incorrect results.

30923 - Server Execution, Planner : Resolved a problem where a query could return incorrect results if segments held a NULL value in an empty set.

30946 - Query Optimizer : A query on a table with a btree index ran longer than expected because the Query Optimizer did not perform partition elimination when the query included an index join with both a local and a join predicate. This issue is resolved; the Query Optimizer improves dynamic and static partition elimination when indexes are present.

30993 - Optimizer : Fixed an issue where certain IN queries performed slowly because full table scans were used instead of indexes.

30999 - Cluster Management : Fixes an issue where moving a mirror segment to an alternate host, using gprecoverseg -F, was failing.

31007 - Cluster Management : Fixes an issue where no error was logged or reported when the incremental recovery of a segment using gprecoverseg failed. The logs now include a message similar to: [WARNING]:-Incremental recovery failed for my-segment-name. You must use gprecoverseg -F to recover the segment.

31014 - analyzedb : Fixed a problem where analyzedb could fail if a table was dropped and recreated during the analyzedb operation.

31018 - gpexpand : The redistribute phase of a cluster expansion returned the error ... failed to expand: error ERROR: ... is not a table when it tried to expand a materialized view. This issue is resolved; Greenplum Databae now supports expanding materialized views.

31026 - Metrics Collector : Resolved an issue where Greenplum Command Center version 6.3.0 and 6.3.1 could not show a visual query plan in the query monitor in certain cases.

31057 - Optimizer : When simplifying constraints during preprocessing, GPORCA did not consider the case where a constraint compared a column to an empty array (for example, EXPLAIN SELECT 1 FROM mytable WHERE mytable.mycolumn=ANY('{}');). These types of queries would cause GPORCA to crash. This problem was resolved by ensuring that GPORCA now skips merging a constraint with an empty array.

174906043 - Metrics Collector : Resolved an issue where the Greenplum Command Center could not show metrics for CREATE TABLE AS SELECT … FROMand COPY (SELECT ... FROM ...) TO ... statements.

175050196 - autovacuum : Resolved a fatal error that could occur when the autovacuum daemon performed a VACUUM operation on the template0 database.

175372920 - Resource Groups : Resolved an issue that could cause queries to fail if an earlier DROP RESOURCE GROUP command failed to drop the resource group.

175471857 - Query Dispatcher : Resolved a problem that could cause a transaction to incorrectly use single-phase commit instead of two-phase commit.

275569338 - Query Dispatcher : Resolved a problem that could cause the error, ERROR: unrecognized node type: 2139062143 (copyfuncs.c:6059), when the query dispatcher needed to refresh a materialized view.

Release 6.12.0

Release Date: 2020-10-30

VMware Tanzu Greenplum 6.12.0 is a minor release that includes changed features and resolves several issues.

Features

Greenplum Database 6.12.0 includes these new and changed features:

  • Greenplum now supports using segment hostnames when defining proxy ports with the gp_interconnect_proxy_addresses parameter (previously, IP addresses were required). Note that if a segment instance hostname is bound to a different IP address at runtime, you must execute gpstop -U to re-load the gp_interconnect_proxy_addresses value. See Configuring Proxies for the Greenplum Interconnect.
  • Because Greenplum Database does not enforce referential integrity syntax (foreign key constraints), the TRUNCATE command was updated so that it truncates a table referenced in a foreign key constraint, even if the CASCADE option is omitted.
  • The Greenplum Database 6.12.0 distribution includes the Greenplum Magic Tool (gpmt), a diagnostics and data collection tool.
  • The Greenplum Database 6.12.0 distribution includes the postgres_fdw PostgreSQL contrib module. Refer to the postgres_fdw module documentation for more information.

Resolved Issues

VMware Tanzu Greenplum 6.12.0 resolves these issues:

174311661 - Query Execution : When executing a long query that contained multi-byte characters, Greenplum could incorrectly truncate the query string (removing multi-byte characters) and, if log_min_duration_statement was set to 0, could subsequent write an invalid symbol to segment logs. This behavior could cause errors in gp_toolkit and Command Center. This problem has been resolved.

173190958 - Transactions : In some cases, a Greenplum Database master reset generated one or more orphaned prepared transactions. This issue is resolved; Greenplum now periodically checks for, and aborts, these transactions.

171249005 - Transactions : Resolves an issue where Greenplum Database generated a PANIC when it exhausted retry attempts to abort prepared transactions.

30992 - Transactions : In some cases, Greenplum Database generated a PANIC after reboot due to a race condition between a checkpoint and xlog COMMIT PREPARE recording. When Greenplum encountered an orphaned prepared transaction that was committed after the xlog was recorded, it returned the error message: cannot abort transaction transaction_number, it was already committed. This issue is resolved.

30970 - Query Optimizer : In some cases, Greenplum Database crashed when the Query Optimizer attempted to generate an index scan from a predicate that contained a subquery. This issue is resolved; the Query Optimizer now disallows such plans.

30962 - Query Optimizer : The second SELECT on an external table within a transaction returned zero records because the Query Optimizer did not generate a unique scan number to differentiate the two queries to the external table. This issue is resolved.

30960 - Query Optimizer : Resolves an issue where the Query Optimizer entered an infinite loop when it merged statistics buckets for double values in UNION and UNION ALL queries due to an incorrect comparison of bucket boundary values with a small Epsilon.

30980 - diskquota Module : Resolved an issue that caused master and mirror segments to display the warning, Share memory is not enough for active tables, when TRUNCATE and CREATE TABLE statements were executed.

30942 - Postgres Planner : In some cases, the Postgres Planner crashed or produced incorrect results when the HAVING clause of a query included a subquery, and one or more columns referenced in the subquery were not also specified in the GROUP BY column set. This issue is resolved.

10813 - Postgres Planner : Resolved an issue that could cause the Query Dispatcher to crash when creating a query plan for a subquery that has GROUPING SETS.

10794 - Resource Groups : Resolves an issue where the result of a query on pg_resgroup_get_status(NULL:oid) could not be saved to a table.

10376 - Query Execution : Resolved an issue where executing CREATE UNIQUE INDEX on a table partition would implicitly change the partition’s distribution key.

425 - gpbackup : When inserting into a table that is distributed by a bpchar and using the legacy bpchar hash operator, rows always used jump consistent hashing instead of legacy (modulo) hashing. This mismatch would cause gprestore operations to fail for when restoring Greenplum 4.x/5.x data into Greenplum 6.x. The problem occurred because a required hashing function ID was missing from a check function that determined if an attribute used legacy hashing. Greenplum 6.12 resolves this issue by adding the required hashing function ID in the check function.

Release 6.11

Release 6.11.2

Release Date: 2020-10-2

Pivotal Greenplum 6.11.2 is a maintenance release that includes changes and resolves several issues.

Changed Features

Greenplum Database 6.11.2 includes these changes:

Resolved Issues

Pivotal Greenplum 6.11.2 resolves these issues:

30549 - Management and Monitoring : Greenplum excluded externally-routable loopback addresses from replication entries, which caused utilities such as gpinitstandby and gpaddmirrors to fail. This problem has been resolved.

30795 - GPORCA : Fixed a problem where GPORCA did not utilize an index scan for certain subqueries, which could lead to poor performance for affected queries.

30878 - GPORCA : If a CREATE TABLE .. AS statement was used to create a table with non-legacy (jump consistent) hash algorithm distribution from a source table that used the legacy (modulo) hash algorithm, GPORCA would distribute the data according to the value of gp_use_legacy_hashops; however, it would set the table’s distribution policy hash algorithm to the value of the original table. This could cause queries to give incorrect results if the distribution policy did not match the data distribution. This problem has been resolved.

30903 / 30966 - Metrics Collector : Workfile entries were sometimes freed prematurely, which could lead to the postmaster process being reset on segments and failures in query execution, or segment PANIC. This problem has been resolved.

30928 - GPORCA : If gp_use_legacy_hashops was enabled, GPORCA could crash when generating the query plan for certain queries that included an aggregate. This problem has been resolved.

174812955 - Query Execution : When executing a long query that contained multi-byte characters, Greenplum could incorrectly truncate the query string (removing multi-byte characters) and, if log_min_duration_statement was set to 0, could subsequent write an invalid symbol to segment logs. This behavior could cause errors in gp_toolkit and Command Center. This problem has been resolved.

Release 6.11.1

Release Date: 2020-09-17

Pivotal Greenplum 6.11.1 is a maintenance release that includes changes and resolves several issues.

Changed Features

Greenplum Database 6.11.1 includes this change:

  • Greenplum Platform Extension Framework (PXF) version 5.15.1 is included, which includes changes and bug fixes. Refer to the PXF Release Notes for more information on release content and to access the PXF documentation.

Resolved Issues

Pivotal Greenplum 6.11.1 resolves these issues:

30751, 173714727 - Query Optimizer : Resolves an issue where a correlated subquery that contained at least one left or right outer join caused the Greenplum Database master to crash when the server configuration parameter optimizer_join_order was set to exhaustive2.

30880 - gpload : Fixed a problem where gpload operations would fail if a table column name included capital letters or special characters.

30901 - GPORCA : For queries that included an outer ref in a subquery, such as select * from foo where foo.a = (select foo.b from bar), GPORCA always used the results of the subquery after unnesting the outer reference. This could cause a crash or incorrect results if the subquery returned no rows, or if the subquery contained a projection with multiple values below the outer reference. To address this problem, all such queries now fall back to using the Postgres planner instead of GPORCA. Note that this behavior occurs for cases where GPORCA would have returned correct results, as well as for cases that could cause crashes or return incorrect results.

30913, 170824967 - gpfdists : A command that accessed an external table using the gpfdists protocol failed if the external table did not use an IP address when specifying a host system in the LOCATION clause of the external table definition. This issue is resolved in Greenplum 6.11.1.

174609237 - gpstart : gpstart was updated so that it does not attempt to start a standby master segment when that segment is unreachable, preventing an associated stack trace during startup.

Release 6.11.0

Release Date: 2020-09-11

Pivotal Greenplum 6.11.0 is a minor release that includes changed features and resolves several issues.

Features

Greenplum Database 6.11.0 includes these new and changed features:

  • GPORCA partition elimination has been enhanced to support a subset of lossy assignment casts that are order-preserving (increasing) functions, including timestamp::date and float::int. For example, GPORCA supports partition elimination when a partition column is defined with the timestamp datatype and the query contains a predicate such as WHERE ts::date == '2020-05-10' that performs a cast on the partitioned column (ts) to compare column data (a timestamp) to a date.
  • PXF version 5.15.0 is included, which includes new and changed features and bug fixes. Refer to the PXF Release Notes for more information on release content and supported platforms, and to access the PXF documentation.
  • Greenplum Command Center 6.3.0 and 4.11.0 are included, which include new workload management and other features, as well as bug fixes. See the Command Center Release Notes for more information.
  • The query dispatcher now supports the PostgreSQL LISTEN, UNLISTEN, and NOTIFY commands.
  • The DataDirect ODBC Drivers for Pivotal Greenplum were updated to version 07.16.0389 (B0562, U0408). This version introduces support for the following datatypes:

    Greenplum Datatype ODBC Datatype
    citext SQL_LONGVARCHAR
    float SQL_REAL
    tinyint SQL_SMALLINT
    wchar SQL_CHAR
    wvarchar SQL_VARCHAR

Resolved Issues

Pivotal Greenplum 6.11.0 resolves these issues:

30899 - Resource Groups : In some cases when running queries are managed by resource groups, Greenplum Database generated a PANIC when managing runaway queries (queries that use an excessive amount of memory) because of locking issues. This issue is resolved.

30877 - VACUUM : In some cases, running VACUUM returns ERROR: found xmin <xid> from before relfrozenxid <frozen_xid>. The error was caused when a previously run VACUUM FULL was interrupted and aborted on a query executor (QE) and corrupted catalog frozen XID information. This issue is resolved.

30870 - Segment Mirroring : In some cases, performing an incremental recovery of a Greenplum Database segment instance failed with the message requested WAL segment has already been removed because the recovery checkpoint was not created properly. This issue is resolved.

30858 - analyzedb : analyzedb failed if analyzedb attempted to update statistics for a set of tables and one of the tables was dropped and then recreated while analyzedb was running. analyzedb has been enhanced better handle the specified situation.

30845 - Query Execution : Under heavy load when running multiple queries, some queries randomly failed with the error Error on receive from seg<ID>. The error was caused when Greenplum Database encountered a divide by 0 error while managing the backend processes that are used to run queries on the segment instances. This issue is resolved.

30761 - Postgres Planner : In some cases, Greenplum Database generated a PANIC when a DROP VIEW command was cancelled from the Greenplum Command Center. The PANIC was generated when Greenplum Database did not correctly handle the visibility of the relation.

30721 - gpcheckcat : Resolved a problem where gpcheckcat would fail with Missing or extraneous entries check errors if the gp_sparse_vector extension was installed.

30637 - Query Optimizer : For some queries against partitioned tables, GPORCA did not perform partition elimination when a predicate that includes the partition column also performs an explicit cast. For example, GPORCA would not perform partition elimination when a partition column is defined with the timestamp datatype and the query contains a predicate such as WHERE ts::date == '2020-05-10' that performs a cast on the partitioned column (ts) to compare column data (a timestamp) to a date. GPORCA partition elimination has been improved to support the specified type of query. See Features.

10491 - Postgres Planner : For some queries that contain nested subqueries that do not specify a relation and also contain a nested GROUP BY clauses, Greenplum Database generated a PANIC. The PANIC was generated when Greenplum Database did not correctly manage the subquery correctly. This is an example of the specified type or query. SELECT * FROM (SELECT * FROM (SELECT c1, SUM(c2) c2 FROM mytbl GROUP BY c1 ) t2 ) t3 GROUP BY c2, ROLLUP((c1)) ORDER BY 1, 2; This issue is resolved.

10561 - Server : Greenplum Database does not support altering the datatype of a column defined as a distribution key or with a constraint. When attempting to change the datatype, the error message did not clearly indicate the cause. The error message has been altered to provide more information.

174505130 - Resource Groups : In some cases for a query managed by resource group, the resource group cancelled the query with the message Canceling query because of high VMEM usage because the resource group calculated the incorrect memory used by the query. This issue is resolved.

174353156 - Interconnect : In some cases when Greenplum Database uses proxies for interconnect communication (the server configuration parameter gp_interconnect_type is set to proxy), a Greenplum background worker process became an orphaned process after the postmaster process was terminated. This issue is resolved.

174205590 - Interconnect : When Greenplum Database uses proxies for interconnect communication (the server configuration parameter gp_interconnect_type is set to proxy), a query might have hung if the query contains multiple concurrent subplans running on the segment instances. The query hung when the Greenplum interconnect did not properly handle the communication among the concurrent subplans. This issue is resolved.

174483149 - Cluster Management - gpinitsystem : gpinitsystem now exports the MASTER_DATA_DIRECTORY environment variable before calling gpconfig, to avoid throwing warning messages when configuring system parameters on Greenplum Database appliances (DCA).

Release 6.10

Relelase 6.10.1

Release Date: 2020-08-13

Pivotal Greenplum 6.10.1 is a maintenance release that resolves known issues.

Resolved Issues

Pivotal Greenplum 6.10.1 resolves these issues:

n/a : Code changes and testing for the Greenplum interconnect proxy feature were introduced in version 6.10.0, but the feature was not enabled in the final release build. Version 6.10.1 resolves this problem and enables the feature.

10009 External table DELIMITER OFF BUG : Fixed a problem where additional data could be included after the last column of an external table if the DELIMITER 'OFF' formatting option was used when creating the table.

Release 6.10.0

Release Date: 2020-08-07

Pivotal Greenplum 6.10.0 is a minor release that includes changed features and resolves several issues.

Features

Greenplum Database 6.10.0 includes these new and changed features:

  • Greenplum Database 6.10 introduces the server configuration parameter max_slot_wal_keep_size that sets the maximum size in megabytes of replication WAL log files on disk per segment instance. The default is -1, Greenplum can retain an unlimited amount of WAL files on disk.
  • Greenplum Database 6.10 introduces the server configuration parameter gp_add_column_inherits_table_setting for append-optimized, column-oriented tables. When adding a column to a table with the ALTER TABLE command, the parameter controls whether the table’s data compression parameters for a column (compresstype, compresslevel, and blocksize) can be inherited from WITH clause values when the table was created. The default is off, the table’s data compression settings are not considered when adding a column to the table. If the value is on, the table’s WITH clause values are considered.
  • When reading data, the gpload utility now supports the control file parameter FILL_MISSING_FIELDS that can add NULL values to a data row if the row has trailing field values that are missing. To enable this feature, set the parameter to true.
  • Greenplum Database supports using proxies for Greenplum Database interconnect communication to reduce the use of connections and ports during query processing. The interconnect proxies consumes fewer connections and ports than TCP mode, and has better performance than UDPIFC mode in a high-latency network.

    To enable interconnect proxies for the Greenplum system, set these system configuration parameters.

    • List the proxy ports with the new parameter gp_interconnect_proxy_addresses. You must specify a proxy port for the master, standby master, and all segment instances.
    • Set the parameter gp_interconnect_type to proxy. The proxy value is new in Greenplum Database 6.10.
      Note: Code changes and testing for the Greenplum interconnect proxy feature were introduced in version 6.10.0, but the feature was not enabled in the final release build. Use version 6.10.1 to enable the feature.
  • The PgBouncer connection pooler was updated from version 1.8.1 to version 1.13. To support this change on RHEL/CentOS 6 platforms, the Greenplum Database package now requires libevent2 instead of libevent. RHEL/CentOS 7 requirements are unchanged. See the PgBouncer 1.13.x Release Notes for a summary of changes.

  • The gpcheckcat catalog verification utility adds a new test, aoseg_table, that you can use to check that the vertical partition information on append-optimized, column storage tables is consistent with pg_attribute.

  • Greenplum Database 6.10 introduces a new server configuration parameter, gp_fts_replication_attempt_count, that you can use to configure the maximum number of times that the Greenplum fault tolerance service (FTS) attempts to establish a primary-mirror replication connection.

  • PXF version 5.14.0 is included, which includes new and changed features and bug fixes. Refer to the PXF Release Notes for more information on release content and supported platforms, and to access the PXF documentation.

  • Greenplum Streaming Server (GPSS) version 1.4.1 is included, which includes changes and bug fixes. Refer to the GPSS Release Notes for more information on release content and to access the GPSS documentation.

    Note: If you have previously used GPSS in your Greenplum 6.x installation, you are required to perform upgrade actions as described in Upgrading the Streaming Server.

Resolved Issues

Pivotal Greenplum 6.10.0 resolves these issues:

30554 - ANALYZE, gprestore : In some cases, when performing concurrent ANALYZE operations on large partition tables generated the error Canceling query because of high VMEM usage. This issue occurred during some restore operations with gprestore. This issue is resolved.

30583 - Transaction Management : In some cases, a segment mirror went offline when the replication WAL log files on the mirror system caused disk full issues. Greenplum Database introduces the server configuration parameter max_slot_wal_keep_size to limit the amount of WAL logs stored on disk. See See Features.

30792 - Logging : Resolves a disk space issue encountered when a query that generated a large number of spill files also generated an excessive number of HashJoin: Too many batches computed log messages by decreasing the severity level of the message.

173680224 - gprecoverseg : In some cases, gprecoverseg hangs when performing an incremental recovery while trying to perform a clean shutdown of the segment instance due to a locking issue. This issue is resolved.

30781, 9427 - Postgres Planner : For some queries that perform joins between partitioned tables, Greenplum Database returned ERROR: unrecognized path type 106. This issue is resolved.

10419 - Postgres Planner : In some cases when generating query plans, the Postgres Planner did not handle volatile functions correctly and allowed multiple executions of the function during query execution. This could cause incorrect results. This issue is resolved.

30692 - System Catalog Functions : pg_get_viewdef() returned an incorrect view definition when the view was created with the Greenplum-specific CASE WHEN (arg1) IS NOT DISTINCT FROM (arg2) clause. This issue is resolved.

30558 - Query Optimizer : Resolves an issue where execution time and spill file size increased for a query on a larger width table because Greenplum Database overestimated row cardinality when the query specified multiple predicates that included distribution keys.

30512 - MPP: Dispatch : Resolves an issue where Greenplum Database hung while continuously retrying a primary-mirror replication connection. Greenplum 6.10 introduces a new server configuration parameter, gp_fts_replication_attempt_count, with which you can configure the maximum number of retry attempts.

10141 - ALTER TABLE … INHERIT : Greenplum Database restricted you from creating both a replicated table that inherits from another table, and a table that inherits from a replicated table, but Greenplum did allow you to use the ALTER TABLE ... INHERIT command to assign inheritance to/from a replicated table after it was created. Update commands on such a table returned the misleading error ModifyTable mixes distributed and entry-only tables. This issue is resolved; Greenplum 6.10 enforces replicated table inheritance restrictions uniformly, including on ALTER TABLE ... INHERIT statements.

10057 - Server: Transactions : Greenplum Database returned the error Too many distributed transactions for snapshot when processing many distributed transactions and the max_prepared_transactions server configuration parameter was set to a low value. This issue is resolved; Greenplum now uses a more robust method to determine the maximum number of distributed transactions.

10030 - Server: Query Execution : Greenplum Database incorrectly used the value of the gp_enable_global_deadlock_detector server configuration parameter on query executors to determine the lock mode. This issue is resolved, the parameter is now master only.

9896 - gpexpand : gpexpand failed on a partitioned Greenplum Database table that included an external table leaf child partition. This issue is resolved; Greenplum Database now skips external table partitions during expansion.

9207 - Server: Transactions : In certain situations, the Greenplum Database standby master host shutdown down and exited abnormally with the FATAL error the limit of *N* distributed transactions has been reached when the global transaction limit was exceeded. This issue is resolved; Greenplum now uses a more robust method to determine the maximum number of distributed transactions.

173219210 - Resource Groups : The Greenplum Database gp_toolkit.gp_resgroup_status_per_host view incorrectly reported CPU usage as the sum of resource group CPU usage on all segments on the same host. This issue is resolved; Greenplum Database now correctly reports the average CPU usage of resource groups on all segments on the host.

171849582 - Tablespace : Greenplum Database generated a PANIC when it did not check for the existence of a tablespace directory before attempting to delete the directory. This issue is resolved.

Release 6.9

Release 6.9.1

Release Date: 2020-07-24

Pivotal Greenplum 6.9.1 is a maintenance release that resolves several issues.

Note: Greenplum 6.9.1 also includes the Greenplum Database R Client (GreenplumR) version 1.1.0. This version of GreenplumR adds the input.signature argument to db.gpapply() and db.gptapply(), to match function argument names to table column names when the function input argument is not a single data frame. See Greenplum Database R Client.

Resolved Issues

Pivotal Greenplum 6.9.1 resolves these issues:

n/a : The DataDirect ODBC Drivers were updated to version 7.16.359 to incorporate hot fix changes and certify compatibility with SUSE Linux Enterprise Server 15 clients. See the DataDirect Release Notes for additional information.

10230, 31582 - Server: Execution : Greenplum Database generated a segmentation fault during execution of a hash aggregate query when it incorrectly initialized a tuple memory context for a partitioned table that contained a btree index. This issue is resolved.

30750 : The ANALYZE code did not exclude external tables (which cannot be analyzed) if an external table was part of a table partition hierarchy. This would result in the failure ERROR: unsupported table type when performing partitioned table operations that use ANALYZE, such as ALTER TABLE … EXCHANGE PARTITION. The problem was resolved by updating the ANALYZE code to correctly exclude these external tables.

Release 6.9.0

Release Date: 2020-06-26

Pivotal Greenplum 6.9.0 is a minor release that includes changed features and resolves several issues.

Features

Greenplum Database 6.9.0 includes these new and changed features:

  • Greenplum Streaming Server (GPSS) version 1.4.0 is included, which introduces many new and changed features and bug fixes. Refer to the GPSS Release Notes for more information on release content and to access the GPSS documentation.

    Note: If you have previously used GPSS in your Greenplum 6.x installation, you are required to perform upgrade actions as described in Upgrading the Streaming Server.
  • PXF version 5.13.0 is included, the first PXF release to also provide a separate download package that enables you to install PXF in a file system location outside of the Greenplum install directory. Refer to the PXF Release Notes for more information on release content and supported platforms, and to access the PXF documentation.

Resolved Issues

Pivotal Greenplum 6.9.0 resolves these issues:

30630 - Segment Mirroring : In some cases during a failover of a segment instance, Greenplum Database returned the FATAL error requested WAL segment WAL_seg_ID has already been removed. Greenplum Database WAL replication incorrectly removed segment files before they were processed during failover. This issue is resolved.

10216 - ALTER TABLE, ALTER DOMAIN : In some cases, heap table data is lost when performing concurrent ALTER TABLE or ALTER DOMAIN commands where one command alters a table column and the other rewrites or redistributes the table data. For example, performing concurrent ALTER TABLE commands where one command changes a column data type from int to text might cause data loss. This issue might also occur when altering a table column during the data distribution phase of a Greenplum system expansion. Greenplum Database did not correctly capture the current state of the table during command execution. This issue is resolved.

10224 - ALTER TABLE : For a leaf partition of a partitioned table, ALTER TABLE allowed the distribution policy to be changed to REPLICATED. This is resolved. ALTER TABLE no longer allows the change.

30647 - Postgres Planner : Some queries that performed multistage aggregation returned results in the incorrect order. For example, some queries that perform a COUNT in the select list and also contain a GROUP BY clause returned results in the incorrect order. This issue is resolved.

10013 - Postgres Planner : In some cases, Greenplum Database generated a PANIC when it encountered a lateral subquery that included a LIMIT 1 or GROUP BY clause. Greenplum 6.9 resolves this issue by forcing the gathering and materialization of any relation containing a GROUP BY or LIMIT clause.

10315 Postgres Planner : For some queries that perform a FULL JOIN using a subselect that contains a COALESCE function, Greenplum Database returned "ERROR: could not find hash distribution key expressions in target list". This issue is resolved.

8919 - MPP, Query Execution : Greenplum Database did not properly handle concurrent updating operations to a table when one of the operations moved a table distribution key to another segment instance. Now when a table distribution key is moved to another segment instance, a concurrent updating operation returns an error.

173243811 - Resource Groups : When resource groups are enabled and a user attempted to move a running query from one resource group to a resource group configured using the memory_limit=0 with the pg_resgroup_move_query() function, Greenplum Database returned the error ERROR: group <group_ID> doesn't have enough memory on master. This issue is resolved.

172931886 - Transaction System : Greenplum 6.9 resolves an issue where restarting a primary could lead to a segment process hang when there were prepared, but not yet committed or aborted, transactions in progress at the time of shutdown.

Release 6.8

Release 6.8.1

Release Date: 2020-06-11

Pivotal Greenplum 6.8.1 is a maintenance release that contains a changed feature and resolves several issues.

Changed Feature

The Greenplum PostGIS extension package has been updated to postgis-2.5.4+pivotal.2. The release contains these changes:

  • Adds support for the PostGIS TIGER geocoder extension and the PostGIS address standardizer and address rules files extensions.
  • Removes PostGIS Raster function limitations.
  • Uses the CREATE EXTENSION and DROP EXTENSION commands to enable and disable support for the PostGIS extension and supported, optional PostGIS extensions.

    Note: The postgis_manager.sh script is deprecated and will be removed in a future release of Greenplum PostGIS. To enable or disable PostGIS support, use the CREATE EXTENSION or DROP EXTENSION command. See Enabling and Removing PostGIS Support
    .

Resolved Issues

Pivotal Greenplum 6.8.1 resolves these issues:

30664 - Query Optimizer : For a complex CTAS query that has implicit casts in the project list, GPORCA may generate a plan with duplicate eliminating motions, to ensure correctness. However, if a duplicate eliminating motion is performed under the hash operation of a Hash Join, an implicit cast operation creates an additional column that causes memtuple binding issues in the executor. To address this problem, GPORCA now generates a modified plan that prunes the output of any duplicate eliminating motions before sending the output to the hash operation.

30684 - Query Optimizer : GPORCA returned incorrect results for some queries when the query’s select list contains a window function and the window function contains a correlated subquery or an outer reference. Now the query falls back to the Postgres planner.

30615 - Query Optimizer : GPORCA query performance degraded when compared with Greenplum 5 for some queries that perform joins using an equality predicate and the equality predicate contains a function, for example coalesce(tbl1.a, '999999') = coalesce(tbl2.a, '999999'). The performance issue was caused by inaccurate cardinality estimates. GPORCA cardinality estimation has been improved for the specified type of query.

172732495, 9953 - query execution : Greenplum Database generated a PANIC when executing a query that executes multiple user-defined functions and more than one of the functions is defined with the EXECUTE ON INITPLAN attribute. This issue is resolved.

172098556 - psql : Resolved a problem where the psql client \dm command did not display materialized views.

172094194, 9837 - gprecoverseg : In some cases when recovering segment instances using the gprecoverseg utility with the -i <recover_config_file> option to specify details about failed segments to recover, the utility changed some segment instance dbid values in the Greenplum system configuration. This issue is resolved.

Release 6.8.0

Release Date: 2020-06-05

Pivotal Greenplum 6.8.0 is a minor release that includes changed features and resolves several issues.

Features

Greenplum Database 6.8.0 includes these new and changed features:

  • Greenplum Streaming Server (GPSS) version 1.3.6 is included, which introduces many new and changed features and bug fixes since the last GPSS version installed in Greenplum 6.x (1.3.1). Refer to the GPSS Release Notes for more information on release content and to access the GPSS documentation.

    Note: If you have previously used GPSS in your Greenplum 6.x installation, you are required to perform upgrade actions as described in Upgrading the Streaming Server.
  • The gpinitsystem input configuration file specified with the -I option supports an additional format to specify hosts. The QD_PRIMARY_ARRAY, PRIMARY_ARRAY, and MIRROR_ARRAY host parameters may now be specified using either of the following formats:

    host~port~data_directory/seg_prefix<segment_id>~dbid~content_id
    
    hostname~address~port~data_directory/seg_prefix<segment_id>~dbid~content_id
    

    The first format, which is the pre-existing format, sets both the hostname and address columns of the gp_segment_configuration catalog table to the value in the host field. The second format sets the hostname and address columns of the gp_segment_configuration catalog table to the values in the respective hostname and address fields.

  • PXF version 5.12.0 is included, which introduces new and changed features and bug fixes. See PXF Version 5.12.0 below.

  • PL/Container version 2.1.2 is included, which introduces the following new features:

    • Support for R version 3.6.3.
    • A new --use_local_copy option to the plcontainer add-image command that you can use to install the specified image only on the local host.
  • Greenplum Database 6.8 adds support for Moving a Query to a Different Resource Group.

  • Greenplum Database 6.8 includes a new metrics collector extension that is compatible with Greenplum Command Center 6.2 and above. If you are using Command Center 6.0 or 6.1 you must download and install Command Center 6.2 after you install Greenplum Database 6.8.

PXF Version 5.12.0

PXF includes the following new and changed features:

  • PXF trims right-padded white space added by Greenplum before it writes Parquet data.
  • PXF bundles newer hive, jackson-databind, and supporting internal libraries.
  • A PXF server running on Java 11 can now read from Hive using an external table that specifies a Hive* profile.
  • PXF introduces the new custom option IGNORE_MISSING_PATH for external tables that you use to read file-based data. Setting this option may be useful when a PXF external table is a child partition of a partitioned Greenplum table. Refer to About PXF External Table Child Partitions for more information.
  • PXF bundles the jodd-core library to satisfy a missing transitive dependency that is required when PXF reads Parquet files that contain data in timestamp format.
  • PXF adds column projection support for the Hive and HiveRC profiles by changing the implementation to use column name-based, rather than column index-based, mapping.

    Note: If you have existing PXF external tables that specify a Hive* profile, you may be required to perform upgrade actions as described in Upgrading PXF.

Resolved Issues

Pivotal Greenplum 6.8.0 resolves these issues:

329, 30602 - PXF : PXF did not correctly read a partitioned Hive table when the external table specified a Hive* profile and the external table and Hive table had a differing number of columns. This issue is resolved. PXF now supports column projection for the Hive* profiles and correctly handles this situation.

30611 - Query Optimizer : When falling back to the Postgres planner, GPORCA incorrectly logged messages that were internal messages. This made the log file difficult to read and caused bloat in the file. This issue is resolved. GPORCA message logging has been improved and the internal messages are no longer sent to the log files.

30585 - Locking : Resolved a problem that could corrupt resource queue locks, and potentially other types of locks, in shared memory. This problem could cause errors such as lock lock_name on object object_identifier is already held.

30557 - DDL : When performing a data reorganization with the ALTER TABLE command on a leaf partition of a partitioned table that did not change the distribution policy, Greenplum Database returned the error ERROR: can't set the distribution policy. This type of redistribution is allowed in Greenplum 5. Now Greenplum allows data reorganization on a leaf partition if the distribution policy is not changed.

30289 - Query Optimizer : When GPORCA performed dynamic partition elimination for some queries against partitioned tables that perform joins, GPORCA was not using the correct statistics. This caused a performance degradation when compared with Greenplum 5. GPORCA has improved how statistics are computed tor the specified type of query.

172854840 - Interconnect : In some cases, a query that executes a stable function that contains an SQL statement might hang because the query dispatcher (QD) did not correctly manage the execution of the function and the dispatching of the query plan. This issue is resolved.

172832212 - Interconnect : In some cases, communication between a query dispatcher (QD) and a query executor (QE) on different segments was slow when the Greenplum interconnect type is set to the TCP networking protocol for Greenplum Database interconnect traffic. Now the communication between a QD and a QE is more efficient.

172615233 - Query Optimizer : For text data types, the GPORCA the cardinality estimation algorithm has been improved for equality comparisons. For example, when a query contains an IN clause that contains text elements.

172576000 - COPY : If data format errors occurred while copying data into a partitioned table with a COPY FROM command in single row error isolation mode, Greenplum Database might crash when a query executor (QE) did not handle the data format error correctly. This issue is resolved.

30487 - Utility Commands : On a Greenplum Database 6 system with FIPS enabled, Greenplum utility commands such as gpinitsystem returned the error "ERROR:root:code for hash md5 was not found." This issue is resolved.

30484 - Utility Commands : When initializing a Greenplum Database system with gpinitsystem, the primary segments were erroneously named using DNS resolvable external hostnames instead of the internal interconnect interface hostnames. At the same time, the segment mirrors were correctly named. This issue is now resolved.

Release 6.7

Release 6.7.1

Release Date: 2020-04-30

Pivotal Greenplum 6.7.1 is a maintenance release that resolves several issues. In addition to these resolved issues:

  • Version 6.7.1 updates PostGIS to version 2.5.4, which removes several previous limitations. See Geospatial Analytics for more information.
  • The Greenplum R client is no longer considered a Beta feature.

Resolved Issues

Pivotal Greenplum 6.7.1 resolves these issues:

n/a - MADlib : In Greenplum 6.7.0 the MADlib download files that were originally provided, madlib-1.17.0+2-gp6-rhel7-x86_64.tar.gz and madlib-1.17.0+2-gp6-rhel6-x86_64.tar.gz, contained MADlib version 1.16 instead of version 1.17. This is resolved in Greenplum 6.7.1, and in Greenplum 6.7.0 with the newly-provided files madlib-1.17.0+3-gp6-rhel7-x86_64.tar.gz and madlib-1.17.0+3-gp6-rhel6-x86_64.tar.gz.

9790 - Server : A crash could occur when performing a SELECT query against a column-oriented table, when the table was created using the WITH NO DATA clause. The problem occurred because the WITH clause options were not correctly added to the pg_attribute_encoding table. This problem has been resolved.

30499 - Server: Execution : Fixed a memory leak that occurred when executing CHECKPOINT commands.

30559 - Query Optimizer : Queries that contain an IN clause with a large number of constants took a long time to generate a query plan. Most of the time was spent estimating the cardinality of the IN clause predicate. The cardinality estimation algorithm has been enhanced and significantly reduces the cardinality estimation time for the specified type of query.

30579 - Interconnect : In some cases during query execution, the query hung with the query dispatcher (QD) waiting for the query executor (QE) on a few segment instances to complete. This issue is resolved.

30844 - gpreload : gpreload returned the error more than one row returned when attempting to reload a table and a view with the same name exists in a different schema in the database. This issue is resolved.

172163076 - Server : A subtransaction would incorrectly use 1-phase commit instead of 2-phase commit if \set ON_ERROR_ROLLBACK interactive was enabled in a client’s .psqlrc file. This problem has been resolved.

172324858, 9891 - MPP: Locking, Signals, Processes : In some cases, Greenplum Database did not manage snapshots correctly when processing concurrent distributed transactions. This caused a concurrent transaction to access a distributed log file that was no longer available and generated the error message Could not open file ""pg_distributedlog/<file-name>"": No such file or directory. This issue is resolved.

172348849 - Postgres Planner : Some queries that contain a UNION ALL that combines the results from SELECT command that uses a replicated table with another SELECT command returns the error ERROR: could not build Motion path. This issue is resolved.

172284550 9823 - ALTER DATABASE : The ALTER DATABASE...FROM CURRENT command did not set a server configuration parameter for a database. This issue is resolved.

Release 6.7.0

Release Date: 2020-04-17

Pivotal Greenplum 6.7.0 is a minor release that includes changed features and resolves several issues.

Features

Greenplum Database 6.7.0 includes these new and changed features:

  • Greenplum Database 6.7 introduces the new gp_resource_group_queuing_timeout server configuration parameter. When the resource group-based resource management scheme is active, gp_resource_group_queuing_timeout specifies the maximum amount of time a transaction waits for execution in a queue on a resource group before Greenplum Database cancels the transaction. By default, queued transactions in a resource group can wait indefinitely.
  • Greenplum Database 6.7 includes MADlib version 1.17, which introduces new Deep Learning features, k-Means clustering, and other improvements and bug fixes. See the Apache MADlib page for additional information and Release Notes.

    Note: In Greenplum 6.7.0 the MADlib download files that were originally provided, madlib-1.17.0+2-gp6-rhel7-x86_64.tar.gz and madlib-1.17.0+2-gp6-rhel6-x86_64.tar.gz, contained MADlib version 1.16 instead of version 1.17. This is resolved in Greenplum 6.7.1, and in Greenplum 6.7.0 with the newly-provided files madlib-1.17.0+3-gp6-rhel7-x86_64.tar.gz and madlib-1.17.0+3-gp6-rhel6-x86_64.tar.gz.

Resolved Issues

Pivotal Greenplum 6.7.0 resolves these issues:

8539 - Server : Using NOWAIT in a SELECT FOR UPDATE statement could result in the error, ERROR: relation "<name>" does not exist, because locking was not correctly handled for the NOWAIT clause. This problem has been resolved. Note, however, that NOWAIT only affects how the SELECT statement obtains row-level locks. A SELECT FOR UPDATE NOWAIT statement will always wait for the required table-level lock; it behaves as if NOWAIT was omitted.

9089 - Server : Fixed a problem where Greenplum Database failed to truncate an append-only, column-oriented table if the CREATE TABLE and TRUNCATE statements were executed in the same transaction.

30305 - Resource Groups : A transaction may be queued for execution on a resource group for an extended period of time, particularly when the resource group reached its concurrent transaction limit. This could prevent queries initiated by Greenplum Database superusers from executing. Greenplum Database 6.7 resolves this issue by introducing the gp_resource_group_queuing_timeout server configuration parameter, which specifies the maximum amount of time a queued transaction waits for execution in a resource group before Greenplum cancels the transaction.

30531 - Query Optimizer : An out of memory error occurred when running some queries that contain joins that perform a comparison operation on citext data. The error occurred because the query falls back to the Postgres Planner. This issue is resolved. Now the query does not fall back to the Postgres planner, the query is executed using GPORCA.

30536 - PL/pgSQL : In a PL/pgSQL procedure, output text from a RAISE NOTICE statement was not displayed correctly if the text contained a newline (line feed) character. Only the text before the newline character was displayed. This issue is resolved.

Release 6.6

Release 6.6.0

Release Date: 2020-04-06

Pivotal Greenplum 6.6.0 is a minor release that includes changed features and resolves several issues.

Features

Greenplum Database 6.6.0 includes these new and changed features:

  • For the CREATE EXTERNAL TABLE command, the LOG ERRORS clause now supports the PERSISTENTLY keyword. The LOG ERRORS clause logs information about external table data rows with formatting errors. The error log data is stored internally. When you specify LOG ERRORS PERSISTENTLY, the log data persists after the external table is dropped.

    If you use the PERSISTENTLY keyword, you must install the functions that manage the persistent error log information.

    For information about the error log information and built-in functions for viewing and managing error log information, see CREATE EXTERNAL TABLE.

  • PXF version 5.11.2 is included, which introduces these changes:

    • PXF no longer validates the JDBC BATCH_SIZE write option during a read operation.
    • PXF bundles a newer jackson-databind library.
    • PXF removes references to the unused pxf-public.classpath file. This in turn removes spurious WARNING: Failed to read classpath file ... log messages.
    • PXF now bundles Tomcat version 7.0.100.
  • Greenplum Database 6.6 includes MADlib version 1.17, which introduces new Deep Learning features, k-Means clustering, and other improvements and bug fixes. See the MADlib 1.17 Release Notes for a complete list of changes.

Resolved Issues

Pivotal Greenplum 6.6.0 resolves these issues:

30483 - Query Optimizer : A query that specified multiple constants in an IN clause generated a large number of spill files and returned the error workfile per query size limit exceeded when GPORCA incorrectly normalized a histogram that was not well-defined. This issue is resolved.

30488 - DLL : For some append-optimized partitioned tables, performance was poor when adding a column to the table with the ALTER TABLE... ADD COLUMN command because the command performed a full table rewrite. Now only data corresponding to the new column is rewritten.

30518 - Query Optimizer : A query that specified an aggregate function such as min() or count() that was invoked on a citext-type column failed with the error cache lookup failed for function 0 because GPORCA incorrectly generated a multi-stage aggregate for the query. This issue is resolved.

30525 - Logging : In some cases, Greenplum Database encountered a segmentation fault and rotated the log file early when the logging level was set to WARNING or less severe and Greenplum attempted to write to the alert log file after it failed to open the file. This issue is resolved.

171506474 - COPY : When COPY FROM SEGMENT command copied data into an append-only table, the command did not update the append-only table metadata tupcount (the number of tuples on a segment, including invisible tuples) and modcount (the number of data modification operations performed). This issue is resolved.

n/a - gpperfmon : The Ubuntu build of Greenplum Database 6.5.0 did not include the gpperfmon database, which is required for using Greenplum Command Center. This issue is resolved in version 6.6.0.

Release 6.5

Release 6.5.0

Release Date: 2020-03-20

Pivotal Greenplum 6.5.0 is a minor release that includes changed features and resolves several issues.

Warning: The Ubuntu build of Greenplum Database 6.5.0 does not include the gpperfmon database, which is required for using Greenplum Command Center. Customers deploying to Ubuntu should not install or upgrade to Greenplum Database 6.5 until a maintenance release is provided to resolve this issue.

Features

Greenplum Database 6.5.0 includes these new and changed features:

  • When creating a user-defined function, you can specify the attribute EXECUTE ON INITPLAN to indicate that the function contains an SQL command that dispatches queries to the segment instances and requires special processing on the master instance by Greenplum Database. When possible, Greenplum Database handles the function on the master instance in the following manner:

    1. First, Greenplum Database executes the function as part of an InitPlan node on the master instance and holds the function output temporarily.
    2. Then, in the MainPlan of the query plan, the function is called in an EntryDB (a special query executor (QE) that runs on the master instance) and Greenplum Database returns the data that was captured when the function was executed as part of the InitPlan node. The function is not executed in the MainPlan. For more information about the attribute and limitations when using the attribute, see CREATE FUNCTION.
  • GPORCA introduces a new costing model for bitmap indexes. The new model is designed to choose faster, bitmap nested loop joins instead of hash joins. The new costing model is implemented as a Beta feature, and it is used as a default only if you enable it by setting the configuration parameter: set optimizer_cost_model = experimental

    The optimizer_cost_model parameter is required only during the Beta test period for this cost model. After further testing and validation, the new cost model will be enabled by default.

  • Greenplum Database includes the server configuration parameter plan_cache_mode that controls whether a prepared statement (either explicitly prepared or implicitly generated, for example by PL/pgSQL) can be executed using a custom plan or a generic plan plan.

    Custom plans are created for each execution using its specific set of parameter values, while generic plans do not rely on the parameter values and can be re-used across executions. By default, choice between these options is made automatically, but it can be overridden by setting this parameter. If the prepared statement has no parameters, a generic plan is always used. The allowed values are auto (the default), force_custom_plan and force_generic_plan. This setting is considered when a cached plan is to be executed, not when it is prepared.

  • PXF version 5.11.1 is included, which introduces new and changed features and bug fixes. See PXF Version 5.11.1 below.

  • The s3 external table protocol automatically recognizes and uncompresses as deflate format any file that it reads that has a .deflate extension.

  • Greenplum Database introduces the Greenplum R Client Beta, an interactive in-database data analytics tool. Refer to the Greenplum Database R Client (Beta) documentation for installation and usage information for this tool.

    The Greenplum R Client (GreenplumR) is currently a Beta feature, and is not supported for production environments.

  • gpload adds the --max_retries option to specify the number of times the utility attempts to connect to Greenplum Database after a connection timeout. The default value, 0, does not attempt a connection after a timeout.

  • Greenplum Database introduces PL/Container version 3.0 Beta, which:

    • Provides support for the new GreenplumR interface.
    • Reduces the number of processes created by PL/Container, in order to save system resources.
    • Supports more containers running concurrently.
    • Includes improved log messages to help diagnose problems. PL/Container 3 is currently a Beta feature, and is not supported for production environments. It provides only an R Docker image for executing functions; Python images are not yet available. See PL/Container Language for installation changes related to PL/Container 3.

PXF Version 5.11.1

PXF includes the following new and changed features:

  • PXF provides a restart command to stop, and then restart, all PXF server instances in the cluster. See Restarting PXF.
  • The pxf [cluster] sync command now recognizes a [-d | --delete] option. When specified, PXF deletes files on the remote host(s) that are not present in the PXF user configuration on the Greenplum Database master host. Refer to pxf and pxf cluster.
  • PXF supports filter predicate pushdown for Parquet data that you access with the Hadoop and Object Store Connectors. Parquet Data Type Mapping describes filter pushdown support for Parquet data types in PXF.
  • PXF includes improvements to error handling and error surfacing.
  • PXF bundles newer guava and Google Cloud Storage hadoop2 libraries.
  • The PXF pxf-log4j.properties template file updates a log filter and changes the level from INFO to WARN.
  • PXF removes unused and default Tomcat applications and files, hardening its default Tomcat security.
  • PXF no longer requires a $JAVA_HOME setting in gpadmin’s .bashrc file on the master, standby master, and segment hosts. You can now specify JAVA_HOME before or during PXF initialization. Refer to the Initialization Overview in the PXF initialization documentation.

Resolved Issues

Pivotal Greenplum 6.5.0 resolves these issues:

307 - PXF : PXF did not correctly handle an external table that was created with the ESCAPE 'OFF' or DELIMITER 'OFF' formatting options. This issue is resolved. PXF now correctly neither escapes nor adds delimiters when reading external data with an external table created with these options.

30155 - gpstart : On systems that use a custom tablespace or filespace, gpstart could fail to start a cluster if the standby master host was down (for example, if the standby was taken offline for maintenance), showing the error: Error occured while stopping the standby master: ExecutionError: 'non-zero rc: 255' occured. This problem occurred because gpstart was attempting to check and sync the filespace or tablespace on the unavailable standby master host. gpstart was modified to skip filespace and tablespace checks when the standby server is not available.

30255 - Query Optimizer : The GPORCA cost model for bitmap indexes could would sometimes cost bitmap nested loop joins higher than hash joins, resulting in poor query performance. Greenplum Database 6.5 introduces a revised cost model for bitmap indexes to address this issue. See See Features.

30287 - Server: Execution : When GPORCA was enabled, queries against an append-only, column-oriented table could cause a PANIC due to shared memory corruption. The code was modified to guard against out-of-bound writes that caused the memory corruption.

30367 - Query Optimizer : For GPORCA, query performance was poor for some queries against tables with columns that are defined with the citext datatype. The poor performance was because GPORCA did not gather statistics and calculate cardinalities for those columns. Now GPORCA gathers statistics and calculates cardinalities for columns defined with the citext datatype.

30369 - Query Execution : Greenplum Database generated a PANIC when executing a query that contains a JOIN LATERAL and the LATERAL subquery contains a LIMIT clause. Now the specified type of query completes.

30379 - ANALYZE : In some cases, performing an ANALYZE operation on a table with a column that is defined with the citext datatype returns the error permission denied for schema <name>. The error was generated when the user performing the operation did not have USAGE privilege in the schema where the citext datatype was defined with the CREATE EXTENSION citext command. Greenplum Database has been modified to not require USAGE privilege in the citext datatype schema for ANALYZE operations.

30382 - VACUUM,TRUNCATE : In some cases, performing a VACUUM FULL operation on the pg_class catalog table and concurrently performing a TRUNCATE operation on a user created heap table returned the error updated tuple is already HEAP_MOVED_OFF and caused the database to become unavailable. The TRUNCATE command did not properly manage the heap table entry in pg_class during the TRUNCATE operation. This issue is resolved.

30390 - gprecoverseg : In some cases, performance was poor when performing an incremental recovery with the gprecoverseg utility on a system with a large number of segment instances. Performance is improved, now the utility performs some recovery operations in parallel.

30405 - gpcheckcat : The gpcheckcat utility failed when the dbid of Greenplum Database master was not 1. Now the master dbid is not required to be 1.

30426 - Query Execution : Some queries that use the window function cume_dist() return the error Backward scanning of tuplestores are not supported if the query generates spill files. This issue is resolved and backward scanning of tuplestore spill files is allowed during query execution.

30437 - Query Optimizer : Queries using dynamic partition elimination (DPE) with range predicates were running slow. This issue has been fixed by allowing only equality comparisons with DPE.

30438 - Catalog and Metadata : If the server configuration parameter gp_use_legacy_hashops was set to on, Greenplum Database incorrectly used the non-legacy opclasses when redistributing a table with an ALTER TABLE...(REORGANIZE = TRUE) command if the command contained DISTRIBUTED BY clause the did not specify an opclass. This caused SELECT commands against a table with redistributed data to return incorrect results.

30441 - analyzedb : The analyzedb utility could fail with an error similar to ERROR: relation "pg_aoseg.pg_aocsseg_xxxxxx" does not exist if a table was dropped during the analyzedb operation. This problem was resolved by ensuring that analyzedb skips any dropped tables when determining the list of tables to analyze.

30450 - PXF : PXF initialization and reset failed when the default system Java version differed from that specified in PXF’s $JAVA_HOME. This issue is resolved; PXF has added flexibility to the specification of the $JAVA_HOME setting.

30452 - Dispatch : If the server configuration parameter check_function_bodies was set in a session on the master, the parameter setting did not persist when a related segment instance session was reset. This caused some functions to fail. Now the parameter setting persists when a segment instance session is reset.

30464 - Query Optimizer : GPORCA incorrectly determined that the plan for a query with a filter on a window function over the distribution key column was direct dispatchable. Now direct dispatch requires the filter to be on a table scan.

30471, 8987 - Postgres Planner : When the Postgres Planner executed some queries that contain a subquery that contain both a distinct qualified aggregate expression and a GROUP BY clause, the Postgres Planner returned the error could not find pathkey item to sort. The error was returned when the Postgres Planner supply did not properly manage information used for sorting.

30474 - Query Execution : In some specific situations, some specific types of queries generated a Greenplum Database PANIC. The PANIC occurred when Greenplum Database did not properly handle skew optimization for multi-batch hash joins. Criteria for a query that caused the PANIC include the query contains a join, the join key has segment-local statistics (such as a catalog table), and the join key is one of the most common values, and the query plan is multi-batch hash join and the hash join is rescannable.

30477 - Query Analyze : While gathering statistics for a partitioned table, the pg_class columns relpages and reltuples were not populated for the root partition, only for leaf partitions. This issue has been fixed by changing the method to calculate if a partition table is empty or not.

30485 - gpinitsystem : When initializing a Greenplum Database system, the gpinitsystem utility failed to set the password for a user name when the name is numeric.

30493 - analyzedb : When used with the --config-file option, analyzedb did not enumerate the leaf partitions of a partitioned table and processed the root partition as a non-partitioned table. For heap tables this produced an error. For append-optimized tables, no error was raised, but DML changes to leaf partitions were not tracked properly. This issue has been resolved. Using the --config-file option correctly analyzes partitioned tables.

168199193 - COPY : In some cases, performance of the COPY command in Greenplum Database 6 was poor when compared to Greenplum Database 5. The performance of the COPY command is improved.

168828451, 8677 - Planner : Some queries returned incorrect results when the queries contain subqueries that perform a join and also contain one or more equality predicates, and optionally contain an IS NULL predicate. Incorrect results were returned when either a merge join or a nested loop join did not correctly process the predicates. This issue is resolved.

169030090 - Server : Superusers were limited to 3 connections by default, causing “too many clients” errors when users run maintenance scripts. The maximum number of superuser connections is set with the superuser_default_connections server configuration parameter. This issue is resolved. The default value for this parameter has changed from 3 to 10.

170745356, 9407 - Query Execution : With gp_enable_global_deadlock_detector set to on, concurrent updates to the same table could produce an incorrect query result. This issue is resolved. Segments report waited-for transaction IDs to the master so that the master has the same transaction order as the segments.

170861600 - Server : Using ALTER TABLE tablename SPLIT PARTITION could cause rows to be assigned to the wrong partition, or could cause a crash, if one or more columns before the partition key were dropped. This issue is resolved.

171481916 - gpinitstandby : In some cases, utilities that checked for host IP address such as gpinitstandby failed. A Python utility (ifaddrs) that is used those Greenplum Database utilities caused the failure. ifaddrs has been updated.

171596248 9679 Query Execution : A Greenplum Database segment instance might generate a PANIC when a query that joins tables with a compound data type generates a query plan that performs a data motion and contains Nested Loop joins. The PANIC occurs due to an error in the prefetch logic for the motion. The prefetch logic issue has been corrected.

Release 6.4

Release 6.4.0

Release Date: 2020-02-11

Pivotal Greenplum 6.4.0 is a minor release that includes changed features and resolves several issues.

Features

Greenplum Database 6.4.0 includes these changed features:

  • DISCARD ALL is not supported. The command returns a message that states that the command is not supported and to consider alternatives such as DEALLOCATE ALL or DISCARD TEMP. See DISCARD.
  • Greenplum Database resource groups support automatic query termination when resource group global shared memory is enabled. For resource groups that use global shared memory, Greenplum Database gracefully cancels running queries that are managed by those resource groups when the queries consume a large portion of the global shared memory. At this point, the queries would have already consumed available resource group slot memory and group shared memory. Greenplum Database cancels queries in order of memory used, from highest to lowest until the percentage of utilized global shared memory is below the percentage specified by the server configuration parameter runaway_detector_activation_percent.

    Global shared memory is enabled for resource groups that are configured to use the vmtracker memory auditor, such as admin_group and default_group when the sum of the MEMORY_LIMIT attribute values configured for all resource groups is less than 100. If resource groups are not configured to use global shared memory pool, the resource group automatic query termination feature is not enabled.

    For information about resource groups, see Using Resource Groups.

  • Greenplum Database supports creating the standby master or mirror segment instances on hosts that are in a different subnet from the master and primary segment instance hosts. Mirror segment instances can also be moved to hosts that are in a different subnet.

Resolved Issues

Pivotal Greenplum 6.4.0 resolves these issues:

30209, 170762049 - gpaddmirrors : The gpaddmirrors utility failed to add mirror segments to a Greenplum Database system, when the mirror segments are in a different subnet from the primary segments. This issue is resolved. Now Greenplum Database supports mirror segments in a different subnet from the primary segments.

30266 - Vacuum : The vacuum workflow changed in Greenplum Database 6 to dispatch once per auxiliary table in addition to the dispatch for the main table. This could cause performance problems and permission verification failures on auxiliary tables. This is fixed. The VACUUM command again dispatches once per segment.

30282 - Resource Management : For some queries managed by Greenplum Database resource groups, the query failed due to an out of memory condition. This caused a segment failure and other issues. This issue is resolved. Resource groups have been enhanced to handle queries that consume a large amount of memory. See Features.

30299 - Server: Execution : For some queries against a partitioned table with multiple column indexes that perform a dynamic index scan, Greenplum Database generated a SIGSEGV. The error occurred when Greenplum Database did not correctly manage dynamic index scan pointer memory. This issue is resolved.

30325 - Query Optimizer : For some queries that reference a view that is defined with a CTE (common table expression) query, and the main query also contains a predicate that is a subquery that references the view, GPORCA returns an error that states could not open temporary file. The error was caused by incorrect predicate pushdown during preprocessing. This issue is resolved.

30359 - Server: Segment Mirroring : A panic occurred during crash recovery for a mirror when replaying WAL records for a transaction that created an append-optimized table, truncated the table, and then aborted the transaction. This issue is resolved.

30354 - DISCARD : DISCARD TEMP might not drop temporary tables on segment instances. This issue is resolved.

30360 - Server: Security : Greenplum Database incorrectly logged the message time constraints added on superuser role every time a superuser role was checked. Now the message is logged only when time constraints are added to a superuser role.

30366 - Server: Execution : Using GRANT commands on partitioned tables across segments would cause the system to PANIC due to alterations in the cached plan. This issue has been resolved.

30371 - Server: Execution : The to_timestamp() function did not return an error for an out of range value. For example, select to_timestamp('20200340123456','YYYYMMDDHH24MISS'); returned a valid date and time 2020-04-09 12:34:56-07. This issue is resolved. Now the function returns an error.

30387 - Server: DML : After completing the execution of a user-defined function that changed the value of a server configuration parameter, Greenplum Database restored the original parameter value on the query dispatcher, but did not synchronize and restore the value to the query executors. This issue is resolved.

170787232 - Server: Query Dispatcher : A Query Dispatcher (QD) would run into local deadlock for some prepared statements that execute an UPDATE or DELETE command. This issue has been fixed by taking into consideration the server configuration parameter gp_enable_global_deadlock_detector.

Release 6.3

Release 6.3.0

Release Date: 2020-1-12

Pivotal Greenplum 6.3.0 is a minor release that includes new features and resolves several issues.

Features

Greenplum Database 6.3.0 includes these new features:

  • The sever configuration parameter wait_for_replication_threshold is introduced to improve performance for Greenplum Database systems with segment mirroring enabled. The parameter specifies the maximum amount of Write-Ahead Logging (WAL)-based records (in KB) written by a transaction on the primary segment instance before the records are written to the mirror segment instance for replication. As the default, Greenplum Database writes the records to the mirror segment instance when a checkpoint occurs or the wait_for_replication_threshold value is reached. See wait_for_replication_threshold.
  • The PL/Container version has been updated to 2.1.0. This version supports Docker images with Python 3 installed. These new PL/Container features enable support for Python 3:

    • A Docker image that is installed with Python 3 - plcontainer-python3-images-2.1.0.tar.gz

      The Docker image can be downloaded from Pivotal Network.

    • The new value python3 for the --language option of plcontainer runtime-add command.

      You specify this value when you add a Docker image that has Python 3 installed on to the Greenplum Database hosts with the plcontainer runtime-add command.

    • The GluonTS module has been added to Python Data Science Module package. The Python Data Science modules are installed with Python 3 in the Docker image on Pivotal Network.

      Note: PL/Container 2.0.x and earlier do not support Python 3.

    For information about PL/Container, see PL/Container Language.

  • PXF version 5.10.1 is included, which introduces bug fixes.

  • The metrics collector extension included with Greenplum Database 6.3.0 adds a new gpcc.enable_query_profiling server configuration parameter that can be enabled to help with performance troubleshooting. When off, the default, the metrics collector does not collect queries executed by the gpmon user in the gpperfmon database or plan node history for queries that run in less than ten seconds. If you enable gpcc.enable_query_profiling in a session the metrics collector collects those queries in that session. See Metrics Collector Server Configuration Parameters in the Greenplum Command Center documentation for more information.

Resolved Issues

Pivotal Greenplum 6.3.0 is a minor release that resolves these issues:

30214 - Query Optimizer : The GPORCA algebrizer might generate a PANIC when optimizing a query involving a window function where one or more of the columns selected was a subquery. This issue is resolved.

30252 - Storage: Filespace / Tablespace : Data distribution errors could lead to data corruption if UPDATE or DELETE statements resulted in data movement between segments (for example, if the UPDATE of an affected tuple was distributed to another segment). The code was modified to check for distribution problems during UPDATE and DELETE operations, and to error out and cancel the operation in order to prevent data corruption. Problems detected in this manner are reported with the error: distribution key of the tuple doesn't belong to current segment (actually from segment_id)

30264 - Segment Mirroring : In some cases when Greenplum Database segment mirroring is enabled, loading a large amount of data caused mirror segment instances to fail due to a timeout issue. This issue has been resolved.

30300 - gpbackup/gprestore : When a view was restored from a backup and the view definition contained the function gp_dist_random(), the definition of the restored view did not contain the function. Greenplum Database has been updated to resolve this issue. Now the restored view contains the correct definition.

30301 - analyzedb : The analyzedb command could take a long time to complete when Greenplum Database incorrectly determined that the statistics for a child partition were not up to date and subsequently resampled the statistics for all partitions. This issue is resolved.

30320 - COPY : In some cases, Greenplum Database returned a segment reject limit reached error when a COPY operation specified a SEGMENT REJECT LIMIT and Greenplum encountered a data formatting error. Because Greenplum rescanned the offending line, it returned the error even before the error limit had been reached. This issue is resolved.

30321 - Query Optimizer : When computing the join order for certain queries, ORCA tries to create a set of all the tables that have a predicate in common with the current join tree, and then pick one of the tables from this set. In certain cases involving left joins, ORCA would error out if it was unable to pick a table to join from this set. This has now been fixed and ORCA no longer falls back for such queries.

30334 - Storage: DDL : A system panic could occur if ALTER TABLE was used to add a new partition, and WITH (OIDS=FALSE) was specified as the only storage parameter for the new partition. The problem was caused by code that failed to handle the possibility of a null reloption value, which is generated when the single default storage parameter is specified. The partitioning code was modified to correctly handle the possibility of such null values.

30348 - gpload : Attempting to run gpload installed with Greenplum Clients 6.2.1 returned this error: No module named gppylib.gpversion. This issue has been resolved.

169749131 - Segment Mirroring : When Greenplum Database segment mirroring is enabled, loading a large amount of data into append-optimized tables caused database performance issues and might have caused mirror instance failures with a walsender replication timeout error. The issues occurred because Greenplum Database was not efficiently writing transaction log records from primary to mirror segment instances. This issue has been resolved. Writing transaction log records from primary to mirror segment instances has been improved.

170021921 - Workload Manager : Transaction performance issues occurred when the gp_wlm extension was loaded and the Greenplum Command Center workload management feature was not enabled. This is fixed in the gp_wlm extension included with Greenplum Database 6.3.0.

170346082 - PXF : The PXF JDBC Connector did not support dynamic session authorization in a remote SQL database. This issue is resolved; PXF now supports session authorization, and introduces the ${pxf.session.user} value and the jdbc.pool.qualifier property as described in About Session Authorization in the JDBC Connector configuration documentation.

170476535 - PXF : PXF incorrectly wrote a Parquet decimal value that was specified with precision and scale settings, and was unable to read the decimal value back. This issue is resolved.

Release 6.2

Release 6.2.1

Release Date: 2019-12-12

Pivotal Greenplum 6.2.1 is a minor release that includes new features and resolves several issues.

New Features

Greenplum Database 6.2.1 includes these new features:

  • Greenplum Database supports materialized views. Materialized views are similar to views. A materialized view enables you to save a frequently used or complex query, then access the query results in a SELECT statement as if they were a table. Materialized views persist the query results in a table-like form. Materialized view data cannot be directly updated. To refresh the materialized view data, use the REFRESH MATERIALIZED VIEW command. See Creating and Managing Materialized Views.

    Note: Known Issues and Limitations describes a limitation of materialized view support in Greenplum 6.2.1.
  • The gpinitsystem utility supports the --ignore-warnings option. The option controls the value returned by gpinitsystem when warnings or an error occurs. If you specify this option, gpinitsystem returns 0 if warnings occurred during system initialization, and returns a non-zero value if a fatal error occurs. If this option is not specified, gpinitsystem returns 1 if initialization completes with warnings, and returns value of 2 or greater if a fatal error occurs.

  • PXF version 5.10.0 is included, which introduces several new and changed features and bug fixes. See PXF Version 5.10.0 below.

PXF Version 5.10.0

PXF 5.10.0 includes the following new and changed features:

  • PXF has improved its performance when reading a large number of files from HDFS or an object store.
  • PXF bundles newer tomcat and jackson libraries.
  • The PXF JDBC Connector now supports pushdown of OR and NOT logical filter operators when specified in a JDBC named query or in an external table query filter condition.
  • PXF supports writing Avro-format data to Hadoop and object stores. Refer to Reading and Writing HDFS Avro Data for more information about this feature.
  • PXF is now certified with Hadoop 2.x and 3.1.x and Hive Server 2.x and 3.1, and bundles new and upgraded Hadoop libraries to support these versions.
  • PXF supports Kerberos authentication to Hive Server 2.x and 3.1.x.
  • PXF supports per-server user impersonation configuration.
  • PXF supports concurrent access to multiple Kerberized Hadoop clusters. In previous releases of Greenplum Database, PXF supported accessing a single Hadoop cluster secured with Kerberos, and this Hadoop cluster must have been configured as the default PXF server.
  • PXF introduces a new template file, pxf-site.xml, to specify the Kerberos and impersonation property settings for a Hadoop or JDBC server configuration. Refer to About Kerberos and User Impersonation Configuration (pxf-site.xml) for more information about this file.
  • PXF now supports connecting to Hadoop with a configurable Hadoop user identity. PXF previously supported only proxy access to Hadoop via the gpadmin Greenplum user.
  • PXF version 5.10.0 deprecates the following configuration properties.

    Note: These property settings continue to work.
    • The PXF_USER_IMPERSONATION, PXF_PRINCIPAL, and PXF_KEYTAB settings in the pxf-env.sh file. You can use the pxf-site.xml file to configure Kerberos and impersonation settings for your new Hadoop server configurations.
    • The pxf.impersonation.jdbc property setting in the jdbc-site.xml file. You can use the pxf.service.user.impersonation property to configure user impersonation for a new JDBC server configuration.
Note: If you have previously configured a PXF JDBC server to access Kerberos-secured Hive, you must upgrade the server definition. See Upgrading PXF in Greenplum 6.x for more information.

Changed Features

Greenplum Database 6.2.1 includes these changed features:

  • Greenplum Stream Server version 1.3.1 is included in the Greenplum distribution.

Resolved Issues

Pivotal Greenplum 6.2.1 is a minor release that resolves these issues:

29454 - gpstart : During Greenplum Database start up, the gpstart utility did not report when a segment instance failed to start. The utility always displayed 0 skipped segment starts. This issue has been resolved. gpstart output was also enhanced to provide additional warnings and summary information about the number of skipped segments. For example: [WARNING]:-**************************************************************************** [WARNING]:-There are 1 segment(s) marked down in the database [WARNING]:-To recover from this current state, review usage of the gprecoverseg [WARNING]:-management utility which will recover failed segment instance databases. [WARNING]:-****************************************************************************

30248, 9022 - DLL : Greenplum Database might generate a PANIC when an index is created on a column of an append-optimized, column-oriented table if the index definition contains a WHERE clause that references multiple columns. This issue has been resolved.

7545 - Postgres Planner : The Postgres Planner might return incorrect results for queries that contain a subquery in an EXISTS clause if the subquery includes a LIMIT [ 0| ALL| NULL] clause or an OFFSET NULL clause. This issue has been resolved.

8590 - Postgres Planner : A query that used the Postgres planner could return incorrect results if it specified a volatile function in a LIMIT clause (for example, LIMIT (random() * 10)). This occurred because Greenplum evaluated the LIMIT clause separately on each segment instance to obtain a preliminary limit, before evaluating it once again as the query was dispatched. The problem was fixed by ensuring that a volatile functions in a LIMIT clause functions are not pushed to segment instances for evaluation.

30083 - Postgres Planner : Fixed a problem in the Postgres planner that could result in the error variable not found in subplan target list. The issue applied to join queries where a table column had a user prescribed CAST applied to it while being both in the select list and in a join condition. At the same time, the column was also part of a motion operator in the query plan.

30200 - Metrics Collector : Greenplum Database 6 stores tablespaces with non-default names as symlinks in the $MASTER_DATA_DIRECTORY/pg_tblspc directory and the metrics collector did not detect these tablespaces. The metrics collector now follows the symlinks to find the names of the tablespace directories and the data directories located in those tablespaces. After enabling the new metrics collector the tablespaces may not be visible in Greenplum Command Center for up to four hours.

30203 - Query Optimizer : When updating a table’s distribution column, Greenplum Database returned an error that states an UPDATE statement cannot update distribution columns if a btree index is defined on the distribution column and the UPDATE command contains an IN clause. The error was returned when Greenplum Database fell back to the Postgres planner to attempt the UPDATE operation. This issue has been resolved. Now GPORCA supports the specified type of updates.

30206 - gpinitsystem : An example in the gpinitsystem help output used an invalid option for specifying the placement of mirror segment instances in a spread configuration. The correct option is --mirror_mode=spread. This issue has been resolved.

30227 - Server : Greenplum Database with resource groups enabled might generate a PANIC when using an extension with improper debug_query_string settings. The cause was a message context issue and it has been resolved.

30256 - analyzedb : When executing some queries against partitioned tables, GPROCA would fail because of missing root partition statistics. This was caused by the analyzedb utility not updating the root partition statistics when generating the partitioned table statistics. This issue has now been resolved.

30292 - External Table : When Greenplum Database attempted to access data from an external table, a PANIC was generated when Greenplum Database could not resolve the host name that is specified in the external table definition. This issue has been resolved. Now Greenplum Database returns an error in the specified situation.

168881383 - PXF : PXF fixed a regression in file and directory name pattern matching that affected the *:text:multi profiles and S3 Select. This issue has been resolved. PXF now correctly handles wildcards specified in the LOCATION data path.

8918 - Postgres Planner : The Postgres Planner generated an incorrect result on a JOIN query when different data types were used in a table column or the query constraints included a constant, and the query required motion. This issue is resolved.

169694492 - Query Optimizer : For a table that has a column that is defined with a btree index, GPORCA fell back to the Postgres planner for queries that use IN clause against the column or an OR of simple comparisons on the column such as col = 5 OR col = 7. Now GPORCA attempts to generate a query plan that uses the index.

169806983 - Greenplum Stream Server : In some cases, reading from Kafka using the default MINIMAL_INTERVAL (0 seconds) caused GPSS to consume a large amount of CPU resources, even when no new messages existed in the Kafka topic. This issue is resolved in GPSS 1.3.1.

169807372, 169831558 - Greenplum Stream Server : GPSS 1.3.0 did not recognize internal history tables that were created with GPSS 1.2.6 and earlier. In some cases, this caused GPSS to load duplicate messages into Greenplum Database. This issue is resolved in GPSS 1.3.1.

170041280 - PXF : PXF was unable to read data from an encrypted HDFS zone and returned an org.apache.hadoop.crypto.CryptoInputStream cannot be cast to org.apache.hadoop.hdfs.DFSInputStream error in this situation. This issue is resolved.

Release 6.1

Release 6.1.0

Release Date: 2019-11-1

Pivotal Greenplum 6.1.0 is a minor release that includes new features and resolves several issues.

Features

Greenplum Database 6.1.0 includes these new features:

  • Greenplum Stream Server 1.3 is included, which introduces new features and bug fixes.

    Note: Greenplum Stream Server (GPSS) and Greenplum-Kafka Integration users: Do not upgrade to Greenplum Database 6.1 if you plan to re-submit Kafka load jobs that you initiated with GPSS in Greenplum 6.0.x. Due to a regression, GPSS may load duplicate Kafka messages into Greenplum. Refer to [Known Issues and Limitations](#topic_w4h_3tx_zqb) for more information.

    New GPSS features include:

    • GPSS now supports log rotation, utilizing a mechanism that you can easily integrate with the Linux logrotate system. See Managing GPSS Log Files for more information.
    • GPSS has added the new INPUT:FILTER load configuration property. This property enables you to specify a filter that GPSS applies to Kafka input data before loading it into Greenplum Database.
    • GPSS displays job progress by partition when you provide the --partition flag to the gpsscli progress command.
    • GPSS enables you to load Kafka data that was emitted since a specific timestamp into Greenplum Database. To use this feature, you provide the --force-reset-timestamp flag when you run gpsscli load, gpsscli start, or gpkafka load.
    • GPSS now supports update and merge operations on data stored in a Greenplum Database table. The load configuration file accepts MODE, MATCH_COLUMNS, UPDATE_COLUMNS, and UPDATE_CONDITION property values to direct these operations. Example: Merging Data from Kafka into Greenplum Using the Greenplum Stream Server provides an example merge scenario.
    • GPSS supports Kerberos authentication to both Kafka and Greenplum Database.
    • GPSS supports SSL encryption between GPSS and Kafka.
    • GPSS supports SSL encryption on the data channel between GPSS and Greenplum Database.
  • The DataDirect JDBC and ODBC drivers were updated to versions 5.1.4.000270 (F000450.U000214) and 07.16.0334 (B0510, U0363), respectively.

    The DataDirect JDBC driver introduces support for the prepareThreshold connection parameter, which specifies the number of prepared statement executions that can be performed before the driver switches to using server-side prepared statements. This parameter defaults to 0, which preserves the earlier driver behavior of always using server-side prepare for prepared statements. Set a number greater than 1 to set a threshold after which server-side prepare is used.

    Note: ExecuteBatch() always uses server-side prepare for prepared statements. This matches the behavior of the Postgres open source driver.

    When the prepareThreshold value is greater than 1, parameterized operations do not send any SQL prepare calls with connection.prepareStatement(). The driver instead sends the query all at once, at execution time. Because of this limitation, the driver must determine the type of every column using the JDBC API before sending the query to the server. This determination works for many data types, but does not work for the following types that could be mapped to multiple Greenplum data types:

    • BIT VARYING
    • BOOLEAN
    • JSON
    • TIME WITH TIME ZONE
    • UUIDCOL

    You must set prepareThreshold to 0 before using parameterized operations with any of the above types. Examine the ResultSetMetaData object in advance to determine if any of the above types are used in a query. Also keep in mind that GPORCA does not support prepared statements that have parameterized values, and will fall back to using the Postgres Planner.

    See PrepareThreshold in the DataDirect documentation.

Resolved Issues

Pivotal Greenplum 6.1.0 is a minor release that resolves these issues:

8804 - Server : In some cases, running the EXPLAIN ANALYZE command on a sorted query in utility mode would cause the segment to crash. This issue is fixed. Greenplum Database no longer crashes in this situation.

8636 - Server : Some users encountered Error: unrecognized parameter "appendoptimized" while creating a partitioned table that specified the appendoptimized=true storage parameter. This issue is fixed; the Greenplum Database server now properly recognizes the appendoptimized parameter when it is specified on partition table creation.

26225 - gpcheckcat : The gpcheckcat utility failed to generate a summary report if there was an orphan TOAST table entry in one of the segments. This is fixed. The string “N/A” is reported when there is no relation OID to report.

29580 - Management and Monitoring : During Greenplum Database startup, an extra empty log file was produced ahead of the current date while performing time-based rotation of log files. For example, if Greenplum started at midnight September 2nd, two log files were generated, gpdb-2019-09-02_000000.csv and gpdb-2019-09-03_000000.csv. This issue has now been fixed.

29984 - Server : During startup, idle query executor (QE) processes can commit up to 16MB of memory each, but they are not tracked by the Linux virtual memory tracker. In a worst-case scenario, these idle processes could trigger OOM errors that were difficult to diagnose. To prevent these situations, Greenplum now hard-codes a startup memory cost to account for untracked QE processes.

30112 - Query Optimizer : For some queries against partitioned tables that contain a large amount of data, GPORCA generated a sub-optimal query plan because of inaccurate cardinality estimation. This issue has been resolved. GPORCA cardinality estimation has been improved.

30183, 30184 - analyzedb : When running the analyzedb command with the --skip_root_stats option, the command could take a long time to finish when analyzing a partitioned table with many partitions due to how the root partition statistics were handled when the partitions were analyzed. This issue has been resolved. Now, only partition statistics are updated.

Note: GPORCA uses root partition statistics. If you use --skip_root_stats option, you should ensure that root partition statistics are up to date so that GPORCA does not produce inferior query plans due to stale root partition statistics.

30149 - Query Execution : A query might fail and return an error with the message invalid seek in sequential BufFile when the server configuration file gp_workfile_compression is on and the query spills to temporary workfiles. The error was caused due to an issue working with workfiles that contain compressed data. The issue has been resolved by correctly handling the compressed workfile data.

30150 - Query Execution : A query might fail and return with the message AssignTransactionId() called by Segment Reader process when the server configuration parameter temp_tablespaces is set. The error was cause by an internal locking and transaction ID issue. This issue has been resolved by removing the requirement to acquire the lock.

30160 - Query Optimizer : GPORCA might return incorrect results when a the query contains a join predicate where one side is distributed on a citext column, and the other is not. GPORCA did not use the correct hash when generating a plan that redistributes the citext column. Now Greenplum Database falls back to the Postgres Planner for the specified type of query.

30183 - analyzedb : The analyzedb command could take a long time to finish when analyzing a table with many partitions. The command’s performance has been greatly improved by waiting to update the root partition statistics until all leaf partitions of a table have been analyzed.

164823612 - gpss : GPSS incorrectly treated Kafka jobs that specified the same Kafka topic and Greenplum output schema name and output table name, but different database names, as the same job. This issue has been resolved. GPSS now includes the Greenplum database name when constructing a job definition.

167997441 - gpss : GPSS did not save error data to the external table error log when it encountered an incorrectly-formatted JSON or Avro message. This issue has been fixed; invoking gp_read_error_log() on the external table now displays the offending data.

168130147 - gpss : In some situations, specifying the --force-reset-earliest flag when loading data failed to read from the correct offset. This problem has been fixed. (Using the --force-reset-xxx flags outside of an offset mismatch scenario is discouraged.)

168393571 - Query Optimizer : Certain queries with btree indexes on Append Optimized (AO) tables were unnecessarily slow due to GPORCA selecting a scan with high transformation and cost impact. This issue has been fixed by improving GPORCA handling of btree type indexes.

168393645 - Query Optimizer : In some situations, a query ran slow because GPORCA did not produce an optimal plan when it encountered a null-rejecting predicate where an operand could be false or null, but not true. This issue is fixed; GPORCA now produces a more optimal plan when evaluating null-rejecting predicates for AND and OR operands.

168705484 - Query Optimizer : For certain queries with a UNION operator over a large number of children, GPORCA query optimization required a long time. This issue has been addressed by adding the ability to derive scalar properties on demand.

168707515 - Query Optimizer : Some queries in GPORCA were consuming more memory than necessary due to suboptimal memory tracking. This has been fixed by optimizing memory accounting inside GPORCA.

169081574 - Interconnect : Greenplum Database might generate a PANIC when the server configuration parameter gp_interconnect_type is TCP due to an issue with memory management during interconnect setup. The issue has been resolved by properly managing the internal interconnect object memory.

169117536 - Execution : Greenplum Database might generate a PANIC when the server configuration parameter log_min_messages is set to debug5. Greenplum Database did not properly handle a debug5 message correctly. The issue is resolved.

169198230 - Plan Cache : A prepared statement might run slow because a cost model issue prevented Greenplum Database from generating a direct dispatch plan for the statement. This issue is fixed. Greenplum Database now introduces non-direct dispatch cost into the cost model only for cached plans, and tries to use direct dispatch for prepared statements when possible.

Release 6.0

Release 6.0.1

Release Date: 2019-10-11

Pivotal Greenplum 6.0.1 is a maintenance release that includes changed features and resolves several issues.

Changed Features

Greenplum Database 6.0.1 includes these changed features:

  • The default value for the server configuration parameter optimizer_use_gpdb_allocators has been changed to true. Now, as the default, GPORCA uses Greenplum Database memory management when executing queries instead of GPORCA-specific memory management. Greenplum Database memory management has several enhancements when compared to GPORCA-specific memory management. See optimizer_use_gpdb_allocators.
  • Writing parquet data using the PXF Hadoop and object store connectors is no longer considered a Beta feature in this release.

Resolved Issues

Pivotal Greenplum 6.0.1 is a maintenance release that resolves these issues:

29712 - Query Execution : Greenplum Database writes unnecessary could not unlink file log messages for spill files when gp_enable_query_metrics is on and log_min_messages is set to INFO. This issue has been resolved. Logging has been improved to not write the log message.

30058 - Query Execution : An internal EXPLAIN function, cdbexplain_localExecStats, operated under the assumption that it was always executed by a query dispatcher (QD) process. However, certain queries could generate plans where the function was executed by a query executor (QE) process. Running such queries with EXPLAIN ANALYZE would cause all segments to crash with segment faults, and error messages referencing cdbexplain_localExecStats. This problem has been resolved.

30094 - Resource Management : In some cases Greenplum Database generated a PANIC when a query was terminated and the query involved catalog tables. The PANIC was caused when a backend process did not properly clean up shared memory before exiting. This issue has been resolved. Now backend process memory management has been improved for the specified situation.

30098 - COPY : Greenplum Database generated a PANIC when a COPY command attempted to write to a catalog table. This issue has been resolved, now Greenplum Database returns an error.

30120 - GRANT : The command GRANT ALL ON ALL TABLES IN SCHEMA <schema> TO <role> caused a PANIC when tables are partitioned or inherited by child tables. This issue has been fixed.

30130 - gpexpand : The gpexpand utility might have failed with a Cannot allocate memory error when system memory or swap space is low. The error was due to an issue with a python process library. This has been resolved by updating the python library.

165660593 - Resource Groups : When resource groups are enabled, Greenplum Database might return an out of memory error when executing a SET or SHOW command, or when executing a query when the Greenplum Database server configuration parameter gp_resource_group_bypass is set to true. The error is due to an issue with resource group memory accounting. This issue has been resolved, resource group memory accounting has been improved for individual statements.

167847839 - gpconfig : The code dispatched from the master to segments to set a configuration parameter enclosed the value in single quotes. This did not handle values containing embedded single quotes, and parameters with the GUC_LIST_QUOTE flag, such as search_path, ended up having different values on the segments than on the master. For example, SELECT set_config('search_path', 'my_schema,public', false); was dispatched as SET search_path TO 'my_schema,public'; instead of SET search_path TO my_schema,public; This is fixed. The set_config() call is now dispatched to the segments as well, passing the (quoted) arguments directly so that the same code runs on the master and segments.

Known Issue 167851039 - PXF : pxf cluster reset did not reset PXF configuration on the standby master. This issue is fixed; the command now resets PXF configuration on all hosts in the Greenplum cluster, including the standby master.

Known Issue 167851065 - PXF : PXF allowed you to initialize PXF without setting PXF_CONF. This issue has been resolved. PXF now correctly checks for this setting before continuing with initialization.

Known Issue 167948506 - PXF : In some cases, accessing an S3 object store with PXF failed when PXF was configured for Kerberized Hadoop. This issue is fixed. PXF now handles token renewal appropriately on concurrent access to S3 and a Kerberized Hadoop cluster.

While not required, if you have previously instituted any of the workarounds identified for this issue that are described in Known Issues and Limitations, you may consider reverting the s3-site.xml modifications or removing the yarn-site.xml file from your S3 server directory.

168167337- gpinitsystem : When running gpinitsystem to initialize a Greenplum Database system with mirroring enabled, the utility configured the pg_hba.conf file in a manner that did not permit incremental recovery of the primary segment instances. This issue is fixed in release 6.0.1. The gpexpand, gprecoverseg, and gpaddmirrors utilities were also updated to ensure that primary and mirror segments always have compatible pg_hba.conf entries in place after performing their respective operations.

Known Issue 168271005 - PXF : The PXF JDBC Connector failed to access any database when PXF was configured to use the MapR Hadoop distribution, and MapR libraries were present in $PXF_CONF/lib. This issue has been resolved. A PXF MapR server configuration no longer affects JDBC access using PXF.

168759361 - psql : Greenplum Database was previously built and dynamically linked against libedit as a required dependency. However, the version of libedit that is available on Redhat 7 is not compatible with features such as tab-completion in psql. The readline library provides the same functionality as libedit, and all versions available across all supported platforms are compatible with the desired features. Therefore, instead of dynamically linking to libedit, Greenplum Database is now being built and dynamically linked to readline. This changes the required dependencies for the installers from libedit to readline.

Release 6.0.0

Release Date: 2019-09-03

Pivotal Greenplum 6.0.0 is a major new release of Greenplum that includes new and changed features.

New Features

Pivotal Greenplum 6 includes these new features.

PostgreSQL Core Features

Pivotal Greenplum 6 incorporates several new features from PostgreSQL versions 8.4 through version 9.4.

INTERVAL Data Type Handling

PostgreSQL 8.4 improves the parsing of INTERVAL literals to align with SQL standards. This changes the output for queries that use INTERVAL labels between versions 5.x and 6.x. For example:

$ psql
psql (8.3.23)
Type "help" for help.

gpadmin=# select INTERVAL '1' YEAR;
 interval
----------
 00:00:00
(1 row)
$ psql
psql (9.2beta2)
Type "help" for help.

gpadmin=# select INTERVAL '1' YEAR;
 interval
----------
 1 year
(1 row)

See Date/Time Types for more information.

Additional PostgreSQL Features

Greenplum Database 6.0 also includes these features and changes from PostgreSQL:

  • Support for user-defined I/O conversion casts. (PostgreSQL 8.4).
  • Support for column-level privileges (PostgreSQL 8.4).
  • The pg_db_role_setting catalog table, which provides support for setting server configuration parameters for a specific database and role combination (PostgreSQL 9.0).
  • Values in the relkind column of the pg_class catalog table were changed to match entries in PostgreSQL 9.3.
  • Support for GIN index method (PostgreSQL 8.3).
  • Postgres Planner support for the SP-GiST index access method (PostgreSQL 9.2). (GPORCA ignores SP-GiST indexes.)
  • Postgres Planner support for ordered-set aggregates and moving-aggregates (PostgreSQL 9.4).
  • Support for jsonb data type (PostgreSQL 9.4).
  • DELETE, INSERT, and UPDATE supports the WITH clause, CTE (common table expression) (PostgreSQL 9.1).
  • Collation support to specify sort order and character classification behavior for data at the column level (PostgreSQL 9.1).

    Note: GPORCA supports collation only when all columns in the query use the same collation. If columns in the query use different collations, then Greenplum uses the Postgres Planner.

Zstandard Compression Algorithm

Greenplum Database 6.0 adds support for zstd (Zstandard) compression for some database operations. See Enabling Compression.

Relaxed Rules for Specifying Table Distribution Columns

In previous releases, if you specified both a UNIQUE constraint and a DISTRIBUTED BY clause in a CREATE TABLE statement, then the DISTRIBUTED BY clause was required to be equal to or a left-subset of the UNIQUE columns. Greenplum 6.x relaxes this rule so that any subset of the UNIQUE columns is accepted.

This change also affects the rules for how Greenplum 6.x selects a default distribution key. If gp_create_table_random_default_distribution is off (the default) and you do not include a DISTRIBUTED BY clause, then Greenplum chooses the table distribution key based on the command:

  • If a LIKE or INHERITS clause is specified, then Greenplum copies the distribution key from the source or parent table.
  • If a PRIMARY KEY or UNIQUE constraints are specified, then Greenplum chooses the largest subset of all the key columns as the distribution key.
  • If neither constraints nor a LIKE or INHERITS clause is specified, then Greenplum chooses the first suitable column as the distribution key. (Columns with geometric or user-defined data types are not eligible as Greenplum distribution key columns.)

Resource Groups Features

Greenplum Database includes these new resource group features:

  • You no longer are required to specify a MEMORY_LIMIT when you configure a Greenplum Database resource group. When you specify MEMORY_LIMIT=0, Greenplum Database will use the resource group global shared memory pool to service queries running in the group.
  • When you specify MEMORY_SPILL_RATIO=0, Greenplum Database will now use the statement_mem server configuration parameter setting to identify the initial amount of query operator memory.

    When used together to configure a resource group (MEMORY_LIMIT=0 and MEMORY_SPILL_RATIO=0), these new capabilities provide a memory management scheme similar to that provided by Greenplum Database resource queues.

    The default values of the MEMORY_SHARED_QUOTA, MEMORY_SPILL_RATIO, and MEMORY_LIMIT attributes for the admin_group and default_group resource groups have been set to use this resource queue-like memory management scheme so that when you initially enable resource groups, your queries will run in a memory environment similar to before.

    Resource Group admin_group default_group
    MEMORY_LIMIT 10 0
    MEMORY_SHARED_QUOTA 80 80
    MEMORY_SPILL_RATIO 0 0

PL/pgSQL Procedural Language Enhancements

PL/pgSQL in Greenplum Database 6.0 includes support for the following new features:

  • Attaching DETAIL and HINT text to user-thrown error messages. You can also specify the SQLSTATE and SQLERRMSG codes to return on a user-thrown error (PostgreSQL 8.4).
  • The RETURN QUERY EXECUTE statement, which specifies a query to execute dynamically (PostgreSQL 8.4).
  • Conditional execution using the CASE statement (PostgreSQL 8.4). See Conditionals in the PostgreSQL documentation.

Replicated Table Data

The CREATE TABLE command supports DISTRIBUTED REPLICATED as a distribution policy. If this distribution policy is specified, Greenplum Database distributes all rows of the table to all segment instances in the Greenplum Database system.

Note: The hidden system columns (ctid, cmin, cmax, xmin, xmax, and gp_segment_id) cannot be referenced in user queries on replicated tables because they have no single, unambiguous value. Greenplum Database returns a column does not exist error for the query.

Concurrency Improvements in Greenplum 6

Greenplum Database 6 includes the following concurrency improvements:

  • Global Deadlock Detector - Previous versions of Greenplum Database prevented global deadlock by holding exclusive table locks for UPDATE and DELETE operations. While this strategy did prevent deadlocks, it came at the cost of poor performance on concurrent updates. Greenplum Database 6 includes a global deadlock detector. This backend process collects and analyzes lock waiting data in the Greenplum cluster. If the Global Deadlock Detector determines that deadlock exists, it breaks the deadlock by cancelling one or more backend processes. By default, the global deadlock detector is disabled and table-level exclusive locks are held for table updates. When the global deadlock detector is enabled, Greenplum Database holds row-level exclusive locks and concurrent updates are allowed. See Global Deadlock Detector.
  • Transaction Lock Optimization - Greenplum Database 6 optimizes transaction lock usage both when you BEGIN and COMMIT a transaction. This benefits highly concurrent mixed workloads.
  • Upstream PostgreSQL Features - Greenplum 6 includes upstream PostgreSQL features, including those for fastpath lock, which reduce lock contention. This benefits concurrent short queries and mixed workloads.
  • VACUUM can more easily skip pages it cannot lock. This reduces the frequency of a vacuum appearing to be “stuck,” which occurs when VACUUM waits to lock a block for cleanup and another session has held a lock on the block for a long time. Now VACUUM skips a block it cannot lock and retries the block later.
  • VACUUM rechecks block visibility after it has removed dead tuples. If all remaining tuples in the block are visible to current and future transactions, the block is marked as all-visible.
  • The tables that are part of a partitioned table hierarchy, but that do not contain data, are age-frozen so that they do not have to be vacuumed separately and do not affect calculation of the number of remaining transaction IDs before wraparound occurs. These tables include the root and intermediate tables in the partition heirarchy and, if they are append-optimized, their associated meta-data tables. This makes it unnecessary to vacuum the root partition to reduce the table’s age, and eliminates the possibly needless vacuuming of all of the child tables.

Additional Contrib Modules

Greenplum Database 6 is distributed with these additional PostgreSQL and Greenplum contrib modules:

PXF Version 5.8.1

Greenplum Database 6.0 includes PXF 5.8.1, which introduces the following new and changed features:

  • The PXF S3 Connector now supports accessing CSV and Parquet data on S3 using the Amazon S3 Select service. Refer to Reading CSV and Parquet Data on S3 Using S3 Select.
  • PXF bundles new and upgraded libraries to provide Java 11 support.
  • PXF has added support for the timestamptz type when writing Parquet data to an external data source.
  • PXF now provides a reset command to reset your local PXF server instance, or all PXF server instances in the cluster, to an uninitialized state. See Resetting PXF.
  • PXF no longer supports specifying a DELIMITER in the CREATE EXTERNAL TABLE command LOCATION URI.

Additional Greenplum Database Features

Greenplum Database 6.0 also includes these features and changes from version 5.x:

  • Recursive WITH Queries (Common Table Expressions) are no longer considered a Beta feature, and are now enabled by default. See WITH Queries (Common Table Expressions)in the Pivotal Greenplum Database Documentation.
  • VACUUM was updated to more easily skip pages that cannot be locked. This change should greatly reduce the incidence of VACUUM getting “stuck” while waiting for other sessions.
  • appendoptimized alias for the appendonly table storage option.
  • New gp_resgroup_status_per_host and gp_resgroup_status_per_segment gp_toolkit views to display resource group CPU and memory usage on a per-host and/or per-segment basis.
  • The new gp_stat_replication view contains replication statistics when master or segment mirroring is enabled. The pg_stat_replication view contains only master replication statistics.
  • The gpfdists and psql programs in the Greenplum Client and Loader Tools package for Windows support OpenSSL encryption.
  • Greenplum 6 includes some PostgreSQL 9.6 aggregate-related performance improvements.
  • The gpload utility program provided in the Greenplum Client and Loader Tools package for Windows is compatible with Greenplum Database 5.

Greenplum 6.0 Beta Features

Because Pivotal Greenplum Database is based on the open source Greenplum Database project code, it includes several Beta features to allow interested developers to experiment with their use on development systems. Feedback will help drive development of these features, and they may become supported in future versions of the product.

Warning: Beta features are not recommended or supported for production deployments.

Key experimental features in Greenplum Database 6.0 include:

  • Storage plugin API for gpbackup and gprestore. Partners, customers, and OSS developers can develop plugins to use in conjunction with gpbackup and gprestore.

    For information about the storage plugin API, see Backup/Restore Storage Plugin APIin the Pivotal Greenplum Database Documentation.

  • Using the Greenplum Platform Extension (PXF) connectors to write Parquet data is a Beta feature.

Changed Features

Greenplum Database 6.0 includes these feature changes:

  • The performance characteristics of Greenplum Database under heavy loads have changed in version 6 as compared to previous versions. In particular, you may notice increased I/O operations on primary segments for changes related to GPSS, WAL replication, and other features. All customers are encouraged to perform load testing with real-world data to ensure that the new Greenplum 6 cluster configuration meets their performance needs.
  • gpbackup and gprestore are no longer installed with Greenplum Database 6, but are available separately on Pivotal Network and can be upgraded separately from the core database installation.
  • Greenplum 6 uses a new jump consistent hash algorithm to map hashed data values to Greenplum segments. The new algorithm ensures that, after new segments are added to the Greenplum 6 cluster, only those rows that hash to the new segment need to be moved. Greenplum 6 hashing has performance characteristics similar to earlier Greenplum releases, but should enable faster database expansion. Note that the new algorithm is more CPU intensive than the previous algorithm, so COPY performance may degrade somewhat on CPU-bound systems.
  • The older, legacy hash functions are represented as non-default hash operator classes, named cdbhash_*_ops. The non-default operator classes are used when upgrading from Greenplum Database earlier than 6.0. The legacy operator classes are compatible with each other, but if you mix the legacy operator classes with the new ones, queries will require Redistribute Motions.

    The server configuration parameter gp_use_legacy_hashops controls whether the legacy or default hash functions are used when creating tables that are defined with a distribution column.

    The gp_distribution_policy system table now contains more information about Greenplum Database tables and the policy for distributing table data across the segments including the operator class of the distribution hash functions.

  • The gpcheck utility is no longer included in Greenplum Database 6.

  • The input file format for the gpmovemirrors, gpaddmirrors, gprecoverseg and gpexpand utilities has changed. Instead of using a colon character : as a separator, the new file format uses a pipe character |. For example, in previous releases a line in a gpexpand input file would resemble:

    sdw5:sdw5-1:50011:/gpdata/primary/gp9:11:9:p
    

    The updated file format is:

    sdw5|sdw5-1|50011|/gpdata/primary/gp9|11|9|p
    

    In addition, gpaddmirrors removes the mirror prefix from lines in its input file. Whereas a line from the previous release might resemble:

    mirror0=0:sdw1:sdw1-1:52001:53001:54001:/gpdata/mir1/gp0
    

    The revised format is:

    0=0|sdw1|sdw1-1|52001|53001|54001|/gpdata/mir1/gp0
    
  • Greenplum uses direct dispatch to target queries that use IS NULL, similar to queries that filter on the table distribution key column(s).

  • The gpinitsystem option to specify the standby master data directory changed from -F to -S. The -S option no longer specifies spread mirroring. A new gpinitsystem option is introduced to specify the mirroring configuration: --mirror-mode={group|spread}.

  • The default value of the server configuration parameter log_rotation_size has changed from 0 to 1GB. This changes the default log rotation behavior so that a new log file is opened when more than 1GB has been written to the current log file or when the current log file has been open for 24 hours.

  • The default value of the server configuration parameter effective_cache_size has changed from 512MB to 16GB.

  • The gpssh-exkeys utility now requires that you have already set up passwordless SSH from the master host to every other host in the cluster. Running gpssh-exkeys then sets up passwordless SSH from every host to every other host.

  • The gpstop smart shutdown behavior has changed. Previously, if you ran gpstop -M smart (or just gpstop), the utility exited with a message if there were any active client connections. Now, gpstop waits for current connections to finish before completing the shutdown. If any connections remain open after the timeout period, or if you interrupt with CTRL-C, gpstop lists the open connections and prompts whether to continue waiting for connections to finish, or to perform a fast or immediate shutdown. The default timeout period is 120 seconds and can be changed with the -t timeout_seconds option.

  • In the pg_stat_activity and pg_stat_replication system views, the procpid column was renamed to pid to match the associated change in PostgreSQL 9.2.

  • In the pg_proc system table, the proiswin column was renamed to proiswindow and relocated in the table to match the pg_proc system table in PostgreSQL 8.4.

  • Queries that use SELECT DISTINCT and UNION/INTERSECT/EXCEPT no longer necessarily return sorted output. Previously these queries always removed duplicate rows by using Sort/Unique processing. They now implement hashing to conform to behavior introduced in PostgreSQL 8.4; this method does not produce sorted output. If your application requires sorted output for these queries, alter the queries to use an explicit ORDER BY clause. Note that SELECT DISTINCT ON never uses hashing, so its behavior is unchanged from previous versions.

  • In the gp_toolkit schema, the gp_resgroup_config view no longer contains the columns proposed_concurrency, proposed_memory_limit, proposed_memory_shared_quota and proposed_memory_spill_ratio.

  • In the pg_resgroupcapability system table, the proposed column has been removed.

  • The pg_database system table datconfig column was removed. Greenplum Database now uses the pg_db_role_setting system table to keep track of per-database and per-role server configuration settings (PostgreSQL 9.0).

  • The pg_aggregate system table aggordered column was removed, and several new columns were added to the table to support ordered-set aggregates and moving-aggregates with the Postgres Planner (PostgreSQL 9.4). The ALTER/CREATE/DROP AGGREGATE SQL command signatures have also been updated to reflect the pg_aggregate catalog changes.

  • The pg_authid system table rolconfig column was removed. Greenplum Database now uses the pg_db_role_setting system table to keep track of per-database and per-role server configuration settings (PostgreSQL 9.0).

  • When creating and altering a table that has a distribution column, you can now specify the hash function used to distribute data across segment instances.

  • Pivotal Greenplum Database 6 removes the RECHECK option from ALTER OPERATOR FAMILY and CREATE OPERATOR CLASS DDL (PostgreSQL 8.4). Greenplum now determines whether an index operator is “lossy” on-the-fly at runtime.

  • Operator-related system catalog tables are modified to support operator families, compatibility, and types (ordering or search).

  • System catalog table entries for HyperLogLog (HLL) functions, aggregates, and types are modified to prefix names with gp_. Renaming the HLL functions prevents name collisions with external Greenplum Database extensions that use HLL. Any user code written to use the built-in Greenplum Database HLL functions must be updated to use the new gp_ names.

  • The “legacy optimizer” from previous releases of Greenplum is now referred to as the Postgres planner in both the code and documentation.

  • The transaction isolation levels in Greenplum Database 6.0 are changed to align with PostgreSQL transaction isolation levels since the introduction of the serializable snapshot isolation (SSI) mode in PostgreSQL 9.1. The new SSI mode, which is not implemented in Greenplum Database, provides true serializability by monitoring concurrent transactions and rolling back transactions that could introduce a serialization anomaly. The existing snapshot isolation (SI) mode guarantees that transactions operate on a single, consistent snapshot of the database, but does not guarantee a consistent result when a set of concurrent transactions is executed in any given sequence.

    Greenplum Database 6.0 now allows the REPEATABLE READ keywords with SQL statements such as BEGIN and SET TRANSACTION. A SERIALIZABLE transaction in PostgreSQL 9.1 or later uses the new SSI mode. A SERIALIZABLE transaction in Greenplum Database 6.0 falls back to REPEATABLE READ, using the SI mode. The following table shows the SQL standard compliance for each transaction isolation level in Greenplum Database 6.0 and PostgreSQL 9.1.

    Requested Transaction Isolation Level Greenplum Database 6.0 Compliance PostgreSQL 9.1 Compliance
    READ UNCOMMITTED READ COMMITTED READ COMMITTED
    READ COMMITTED READ COMMITTED READ COMMITTED
    REPEATABLE READ REPEATABLE READ (SI) REPEATABLE READ (SI)
    SERIALIZABLE Falls back to REPEATABLE READ (SI) SERIALIZABLE (SSI)
  • The CREATE TABLESPACE command has changed.

    • The command no longer requires a filespace created with the gpfilespace utility.
    • The FILESPACE clause has been removed.
    • The WITH clause has been added to allow specifying a tablespace location for a specific segment instance.
    • A primary-mirror pair sharing the same content ID must use the same tablespace location.
  • The ALTER SEQUENCE SQL command has new clauses START [WITH] start and OWNER TO new_owner (PostgreSQL 8.4). The START clause sets the start value that will be used by future ALTER SEQUENCE RESTART commands, but does not change the current value of the sequence. The OWNER TO clause changes the sequence’s owner.

  • The ALTER TABLE SQL command has a SET WITH OIDS clause to add an oid system column to a table (PostgreSQL 8.4). Note that using oids with Greenplum Database tables is strongly discouraged.

  • The CREATE DATABASE SQL command has new parameters LC_COLLATE and LC_CTYPE to specify the collation order and character classification for the new database.

  • The CREATE FUNCTION SQL command has a new keyword WINDOW, which indicates that the function is a window function rather than a plain function (PostgreSQL 8.4).

  • Specifying the index name in the CREATE INDEX SQL command is now optional. Greenplum Database constructs a default index name from the table name and indexed columns.

  • In the CREATE TABLE command, the Greenplum Database parser allows commas to be placed between a SUBPARTITION TEMPLATE clause and its cooresponding SUBPARTITION BY clause, and between consecutive SUBPARTITION BY clauses. Using this undocumented syntax will generate a deprecation warning message.

  • Superuser privileges are now required to create a protocol. See CREATE PROTOCOL.

  • The CREATE TYPE SQL command has a new LIKE=type clause that copies the new type’s representation (INTERNALLENGTH, PASSEDBYVALUE, ALIGNMENT, and STORAGE) from an existing type (PostgreSQL 8.4).

  • The GRANT SQL command has new syntax to grant privileges on truncate, foreign data wrappers, and foreign data servers (PostgreSQL 8.4).

  • The LOCK SQL command has an optional ONLY keyword (PostgreSQL 8.4). When specified, the table is locked without locking any tables that inherit from it.

  • Using the LOCK table statement outside of a transaction raises an error in Greenplum Database 6.0. In earlier releases, the statement executed, although it is only useful when executed inside of a transaction.

  • The SELECT and VALUES SQL commands support the SQL 2008 OFFSET and FETCH syntax (PostgreSQL 8.4). These clauses provide an alternative syntax for limiting the results returned by a query.

  • The FROM clause can be omitted from a SELECT command, but Greenplum Database no longer allows queries that omit the FROM clause and also reference database tables.

  • The ROWS and RANGE SQL keywords have changed from reserved to unreserved, and may be used as table or column names without quoting.

  • In Greenplum 6, a query on an external table with descendants will by default recurse into the descendant tables. This is a change from previous Greenplum Database versions, which never recursed into descendants. To get the previous behavior in Greenplum 6, you must include the ONLY keyword in the query to restrict the query to the parent table.

  • The default value for the optimizer_force_multistage_agg server configuration parameter has changed from true to false. GPORCA will now by default choose between a one-stage or two-stage aggregate plan for a scalar distinct qualified aggregate based on cost.

  • The TRUNCATE SQL command has an optional ONLY keyword (PostgreSQL 8.4). When specified, the table is truncated without truncating any tables that inherit from it.

  • The createdb command-line utility has new options -l (--locale), --lc-collate, and --lc-ctype to specify the locale and character classification for the database (PostgreSQL 8.4).

  • The pg_dump, pg_dumpall, and pg_restore utilities have a new --role=rolename option that instructs the utility to execute SET ROLE rolename after connecting to the database and before starting the dump or restore operation (PostgreSQL 8.4).

  • The pg_dump and pg_dumpall command-line utilities have a new option --lock-wait-timeout=timeout (PostgreSQL 8.4). When specified, instead of waiting indefinitely the dump fails if the utility cannot acquire shared table locks within the specified number of milliseconds.

  • The -d and -D command-line options are removed from the pg_dump and pg_dumpall utilities. The corresponding long versions, --inserts and --column-inserts are still supported. A new --binary-upgrade option is added, for use by in-place upgrade utilities.

  • The -w (--no-password) option was added to the pg_dump, pg_dumpall, and pg_restore utilities.

  • The -D option is removed from the gpexpand utility. The expansion schema will be created in the postgres database.

  • The gpstate utility has a new -x option, which displays details of an in-progress system expansion. gpstate -s and gpstate with no options specified also report if a system expansion is in progress.

  • The pg_restore utility has a new option -j (--number-of-jobs) parameter. This option can reduce time to restore a large database by running tasks such as loading data, creating indexes, and creating constraints concurrently.

  • The vacuumdb utility has a new -F (--freeze) option to freeze row transaction information.

  • ALTER DATABASE includes the SET TABLESPACE clause to change the default tablespace.

  • CREATE DATABASE includes the COLLATE and CTYPE options for setting the collation order and character classification of the new database.

  • In the gp_toolkit schema, the gp_workfile_* views have changed due to Greenplum Database 6 workfile enhancements. See Checking Query Disk Spill Space Usage for information about gp_workfile_* views.

  • The server configuration parameter gp_workfile_compress_algorithm has been changed to gp_workfile_compression. When workfile compression is enabled, Greenplum Database uses Zstandard compression.

  • The Oracle Compatibility Functions are now available in Greenplum Database as an extension, based on the PostgreSQL orafce project at https://github.com/orafce/orafce. Instead of executing a SQL script to install the compatibility functions in a database, you now execute the SQL command CREATE EXTENSION orafce. The Greenplum Database 6.0 orafce extension is based on the orafce 3.7 release. See Oracle Compatibility Functions for information about differences between the Greenplum Database compatibility functions and the PostgreSQL orafce extension.

  • Greenplum Database 6 supports specifying a table column of the citext data type as a distribution key.

  • Greenplum Database 6 provides a single client and loader tool package that you can download and install on a client system. Previous Greenplum releases provided separate client and loader packages. For more information about the Greenplum 6 Clients package, refer to Client Tools in the platform requirements documentation.

  • Greenplum Database 6 includes both PostgreSQL-sourced and Greenplum- sourced contrib modules. Most of these modules are now packaged as extensions, and you register an extension in Greenplum with the CREATE EXTENSION name command. Refer to Installing Additional Supplied Modules for more information about registering contrib modules in Greenplum Database 6.

  • When Greenplum Database High Availability is enabled, a primary segment instance is kept up to date with the mirror segment instance using Write-Ahead Logging (WAL)-based streaming replication. See Overview of Segment Mirroring.

    The gp_stat_replication view contains replication statistics when master or segment mirroring is enabled.

    In previous releases, segment mirroring employed a physical file replication scheme.

  • In the gp_segment_configuration table, the replication_port has been removed. The datadir column has been added to display the segment instance data directory. The mode column values are now s (synchronized) or n (not synchronized). Use the gp_stat_replication view to determine the synchronization state.

  • The Greenplum Database 6 Client and Loader Tools package for Windows does not support running the gpfdist program as a native Windows service.

Removed Features

Pivotal Greenplum Database 6.0 removes these features:

  • The gpseginstall utility is no longer included. You must install the Greenplum software RPM on each segment host, as described in Installing the Greenplum Database Software.
  • The gptransfer utility is no longer included; use gpcopy for all functionality that was provided with gptransfer.
  • The gp_fault_strategy system table is no longer used. Greenplum Database now uses the gp_segment_configuration system table to determine if mirroring is enabled.
  • Pivotal Greenplum Database 6 removes the gpcrondump, gpdbrestore, and gpmfr management utilities. Use gpbackup and gprestore to back up and restore Greenplum Database.
  • Pivotal Greenplum Database 6 no longer supports Veritas NetBackup.
  • Pivotal Greenplum Database 6 no longer supports the use of direct I/O to bypass the buffering of memory within the file system cache for backup.
  • Pivotal Greenplum Database 6 no longer supports the gphdfs external table protocol to access a Hadoop system. Use the Greenplum Platform Extension Framework (PXF) to access Hadoop in version 6. Refer to pxf:// Protocol for information about using the pxf external table protocol.
  • Pivotal Greenplum Database 6 no longer supports SSLv3.
  • Pivotal Greenplum Database 6 removes the following server configuration parameters:
    • gp_analyze_relative_error
    • gp_backup_directIO
    • gp_backup_directIO_read_chunk_mb
    • gp_connections_per_thread
    • gp_enable_sequential_window_plans
    • gp_idf_deduplicate
    • gp_snmp_community
    • gp_snmp_monitor_address
    • gp_snmp_use_inform_or_trap
    • gp_workfile_checksumming
  • The undocumented gp_cancel_query() function, and the configuration parameters gp_cancel_query_print_log and gp_cancel_query_delay_time, are removed in Greenplum Database 6.
  • The string_agg(expression) function is removed from Greenplum 6. The function concatenates text values into a string. The string_agg(expression, delimiter) function is still supported.
  • Pivotal Greenplum Database 6 no longer supports the ability to configure a Greenplum Database system to trigger SNMP (Simple Network Management Protocol) alerts or send email notifications to system administrators if certain database events occur. Use Pivotal Greenplum Command Center alerts to detect and respond to events that occur in a Greenplum system.
  • Pivotal Greenplum Database 6 removes the gpfilespace utility. The CREATE TABLESPACE command no longer requires a filespace created with the utility.
  • Pivotal Greenplum Database 6 no longer automatically casts text from the deprecated timestamp format YYYYMMDDHH24MISS. The format could not be parsed unambiguously in previous Greenplum Database releases. The format is not supported in PostgreSQL 9.4.

    For example, this command returns an error in Greenplum Database 6. In previous releases, a timestamp is returned.

    # select to_timestamp('20190905140000');
    

    In Greenplum Database 6, this command returns a timestamp.

    # select to_timestamp('20190905140000','YYYYMMDDHH24MISS');
    
  • Pivotal Greenplum Database 6 removes the --ignore-version option from the pg_dump, pg_dumpall, and pg_restore utilities.

  • The gpcheck utility is no longer included.

Deprecated Features

Deprecated features will be removed in a future major release of Greenplum Database. VMware Tanzu Greenplum 6.x deprecates:

  • The gpsys1 utility.
  • The analzyedb option --skip_root_stats (deprecated since 6.2).

    If the option is specified, a warning is issued stating that the option will be ignored.

  • The server configuration parameter gp_statistics_use_fkeys (deprecated since 6.2).

  • The server configuration parameter gp_ignore_error_table (deprecated since 6.0).

    To avoid a Greenplum Database syntax error, set the value of this parameter to true when you run applications that execute CREATE EXTERNAL TABLE or COPY commands that include the now removed Greenplum Database 4.3.x INTO ERROR TABLE clause.

  • Specifying => as an operator name in the CREATE OPERATOR command (deprecated since 6.0).

  • The Greenplum external table C API (deprecated since 6.0).

    Any developers using this API are encouraged to use the new Foreign Data Wrapper API in its place.

  • Commas placed between a SUBPARTITION TEMPLATE clause and its corresponding SUBPARTITION BY clause, and between consecutive SUBPARTITION BY clauses in a CREATE TABLE command (deprecated since 6.0).

    Using this undocumented syntax will generate a deprecation warning message.

  • The timestamp format YYYYMMDDHH24MISS (deprecated since 6.0).

    This format could not be parsed unambiguously in previous Greenplum Database releases, and is not supported in PostgreSQL 9.4.

  • The createlang and droplang utilities (deprecated since 6.0).

  • The pg_resqueue_status system view (deprecated since 6.0).

    Use the gp_toolkit.gp_resqueue_status view instead.

  • The GLOBAL and LOCAL modifiers when creating a temporary table with the CREATE TABLE and CREATE TABLE AS commands (deprecated since 6.0).

    These keywords are present for SQL standard compatibility, but have no effect in Greenplum Database.

  • Using WITH OIDS or oids=TRUE to assign an OID system column when creating or altering a table (deprecated since 6.0).

  • Allowing superusers to specify the SQL_ASCII encoding regardless of the locale settings (deprecated since 6.0).

    This choice may result in misbehavior of character-string functions when data that is not encoding-compatible with the locale is stored in the database.

  • The @@@ text search operator (deprecated since 6.0).

    This operator is currently a synonym for the @@ operator.

  • The unparenthesized syntax for option lists in the VACUUM command (deprecated since 6.0).

    This syntax requires that the options to the command be specified in a specific order.

  • The plain pgbouncer authentication type (auth_type = plain) (deprecated since 4.x).

Known Issues and Limitations

VMware Tanzu Greenplum 6 has these limitations:

  • Upgrading a Greenplum Database 4 release to VMware Tanzu Greenplum 6 is not supported. Upgrading a Greenplum Database 5.x release to VMware Tanzu Greenplum 6 is supported via gpupgrade. For more information, see the gpupgrade documentation.
  • MADlib, GPText, and PostGIS are not yet provided for installation on Ubuntu systems.
  • Greenplum for Kubernetes is not yet provided with this release.

The following table lists key known issues in VMware Tanzu Greenplum 6.x.

Table 1. Key Known Issues in VMware Tanzu Greenplum 6.x
Issue Category Description
31010 Query Optimizer A view created in Greenplum Database 5.28.3 or older that specified an external table in the FROM clause, and that was migrated to Greenplum Database 6.x, always falls back to the Postgres Planner when queried.

Workaround: If you migrated a view from Greenplum Database 5.28.3 or earlier, and the view specified an external table in the FROM clause, you must drop and recreate the view in Greenplum 6.x to ensure that the Query Optimizer is exercised when you query the view.

N/A Backup/Restore Restoring the Greenplum Database backup for a table fails in Greenplum 6 versions earlier than version 6.10 when a replicated table has an inheritance relationship to/from another table that was assigned via an ALTER TABLE ... INHERIT statement after table creation.
Workaround: Use the following SQL commands to determine if Greenplum Database includes any replicated tables that inherit from a parent table, or if there are replicated tables that are inherited by a child table:
SELECT inhrelid::regclass FROM pg_inherits,
  gp_distribution_policy dp
WHERE inhrelid=dp.localoid AND dp.policytype='r';
SELECT inhparent::regclass FROM pg_inherits,
  gp_distribution_policy dp
WHERE inhparent=dp.localoid AND dp.policytype='r';

If these queries return any tables, you may choose to run gprestore with the -–on-error-continue flag to not fail the entire restore when this issue is hit. Or, you can specify the list of tables returned by the queries to the -–exclude-table-file option to skip those tables during restore. You must recreate and repopulate the affected tables after restore.

N/A Spark Connector This version of Greenplum is not compatible with Greenplum-Spark Connector versions earlier than version 1.7.0, due to a change in how Greenplum handles distributed transaction IDs.
N/A PXF Starting in 6.x, Greenplum does not bundle cURL and instead loads the system-provided library. PXF requires cURL version 7.29.0 or newer. The officially-supported cURL for the CentOS 6.x and Red Hat Enterprise Linux 6.x operating systems is version 7.19.*. Greenplum Database 6 does not support running PXF on CentOS 6.x or RHEL 6.x due to this limitation.

Workaround: Upgrade the operating system of your Greenplum Database 6 hosts to CentOS 7+ or RHEL 7+, which provides a cURL version suitable to run PXF.

29703 Loading Data from External Tables Due to limitations in the Greenplum Database external table framework, Greenplum Database cannot log the following types of errors that it encounters while loading data:
  • data type parsing errors
  • unexpected value type errors
  • data type conversion errors
  • errors returned by native and user-defined functions
LOG ERRORS returns error information for data exceptions only. When it encounters a parsing error, Greenplum terminates the load job, but it cannot log and propagate the error back to the user via gp_read_error_log().

Workaround: Clean the input data before loading it into Greenplum Database.

30594 Resource Management Resource queue-related statistics may be inaccurate in certain cases. VMware recommends that you use the resource group resource management scheme that is available in Greenplum 6.
30522 Logging Greenplum Database may write a FATAL message to the standby master or mirror log stating that the database system is in recovery mode when the instance is synchronizing with the master and Greenplum attempts to contact it before the operation completes. Ignore these messages and use gpstate -f output to determine if the standby successfully synchronized with the Greenplum master; the command returns Sync state: sync if it is synchronized.
30537 Postgres Planner The Postgres Planner generates a very large query plan that causes out of memory issues for the following type of CTE (common table expression) query: the WITH clause of the CTE contains a partitioned table with a large number partitions, and the WITH reference is used in a subquery that joins another partitioned table.

Workaround: If possible, use the GPORCA query optimizer. With the server configuration parameter optimizer=on, Greenplum Database attempts to use GPORCA for query planning and optimization when possible and falls back to the Postgres Planner when GPORCA cannot be used. Also, the specified type of query might require a long time to complete.

170824967 gpfdists For Greenplum Database 6.x, a command that accesses an external table that uses the gpfdists protocol fails if the external table does not use an IP address when specifying a host system in the LOCATION clause of the external table definition.
n/a Materialized Views By default, certain gp_toolkit views do not display data for materialized views. If you want to include this information in gp_toolkit view output, you must redefine a gp_toolkit internal view as described in Including Data for Materialized Views.
168957894 PXF The PXF Hive Connector does not support using the Hive* profiles to access Hive transactional tables.

Workaround: Use the PXF JDBC Connector to access Hive.

168548176 gpbackup When using gpbackup to back up a Greenplum Database 5.7.1 or earlier 5.x release with resource groups enabled, gpbackup returns a column not found error for t6.value AS memoryauditor.
164791118 PL/R PL/R cannot be installed using the deprecated createlang utility, and displays the error:
createlang: language installation failed: ERROR:
no schema has been selected to create in
Workaround: Use CREATE EXTENSION to install PL/R, as described in the documentation.
N/A Greenplum Client/Load Tools on Windows The Greenplum Database client and load tools on Windows have not been tested with Active Directory Kerberos authentication.
N/A Greenplum Installation For Greenplum Database 6.16.1 and 6.16.2, the library libevent is not listed as a dependency for the rpm for RHEL 7. Resolved in 6.16.3.
N/A Greenplum on Dell EMC VxRail The maximum number of virtual machines that can be deployed in a VMware Tanzu Greenplum on Dell EMC VxRail environment is 250.
N/A Greenplum on Dell EMC VxRail When provisioning virtual machines with Terraform, there is a 30 minute timeout for cloning the virtual machines in vSphere.