Platform Requirements

Platform Requirements

This topic describes the Pivotal Greenplum Database 6 platform and operating system software requirements.

Important: Pivotal Support does not provide support for open source versions of Greenplum Database. Only Pivotal Greenplum Database is supported by Pivotal Support.

Operating Systems

Pivotal Greenplum 6 runs on the following operating system platforms:

  • Red Hat Enterprise Linux 64-bit 7.x (See the following Note.)
  • Red Hat Enterprise Linux 64-bit 6.x
  • CentOS 64-bit 7.x
  • CentOS 64-bit 6.x
  • Ubuntu 18.04 LTS
Important: Significant Greenplum Database performance degradation has been observed when enabling resource group-based workload management on RedHat 6.x and CentOS 6.x systems. This issue is caused by a Linux cgroup kernel bug. This kernel bug has been fixed in CentOS 7.x and Red Hat 7.x systems.

If you use RedHat 6 and the performance with resource groups is acceptable for your use case, upgrade your kernel to version 2.6.32-696 or higher to benefit from other fixes to the cgroups implementation.

Note: For Greenplum Database that is installed on Red Hat Enterprise Linux 7.x or CentOS 7.x prior to 7.3, an operating system issue might cause Greenplum Database that is running large workloads to hang in the workload. The Greenplum Database issue is caused by Linux kernel bugs.

RHEL 7.3 and CentOS 7.3 resolves the issue.

Greenplum Database supports TLS V2.

Software Dependencies

Greenplum Database 6 requires the following software packages on RHEL/CentOS 6/7 systems which are installed automatically as dependencies when you install the Pivotal Greenplum Database RPM package):
  • apr
  • apr-util
  • bash
  • bzip2
  • curl
  • krb5
  • libcurl
  • libevent
  • libxml2
  • libyaml
  • zlib
  • openldap
  • openssh
  • openssl
  • openssl-libs (RHEL7/Centos7)
  • perl
  • readline
  • rsync
  • R
  • sed (used by gpinitsystem)
  • tar
  • zip
Greenplum Database 6 client software requires these operating system packages:
  • apr
  • apr-util
  • libyaml
  • libevent
On Ubuntu systems, Greenplum Database 6 requires the following software packages, which are installed automatically as dependencies when you install Greenplum Database with the Debian package installer:
  • libapr1
  • libaprutil1
  • bash
  • bzip2
  • krb5-multidev
  • libcurl3-gnutls
  • libcurl4
  • libevent-2.1-6
  • libxml2
  • libyaml-0-2
  • zlib1g
  • libldap-2.4-2
  • openssh-client
  • openssh-client
  • openssl
  • perl
  • readline
  • rsync
  • sed
  • tar
  • zip
  • net-tools
  • less
  • iproute2

Greenplum Database 6 uses Python 2.7.12, which is included with the product installation (and not installed as a package dependency).

Important: SSL is supported only on the Greenplum Database master host system. It cannot be used on the segment host systems.
Important: For all Greenplum Database host systems, SELinux must be disabled. You should also disable firewall software, although firewall software can be enabled if it is required for security purposes. See Disabling SELinux and Firewall Software.

Java

Greenplum 6 supports these Java versions for PL/Java and PXF:
  • Open JDK 8 or Open JDK 11, available from AdoptOpenJDK
  • Oracle JDK 8 or Oracle JDK 11

Hardware and Network

The following table lists minimum recommended specifications for hardware servers intended to support Greenplum Database on Linux systems in a production environment. All host servers in your Greenplum Database system must have the same hardware and software configuration. Greenplum also provides hardware build guides for its certified hardware platforms. It is recommended that you work with a Greenplum Systems Engineer to review your anticipated environment to ensure an appropriate hardware configuration for Greenplum Database.
Warning: Running Pivotal Greenplum Database on hyper-converged infrastructure (HCI) has known issues with performance, scalability, and stability and is not recommended as a scalable solution for Pivotal Greenplum and may not be supported by Pivotal if stability problems appear related to the infrastructure. HCI virtualizes all of the elements of conventional hardware systems and includes, at a minimum, virtualized computing, a virtualised SAN, and virtualized networking.
Table 1. Minimum Hardware Requirements
Minimum CPU Any x86_64 compatible CPU
Minimum Memory 16 GB RAM per server
Disk Space Requirements
  • 150MB per host for Greenplum installation
  • Approximately 300MB per segment instance for meta data
  • Appropriate free space for data with disks at no more than 70% capacity
Network Requirements 10 Gigabit Ethernet within the array

NIC bonding is recommended when multiple interfaces are present

Pivotal Greenplum can use either IPV4 or IPV6 protocols.

Storage

The only file system supported for running Greenplum Database is the XFS file system. All other file systems are explicitly not supported by Pivotal.

Greenplum Database is supported on network or shared storage if the shared storage is presented as a block device to the servers running Greenplum Database and the XFS file system is mounted on the block device. Network file systems are not supported. When using network or shared storage, Greenplum Database mirroring must be used in the same way as with local storage, and no modifications may be made to the mirroring scheme or the recovery scheme of the segments.

Other features of the shared storage such as de-duplication and/or replication are not directly supported by Pivotal Greenplum Database, but may be used with support of the storage vendor as long as they do not interfere with the expected operation of Greenplum Database at the discretion of Pivotal.

Greenplum Database can be deployed to virtualized systems only if the storage is presented as block devices and the XFS file system is mounted for the storage of the segment directories.
Warning: Running Greenplum Database on hyper-converged infrastructure (HCI) has known issues with performance, scalability, and stability and is not recommended as a scalable solution for Pivotal Greenplum Database and may not be supported by Pivotal if stability problems appear related to the infrastructure. HCI virtualizes all of the elements of conventional hardware systems and includes, at a minimum, virtualized computing, a virtualised SAN, and virtualized networking.

Greenplum Database is supported on Amazon Web Services (AWS) servers using either Amazon instance store (Amazon uses the volume names ephemeral[0-20]) or Amazon Elastic Block Store (Amazon EBS) storage. If using Amazon EBS storage the storage should be RAID of Amazon EBS volumes and mounted with the XFS file system for it to be a supported configuration.

Data Domain Boost

Pivotal Greenplum 6.0.0 supports Data Domain Boost for backup on Red Hat Enterprise Linux. This table lists the versions of Data Domain Boost SDK and DDOS supported by Pivotal Greenplum 6.x.

Table 2. Data Domain Boost Compatibility
Pivotal Greenplum Data Domain Boost DDOS
6.x 3.3 6.1 (all versions)

6.0 (all versions)

Note: In addition to the DDOS versions listed in the previous table, Pivotal Greenplum supports all minor patch releases (fourth digit releases) later than the certified version.

Tools and Extensions Compatibility

Client Tools

Greenplum Database 6 releases a Clients tool package on various platforms that can be used to access Greenplum Database from a client system. The Greenplum 6 Clients tool package is supported on the following platforms:

  • Red Hat Enterprise Linux x86_64 6.x (RHEL 6)
  • Red Hat Enterprise Linux x86_64 7.x (RHEL 7)
  • Ubuntu 18.04 LTS
  • Windows 10 (32-bit and 64-bit)
  • Windows 8 (32-bit and 64-bit)
  • Windows Server 2012 (32-bit and 64-bit)
  • Windows Server 2012 R2 (32-bit and 64-bit)
  • Windows Server 2008 R2 (32-bit and 64-bit)

The Greenplum 6 Clients package includes the client and loader programs provided in the Greenplum 5 packages plus the addition of database/role/language commands and the Greenplum-Kafka Integration and Greenplum Stream Server command utilities. Refer to Greenplum Client and Loader Tools Package for installation and usage details of the Greenplum 6 Client tools.

Extensions

Table 3. Pivotal Greenplum 6.0.0 Extensions Compatibility
Pivotal Greenplum Extension Versions
MADlib machine learning1 MADlib 1.16
PL/Java 2.0.1
PL/R2 3.0.3
Python Data Science Module Package3 2.0.2
R Data Science Library Package4 2.0.2
PL/Container and PL/Container images for Python, R 2.0.2
PostGIS Spatial and Geographic Objects for Greenplum Database 6.0.x 2.1.5+pivotal.2-2
Note: 1For information about MADlib support and upgrade information, see the MADlib FAQ.

2PL/R supports R 3.5.1. On RHEL and CenOS the PL/R package installs R 3.3.3. See PL/R Language.

3For information about the Python package, including the modules provided, see the Python Data Science Module Package.

4For information about the R package, including the libraries provided, see the R Data Science Library Package.

For information about the Oracle Compatibility Functions, see Oracle Compatibility Functions.

These Greenplum Database extensions are installed with Pivotal Greenplum Database
  • Fuzzy String Match Extension
  • PL/Python Extension
  • pgcrypto Extension

Data Connectors

  • Greenplum Platform Extension Framework (PXF) v5.8.1 - PXF, integrated with Greenplum Database 6, provides access to Hadoop, object store, and SQL external data stores. Refer to Accessing External Data with PXF in the Greenplum Database Administrator Guide for PXF configuration and usage information.
  • Greenplum-Kafka Integration - The Pivotal Greenplum-Kafka Integration provides high speed, parallel data transfer from a Kafka cluster to a Pivotal Greenplum Database cluster for batch and streaming ETL operations. It requires Kafka version 0.11 or newer for exactly-once delivery assurance. Refer to the Pivotal Greenplum-Kafka Integration Documentation for more information about this feature.
  • Greenplum Stream Server v1.2.6 - The Pivotal Greenplum Stream Server is an ETL tool that provides high speed, parallel data transfer from Informatica, Kafka, and custom client data sources to a Pivotal Greenplum Database cluster. Refer to the Performing ETL Operations with the Pivotal Greenplum Stream Server Documentation for more information about this feature.
  • Greenplum Informatica Connector v1.0.5 - The Pivotal Greenplum Informatica Connector supports high speed data transfer from an Informatica PowerCenter cluster to a Pivotal Greenplum Database cluster for batch and streaming ETL operations.
  • Greenplum Spark Connector v1.6.1 - The Pivotal Greenplum Spark Connector supports high speed, parallel data transfer between Greenplum Database and an Apache Spark cluster using Spark’s Scala API.
  • Progress DataDirect JDBC Drivers v5.1.4.000223 - The Progress DataDirect JDBC drivers are compliant with the Type 4 architecture, but provide advanced features that define them as Type 5 drivers.
  • Progress DataDirect ODBC Drivers v7.1.6 (07.16.0301) - The Progress DataDirect ODBC drivers enable third party applications to connect via a common interface to the Pivotal Greenplum Database system.
Note: Pivotal Greenplum 6 does not support the ODBC driver for Cognos Analytics V11.

Connecting to IBM Cognos software with an ODBC driver is not supported. Greenplum Database supports connecting to IBM Cognos software with the DataDirect JDBC driver for Pivotal Greenplum. This driver is available as a download from Pivotal Network.

GPText

Pivotal Greenplum Database 6 is compatible with Pivotal Greenplum Text version 3.3.1 and later. See the Greenplum Text documentation for additional compatibility information.

Greenplum Command Center

Pivotal Greenplum Database 6 is compatible with Pivotal Greenplum Command Center 6.0.0 and later. See the Greenplum Command Center documentation for additional compatibility information.

Hadoop Distributions

Greenplum Database provides access to HDFS with the Greenplum Platform Extension Framework (PXF).

PXF can use Cloudera, Hortonworks Data Platform, MapR, and generic Apache Hadoop distributions. PXF bundles all of the JAR files on which it depends, including the following Hadoop libraries:

  • Hadoop version 2.9.2
  • Hive version 1.2.2
  • HBase version 1.3.2
Note: If you plan to access JSON format data stored in a Cloudera Hadoop cluster, PXF requires a Cloudera version 5.8 or later Hadoop distribution.