Memory and Resource Management with Resource Groups
A newer version of this documentation is available. Use the version menu above to view the most up-to-date release of the Greenplum 5.x documentation.
Memory and Resource Management with Resource Groups
Managing Greenplum Database resources with resource groups.
Memory, CPU, and concurrent transaction management have a significant impact on performance in a Greenplum Database cluster. Resource groups are a newer resource management scheme that enforce memory, CPU, and concurrent transaction limits in Greenplum Database.
- Configuring Memory for Greenplum Database
- Configuring Resource Groups
- Example Memory Configuration Calculations
- Low Memory Queries
- Administrative Utilities and admin_group Concurrency
Configuring Memory for Greenplum Database
While it is not always possible to increase system memory, you can avoid many out-of-memory conditions by configuring resource groups to manage expected workloads.
The following operating system and Greenplum Database memory settings are significant when you manage Greenplum Database resources with resource groups:
This Linux kernel parameter, set in /etc/sysctl.conf, identifies the method that the operating system uses to determine how much memory can be allocated to processes. vm.overcommit_memory must always be set to 2 for Greenplum Database systems.
This Linux kernel parameter, set in /etc/sysctl.conf, identifies the percentage of RAM that is used for application processes; the remainder is reserved for the operating system. The operating system default value (50 on Red Hat) is a good starting point for Greenplum Database clusters employing resource group-based resource management. If your memory utilization is too low, increase the value; if your memory or swap usage is too high, decrease the setting.
The percentage of system memory to allocate to Greenplum Database. The default value is .7 (70%).
Set gp_workfile_limit_files_per_query to limit the maximum number of temporary spill files (workfiles) allowed per query. Spill files are created when a query requires more memory than it is allocated. When the limit is exceeded the query is terminated. The default is zero, which allows an unlimited number of spill files and may fill up the file system.
If there are numerous spill files then set gp_workfile_compress_algorithm to compress the spill files. Compressing spill files may help to avoid overloading the disk subsystem with IO operations.
- Do not configure the operating system to use huge pages.
- When you configure resource group memory, consider memory requirements for mirror segments that become primary segments during a failure to ensure that database operations can continue when primary segments or segment hosts fail.
Configuring Resource Groups
Greenplum Database resource groups provide a powerful mechanism for managing the workload of the cluster. Consider these general guidelines when you configure resource groups for your system:
- A transaction submitted by any Greenplum Database role with SUPERUSER privileges runs under the default resource group named admin_group. Keep this in mind when scheduling and running Greenplum administration utilities.
- Ensure that you assign each non-admin role a resource group. If you do not assign a resource group to a role, queries submitted by the role are handled by the default resource group named default_group.
- Use the CONCURRENCY resource group parameter to limit the number of active queries that members of a particular resource group can run concurrently.
- Use the MEMORY_LIMIT and MEMORY_SHARED_QUOTA parameters to control the maximum amount of memory that queries running in the resource group can consume.
- Greenplum Database assigns unreserved memory (100 - (sum of all resource group MEMORY_LIMITs) to a global shared memory pool. This memory is available to all queries on a first-come, first-served basis.
- Alter resource groups dynamically to match the real requirements of the group for the workload and the time of day.
- Use the gptoolkit views to examine resource group resource usage and to monitor how the groups are working.
- Consider using Pivotal Greenplum Command Center to create and manage resource groups, and to define the criteria under which Command Center dynamically assigns a transaction to a resource group.
Example Memory Configuration Calculations
This section provides example memory calculations for a Greenplum Database system with the following specifications:
- Total RAM = 256GB
- Swap = 64GB
- 8 primary segments and 8 mirror segments per host, in blocks of 4 hosts
- Maximum number of primaries per host during failure is 11
The usable memory available on a host is a function of the amount of RAM and swap space configured for the system, as well as the vm.overcommit_ratio system parameter setting:
total_node_usable_memory = RAM * (vm.overcommit_ratio / 100) + Swap = 256GB * (50/100) + 64GB = 192GB
Assuming the default gp_resource_group_memory_limit value (.7), the memory allocated to a Greenplum Database host with the example configuration is:
total_gp_memory = total_node_usable_memory * gp_resource_group_memory_limit = 192GB * .7 = 134.4GB
The memory available to a Greenplum Database segment on a segment host is a function of the memory reserved for Greenplum on the host and the number of active primary segments on the host. On cluster startup:
gp_seg_memory = total_gp_memory / number_of_active_primary_segments = 134.4GB / 8 = 16.8GB
total_gp_memory_with_primaries = 16.8GB * 11 = 184.8GB
Low Memory Queries
Administrative Utilities and admin_group Concurrency
The default resource group for database transactions initiated by Greenplum Database SUPERUSERs is the group named admin_group. The default CONCURRENCY value for the admin_group resource group is 10.