Monitoring a Greenplum System
A newer version of this documentation is available. Click here to view the most up-to-date release of the Greenplum 4.x documentation.
Monitoring a Greenplum System
Education in the Greenplum Database system helps administrators plan workflow and troubleshoot problems. This chapter discusses tools for monitoring database performance and activity.
Monitoring Database Activity and Performance
Greenplum provides an optional system monitoring and management tool, Greenplum Command Center, that administrators can enable within Greenplum Database 4.3.
Enabling Greenplum Command Center is a two-part process. First, enable the Greenplum Database server to collect and store system metrics. Next, install and configure the Greenplum Command Center Console, an online application used to view the system metrics collected and store them in the Command Center’s dedicated Greenplum Database.
The Greenplum Command Center Console ships separately from your Greenplum Database 4.3 installation. Download the Greenplum Command Center Console package from Pivotal Network and documentation from Pivotal Documentation. See the Greenplum Database Command Center Administrator Guide for more information on installing and using the Greenplum Command Center Console.
Monitoring System State
As a Greenplum Database administrator, you must to monitor the system for problem events such as a segment going down or running out of disk space on a segment host. The following topics describe how to monitor the health of a Greenplum Database system and examine certain state information for a Greenplum Database system.
Enabling System Alerts and Notifications
You can configure a Greenplum Database system to trigger SNMP (Simple Network Management Protocol) alerts or send email notifications to system administrators if certain database events occur. These events include:
- All PANIC-level error conditions
- All FATAL-level error conditions
- ERROR-level conditions that are "internal errors" (for example, SIGSEGV errors)
- Database system shutdown and restart
- Segment failure and recovery
- Standby master out-of-sync conditions
- Master host manual shutdown or other software problem (in certain failure scenarios, Greenplum Database cannot send an alert or notification)
This topic includes the following sub-topics:
Note that SNMP alerts and email notifications report the same event information. There is no difference in the event information that either tool reports.
Using SNMP with a Greenplum Database System
Greenplum’s gpsnmpd agent is an SNMP daemon that supports SNMP requests about the state of a Greenplum Database system using MIBs (Management Information Bases).
MIBs are collections of objects that describe an SNMP-manageable entity — in this case, a Greenplum Database system. An agent is any SNMP software running on a managed device that responds to queries or set requests. The gpsnmpd daemon currently supports the generic RDBMS MIB and typically operates on the master host.
gpsnmpd works with the SNMP support that already exists on the Greenplum Database system. Greenplum recommends you install and run gpsnmpd as an AgentX (Agent Extensibility Protocol) sub-agent to the operating system’s SNMP agent (usually called snmpd). This allows a Network Management System to obtain information about the hardware, operating system, and Greenplum Database from the same port (161) and IP address. It also enables the auto-discovery of Greenplum Database instances.
Alternatively, you can run the Greenplum SNMP agent as a stand-alone agent and use the gpsnmpd agent or SNMP notification features independently of each other. As a standalone SNMP agent, gpsnmpd listens for SNMP queries and requires the same configuration as the system SNMP agent.
When the gpsnmpd sub-agent starts, it registers itself with the system-level SNMP agent and communicates to the system agent the MIB parts of which it is aware. The system agent communicates with the SNMP client/network monitoring application and forwards requests for particular sections of the MIB to the gpsnmpd sub-agent.
To obtain information about a Greenplum Database instance, gpsnmpd logs into Greenplum Database through a normal libpq client session. SNMP GetRequests trigger SELECT statements against Greenplum Database to get the requested information. gpsnmpd does not send SNMP trap notifications; Greenplum Database itself sends this information to the network monitor application.
Before setting up SNMP support on Greenplum Database, ensure SNMP is installed on the master host. If the snmpd file is not present in the /usr/sbin directory (/usr/sfw/sbin/ on Solaris), then SNMP is not installed on the system. Depending on the platform on which you are running Greenplum Database, install the following:
|Red Hat Enterprise||net-snmp
|SUSE, Solaris, OSX||N/A|
The snmp.conf configuration file is located in /etc/snmp/. On Solaris platforms, the snmp.conf file is in /etc/sma/snmp/.
After you establish that SNMP is on the master host, log in as root, open a text editor, and edit the path_to/snmp/snmpd.conf file. To use SNMP with Greenplum Database, the minimum configuration change required to the snmpd.conf file is specifying a community name. For example:
Replace public with the name of your SNMP community. Greenplum also recommends configuring syslocation and syscontact. Configure other SNMP settings as required for your environment and save the file.
For more information about the snmpd.conf file, enter:
On SUSE Linux platforms, make sure to review and configure security settings in the snmp.conf file so snmpd accepts connections from sub-agents and returns all available Object IDs (OIDs).
After you finish configuring the snmpd.conf file, start the system snmpd daemon:
# /sbin/chkconfig snmpd on
Then, verify the system snmpd daemon is running. Enter:
# snmpwalk -v 1 -c community_name localhost .126.96.36.199.188.8.131.52.0
# snmpwalk -v 1 -c public localhost .184.108.40.206.220.127.116.11.0
If this command returns “Timeout: No Response from localhost”, then the system snmpd daemon is not running. If the daemon is running, output similar to the following displays:
SNMPv2-MIB::sysDescr.0 = STRING: Linux hostname 2.6.18-92.el5 #1 SMP Tue Jun 10 18:51:06 EDT 2010 x86_64
Installing and Configuring the Greenplum SNMP Agent
The following procedure describes how to configure the Greenplum SNMP agent (gpsnmpd) to collect and return database information to a network monitor.
If required, gpsnmpd can run as a stand-alone agent on a port other than port 161.
- Log in as root and source the
greenplum_path.sh file from your Greenplum
$ su - # source $GPHOME/greenplum_path.sh
- Copy NETWORK-SERVICES-MIB.txt,
GPDB-MIB.txt, and RDBMS-MIB.txt from the
$GPHOME/share/postgresql directory to the default location for
MIBs on the system.
The typical default location for MIBs is: /usr/share/snmp/mibs
# cp NETWORK-SERVICES-MIB.txt GPDB-MIB.txt RDBMS-MIB.txt \ /usr/share/snmp/mibs
- Determine how you want to use gpsnmpd. Greenplum
recommends running gpsnmpd as a sub-agent. This allows the
monitoring application to use standard SNMP ports to communicate with both the
system agent and gpsnmpd. This enables you to monitor both system
and Greenplum Database status and events.
To run gpsnmpd as a sub-agent to the existing SNMP agent, open /etc/snmp/snmpd.conf in a text editor and enter the following:
On SUSE Linux, also enter the following:
Alternatively, you can run gpsnmpd as a stand-alone agent. To do this, skip this step and go to Step 6.
- Perform platform-specific tasks:
- On SUSE
Linux platforms, create the following
ln -s /var/run/agentx /var/agent
- On Solaris platforms, start the System Management
- On SUSE Linux platforms, create the following link:
- Restart the snmpd
# /etc/init.d/snmpd restart
- Set up a Greenplum Database table that is used by the Greenplum
SNMP daemon gpsnmpd by running the script
Greenplum recommends using the postgres database. The database name is specified in step 7 when you start the gpsnmpd daemon. For example, as gpadmin run this script to create the table in the postgres database:
$ psql -d postgres -f $GPHOME/share/postgresql/gpsnmpd.sql
- Start gpsnmpd.
- To start
gpsnmpd as an AgentX sub-agent, enter the following command
# gpsnmpd -s -b -C "dbname=postgres user=username \ password=password"
# gpsnmpd -s -b -C "dbname=postgres user=gpadmin \ password=secret"
- To start gpsnmpd as a stand-alone agent, enter
the following command (as
# gpsnmpd -b -c path_to/snmp/snmpd.conf -x \ nic_ip_address: -C "dbname=postgres user=username \ password=password"
# gpsnmpd -b -c /etc/snmp/snmpd.conf -x \ 192.0.2.24:10161 -C "dbname=postgres user=gpadmin \ password=secret"
Greenplum recommends using the postgres database in the connection string (dbname=postgres).
You do not need to specify the –C option if you create a database role (user identification) called root and add the following line in the pg_hba.conf file:
local postgres root ident
This allows the UNIX user root to connect to the Postgres database over the local connection. The root user does not require special permissions, but other users require the user and password parameters to start gpsnmpd.
You can specify any configuration file to run gpsnmpd as a stand-alone agent; you do not have to use the path_to/snmp/snmpd.conffile. The configuration file you use must include a value for rocommunity.
The -x option allows you to specify an IP address for a network interface card on the host and specify a port other than the default SNMP port of 161. This enables you to run gpsnmpd without root permissions because you must have root permissions to use ports 1024 and lower. You do not need to specify this option if you are running gpsnmpd as an AgentX sub-agent (-s).
gpsnmpd loads a default set of MIBs automatically. However, you can use -m option to load specific MIBs when starting gpsnmpd. You can also use -M option to specify a list of directories from which you will load MIBs.
Best practice is not to enter a password in the gpsnmpd command as shown in Step 6 or use the PGPASSWORD environment variable. Some operating systems allow non-root users to see process environment variables and passwords through the ps command. Instead, use the .pgpass file. If you do not specify a password in the gpsnmpd command, the agent reads from the .pgpass file in the home directory of the user you specify in the command. See the The Password File section in the PostgreSQL documentation for more information.
- To start gpsnmpd as an AgentX sub-agent, enter the following command (as root):
- To verify the gpsnmpd agent is enabled and
responding, perform the following tests:
- Test server access
to the Greenplum
# snmpwalk -c communitynamehostname:161 -v2c RDBMS-MIB::rdbmsRelState
You should see the following output:
Resetting connectionapplIndex for 0 is 0 applIndex for 1 is 1 RDBMS-MIB::rdbmsRelState.1.1 = INTEGER: active(2) RDBMS-MIB::rdbmsRelState.2.0 = INTEGER: active(2) RDBMS-MIB::rdbmsRelState.2.1 = INTEGER: active(2)
- Verify Greenplum Database appears in the
rdbmsDbTable. This table describes each database monitored
# snmpwalk -c communitynamehostname:161 -v2c RDBMS-MIB::rdbmsDbTable
You should see output similar to the following:
... RDBMS-MIB::rdbmsDbPrivateMibOID.10888 = OID: SNMPv2-SMI::enterprises.31327.10888RDBMS-MIB::rdbmsDbPrivateMibOID.10889 = OID: SNMPv2-SMI::enterprises.31327.10889 RDBMS-MIB::rdbmsDbVendorName.1 = STRING: Greenplum Corporation RDBMS-MIB::rdbmsDbVendorName.10888 = STRING: Greenplum Corporation RDBMS-MIB::rdbmsDbVendorName.10889 = STRING: Greenplum Corporation RDBMS-MIB::rdbmsDbName.1 = STRING: postgres RDBMS-MIB::rdbmsDbName.10888 = STRING: template0 RDBMS-MIB::rdbmsDbName.10889 = STRING: postgres ...
- Test server access to the Greenplum Database:
- In your network monitor application (such as Nagios, Cacti, or OpenView), import RDBMS-MIB.txt, GPDB-MIB.txt, and NETWORK-SERVICES-MIB.txt. Specify the host name of the Greenplum master in your monitoring application.
Setting up SNMP Notifications
- To configure a Greenplum Database system to send SNMP notifications
when alerts occur, open the postgresql.conf file on the master
host, uncomment, and set the following parameters:
- gp_snmp_community: Set this parameter to the community name you specified for your environment.
- gp_snmp_monitor_address: Enter the hostname:port of your network monitor application. Typically, the port number is 162. If there are multiple monitor addresses, separate them with a comma.
gp_snmp_use_inform_or_trap: Enter either
trap or inform. Trap notifications are SNMP
messages sent from one application to another (for example, between Greenplum
Database and a network monitoring application). These messages are
unacknowledged by the monitoring application, but generate less network
Inform notifications are the same as trap messages, except the application sends an acknowledgement to the application that generated the alert. In this case, the monitoring application sends acknowledgement messages to Greenplum Database-generated trap notifications. While inform messages create more overhead, they inform Greenplum Database the monitoring application has received the traps.
- To test SNMP notifications, you can use the
snmptrapd trap receiver. As root, enter:
# /usr/sbin/snmptrapd -m ALL -Lf ~/filename.log
On Solaris, enter:
# /usr/sfw/sbin/snmptrapd -m ALL -Lf ~/filename.log
-Lf indicates that traps are logged to a file. -Le indicates that traps are logged to stderr instead. -m ALL loads all available MIBs (you can also specify individual MIBs if required).
Enabling Email Notifications
Complete the following steps to enable Greenplum Database to send email notifications to system administrators whenever certain database events occur.
- Open $MASTER_DATA_DIRECTORY/postgresql.conf in a text editor.
- In the EMAIL ALERTS section, uncomment the following
parameters and enter the appropriate values for your email server and domain. For example:
gp_email_smtp_server='smtp.company.com:25' firstname.lastname@example.org' gp_email_smtp_password='mypassword' gp_email_from='Greenplum Database <email@example.com>' firstname.lastname@example.org;John Smith <email@example.com>'
You may create specific email accounts or groups in your email system that send and receive email alerts from the Greenplum Database system. For example:
gp_email_from='GPDB Production Instance <firstname.lastname@example.org>' email@example.com'
You can also specify multiple email addresses for both gp_email parameters. Use a semi-colon ( ; ) to separate each email address. For example:
- Save and close the postgresql.conf file.
- Restart Greenplum
$ gpstop -r
Testing Email Notifications
The Greenplum Database master host must be able to connect to the SMTP email server you specify for the gp_email_smtp_server parameter. To test connectivity, use the ping command:
$ ping my_email_server
If the master host can contact the SMTP server, log in to psql and test email notifications with the following command:
$ psql template1 =# SELECT gp_elog('Test GPDB Email',true); gp_elog
The address you specified for the gp_email_to parameter should receive an email with Test GPDB Email in the subject line.
You can also test email notifications by using a public SMTP server, such as Google’s Gmail SMTP server, and an external email address. For example:
gp_email_smtp_server='smtp.gmail.com:25' #gp_email_smtp_userid='' #gp_email_smtp_password='' firstname.lastname@example.org' email@example.com'
Checking System State
A Greenplum Database system is comprised of multiple PostgreSQL instances (the master and segments) spanning multiple machines. To monitor a Greenplum Database system, you need to know information about the system as a whole, as well as status information of the individual instances. The gpstate utility provides status information about a Greenplum Database system.
Viewing Master and Segment Status and Configuration
The default behavior of gpstate is to check segment instances and show a brief status of the valid and failed segments. For example, to see a quick status of your Greenplum Database system:
To see more detailed information about your Greenplum array configuration, use gpstate with the -s option:
$ gpstate -s
Viewing Your Mirroring Configuration and Status
If you are using mirroring for data redundancy, you may want to see the list of mirror segment instances in the system, their current synchronization status, and the mirror to primary mapping. For example, to see the mirror segments in the system and their status:
$ gpstate -m
To see the primary to mirror segment mappings:
$ gpstate -c
To see the status of the standby master mirror:
$ gpstate -f
Checking Disk Space Usage
A database administrator’s most important monitoring task is to make sure the file systems where the master and segment data directories reside do not grow to more than 70 percent full. A filled data disk will not result in data corruption, but it may prevent normal database activity from occurring. If the disk grows too full, it can cause the database server to shut down.
You can use the gp_disk_free external table in the gp_toolkit administrative schema to check for remaining free space (in bytes) on the segment host file systems. For example:
=# SELECT * FROM gp_toolkit.gp_disk_free ORDER BY dfsegment;
Checking Sizing of Distributed Databases and Tables
The gp_toolkit administrative schema contains several views that you can use to determine the disk space usage for a distributed Greenplum Database, schema, table, or index. For a list of the available sizing views for checking database object sizes and disk space, see the Greenplum Database Reference Guide.
Viewing Disk Space Usage for a Database
To see the total size of a database (in bytes), use the gp_size_of_database view in the gp_toolkit administrative schema. For example:
=> SELECT * FROM gp_toolkit.gp_size_of_database ORDER BY soddatname;
Viewing Disk Space Usage for a Table
The gp_toolkit administrative schema contains several views for checking the size of a table. The table sizing views list the table by object ID (not by name). To check the size of a table by name, you must look up the relation name (relname) in the pg_class table. For example:
=> SELECT relname AS name, sotdsize AS size, sotdtoastsize AS toast, sotdadditionalsize AS other FROM gp_size_of_table_disk as sotd, pg_class WHERE sotd.sotdoid=pg_class.oid ORDER BY relname;
For a list of the available table sizing views, see the Greenplum Database Reference Guide.
Viewing Disk Space Usage for Indexes
The gp_toolkit administrative schema contains a number of views for checking index sizes. To see the total size of all index(es) on a table, use the gp_size_of_all_table_indexes view. To see the size of a particular index, use the gp_size_of_index view. The index sizing views list tables and indexes by object ID (not by name). To check the size of an index by name, you must look up the relation name (relname) in the pg_class table. For example:
=> SELECT soisize, relname as indexname FROM pg_class, gp_size_of_index WHERE pg_class.oid=gp_size_of_index.soioid AND pg_class.relkind='i';
Checking for Data Distribution Skew
All tables in Greenplum Database are distributed, meaning their data is divided evenly across all of the segments in the system. Unevenly distributed data may diminish query processing performance. A table’s distribution policy is determined at table creation time. For information about choosing the table distribution policy, see the following topics:
The gp_toolkit administrative schema also contains a number of views for checking data distribution skew on a table. For information about how to check for uneven data distribution, see the Greenplum Database Reference Guide.
Viewing a Table’s Distribution Key
To see the columns used as the data distribution key for a table, you can use the \d+ meta-command in psql to examine the definition of a table. For example:
=# \d+ sales Table "retail.sales" Column | Type | Modifiers | Description -------------+--------------+-----------+------------- sale_id | integer | | amt | float | | date | date | | Has OIDs: no Distributed by: (sale_id)
Viewing Data Distribution
To see the data distribution of a table’s rows (the number of rows on each segment), you can run a query such as:
=# SELECT gp_segment_id, count(*) FROM table_name GROUP BY gp_segment_id;
A table is considered to have a balanced distribution if all segments have roughly the same number of rows.
Checking for Query Processing Skew
When a query is being processed, all segments should have equal workloads to ensure the best possible performance. If you identify a poorly-performing query, you may need to inveestigate further using the EXPLAIN command. For information about using the EXPLAIN command and query profiling, see Query Profiling.
Query processing workload can be skewed if the table’s data distribution policy and the query predicates are not well matched. To check for processing skew, you can run a query such as:
=# SELECT gp_segment_id, count(*) FROM table_name WHERE column='value' GROUP BY gp_segment_id;
This will show the number of rows returned by segment for the given WHERE predicate.
Viewing Metadata Information about Database Objects
Greenplum Database tracks various metadata information in its system catalogs about the objects stored in a database, such as tables, views, indexes and so on, as well as global objects such as roles and tablespaces.
Viewing the Last Operation Performed
You can use the system views pg_stat_operations and pg_stat_partition_operations to look up actions performed on an object, such as a table. For example, to see the actions performed on a table, such as when it was created and when it was last vacuumed and analyzed:
=> SELECT schemaname as schema, objname as table, usename as role, actionname as action, subtype as type, statime as time FROM pg_stat_operations WHERE objname='cust'; schema | table | role | action | type | time --------+-------+------+---------+-------+-------------------------- sales | cust | main | CREATE | TABLE | 2010-02-09 18:10:07.867977-08 sales | cust | main | VACUUM | | 2010-02-10 13:32:39.068219-08 sales | cust | main | ANALYZE | | 2010-02-25 16:07:01.157168-08 (3 rows)
Viewing the Definition of an Object
To see the definition of an object, such as a table or view, you can use the \d+ meta command when working in psql. For example, to see the definition of a table:
=> \d+ mytable
Viewing Query Workfile Usage Information
Greenplum Database administrative schema gp_toolkit contains views that display information about Greenplum Database workfiles. Greenplum Database creates workfiles on disk if it does not have sufficient memory to execute the query in memory. This information can be used for troubleshooting and tuning queries. The information in the views can also be used to specify the values for the Greenplum Database configuration parameters gp_workfile_limit_per_query and gp_workfile_limit_per_segment.
These are the views in the schema gp_toolkit:
- The gp_workfile_entries view contains one row for each operator using disk space for workfiles on a segment at the current time.
- The gp_workfile_usage_per_query view contains one row for each query using disk space for workfiles on a segment at the current time.
- The gp_workfile_usage_per_segment view contains one row for each segment. Each row displays the total amount of disk space used for workfiles on the segment at the current time.
For information about using gp_toolkit, see Using gp_toolkit.
Viewing the Database Server Log Files
Every database instance in Greenplum Database (master and segments) runs a PostgreSQL database server with its own server log file. Daily log files are created in the pg_log directory of the master and each segment data directory.
Log File Format
The server log files are written in comma-separated values (CSV) format. Some log entries will not have values for all log fields. For example, only log entries associated with a query worker process will have the slice_id populated. You can identify related log entries of a particular query by the query’s session identifier (gp_session_id) and command identifier (gp_command_count).
The following fields are written to the log:
|#||Field Name||Data Type||Description|
|1||event_time||timestamp with time zone||Time that the log entry was written to the log|
|2||user_name||varchar(100)||The database user name|
|3||database_name||varchar(100)||The database name|
|4||process_id||varchar(10)||The system process ID (prefixed with "p")|
|5||thread_id||varchar(50)||The thread count (prefixed with "th")|
|6||remote_host||varchar(100)||On the master, the hostname/address of the client machine. On the segment, the hostname/address of the master.|
|7||remote_port||varchar(10)||The segment or master port number|
|8||session_start_time||timestamp with time zone||Time session connection was opened|
|9||transaction_id||int||Top-level transaction ID on the master. This ID is the parent of any subtransactions.|
|10||gp_session_id||text||Session identifier number (prefixed with "con")|
|11||gp_command_count||text||The command number within a session (prefixed with "cmd")|
|12||gp_segment||text||The segment content identifier (prefixed with "seg" for primaries or "mir" for mirrors). The master always has a content ID of -1.|
|13||slice_id||text||The slice ID (portion of the query plan being executed)|
|14||distr_tranx_id||text||Distributed transaction ID|
|15||local_tranx_id||text||Local transaction ID|
|17||event_severity||varchar(10)||Values include: LOG, ERROR, FATAL, PANIC, DEBUG1, DEBUG2|
|18||sql_state_code||varchar(10)||SQL state code associated with the log message|
|19||event_message||text||Log or error message text|
|20||event_detail||text||Detail message text associated with an error or warning message|
|21||event_hint||text||Hint message text associated with an error or warning message|
|22||internal_query||text||The internally-generated query text|
|23||internal_query_pos||int||The cursor index into the internally-generated query text|
|24||event_context||text||The context in which this message gets generated|
|25||debug_query_string||text||User-supplied query string with full detail for debugging. This string can be modified for internal use.|
|26||error_cursor_pos||int||The cursor index into the query string|
|27||func_name||text||The function in which this message is generated|
|28||file_name||text||The internal code file where the message originated|
|29||file_line||int||The line of the code file where the message originated|
|30||stack_trace||text||Stack trace text associated with this message|
Searching the Greenplum Database Server Log Files
Greenplum provides a utility called gplogfilter can search through a Greenplum Database log file for entries matching the specified criteria. By default, this utility searches through the Greenplum master log file in the default logging location. For example, to display the last three lines of the master log file:
$ gplogfilter -n 3
To search through all segment log files simultaneously, run gplogfilter through the gpssh utility. For example, to display the last three lines of each segment log file:
$ gpssh -f seg_host_file
=> source /usr/local/greenplum-db/greenplum_path.sh => gplogfilter -n 3 /gpdata/gp*/pg_log/gpdb*.log
Use Greenplum’s administrative schema gp_toolkit to query the system catalogs, log files, and operating environment for system status information. The gp_toolkit schema contains several views you can access using SQL commands. The gp_toolkit schema is accessible to all database users. Some objects require superuser permissions. Use a command similar to the following to add the gp_toolkit schema to your schema search path:
=> ALTER ROLE myrole SET search_path TO myschema,gp_toolkit;
For a description of the available administrative schema views and their usages, see the Greenplum Database Reference Guide.