Checking for Tables that Need Routine Maintenance

Checking for Tables that Need Routine Maintenance

The following views can help identify tables that need routine table maintenance (VACUUM and/or ANALYZE).

The VACUUM or VACUUM FULL command reclaims disk space occupied by deleted or obsolete rows. Because of the MVCC transaction concurrency model used in Greenplum Database, data rows that are deleted or updated still occupy physical space on disk even though they are not visible to any new transactions. Expired rows increase table size on disk and eventually slow down scans of the table.

The ANALYZE command collects column-level statistics needed by the query optimizer. Greenplum Database uses a cost-based query optimizer that relies on database statistics. Accurate statistics allow the query optimizer to better estimate selectivity and the number of rows retrieved by a query operation in order to choose the most efficient query plan.

gp_bloat_diag

This view shows regular heap-storage tables that have bloat (the actual number of pages on disk exceeds the expected number of pages given the table statistics). Tables that are bloated require a VACUUM or a VACUUM FULL in order to reclaim disk space occupied by deleted or obsolete rows. This view is accessible to all users, however non-superusers will only be able to see the tables that they have permission to access.
Note: For diagnostic functions that return append-optimized table information, see Checking Append-Optimized Tables.
Table 1. gp_bloat_diag view
Column Description
bdirelid Table object id.
bdinspname Schema name.
bdirelname Table name.
bdirelpages Actual number of pages on disk.
bdiexppages Expected number of pages given the table data.
bdidiag Bloat diagnostic message.

gp_stats_missing

This view shows tables that do not have statistics and therefore may require an ANALYZE be run on the table.

Table 2. gp_stats_missing view
Column Description
smischema Schema name.
smitable Table name.
smisize Does this table have statistics? False if the table does not have row count and row sizing statistics recorded in the system catalog, which may indicate that the table needs to be analyzed. This will also be false if the table does not contain any rows. For example, the parent tables of partitioned tables are always empty and will always return a false result.
smicols Number of columns in the table.
smirecs Number of rows in the table.