Managing Spill Files Generated by Queries

Managing Spill Files Generated by Queries

Greenplum Database creates spill files, also known as workfiles, on disk if it does not have sufficient memory to execute an SQL query in memory. The default value of 100,000 spill files is sufficient for the majority of queries. However, if a query creates more than the specified number of spill files, Greenplum Database returns this error:
ERROR: number of workfiles per query limit exceeded
Reasons that cause a large number of spill files to be generated include:
  • Data skew is present in the queried data.
  • The amount memory allocated for the query is too low.

You might be able to run the query successfully by changing the query, changing the data distribution, or changing the system memory configuration. You can use the gp_workfile_* views to see spill file usage information. You can control the maximum amount of memory that can used by a query with the Greenplum Database server configuration parameters max_statement_mem, statement_mem, or through resource queues.

Monitoring a Greenplum System contains the following information:
  • Information about skew and how to check for data skew
  • Information about using the gp_workfile_* views

For information about server configuration parameters, see the Greenplum Database Reference Guide. For information about resource queues, see Workload Management with Resource Queues.

If you have determined that the query must create more spill files than allowed by the value of server configuration parameter gp_workfile_limit_files_per_query, you can increase the value of the parameter.