| <!-- |
| $PostgreSQL: pgsql/doc/src/sgml/ref/analyze.sgml,v 1.20 2006/09/16 00:30:16 momjian Exp $ |
| PostgreSQL documentation |
| --> |
| |
| <refentry id="SQL-ANALYZE"> |
| <refmeta> |
| <refentrytitle id="sql-analyze-title">ANALYZE</refentrytitle> |
| <refmiscinfo>SQL - Language Statements</refmiscinfo> |
| </refmeta> |
| |
| <refnamediv> |
| <refname>ANALYZE</refname> |
| <refpurpose>collect statistics about a database</refpurpose> |
| </refnamediv> |
| |
| <indexterm zone="sql-analyze"> |
| <primary>ANALYZE</primary> |
| </indexterm> |
| |
| <refsynopsisdiv> |
| <synopsis> |
| ANALYZE [ VERBOSE ] [ <replaceable class="PARAMETER">table</replaceable> [ (<replaceable class="PARAMETER">column</replaceable> [, ...] ) ] ] |
| </synopsis> |
| </refsynopsisdiv> |
| |
| <refsect1> |
| <title>Description</title> |
| |
| <para> |
| <command>ANALYZE</command> collects statistics about the contents |
| of tables in the database, and stores the results in the system |
| table <literal>pg_statistic</literal>. Subsequently, the query |
| planner uses these statistics to help determine the most efficient |
| execution plans for queries. |
| </para> |
| |
| <para> |
| With no parameter, <command>ANALYZE</command> examines every table in the |
| current database. With a parameter, <command>ANALYZE</command> examines |
| only that table. It is further possible to give a list of column names, |
| in which case only the statistics for those columns are collected. |
| </para> |
| </refsect1> |
| |
| <refsect1> |
| <title>Parameters</title> |
| |
| <variablelist> |
| <varlistentry> |
| <term><literal>VERBOSE</literal></term> |
| <listitem> |
| <para> |
| Enables display of progress messages. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="PARAMETER">table</replaceable></term> |
| <listitem> |
| <para> |
| The name (possibly schema-qualified) of a specific table to |
| analyze. Defaults to all tables in the current database. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="PARAMETER">column</replaceable></term> |
| <listitem> |
| <para> |
| The name of a specific column to analyze. Defaults to all columns. |
| </para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| </refsect1> |
| |
| <refsect1> |
| <title>Outputs</title> |
| |
| <para> |
| When <literal>VERBOSE</> is specified, <command>ANALYZE</> emits |
| progress messages to indicate which table is currently being |
| processed. Various statistics about the tables are printed as well. |
| </para> |
| </refsect1> |
| |
| <refsect1> |
| <title>Notes</title> |
| |
| <para> |
| It is a good idea to run <command>ANALYZE</command> periodically, or |
| just after making major changes in the contents of a table. Accurate |
| statistics will help the planner to choose the most appropriate query |
| plan, and thereby improve the speed of query processing. A common |
| strategy is to run <xref linkend="sql-vacuum" endterm="sql-vacuum-title"> |
| and <command>ANALYZE</command> once a day during a low-usage time of day. |
| </para> |
| |
| <para> |
| Unlike <command>VACUUM FULL</command>, <command>ANALYZE</command> |
| requires only a read lock on the target table, so it can run in |
| parallel with other activity on the table. |
| </para> |
| |
| <para> |
| The statistics collected by <command>ANALYZE</command> usually |
| include a list of some of the most common values in each column and |
| a histogram showing the approximate data distribution in each |
| column. One or both of these may be omitted if |
| <command>ANALYZE</command> deems them uninteresting (for example, |
| in a unique-key column, there are no common values) or if the |
| column data type does not support the appropriate operators. There |
| is more information about the statistics in <xref |
| linkend="maintenance">. |
| </para> |
| |
| <para> |
| For large tables, <command>ANALYZE</command> takes a random sample |
| of the table contents, rather than examining every row. This |
| allows even very large tables to be analyzed in a small amount of |
| time. Note, however, that the statistics are only approximate, and |
| will change slightly each time <command>ANALYZE</command> is run, |
| even if the actual table contents did not change. This may result |
| in small changes in the planner's estimated costs shown by |
| <command>EXPLAIN</command>. In rare situations, this |
| non-determinism will cause the query optimizer to choose a |
| different query plan between runs of <command>ANALYZE</command>. To |
| avoid this, raise the amount of statistics collected by |
| <command>ANALYZE</command>, as described below. |
| </para> |
| |
| <para> |
| The extent of analysis can be controlled by adjusting the |
| <xref linkend="guc-default-statistics-target"> configuration variable, or |
| on a column-by-column basis by setting the per-column statistics |
| target with <command>ALTER TABLE ... ALTER COLUMN ... SET |
| STATISTICS</command> (see <xref linkend="sql-altertable" |
| endterm="sql-altertable-title">). The target value sets the |
| maximum number of entries in the most-common-value list and the |
| maximum number of bins in the histogram. The default target value |
| is 10, but this can be adjusted up or down to trade off accuracy of |
| planner estimates against the time taken for |
| <command>ANALYZE</command> and the amount of space occupied in |
| <literal>pg_statistic</literal>. In particular, setting the |
| statistics target to zero disables collection of statistics for |
| that column. It may be useful to do that for columns that are |
| never used as part of the <literal>WHERE</>, <literal>GROUP BY</>, |
| or <literal>ORDER BY</> clauses of queries, since the planner will |
| have no use for statistics on such columns. |
| </para> |
| |
| <para> |
| The largest statistics target among the columns being analyzed determines |
| the number of table rows sampled to prepare the statistics. Increasing |
| the target causes a proportional increase in the time and space needed |
| to do <command>ANALYZE</command>. |
| </para> |
| </refsect1> |
| |
| <refsect1> |
| <title>Compatibility</title> |
| |
| <para> |
| There is no <command>ANALYZE</command> statement in the SQL standard. |
| </para> |
| </refsect1> |
| </refentry> |