| <!-- |
| doc/src/sgml/ref/create_index.sgml |
| PostgreSQL documentation |
| --> |
| |
| <refentry id="sql-createindex"> |
| <indexterm zone="sql-createindex"> |
| <primary>CREATE INDEX</primary> |
| </indexterm> |
| |
| <refmeta> |
| <refentrytitle>CREATE INDEX</refentrytitle> |
| <manvolnum>7</manvolnum> |
| <refmiscinfo>SQL - Language Statements</refmiscinfo> |
| </refmeta> |
| |
| <refnamediv> |
| <refname>CREATE INDEX</refname> |
| <refpurpose>define a new index</refpurpose> |
| </refnamediv> |
| |
| <refsynopsisdiv> |
| <synopsis> |
| CREATE [ UNIQUE ] INDEX [ CONCURRENTLY ] [ [ IF NOT EXISTS ] <replaceable class="parameter">name</replaceable> ] ON [ ONLY ] <replaceable class="parameter">table_name</replaceable> [ USING <replaceable class="parameter">method</replaceable> ] |
| ( { <replaceable class="parameter">column_name</replaceable> | ( <replaceable class="parameter">expression</replaceable> ) } [ COLLATE <replaceable class="parameter">collation</replaceable> ] [ <replaceable class="parameter">opclass</replaceable> [ ( <replaceable class="parameter">opclass_parameter</replaceable> = <replaceable class="parameter">value</replaceable> [, ... ] ) ] ] [ ASC | DESC ] [ NULLS { FIRST | LAST } ] [, ...] ) |
| [ INCLUDE ( <replaceable class="parameter">column_name</replaceable> [, ...] ) ] |
| [ WITH ( <replaceable class="parameter">storage_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) ] |
| [ TABLESPACE <replaceable class="parameter">tablespace_name</replaceable> ] |
| [ WHERE <replaceable class="parameter">predicate</replaceable> ] |
| </synopsis> |
| </refsynopsisdiv> |
| |
| <refsect1> |
| <title>Description</title> |
| |
| <para> |
| <command>CREATE INDEX</command> constructs an index on the specified column(s) |
| of the specified relation, which can be a table or a materialized view. |
| Indexes are primarily used to enhance database performance (though |
| inappropriate use can result in slower performance). |
| </para> |
| |
| <para> |
| The key field(s) for the index are specified as column names, |
| or alternatively as expressions written in parentheses. |
| Multiple fields can be specified if the index method supports |
| multicolumn indexes. |
| </para> |
| |
| <para> |
| An index field can be an expression computed from the values of |
| one or more columns of the table row. This feature can be used |
| to obtain fast access to data based on some transformation of |
| the basic data. For example, an index computed on |
| <literal>upper(col)</literal> would allow the clause |
| <literal>WHERE upper(col) = 'JIM'</literal> to use an index. |
| </para> |
| |
| <para> |
| <productname>PostgreSQL</productname> provides the index methods |
| B-tree, hash, GiST, SP-GiST, GIN, and BRIN. Users can also define their own |
| index methods, but that is fairly complicated. |
| </para> |
| |
| <para> |
| When the <literal>WHERE</literal> clause is present, a |
| <firstterm>partial index</firstterm> is created. |
| A partial index is an index that contains entries for only a portion of |
| a table, usually a portion that is more useful for indexing than the |
| rest of the table. For example, if you have a table that contains both |
| billed and unbilled orders where the unbilled orders take up a small |
| fraction of the total table and yet that is an often used section, you |
| can improve performance by creating an index on just that portion. |
| Another possible application is to use <literal>WHERE</literal> with |
| <literal>UNIQUE</literal> to enforce uniqueness over a subset of a |
| table. See <xref linkend="indexes-partial"/> for more discussion. |
| </para> |
| |
| <para> |
| The expression used in the <literal>WHERE</literal> clause can refer |
| only to columns of the underlying table, but it can use all columns, |
| not just the ones being indexed. Presently, subqueries and |
| aggregate expressions are also forbidden in <literal>WHERE</literal>. |
| The same restrictions apply to index fields that are expressions. |
| </para> |
| |
| <para> |
| All functions and operators used in an index definition must be |
| <quote>immutable</quote>, that is, their results must depend only on |
| their arguments and never on any outside influence (such as |
| the contents of another table or the current time). This restriction |
| ensures that the behavior of the index is well-defined. To use a |
| user-defined function in an index expression or <literal>WHERE</literal> |
| clause, remember to mark the function immutable when you create it. |
| </para> |
| </refsect1> |
| |
| <refsect1> |
| <title>Parameters</title> |
| |
| <variablelist> |
| <varlistentry> |
| <term><literal>UNIQUE</literal></term> |
| <listitem> |
| <para> |
| Causes the system to check for |
| duplicate values in the table when the index is created (if data |
| already exist) and each time data is added. Attempts to |
| insert or update data which would result in duplicate entries |
| will generate an error. |
| </para> |
| |
| <para> |
| Additional restrictions apply when unique indexes are applied to |
| partitioned tables; see <xref linkend="sql-createtable" />. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>CONCURRENTLY</literal></term> |
| <listitem> |
| <para> |
| When this option is used, <productname>PostgreSQL</productname> will build the |
| index without taking any locks that prevent concurrent inserts, |
| updates, or deletes on the table; whereas a standard index build |
| locks out writes (but not reads) on the table until it's done. |
| There are several caveats to be aware of when using this option |
| — see <xref linkend="sql-createindex-concurrently"/> below. |
| </para> |
| <para> |
| For temporary tables, <command>CREATE INDEX</command> is always |
| non-concurrent, as no other session can access them, and |
| non-concurrent index creation is cheaper. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>IF NOT EXISTS</literal></term> |
| <listitem> |
| <para> |
| Do not throw an error if a relation with the same name already exists. |
| A notice is issued in this case. Note that there is no guarantee that |
| the existing index is anything like the one that would have been created. |
| Index name is required when <literal>IF NOT EXISTS</literal> is specified. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>INCLUDE</literal></term> |
| <listitem> |
| <para> |
| The optional <literal>INCLUDE</literal> clause specifies a |
| list of columns which will be included in the index |
| as <firstterm>non-key</firstterm> columns. A non-key column cannot |
| be used in an index scan search qualification, and it is disregarded |
| for purposes of any uniqueness or exclusion constraint enforced by |
| the index. However, an index-only scan can return the contents of |
| non-key columns without having to visit the index's table, since |
| they are available directly from the index entry. Thus, addition of |
| non-key columns allows index-only scans to be used for queries that |
| otherwise could not use them. |
| </para> |
| |
| <para> |
| It's wise to be conservative about adding non-key columns to an |
| index, especially wide columns. If an index tuple exceeds the |
| maximum size allowed for the index type, data insertion will fail. |
| In any case, non-key columns duplicate data from the index's table |
| and bloat the size of the index, thus potentially slowing searches. |
| Furthermore, B-tree deduplication is never used with indexes |
| that have a non-key column. |
| </para> |
| |
| <para> |
| Columns listed in the <literal>INCLUDE</literal> clause don't need |
| appropriate operator classes; the clause can include |
| columns whose data types don't have operator classes defined for |
| a given access method. |
| </para> |
| |
| <para> |
| Expressions are not supported as included columns since they cannot be |
| used in index-only scans. |
| </para> |
| |
| <para> |
| Currently, the B-tree, GiST and SP-GiST index access methods support |
| this feature. In these indexes, the values of columns listed |
| in the <literal>INCLUDE</literal> clause are included in leaf tuples |
| which correspond to heap tuples, but are not included in upper-level |
| index entries used for tree navigation. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">name</replaceable></term> |
| <listitem> |
| <para> |
| The name of the index to be created. No schema name can be included |
| here; the index is always created in the same schema as its parent |
| table. If the name is omitted, <productname>PostgreSQL</productname> chooses a |
| suitable name based on the parent table's name and the indexed column |
| name(s). |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>ONLY</literal></term> |
| <listitem> |
| <para> |
| Indicates not to recurse creating indexes on partitions, if the |
| table is partitioned. The default is to recurse. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">table_name</replaceable></term> |
| <listitem> |
| <para> |
| The name (possibly schema-qualified) of the table to be indexed. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">method</replaceable></term> |
| <listitem> |
| <para> |
| The name of the index method to be used. Choices are |
| <literal>btree</literal>, <literal>hash</literal>, |
| <literal>gist</literal>, <literal>spgist</literal>, <literal>gin</literal>, and |
| <literal>brin</literal>. |
| The default method is <literal>btree</literal>. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">column_name</replaceable></term> |
| <listitem> |
| <para> |
| The name of a column of the table. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">expression</replaceable></term> |
| <listitem> |
| <para> |
| An expression based on one or more columns of the table. The |
| expression usually must be written with surrounding parentheses, |
| as shown in the syntax. However, the parentheses can be omitted |
| if the expression has the form of a function call. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">collation</replaceable></term> |
| <listitem> |
| <para> |
| The name of the collation to use for the index. By default, |
| the index uses the collation declared for the column to be |
| indexed or the result collation of the expression to be |
| indexed. Indexes with non-default collations can be useful for |
| queries that involve expressions using non-default collations. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">opclass</replaceable></term> |
| <listitem> |
| <para> |
| The name of an operator class. See below for details. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">opclass_parameter</replaceable></term> |
| <listitem> |
| <para> |
| The name of an operator class parameter. See below for details. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>ASC</literal></term> |
| <listitem> |
| <para> |
| Specifies ascending sort order (which is the default). |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>DESC</literal></term> |
| <listitem> |
| <para> |
| Specifies descending sort order. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>NULLS FIRST</literal></term> |
| <listitem> |
| <para> |
| Specifies that nulls sort before non-nulls. This is the default |
| when <literal>DESC</literal> is specified. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>NULLS LAST</literal></term> |
| <listitem> |
| <para> |
| Specifies that nulls sort after non-nulls. This is the default |
| when <literal>DESC</literal> is not specified. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">storage_parameter</replaceable></term> |
| <listitem> |
| <para> |
| The name of an index-method-specific storage parameter. See |
| <xref linkend="sql-createindex-storage-parameters"/> below |
| for details. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">tablespace_name</replaceable></term> |
| <listitem> |
| <para> |
| The tablespace in which to create the index. If not specified, |
| <xref linkend="guc-default-tablespace"/> is consulted, or |
| <xref linkend="guc-temp-tablespaces"/> for indexes on temporary |
| tables. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">predicate</replaceable></term> |
| <listitem> |
| <para> |
| The constraint expression for a partial index. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| </variablelist> |
| |
| <refsect2 id="sql-createindex-storage-parameters" xreflabel="Index Storage Parameters"> |
| <title>Index Storage Parameters</title> |
| |
| <para> |
| The optional <literal>WITH</literal> clause specifies <firstterm>storage |
| parameters</firstterm> for the index. Each index method has its own set of allowed |
| storage parameters. The B-tree, hash, GiST and SP-GiST index methods all |
| accept this parameter: |
| </para> |
| |
| <variablelist> |
| <varlistentry id="index-reloption-fillfactor" xreflabel="fillfactor"> |
| <term><literal>fillfactor</literal> (<type>integer</type>) |
| <indexterm> |
| <primary><varname>fillfactor</varname> storage parameter</primary> |
| </indexterm> |
| </term> |
| <listitem> |
| <para> |
| The fillfactor for an index is a percentage that determines how full |
| the index method will try to pack index pages. For B-trees, leaf pages |
| are filled to this percentage during initial index builds, and also |
| when extending the index at the right (adding new largest key values). |
| If pages |
| subsequently become completely full, they will be split, leading to |
| fragmentation of the on-disk index structure. B-trees use a default |
| fillfactor of 90, but any integer value from 10 to 100 can be selected. |
| </para> |
| <para> |
| B-tree indexes on tables where many inserts and/or updates are |
| anticipated can benefit from lower fillfactor settings at |
| <command>CREATE INDEX</command> time (following bulk loading into the |
| table). Values in the range of 50 - 90 can usefully <quote>smooth |
| out</quote> the <emphasis>rate</emphasis> of page splits during the |
| early life of the B-tree index (lowering fillfactor like this may even |
| lower the absolute number of page splits, though this effect is highly |
| workload dependent). The B-tree bottom-up index deletion technique |
| described in <xref linkend="btree-deletion"/> is dependent on having |
| some <quote>extra</quote> space on pages to store <quote>extra</quote> |
| tuple versions, and so can be affected by fillfactor (though the effect |
| is usually not significant). |
| </para> |
| <para> |
| In other specific cases it might be useful to increase fillfactor to |
| 100 at <command>CREATE INDEX</command> time as a way of maximizing |
| space utilization. You should only consider this when you are |
| completely sure that the table is static (i.e. that it will never be |
| affected by either inserts or updates). A fillfactor setting of 100 |
| otherwise risks <emphasis>harming</emphasis> performance: even a few |
| updates or inserts will cause a sudden flood of page splits. |
| </para> |
| <para> |
| The other index methods use fillfactor in different but roughly |
| analogous ways; the default fillfactor varies between methods. |
| </para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| |
| <para> |
| B-tree indexes additionally accept this parameter: |
| </para> |
| |
| <variablelist> |
| <varlistentry id="index-reloption-deduplicate-items" xreflabel="deduplicate_items"> |
| <term><literal>deduplicate_items</literal> (<type>boolean</type>) |
| <indexterm> |
| <primary><varname>deduplicate_items</varname> storage parameter</primary> |
| </indexterm> |
| </term> |
| <listitem> |
| <para> |
| Controls usage of the B-tree deduplication technique described |
| in <xref linkend="btree-deduplication"/>. Set to |
| <literal>ON</literal> or <literal>OFF</literal> to enable or |
| disable the optimization. (Alternative spellings of |
| <literal>ON</literal> and <literal>OFF</literal> are allowed as |
| described in <xref linkend="config-setting"/>.) The default is |
| <literal>ON</literal>. |
| </para> |
| |
| <note> |
| <para> |
| Turning <literal>deduplicate_items</literal> off via |
| <command>ALTER INDEX</command> prevents future insertions from |
| triggering deduplication, but does not in itself make existing |
| posting list tuples use the standard tuple representation. |
| </para> |
| </note> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| |
| <para> |
| GiST indexes additionally accept this parameter: |
| </para> |
| |
| <variablelist> |
| <varlistentry id="index-reloption-buffering" xreflabel="buffering"> |
| <term><literal>buffering</literal> (<type>enum</type>) |
| <indexterm> |
| <primary><varname>buffering</varname> storage parameter</primary> |
| </indexterm> |
| </term> |
| <listitem> |
| <para> |
| Determines whether the buffered build technique described in |
| <xref linkend="gist-buffering-build"/> is used to build the index. With |
| <literal>OFF</literal> buffering is disabled, with <literal>ON</literal> |
| it is enabled, and with <literal>AUTO</literal> it is initially disabled, |
| but is turned on on-the-fly once the index size reaches |
| <xref linkend="guc-effective-cache-size"/>. The default |
| is <literal>AUTO</literal>. |
| Note that if sorted build is possible, it will be used instead of |
| buffered build unless <literal>buffering=ON</literal> is specified. |
| </para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| |
| <para> |
| GIN indexes accept different parameters: |
| </para> |
| |
| <variablelist> |
| <varlistentry id="index-reloption-fastupdate" xreflabel="fastupdate"> |
| <term><literal>fastupdate</literal> (<type>boolean</type>) |
| <indexterm> |
| <primary><varname>fastupdate</varname> storage parameter</primary> |
| </indexterm> |
| </term> |
| <listitem> |
| <para> |
| This setting controls usage of the fast update technique described in |
| <xref linkend="gin-fast-update"/>. It is a Boolean parameter: |
| <literal>ON</literal> enables fast update, <literal>OFF</literal> disables it. |
| The default is <literal>ON</literal>. |
| </para> |
| |
| <note> |
| <para> |
| Turning <literal>fastupdate</literal> off via <command>ALTER INDEX</command> prevents |
| future insertions from going into the list of pending index entries, |
| but does not in itself flush previous entries. You might want to |
| <command>VACUUM</command> the table or call <function>gin_clean_pending_list</function> |
| function afterward to ensure the pending list is emptied. |
| </para> |
| </note> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| |
| <variablelist> |
| <varlistentry id="index-reloption-gin-pending-list-limit" xreflabel="gin_pending_list_limit"> |
| <term><literal>gin_pending_list_limit</literal> (<type>integer</type>) |
| <indexterm> |
| <primary><varname>gin_pending_list_limit</varname></primary> |
| <secondary>storage parameter</secondary> |
| </indexterm> |
| </term> |
| <listitem> |
| <para> |
| Custom <xref linkend="guc-gin-pending-list-limit"/> parameter. |
| This value is specified in kilobytes. |
| </para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| |
| <para> |
| <acronym>BRIN</acronym> indexes accept different parameters: |
| </para> |
| |
| <variablelist> |
| <varlistentry id="index-reloption-pages-per-range" xreflabel="pages_per_range"> |
| <term><literal>pages_per_range</literal> (<type>integer</type>) |
| <indexterm> |
| <primary><varname>pages_per_range</varname> storage parameter</primary> |
| </indexterm> |
| </term> |
| <listitem> |
| <para> |
| Defines the number of table blocks that make up one block range for |
| each entry of a <acronym>BRIN</acronym> index (see <xref linkend="brin-intro"/> |
| for more details). The default is <literal>128</literal>. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry id="index-reloption-autosummarize" xreflabel="autosummarize"> |
| <term><literal>autosummarize</literal> (<type>boolean</type>) |
| <indexterm> |
| <primary><varname>autosummarize</varname> storage parameter</primary> |
| </indexterm> |
| </term> |
| <listitem> |
| <para> |
| Defines whether a summarization run is queued for the previous page |
| range whenever an insertion is detected on the next one. |
| See <xref linkend="brin-operation"/> for more details. |
| The default is <literal>off</literal>. |
| </para> |
| </listitem> |
| </varlistentry> |
| </variablelist> |
| </refsect2> |
| |
| <refsect2 id="sql-createindex-concurrently" xreflabel="Building Indexes Concurrently"> |
| <title>Building Indexes Concurrently</title> |
| |
| <indexterm zone="sql-createindex-concurrently"> |
| <primary>index</primary> |
| <secondary>building concurrently</secondary> |
| </indexterm> |
| |
| <para> |
| Creating an index can interfere with regular operation of a database. |
| Normally <productname>PostgreSQL</productname> locks the table to be indexed against |
| writes and performs the entire index build with a single scan of the |
| table. Other transactions can still read the table, but if they try to |
| insert, update, or delete rows in the table they will block until the |
| index build is finished. This could have a severe effect if the system is |
| a live production database. Very large tables can take many hours to be |
| indexed, and even for smaller tables, an index build can lock out writers |
| for periods that are unacceptably long for a production system. |
| </para> |
| |
| <para> |
| <productname>PostgreSQL</productname> supports building indexes without locking |
| out writes. This method is invoked by specifying the |
| <literal>CONCURRENTLY</literal> option of <command>CREATE INDEX</command>. |
| When this option is used, |
| <productname>PostgreSQL</productname> must perform two scans of the table, and in |
| addition it must wait for all existing transactions that could potentially |
| modify or use the index to terminate. Thus |
| this method requires more total work than a standard index build and takes |
| significantly longer to complete. However, since it allows normal |
| operations to continue while the index is built, this method is useful for |
| adding new indexes in a production environment. Of course, the extra CPU |
| and I/O load imposed by the index creation might slow other operations. |
| </para> |
| |
| <para> |
| In a concurrent index build, the index is actually entered as an |
| <quote>invalid</quote> index into |
| the system catalogs in one transaction, then two table scans occur in |
| two more transactions. Before each table scan, the index build must |
| wait for existing transactions that have modified the table to terminate. |
| After the second scan, the index build must wait for any transactions |
| that have a snapshot (see <xref linkend="mvcc"/>) predating the second |
| scan to terminate, including transactions used by any phase of concurrent |
| index builds on other tables, if the indexes involved are partial or have |
| columns that are not simple column references. |
| Then finally the index can be marked <quote>valid</quote> and ready for use, |
| and the <command>CREATE INDEX</command> command terminates. |
| Even then, however, the index may not be immediately usable for queries: |
| in the worst case, it cannot be used as long as transactions exist that |
| predate the start of the index build. |
| </para> |
| |
| <para> |
| If a problem arises while scanning the table, such as a deadlock or a |
| uniqueness violation in a unique index, the <command>CREATE INDEX</command> |
| command will fail but leave behind an <quote>invalid</quote> index. This index |
| will be ignored for querying purposes because it might be incomplete; |
| however it will still consume update overhead. The <application>psql</application> |
| <command>\d</command> command will report such an index as <literal>INVALID</literal>: |
| |
| <programlisting> |
| postgres=# \d tab |
| Table "public.tab" |
| Column | Type | Collation | Nullable | Default |
| --------+---------+-----------+----------+--------- |
| col | integer | | | |
| Indexes: |
| "idx" btree (col) INVALID |
| </programlisting> |
| |
| The recommended recovery |
| method in such cases is to drop the index and try again to perform |
| <command>CREATE INDEX CONCURRENTLY</command>. (Another possibility is |
| to rebuild the index with <command>REINDEX INDEX CONCURRENTLY</command>). |
| </para> |
| |
| <para> |
| Another caveat when building a unique index concurrently is that the |
| uniqueness constraint is already being enforced against other transactions |
| when the second table scan begins. This means that constraint violations |
| could be reported in other queries prior to the index becoming available |
| for use, or even in cases where the index build eventually fails. Also, |
| if a failure does occur in the second scan, the <quote>invalid</quote> index |
| continues to enforce its uniqueness constraint afterwards. |
| </para> |
| |
| <para> |
| Concurrent builds of expression indexes and partial indexes are supported. |
| Errors occurring in the evaluation of these expressions could cause |
| behavior similar to that described above for unique constraint violations. |
| </para> |
| |
| <para> |
| Regular index builds permit other regular index builds on the |
| same table to occur simultaneously, but only one concurrent index build |
| can occur on a table at a time. In either case, schema modification of the |
| table is not allowed while the index is being built. Another difference is |
| that a regular <command>CREATE INDEX</command> command can be performed |
| within a transaction block, but <command>CREATE INDEX CONCURRENTLY</command> |
| cannot. |
| </para> |
| |
| <para> |
| Concurrent builds for indexes on partitioned tables are currently not |
| supported. However, you may concurrently build the index on each |
| partition individually and then finally create the partitioned index |
| non-concurrently in order to reduce the time where writes to the |
| partitioned table will be locked out. In this case, building the |
| partitioned index is a metadata only operation. |
| </para> |
| |
| </refsect2> |
| </refsect1> |
| |
| <refsect1> |
| <title>Notes</title> |
| |
| <para> |
| See <xref linkend="indexes"/> for information about when indexes can |
| be used, when they are not used, and in which particular situations |
| they can be useful. |
| </para> |
| |
| <para> |
| Currently, only the B-tree, GiST, GIN, and BRIN index methods support |
| multiple-key-column indexes. Whether there can be multiple key |
| columns is independent of whether <literal>INCLUDE</literal> columns |
| can be added to the index. Indexes can have up to 32 columns, |
| including <literal>INCLUDE</literal> columns. |
| (This limit can be altered when building |
| <productname>PostgreSQL</productname>.) Only B-tree currently |
| supports unique indexes. |
| </para> |
| |
| <para> |
| An <firstterm>operator class</firstterm> with optional parameters |
| can be specified for each column of an index. |
| The operator class identifies the operators to be |
| used by the index for that column. For example, a B-tree index on |
| four-byte integers would use the <literal>int4_ops</literal> class; |
| this operator class includes comparison functions for four-byte |
| integers. In practice the default operator class for the column's data |
| type is usually sufficient. The main point of having operator classes |
| is that for some data types, there could be more than one meaningful |
| ordering. For example, we might want to sort a complex-number data |
| type either by absolute value or by real part. We could do this by |
| defining two operator classes for the data type and then selecting |
| the proper class when creating an index. More information about |
| operator classes is in <xref linkend="indexes-opclass"/> and in <xref |
| linkend="xindex"/>. |
| </para> |
| |
| <para> |
| When <literal>CREATE INDEX</literal> is invoked on a partitioned |
| table, the default behavior is to recurse to all partitions to ensure |
| they all have matching indexes. |
| Each partition is first checked to determine whether an equivalent |
| index already exists, and if so, that index will become attached as a |
| partition index to the index being created, which will become its |
| parent index. |
| If no matching index exists, a new index will be created and |
| automatically attached; the name of the new index in each partition |
| will be determined as if no index name had been specified in the |
| command. |
| If the <literal>ONLY</literal> option is specified, no recursion |
| is done, and the index is marked invalid. |
| (<command>ALTER INDEX ... ATTACH PARTITION</command> marks the index |
| valid, once all partitions acquire matching indexes.) Note, however, |
| that any partition that is created in the future using |
| <command>CREATE TABLE ... PARTITION OF</command> will automatically |
| have a matching index, regardless of whether <literal>ONLY</literal> is |
| specified. |
| </para> |
| |
| <para> |
| For index methods that support ordered scans (currently, only B-tree), |
| the optional clauses <literal>ASC</literal>, <literal>DESC</literal>, <literal>NULLS |
| FIRST</literal>, and/or <literal>NULLS LAST</literal> can be specified to modify |
| the sort ordering of the index. Since an ordered index can be |
| scanned either forward or backward, it is not normally useful to create a |
| single-column <literal>DESC</literal> index — that sort ordering is already |
| available with a regular index. The value of these options is that |
| multicolumn indexes can be created that match the sort ordering requested |
| by a mixed-ordering query, such as <literal>SELECT ... ORDER BY x ASC, y |
| DESC</literal>. The <literal>NULLS</literal> options are useful if you need to support |
| <quote>nulls sort low</quote> behavior, rather than the default <quote>nulls |
| sort high</quote>, in queries that depend on indexes to avoid sorting steps. |
| </para> |
| |
| <para> |
| The system regularly collects statistics on all of a table's |
| columns. Newly-created non-expression indexes can immediately |
| use these statistics to determine an index's usefulness. |
| For new expression indexes, it is necessary to run <link |
| linkend="sql-analyze"><command>ANALYZE</command></link> or wait for |
| the <link linkend="autovacuum">autovacuum daemon</link> to analyze |
| the table to generate statistics for these indexes. |
| </para> |
| |
| <para> |
| For most index methods, the speed of creating an index is |
| dependent on the setting of <xref linkend="guc-maintenance-work-mem"/>. |
| Larger values will reduce the time needed for index creation, so long |
| as you don't make it larger than the amount of memory really available, |
| which would drive the machine into swapping. |
| </para> |
| |
| <para> |
| <productname>PostgreSQL</productname> can build indexes while |
| leveraging multiple CPUs in order to process the table rows faster. |
| This feature is known as <firstterm>parallel index |
| build</firstterm>. For index methods that support building indexes |
| in parallel (currently, only B-tree), |
| <varname>maintenance_work_mem</varname> specifies the maximum |
| amount of memory that can be used by each index build operation as |
| a whole, regardless of how many worker processes were started. |
| Generally, a cost model automatically determines how many worker |
| processes should be requested, if any. |
| </para> |
| |
| <para> |
| Parallel index builds may benefit from increasing |
| <varname>maintenance_work_mem</varname> where an equivalent serial |
| index build will see little or no benefit. Note that |
| <varname>maintenance_work_mem</varname> may influence the number of |
| worker processes requested, since parallel workers must have at |
| least a <literal>32MB</literal> share of the total |
| <varname>maintenance_work_mem</varname> budget. There must also be |
| a remaining <literal>32MB</literal> share for the leader process. |
| Increasing <xref linkend="guc-max-parallel-maintenance-workers"/> |
| may allow more workers to be used, which will reduce the time |
| needed for index creation, so long as the index build is not |
| already I/O bound. Of course, there should also be sufficient |
| CPU capacity that would otherwise lie idle. |
| </para> |
| |
| <para> |
| Setting a value for <literal>parallel_workers</literal> via <link |
| linkend="sql-altertable"><command>ALTER TABLE</command></link> directly controls how many parallel |
| worker processes will be requested by a <command>CREATE |
| INDEX</command> against the table. This bypasses the cost model |
| completely, and prevents <varname>maintenance_work_mem</varname> |
| from affecting how many parallel workers are requested. Setting |
| <literal>parallel_workers</literal> to 0 via <command>ALTER |
| TABLE</command> will disable parallel index builds on the table in |
| all cases. |
| </para> |
| |
| <tip> |
| <para> |
| You might want to reset <literal>parallel_workers</literal> after |
| setting it as part of tuning an index build. This avoids |
| inadvertent changes to query plans, since |
| <literal>parallel_workers</literal> affects |
| <emphasis>all</emphasis> parallel table scans. |
| </para> |
| </tip> |
| |
| <para> |
| While <command>CREATE INDEX</command> with the |
| <literal>CONCURRENTLY</literal> option supports parallel builds |
| without special restrictions, only the first table scan is actually |
| performed in parallel. |
| </para> |
| |
| <para> |
| Use <link linkend="sql-dropindex"><command>DROP INDEX</command></link> |
| to remove an index. |
| </para> |
| |
| <para> |
| Like any long-running transaction, <command>CREATE INDEX</command> on a |
| table can affect which tuples can be removed by concurrent |
| <command>VACUUM</command> on any other table. |
| </para> |
| |
| <para> |
| Prior releases of <productname>PostgreSQL</productname> also had an |
| R-tree index method. This method has been removed because |
| it had no significant advantages over the GiST method. |
| If <literal>USING rtree</literal> is specified, <command>CREATE INDEX</command> |
| will interpret it as <literal>USING gist</literal>, to simplify conversion |
| of old databases to GiST. |
| </para> |
| |
| <para> |
| Each backend running <command>CREATE INDEX</command> will report its |
| progress in the <structname>pg_stat_progress_create_index</structname> |
| view. See <xref linkend="create-index-progress-reporting"/> for details. |
| </para> |
| </refsect1> |
| |
| <refsect1> |
| <title>Examples</title> |
| |
| <para> |
| To create a unique B-tree index on the column <literal>title</literal> in |
| the table <literal>films</literal>: |
| <programlisting> |
| CREATE UNIQUE INDEX title_idx ON films (title); |
| </programlisting> |
| </para> |
| |
| <para> |
| To create a unique B-tree index on the column <literal>title</literal> |
| with included columns <literal>director</literal> |
| and <literal>rating</literal> in the table <literal>films</literal>: |
| <programlisting> |
| CREATE UNIQUE INDEX title_idx ON films (title) INCLUDE (director, rating); |
| </programlisting> |
| </para> |
| |
| <para> |
| To create a B-Tree index with deduplication disabled: |
| <programlisting> |
| CREATE INDEX title_idx ON films (title) WITH (deduplicate_items = off); |
| </programlisting> |
| </para> |
| |
| <para> |
| To create an index on the expression <literal>lower(title)</literal>, |
| allowing efficient case-insensitive searches: |
| <programlisting> |
| CREATE INDEX ON films ((lower(title))); |
| </programlisting> |
| (In this example we have chosen to omit the index name, so the system |
| will choose a name, typically <literal>films_lower_idx</literal>.) |
| </para> |
| |
| <caution> |
| <para> |
| Hash index operations are not presently WAL-logged, |
| so hash indexes might need to be rebuilt with <command>REINDEX</> |
| after a database crash if there were unwritten changes. |
| Also, changes to hash indexes are not replicated over warm standby |
| replication after the initial base backup, so they |
| give wrong answers to queries that subsequently use them. |
| For these reasons, hash index use is presently discouraged. |
| </para> |
| </caution> |
| |
| <para> |
| To create an index with non-default collation: |
| <programlisting> |
| CREATE INDEX title_idx_german ON films (title COLLATE "de_DE"); |
| </programlisting> |
| </para> |
| |
| <para> |
| To create an index with non-default sort ordering of nulls: |
| <programlisting> |
| CREATE INDEX title_idx_nulls_low ON films (title NULLS FIRST); |
| </programlisting> |
| </para> |
| |
| <para> |
| To create an index with non-default fill factor: |
| <programlisting> |
| CREATE UNIQUE INDEX title_idx ON films (title) WITH (fillfactor = 70); |
| </programlisting> |
| </para> |
| |
| <para> |
| To create a <acronym>GIN</acronym> index with fast updates disabled: |
| <programlisting> |
| CREATE INDEX gin_idx ON documents_table USING GIN (locations) WITH (fastupdate = off); |
| </programlisting> |
| </para> |
| |
| <para> |
| To create an index on the column <literal>code</literal> in the table |
| <literal>films</literal> and have the index reside in the tablespace |
| <literal>indexspace</literal>: |
| <programlisting> |
| CREATE INDEX code_idx ON films (code) TABLESPACE indexspace; |
| </programlisting> |
| </para> |
| |
| <para> |
| To create a GiST index on a point attribute so that we |
| can efficiently use box operators on the result of the |
| conversion function: |
| <programlisting> |
| CREATE INDEX pointloc |
| ON points USING gist (box(location,location)); |
| SELECT * FROM points |
| WHERE box(location,location) && '(0,0),(1,1)'::box; |
| </programlisting> |
| </para> |
| |
| <para> |
| To create an index without locking out writes to the table: |
| <programlisting> |
| CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity); |
| </programlisting></para> |
| |
| </refsect1> |
| |
| <refsect1> |
| <title>Compatibility</title> |
| |
| <para> |
| <command>CREATE INDEX</command> is a |
| <productname>PostgreSQL</productname> language extension. There |
| are no provisions for indexes in the SQL standard. |
| </para> |
| </refsect1> |
| |
| <refsect1> |
| <title>See Also</title> |
| |
| <simplelist type="inline"> |
| <member><xref linkend="sql-alterindex"/></member> |
| <member><xref linkend="sql-dropindex"/></member> |
| <member><xref linkend="sql-reindex"/></member> |
| <member><xref linkend="create-index-progress-reporting"/></member> |
| </simplelist> |
| </refsect1> |
| </refentry> |