| <!-- |
| $PostgreSQL: pgsql/doc/src/sgml/ref/create_index.sgml,v 1.58 2006/09/16 00:30:17 momjian Exp $ |
| PostgreSQL documentation |
| --> |
| |
| <refentry id="SQL-CREATEINDEX"> |
| <refmeta> |
| <refentrytitle id="sql-createindex-title">CREATE INDEX</refentrytitle> |
| <refmiscinfo>SQL - Language Statements</refmiscinfo> |
| </refmeta> |
| |
| <refnamediv> |
| <refname>CREATE INDEX</refname> |
| <refpurpose>define a new index</refpurpose> |
| </refnamediv> |
| |
| <indexterm zone="sql-createindex"> |
| <primary>CREATE INDEX</primary> |
| </indexterm> |
| |
| <refsynopsisdiv> |
| <synopsis> |
| |
| CREATE [UNIQUE] INDEX name ON table |
| [USING btree|bitmap|gist] |
| ( {column | (expression)} [opclass] [, ...] ) |
| [ WITH ( FILLFACTOR = value ) ] |
| [TABLESPACE tablespace] |
| [WHERE predicate] |
| |
| |
| </synopsis> |
| </refsynopsisdiv> |
| |
| <refsect1> |
| <title>Description</title> |
| |
| <para> |
| <command>CREATE INDEX</command> constructs an index <replaceable |
| class="parameter">index_name</replaceable> on the specified table. |
| Indexes are primarily used to enhance database performance (though |
| inappropriate use can result in slower performance). |
| </para> |
| |
| <para> |
| The key field(s) for the index are specified as column names, |
| or alternatively as expressions written in parentheses. |
| Multiple fields can be specified if the index method supports |
| multicolumn indexes. |
| </para> |
| |
| <para> |
| An index field can be an expression computed from the values of |
| one or more columns of the table row. This feature can be used |
| to obtain fast access to data based on some transformation of |
| the basic data. For example, an index computed on |
| <literal>upper(col)</> would allow the clause |
| <literal>WHERE upper(col) = 'JIM'</> to use an index. |
| </para> |
| |
| <para> |
| <productname>PostgreSQL</productname> provides the index methods |
| B-tree, hash, GiST, and GIN. Users can also define their own index |
| methods, but that is fairly complicated. |
| </para> |
| |
| <para> |
| When the <literal>WHERE</literal> clause is present, a |
| <firstterm>partial index</firstterm> is created. |
| A partial index is an index that contains entries for only a portion of |
| a table, usually a portion that is more useful for indexing than the |
| rest of the table. For example, if you have a table that contains both |
| billed and unbilled orders where the unbilled orders take up a small |
| fraction of the total table and yet that is an often used section, you |
| can improve performance by creating an index on just that portion. |
| Another possible application is to use <literal>WHERE</literal> with |
| <literal>UNIQUE</literal> to enforce uniqueness over a subset of a |
| table. See <xref linkend="indexes-partial"> for more discussion. |
| </para> |
| |
| <para> |
| The expression used in the <literal>WHERE</literal> clause may refer |
| only to columns of the underlying table, but it can use all columns, |
| not just the ones being indexed. Presently, subqueries and |
| aggregate expressions are also forbidden in <literal>WHERE</literal>. |
| The same restrictions apply to index fields that are expressions. |
| </para> |
| |
| <para> |
| All functions and operators used in an index definition must be |
| <quote>immutable</>, that is, their results must depend only on |
| their arguments and never on any outside influence (such as |
| the contents of another table or the current time). This restriction |
| ensures that the behavior of the index is well-defined. To use a |
| user-defined function in an index expression or <literal>WHERE</literal> |
| clause, remember to mark the function immutable when you create it. |
| </para> |
| </refsect1> |
| |
| <refsect1> |
| <title>Parameters</title> |
| |
| <variablelist> |
| <varlistentry> |
| <term><literal>UNIQUE</literal></term> |
| <listitem> |
| <para> |
| Causes the system to check for |
| duplicate values in the table when the index is created (if data |
| already exist) and each time data is added. Attempts to |
| insert or update data which would result in duplicate entries |
| will generate an error. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><literal>CONCURRENTLY</literal></term> |
| <listitem> |
| <para> |
| When this option is used, <productname>PostgreSQL</> will build the |
| index without taking any locks that prevent concurrent inserts, |
| updates, or deletes on the table; whereas a standard index build |
| locks out writes (but not reads) on the table until it's done. |
| There are several caveats to be aware of when using this option |
| — see <xref linkend="SQL-CREATEINDEX-CONCURRENTLY" |
| endterm="SQL-CREATEINDEX-CONCURRENTLY-title">. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">name</replaceable></term> |
| <listitem> |
| <para> |
| The name of the index to be created. No schema name can be included |
| here; the index is always created in the same schema as its parent |
| table. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">table</replaceable></term> |
| <listitem> |
| <para> |
| The name (possibly schema-qualified) of the table to be indexed. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">method</replaceable></term> |
| <listitem> |
| <para> |
| The name of the index method to be used. Choices are |
| <literal>btree</literal>, <literal>hash</literal>, |
| <literal>gist</literal>, and <literal>gin</>. The |
| default method is <literal>btree</literal>. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">column</replaceable></term> |
| <listitem> |
| <para> |
| The name of a column of the table. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">expression</replaceable></term> |
| <listitem> |
| <para> |
| An expression based on one or more columns of the table. The |
| expression usually must be written with surrounding parentheses, |
| as shown in the syntax. However, the parentheses may be omitted |
| if the expression has the form of a function call. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">opclass</replaceable></term> |
| <listitem> |
| <para> |
| The name of an operator class. See below for details. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">storage_parameter</replaceable></term> |
| <listitem> |
| <para> |
| The name of an index-method-specific storage parameter. See |
| below for details. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">tablespace</replaceable></term> |
| <listitem> |
| <para> |
| The tablespace in which to create the index. If not specified, |
| <xref linkend="guc-default-tablespace"> is used, or the database's |
| default tablespace if <varname>default_tablespace</> is an empty |
| string. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term><replaceable class="parameter">predicate</replaceable></term> |
| <listitem> |
| <para> |
| The constraint expression for a partial index. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| </variablelist> |
| |
| <refsect2 id="SQL-CREATEINDEX-storage-parameters"> |
| <title id="SQL-CREATEINDEX-storage-parameters-title">Index Storage Parameters</title> |
| |
| <para> |
| The <literal>WITH</> clause can specify <firstterm>storage parameters</> |
| for indexes. Each index method can have its own set of allowed storage |
| parameters. The built-in index methods all accept a single parameter: |
| </para> |
| |
| <variablelist> |
| |
| <varlistentry> |
| <term><literal>FILLFACTOR</></term> |
| <listitem> |
| <para> |
| The fillfactor for an index is a percentage that determines how full |
| the index method will try to pack index pages. For B-trees, leaf pages |
| are filled to this percentage during initial index build, and also |
| when extending the index at the right (largest key values). If pages |
| subsequently become completely full, they will be split, leading to |
| gradual degradation in the index's efficiency. B-trees use a default |
| fillfactor of 90, but any value from 10 to 100 can be selected. |
| If the table is static then fillfactor 100 is best to minimize the |
| index's physical size, but for heavily updated tables a smaller |
| fillfactor is better to minimize the need for page splits. The |
| other index methods use fillfactor in different but roughly analogous |
| ways; the default fillfactor varies between methods. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| </variablelist> |
| |
| </refsect2> |
| |
| <refsect2 id="SQL-CREATEINDEX-CONCURRENTLY"> |
| <title id="SQL-CREATEINDEX-CONCURRENTLY-title">Building Indexes Concurrently</title> |
| |
| <indexterm zone="SQL-CREATEINDEX-CONCURRENTLY"> |
| <primary>index</primary> |
| <secondary>building concurrently</secondary> |
| </indexterm> |
| |
| <para> |
| Creating an index can interfere with regular operation of a database. |
| Normally <productname>PostgreSQL</> locks the table to be indexed against |
| writes and performs the entire index build with a single scan of the |
| table. Other transactions can still read the table, but if they try to |
| insert, update, or delete rows in the table they will block until the |
| index build is finished. This could have a severe effect if the system is |
| a live production database. Large tables can take many hours to be |
| indexed, and even for smaller tables, an index build can lock out writers |
| for periods that are unacceptably long for a production system. |
| </para> |
| |
| <para> |
| <productname>PostgreSQL</> supports building indexes without locking |
| out writes. This method is invoked by specifying the |
| <literal>CONCURRENTLY</> option of <command>CREATE INDEX</>. |
| When this option is used, |
| <productname>PostgreSQL</> must perform two scans of the table, and in |
| addition it must wait for all existing transactions to terminate. Thus |
| this method requires more total work than a standard index build and takes |
| significantly longer to complete. However, since it allows normal |
| operations to continue while the index is built, this method is useful for |
| adding new indexes in a production environment. Of course, the extra CPU |
| and I/O load imposed by the index creation may slow other operations. |
| </para> |
| |
| <para> |
| If a problem arises during the second scan of the table, such as a |
| uniqueness violation in a unique index, the <command>CREATE INDEX</> |
| command will fail but leave behind an <quote>invalid</> index. This index |
| will be ignored for querying purposes because it may be incomplete; |
| however it will still consume update overhead. The recommended recovery |
| method in such cases is to drop the index and try again to perform |
| <command>CREATE INDEX CONCURRENTLY</>. (Another possibility is to rebuild |
| the index with <command>REINDEX</>. However, since <command>REINDEX</> |
| does not support concurrent builds, this option is unlikely to seem |
| attractive.) |
| </para> |
| |
| <para> |
| Another caveat when building a unique index concurrently is that the |
| uniqueness constraint is already being enforced against other transactions |
| when the second table scan begins. This means that constraint violations |
| could be reported in other queries prior to the index becoming available |
| for use, or even in cases where the index build eventually fails. Also, |
| if a failure does occur in the second scan, the <quote>invalid</> index |
| continues to enforce its uniqueness constraint afterwards. |
| </para> |
| |
| <para> |
| Concurrent builds of expression indexes and partial indexes are supported. |
| Errors occurring in the evaluation of these expressions could cause |
| behavior similar to that described above for unique constraint violations. |
| </para> |
| |
| <para> |
| Regular index builds permit other regular index builds on the |
| same table to occur in parallel, but only one concurrent index build |
| can occur on a table at a time. In both cases, no other types of schema |
| modification on the table are allowed meanwhile. Another difference |
| is that a regular <command>CREATE INDEX</> command can be performed within |
| a transaction block, but <command>CREATE INDEX CONCURRENTLY</> cannot. |
| </para> |
| </refsect2> |
| </refsect1> |
| |
| <refsect1> |
| <title>Notes</title> |
| |
| <para> |
| See <xref linkend="indexes"> for information about when indexes can |
| be used, when they are not used, and in which particular situations |
| they can be useful. |
| </para> |
| |
| <para> |
| Currently, only the B-tree and GiST index methods support |
| multicolumn indexes. Up to 32 fields may be specified by default. |
| (This limit can be altered when building |
| <productname>PostgreSQL</productname>.) Only B-tree currently |
| supports unique indexes. |
| </para> |
| |
| <para> |
| An <firstterm>operator class</firstterm> can be specified for each |
| column of an index. The operator class identifies the operators to be |
| used by the index for that column. For example, a B-tree index on |
| four-byte integers would use the <literal>int4_ops</literal> class; |
| this operator class includes comparison functions for four-byte |
| integers. In practice the default operator class for the column's data |
| type is usually sufficient. The main point of having operator classes |
| is that for some data types, there could be more than one meaningful |
| ordering. For example, we might want to sort a complex-number data |
| type either by absolute value or by real part. We could do this by |
| defining two operator classes for the data type and then selecting |
| the proper class when making an index. More information about |
| operator classes is in <xref linkend="indexes-opclass"> and in <xref |
| linkend="xindex">. |
| </para> |
| |
| <para> |
| Use <xref linkend="sql-dropindex" endterm="sql-dropindex-title"> |
| to remove an index. |
| </para> |
| |
| <para> |
| Indexes are not used for <literal>IS NULL</> clauses by default. |
| The best way to use indexes in such cases is to create a partial index |
| using an <literal>IS NULL</> predicate. |
| </para> |
| |
| <para> |
| Prior releases of <productname>PostgreSQL</productname> also had an |
| R-tree index method. This method has been removed because |
| it had no significant advantages over the GiST method. |
| If <literal>USING rtree</> is specified, <command>CREATE INDEX</> |
| will interpret it as <literal>USING gist</>, to simplify conversion |
| of old databases to GiST. |
| </para> |
| </refsect1> |
| |
| <refsect1> |
| <title>Examples</title> |
| |
| <para> |
| To create a B-tree index on the column <literal>title</literal> in |
| the table <literal>films</literal>: |
| <programlisting> |
| CREATE UNIQUE INDEX title_idx ON films (title); |
| </programlisting> |
| </para> |
| |
| <para> |
| To create an index on the expression <literal>lower(title)</>, |
| allowing efficient case-insensitive searches: |
| <programlisting> |
| CREATE INDEX lower_title_idx ON films ((lower(title))); |
| </programlisting> |
| </para> |
| |
| <para> |
| To create an index with non-default fill factor: |
| <programlisting> |
| CREATE UNIQUE INDEX title_idx ON films (title) WITH (fillfactor = 70); |
| </programlisting> |
| </para> |
| |
| <para> |
| To create an index on the column <literal>code</> in the table |
| <literal>films</> and have the index reside in the tablespace |
| <literal>indexspace</>: |
| <programlisting> |
| CREATE INDEX code_idx ON films(code) TABLESPACE indexspace; |
| </programlisting> |
| </para> |
| |
| <!-- |
| <comment> |
| Is this example correct? |
| </comment> |
| <para> |
| To create a GiST index on a point attribute so that we |
| can efficiently use box operators on the result of the |
| conversion function: |
| <programlisting> |
| CREATE INDEX pointloc |
| ON points USING GIST (point2box(location) box_ops); |
| SELECT * FROM points |
| WHERE point2box(points.pointloc) = boxes.box; |
| </programlisting> |
| </para> |
| --> |
| |
| <para> |
| To create an index without locking out writes to the table: |
| <programlisting> |
| CREATE INDEX CONCURRENTLY sales_quantity_index ON sales_table (quantity); |
| </programlisting> |
| </para> |
| |
| </refsect1> |
| |
| <refsect1> |
| <title>Compatibility</title> |
| |
| <para> |
| <command>CREATE INDEX</command> is a |
| <productname>PostgreSQL</productname> language extension. There |
| are no provisions for indexes in the SQL standard. |
| </para> |
| </refsect1> |
| |
| <refsect1> |
| <title>See Also</title> |
| |
| <simplelist type="inline"> |
| <member><xref linkend="sql-alterindex" endterm="sql-alterindex-title"></member> |
| <member><xref linkend="sql-dropindex" endterm="sql-dropindex-title"></member> |
| </simplelist> |
| </refsect1> |
| </refentry> |