| <!-- doc/src/sgml/dict-int.sgml --> |
| |
| <sect1 id="dict-int" xreflabel="dict_int"> |
| <title>dict_int</title> |
| |
| <indexterm zone="dict-int"> |
| <primary>dict_int</primary> |
| </indexterm> |
| |
| <para> |
| <filename>dict_int</filename> is an example of an add-on dictionary template |
| for full-text search. The motivation for this example dictionary is to |
| control the indexing of integers (signed and unsigned), allowing such |
| numbers to be indexed while preventing excessive growth in the number of |
| unique words, which greatly affects the performance of searching. |
| </para> |
| |
| <para> |
| This module is considered <quote>trusted</quote>, that is, it can be |
| installed by non-superusers who have <literal>CREATE</literal> privilege |
| on the current database. |
| </para> |
| |
| <sect2> |
| <title>Configuration</title> |
| |
| <para> |
| The dictionary accepts three options: |
| </para> |
| |
| <itemizedlist> |
| <listitem> |
| <para> |
| The <literal>maxlen</literal> parameter specifies the maximum number of |
| digits allowed in an integer word. The default value is 6. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| The <literal>rejectlong</literal> parameter specifies whether an overlength |
| integer should be truncated or ignored. If <literal>rejectlong</literal> is |
| <literal>false</literal> (the default), the dictionary returns the first |
| <literal>maxlen</literal> digits of the integer. If <literal>rejectlong</literal> is |
| <literal>true</literal>, the dictionary treats an overlength integer as a stop |
| word, so that it will not be indexed. Note that this also means that |
| such an integer cannot be searched for. |
| </para> |
| </listitem> |
| <listitem> |
| <para> |
| The <literal>absval</literal> parameter specifies whether leading |
| <quote><literal>+</literal></quote> or <quote><literal>-</literal></quote> |
| signs should be removed from integer words. The default |
| is <literal>false</literal>. When <literal>true</literal>, the sign is |
| removed before <literal>maxlen</literal> is applied. |
| </para> |
| </listitem> |
| </itemizedlist> |
| </sect2> |
| |
| <sect2> |
| <title>Usage</title> |
| |
| <para> |
| Installing the <literal>dict_int</literal> extension creates a text search |
| template <literal>intdict_template</literal> and a dictionary <literal>intdict</literal> |
| based on it, with the default parameters. You can alter the |
| parameters, for example |
| |
| <programlisting> |
| mydb# ALTER TEXT SEARCH DICTIONARY intdict (MAXLEN = 4, REJECTLONG = true); |
| ALTER TEXT SEARCH DICTIONARY |
| </programlisting> |
| |
| or create new dictionaries based on the template. |
| </para> |
| |
| <para> |
| To test the dictionary, you can try |
| |
| <programlisting> |
| mydb# select ts_lexize('intdict', '12345678'); |
| ts_lexize |
| ----------- |
| {123456} |
| </programlisting> |
| |
| but real-world usage will involve including it in a text search |
| configuration as described in <xref linkend="textsearch"/>. |
| That might look like this: |
| |
| <programlisting> |
| ALTER TEXT SEARCH CONFIGURATION english |
| ALTER MAPPING FOR int, uint WITH intdict; |
| </programlisting> |
| |
| </para> |
| </sect2> |
| |
| </sect1> |