| <!-- doc/src/sgml/tableam.sgml --> |
| |
| <chapter id="tableam"> |
| <title>Table Access Method Interface Definition</title> |
| |
| <indexterm> |
| <primary>Table Access Method</primary> |
| </indexterm> |
| <indexterm> |
| <primary>tableam</primary> |
| <secondary>Table Access Method</secondary> |
| </indexterm> |
| |
| <para> |
| This chapter explains the interface between the core |
| <productname>PostgreSQL</productname> system and <firstterm>table access |
| methods</firstterm>, which manage the storage for tables. The core system |
| knows little about these access methods beyond what is specified here, so |
| it is possible to develop entirely new access method types by writing |
| add-on code. |
| </para> |
| |
| <para> |
| Each table access method is described by a row in the <link |
| linkend="catalog-pg-am"><structname>pg_am</structname></link> system |
| catalog. The <structname>pg_am</structname> entry specifies a name and a |
| <firstterm>handler function</firstterm> for the table access method. These |
| entries can be created and deleted using the <xref |
| linkend="sql-create-access-method"/> and <xref |
| linkend="sql-drop-access-method"/> SQL commands. |
| </para> |
| |
| <para> |
| A table access method handler function must be declared to accept a single |
| argument of type <type>internal</type> and to return the pseudo-type |
| <type>table_am_handler</type>. The argument is a dummy value that simply |
| serves to prevent handler functions from being called directly from SQL commands. |
| |
| The result of the function must be a pointer to a struct of type |
| <structname>TableAmRoutine</structname>, which contains everything that the |
| core code needs to know to make use of the table access method. The return |
| value needs to be of server lifetime, which is typically achieved by |
| defining it as a <literal>static const</literal> variable in global |
| scope. The <structname>TableAmRoutine</structname> struct, also called the |
| access method's <firstterm>API struct</firstterm>, defines the behavior of |
| the access method using callbacks. These callbacks are pointers to plain C |
| functions and are not visible or callable at the SQL level. All the |
| callbacks and their behavior is defined in the |
| <structname>TableAmRoutine</structname> structure (with comments inside the |
| struct defining the requirements for callbacks). Most callbacks have |
| wrapper functions, which are documented from the point of view of a user |
| (rather than an implementor) of the table access method. For details, |
| please refer to the <ulink url="https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/include/access/tableam.h;hb=HEAD"> |
| <filename>src/include/access/tableam.h</filename></ulink> file. |
| </para> |
| |
| <para> |
| To implement an access method, an implementor will typically need to |
| implement an AM-specific type of tuple table slot (see |
| <ulink url="https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/include/executor/tuptable.h;hb=HEAD"> |
| <filename>src/include/executor/tuptable.h</filename></ulink>), which allows |
| code outside the access method to hold references to tuples of the AM, and |
| to access the columns of the tuple. |
| </para> |
| |
| <para> |
| Currently, the way an AM actually stores data is fairly unconstrained. For |
| example, it's possible, but not required, to use postgres' shared buffer |
| cache. In case it is used, it likely makes sense to use |
| <productname>PostgreSQL</productname>'s standard page layout as described in |
| <xref linkend="storage-page-layout"/>. |
| </para> |
| |
| <para> |
| One fairly large constraint of the table access method API is that, |
| currently, if the AM wants to support modifications and/or indexes, it is |
| necessary for each tuple to have a tuple identifier (<acronym>TID</acronym>) |
| consisting of a block number and an item number (see also <xref |
| linkend="storage-page-layout"/>). It is not strictly necessary that the |
| sub-parts of <acronym>TIDs</acronym> have the same meaning they e.g., have |
| for <literal>heap</literal>, but if bitmap scan support is desired (it is |
| optional), the block number needs to provide locality. |
| </para> |
| |
| <para> |
| For crash safety, an AM can use postgres' <link |
| linkend="wal"><acronym>WAL</acronym></link>, or a custom implementation. |
| If <acronym>WAL</acronym> is chosen, either <link |
| linkend="generic-wal">Generic WAL Records</link> can be used, |
| or a new type of <acronym>WAL</acronym> records can be implemented. |
| Generic WAL Records are easy, but imply higher WAL volume. |
| Implementation of a new type of WAL record |
| currently requires modifications to core code (specifically, |
| <filename>src/include/access/rmgrlist.h</filename>). |
| </para> |
| |
| <para> |
| To implement transactional support in a manner that allows different table |
| access methods be accessed within a single transaction, it likely is |
| necessary to closely integrate with the machinery in |
| <filename>src/backend/access/transam/xlog.c</filename>. |
| </para> |
| |
| <para> |
| Any developer of a new <literal>table access method</literal> can refer to |
| the existing <literal>heap</literal> implementation present in |
| <filename>src/backend/access/heap/heapam_handler.c</filename> for details of |
| its implementation. |
| </para> |
| |
| </chapter> |