| <!-- doc/src/sgml/logical-replication.sgml --> |
| |
| <chapter id="logical-replication"> |
| <title>Logical Replication</title> |
| |
| <para> |
| Logical replication is a method of replicating data objects and their |
| changes, based upon their replication identity (usually a primary key). We |
| use the term logical in contrast to physical replication, which uses exact |
| block addresses and byte-by-byte replication. PostgreSQL supports both |
| mechanisms concurrently, see <xref linkend="high-availability"/>. Logical |
| replication allows fine-grained control over both data replication and |
| security. |
| </para> |
| |
| <para> |
| Logical replication uses a <firstterm>publish</firstterm> |
| and <firstterm>subscribe</firstterm> model with one or |
| more <firstterm>subscribers</firstterm> subscribing to one or more |
| <firstterm>publications</firstterm> on a <firstterm>publisher</firstterm> |
| node. Subscribers pull data from the publications they subscribe to and may |
| subsequently re-publish data to allow cascading replication or more complex |
| configurations. |
| </para> |
| |
| <para> |
| Logical replication of a table typically starts with taking a snapshot |
| of the data on the publisher database and copying that to the subscriber. |
| Once that is done, the changes on the publisher are sent to the subscriber |
| as they occur in real-time. The subscriber applies the data in the same |
| order as the publisher so that transactional consistency is guaranteed for |
| publications within a single subscription. This method of data replication |
| is sometimes referred to as transactional replication. |
| </para> |
| |
| <para> |
| The typical use-cases for logical replication are: |
| |
| <itemizedlist> |
| <listitem> |
| <para> |
| Sending incremental changes in a single database or a subset of a |
| database to subscribers as they occur. |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para> |
| Firing triggers for individual changes as they arrive on the |
| subscriber. |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para> |
| Consolidating multiple databases into a single one (for example for |
| analytical purposes). |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para> |
| Replicating between different major versions of PostgreSQL. |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para> |
| Replicating between PostgreSQL instances on different platforms (for |
| example Linux to Windows) |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para> |
| Giving access to replicated data to different groups of users. |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para> |
| Sharing a subset of the database between multiple databases. |
| </para> |
| </listitem> |
| </itemizedlist> |
| </para> |
| |
| <para> |
| The subscriber database behaves in the same way as any other PostgreSQL |
| instance and can be used as a publisher for other databases by defining its |
| own publications. When the subscriber is treated as read-only by |
| application, there will be no conflicts from a single subscription. On the |
| other hand, if there are other writes done either by an application or by other |
| subscribers to the same set of tables, conflicts can arise. |
| </para> |
| |
| <sect1 id="logical-replication-publication"> |
| <title>Publication</title> |
| |
| <para> |
| A <firstterm>publication</firstterm> can be defined on any physical |
| replication primary. The node where a publication is defined is referred to |
| as <firstterm>publisher</firstterm>. A publication is a set of changes |
| generated from a table or a group of tables, and might also be described as |
| a change set or replication set. Each publication exists in only one database. |
| </para> |
| |
| <para> |
| Publications are different from schemas and do not affect how the table is |
| accessed. Each table can be added to multiple publications if needed. |
| Publications may currently only contain tables. Objects must be added |
| explicitly, except when a publication is created for <literal>ALL |
| TABLES</literal>. |
| </para> |
| |
| <para> |
| Publications can choose to limit the changes they produce to |
| any combination of <command>INSERT</command>, <command>UPDATE</command>, |
| <command>DELETE</command>, and <command>TRUNCATE</command>, similar to how triggers are fired by |
| particular event types. By default, all operation types are replicated. |
| </para> |
| |
| <para> |
| A published table must have a <quote>replica identity</quote> configured in |
| order to be able to replicate <command>UPDATE</command> |
| and <command>DELETE</command> operations, so that appropriate rows to |
| update or delete can be identified on the subscriber side. By default, |
| this is the primary key, if there is one. Another unique index (with |
| certain additional requirements) can also be set to be the replica |
| identity. If the table does not have any suitable key, then it can be set |
| to replica identity <quote>full</quote>, which means the entire row becomes |
| the key. This, however, is very inefficient and should only be used as a |
| fallback if no other solution is possible. If a replica identity other |
| than <quote>full</quote> is set on the publisher side, a replica identity |
| comprising the same or fewer columns must also be set on the subscriber |
| side. See <xref linkend="sql-altertable-replica-identity"/> for details on |
| how to set the replica identity. If a table without a replica identity is |
| added to a publication that replicates <command>UPDATE</command> |
| or <command>DELETE</command> operations then |
| subsequent <command>UPDATE</command> or <command>DELETE</command> |
| operations will cause an error on the publisher. <command>INSERT</command> |
| operations can proceed regardless of any replica identity. |
| </para> |
| |
| <para> |
| Every publication can have multiple subscribers. |
| </para> |
| |
| <para> |
| A publication is created using the <link linkend="sql-createpublication"><command>CREATE PUBLICATION</command></link> |
| command and may later be altered or dropped using corresponding commands. |
| </para> |
| |
| <para> |
| The individual tables can be added and removed dynamically using |
| <link linkend="sql-alterpublication"><command>ALTER PUBLICATION</command></link>. Both the <literal>ADD |
| TABLE</literal> and <literal>DROP TABLE</literal> operations are |
| transactional; so the table will start or stop replicating at the correct |
| snapshot once the transaction has committed. |
| </para> |
| </sect1> |
| |
| <sect1 id="logical-replication-subscription"> |
| <title>Subscription</title> |
| |
| <para> |
| A <firstterm>subscription</firstterm> is the downstream side of logical |
| replication. The node where a subscription is defined is referred to as |
| the <firstterm>subscriber</firstterm>. A subscription defines the connection |
| to another database and set of publications (one or more) to which it wants |
| to subscribe. |
| </para> |
| |
| <para> |
| The subscriber database behaves in the same way as any other PostgreSQL |
| instance and can be used as a publisher for other databases by defining its |
| own publications. |
| </para> |
| |
| <para> |
| A subscriber node may have multiple subscriptions if desired. It is |
| possible to define multiple subscriptions between a single |
| publisher-subscriber pair, in which case care must be taken to ensure |
| that the subscribed publication objects don't overlap. |
| </para> |
| |
| <para> |
| Each subscription will receive changes via one replication slot (see |
| <xref linkend="streaming-replication-slots"/>). Additional replication |
| slots may be required for the initial data synchronization of |
| pre-existing table data and those will be dropped at the end of data |
| synchronization. |
| </para> |
| |
| <para> |
| A logical replication subscription can be a standby for synchronous |
| replication (see <xref linkend="synchronous-replication"/>). The standby |
| name is by default the subscription name. An alternative name can be |
| specified as <literal>application_name</literal> in the connection |
| information of the subscription. |
| </para> |
| |
| <para> |
| Subscriptions are dumped by <command>pg_dump</command> if the current user |
| is a superuser. Otherwise a warning is written and subscriptions are |
| skipped, because non-superusers cannot read all subscription information |
| from the <structname>pg_subscription</structname> catalog. |
| </para> |
| |
| <para> |
| The subscription is added using <link linkend="sql-createsubscription"><command>CREATE SUBSCRIPTION</command></link> and |
| can be stopped/resumed at any time using the |
| <link linkend="sql-altersubscription"><command>ALTER SUBSCRIPTION</command></link> command and removed using |
| <link linkend="sql-dropsubscription"><command>DROP SUBSCRIPTION</command></link>. |
| </para> |
| |
| <para> |
| When a subscription is dropped and recreated, the synchronization |
| information is lost. This means that the data has to be resynchronized |
| afterwards. |
| </para> |
| |
| <para> |
| The schema definitions are not replicated, and the published tables must |
| exist on the subscriber. Only regular tables may be |
| the target of replication. For example, you can't replicate to a view. |
| </para> |
| |
| <para> |
| The tables are matched between the publisher and the subscriber using the |
| fully qualified table name. Replication to differently-named tables on the |
| subscriber is not supported. |
| </para> |
| |
| <para> |
| Columns of a table are also matched by name. The order of columns in the |
| subscriber table does not need to match that of the publisher. The data |
| types of the columns do not need to match, as long as the text |
| representation of the data can be converted to the target type. For |
| example, you can replicate from a column of type <type>integer</type> to a |
| column of type <type>bigint</type>. The target table can also have |
| additional columns not provided by the published table. Any such columns |
| will be filled with the default value as specified in the definition of the |
| target table. |
| </para> |
| |
| <sect2 id="logical-replication-subscription-slot"> |
| <title>Replication Slot Management</title> |
| |
| <para> |
| As mentioned earlier, each (active) subscription receives changes from a |
| replication slot on the remote (publishing) side. |
| </para> |
| <para> |
| Additional table synchronization slots are normally transient, created |
| internally to perform initial table synchronization and dropped |
| automatically when they are no longer needed. These table synchronization |
| slots have generated names: <quote><literal>pg_%u_sync_%u_%llu</literal></quote> |
| (parameters: Subscription <parameter>oid</parameter>, |
| Table <parameter>relid</parameter>, system identifier <parameter>sysid</parameter>) |
| </para> |
| <para> |
| Normally, the remote replication slot is created automatically when the |
| subscription is created using <command>CREATE SUBSCRIPTION</command> and it |
| is dropped automatically when the subscription is dropped using |
| <command>DROP SUBSCRIPTION</command>. In some situations, however, it can |
| be useful or necessary to manipulate the subscription and the underlying |
| replication slot separately. Here are some scenarios: |
| |
| <itemizedlist> |
| <listitem> |
| <para> |
| When creating a subscription, the replication slot already exists. In |
| that case, the subscription can be created using |
| the <literal>create_slot = false</literal> option to associate with the |
| existing slot. |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para> |
| When creating a subscription, the remote host is not reachable or in an |
| unclear state. In that case, the subscription can be created using |
| the <literal>connect = false</literal> option. The remote host will then not |
| be contacted at all. This is what <application>pg_dump</application> |
| uses. The remote replication slot will then have to be created |
| manually before the subscription can be activated. |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para> |
| When dropping a subscription, the replication slot should be kept. |
| This could be useful when the subscriber database is being moved to a |
| different host and will be activated from there. In that case, |
| disassociate the slot from the subscription using <command>ALTER |
| SUBSCRIPTION</command> before attempting to drop the subscription. |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para> |
| When dropping a subscription, the remote host is not reachable. In |
| that case, disassociate the slot from the subscription |
| using <command>ALTER SUBSCRIPTION</command> before attempting to drop |
| the subscription. If the remote database instance no longer exists, no |
| further action is then necessary. If, however, the remote database |
| instance is just unreachable, the replication slot (and any still |
| remaining table synchronization slots) should then be |
| dropped manually; otherwise it/they would continue to reserve WAL and might |
| eventually cause the disk to fill up. Such cases should be carefully |
| investigated. |
| </para> |
| </listitem> |
| </itemizedlist> |
| </para> |
| </sect2> |
| </sect1> |
| |
| <sect1 id="logical-replication-conflicts"> |
| <title>Conflicts</title> |
| |
| <para> |
| Logical replication behaves similarly to normal DML operations in that |
| the data will be updated even if it was changed locally on the subscriber |
| node. If incoming data violates any constraints the replication will |
| stop. This is referred to as a <firstterm>conflict</firstterm>. When |
| replicating <command>UPDATE</command> or <command>DELETE</command> |
| operations, missing data will not produce a conflict and such operations |
| will simply be skipped. |
| </para> |
| |
| <para> |
| A conflict will produce an error and will stop the replication; it must be |
| resolved manually by the user. Details about the conflict can be found in |
| the subscriber's server log. |
| </para> |
| |
| <para> |
| The resolution can be done either by changing data on the subscriber so |
| that it does not conflict with the incoming change or by skipping the |
| transaction that conflicts with the existing data. The transaction can be |
| skipped by calling the <link linkend="pg-replication-origin-advance"> |
| <function>pg_replication_origin_advance()</function></link> function with |
| a <parameter>node_name</parameter> corresponding to the subscription name, |
| and a position. The current position of origins can be seen in the |
| <link linkend="view-pg-replication-origin-status"> |
| <structname>pg_replication_origin_status</structname></link> system view. |
| </para> |
| </sect1> |
| |
| <sect1 id="logical-replication-restrictions"> |
| <title>Restrictions</title> |
| |
| <para> |
| Logical replication currently has the following restrictions or missing |
| functionality. These might be addressed in future releases. |
| </para> |
| |
| <itemizedlist> |
| <listitem> |
| <para> |
| The database schema and DDL commands are not replicated. The initial |
| schema can be copied by hand using <literal>pg_dump |
| --schema-only</literal>. Subsequent schema changes would need to be kept |
| in sync manually. (Note, however, that there is no need for the schemas |
| to be absolutely the same on both sides.) Logical replication is robust |
| when schema definitions change in a live database: When the schema is |
| changed on the publisher and replicated data starts arriving at the |
| subscriber but does not fit into the table schema, replication will error |
| until the schema is updated. In many cases, intermittent errors can be |
| avoided by applying additive schema changes to the subscriber first. |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para> |
| Sequence data is not replicated. The data in serial or identity columns |
| backed by sequences will of course be replicated as part of the table, |
| but the sequence itself would still show the start value on the |
| subscriber. If the subscriber is used as a read-only database, then this |
| should typically not be a problem. If, however, some kind of switchover |
| or failover to the subscriber database is intended, then the sequences |
| would need to be updated to the latest values, either by copying the |
| current data from the publisher (perhaps |
| using <command>pg_dump</command>) or by determining a sufficiently high |
| value from the tables themselves. |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para> |
| Replication of <command>TRUNCATE</command> commands is supported, but |
| some care must be taken when truncating groups of tables connected by |
| foreign keys. When replicating a truncate action, the subscriber will |
| truncate the same group of tables that was truncated on the publisher, |
| either explicitly specified or implicitly collected via |
| <literal>CASCADE</literal>, minus tables that are not part of the |
| subscription. This will work correctly if all affected tables are part |
| of the same subscription. But if some tables to be truncated on the |
| subscriber have foreign-key links to tables that are not part of the same |
| (or any) subscription, then the application of the truncate action on the |
| subscriber will fail. |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para> |
| Large objects (see <xref linkend="largeobjects"/>) are not replicated. |
| There is no workaround for that, other than storing data in normal |
| tables. |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para> |
| Replication is only supported by tables, including partitioned tables. |
| Attempts to replicate other types of relations, such as views, materialized |
| views, or foreign tables, will result in an error. |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para> |
| When replicating between partitioned tables, the actual replication |
| originates, by default, from the leaf partitions on the publisher, so |
| partitions on the publisher must also exist on the subscriber as valid |
| target tables. (They could either be leaf partitions themselves, or they |
| could be further subpartitioned, or they could even be independent |
| tables.) Publications can also specify that changes are to be replicated |
| using the identity and schema of the partitioned root table instead of |
| that of the individual leaf partitions in which the changes actually |
| originate (see <link linkend="sql-createpublication"><command>CREATE PUBLICATION</command></link>). |
| </para> |
| </listitem> |
| </itemizedlist> |
| </sect1> |
| |
| <sect1 id="logical-replication-architecture"> |
| <title>Architecture</title> |
| |
| <para> |
| Logical replication starts by copying a snapshot of the data on the |
| publisher database. Once that is done, changes on the publisher are sent |
| to the subscriber as they occur in real time. The subscriber applies data |
| in the order in which commits were made on the publisher so that |
| transactional consistency is guaranteed for the publications within any |
| single subscription. |
| </para> |
| |
| <para> |
| Logical replication is built with an architecture similar to physical |
| streaming replication (see <xref linkend="streaming-replication"/>). It is |
| implemented by <quote>walsender</quote> and <quote>apply</quote> |
| processes. The walsender process starts logical decoding (described |
| in <xref linkend="logicaldecoding"/>) of the WAL and loads the standard |
| logical decoding plugin (pgoutput). The plugin transforms the changes read |
| from WAL to the logical replication protocol |
| (see <xref linkend="protocol-logical-replication"/>) and filters the data |
| according to the publication specification. The data is then continuously |
| transferred using the streaming replication protocol to the apply worker, |
| which maps the data to local tables and applies the individual changes as |
| they are received, in correct transactional order. |
| </para> |
| |
| <para> |
| The apply process on the subscriber database always runs with |
| <varname>session_replication_role</varname> set |
| to <literal>replica</literal>, which produces the usual effects on triggers |
| and constraints. |
| </para> |
| |
| <para> |
| The logical replication apply process currently only fires row triggers, |
| not statement triggers. The initial table synchronization, however, is |
| implemented like a <command>COPY</command> command and thus fires both row |
| and statement triggers for <command>INSERT</command>. |
| </para> |
| |
| <sect2 id="logical-replication-snapshot"> |
| <title>Initial Snapshot</title> |
| <para> |
| The initial data in existing subscribed tables are snapshotted and |
| copied in a parallel instance of a special kind of apply process. |
| This process will create its own replication slot and copy the existing |
| data. As soon as the copy is finished the table contents will become |
| visible to other backends. Once existing data is copied, the worker |
| enters synchronization mode, which ensures that the table is brought |
| up to a synchronized state with the main apply process by streaming |
| any changes that happened during the initial data copy using standard |
| logical replication. During this synchronization phase, the changes |
| are applied and committed in the same order as they happened on the |
| publisher. Once synchronization is done, control of the |
| replication of the table is given back to the main apply process where |
| replication continues as normal. |
| </para> |
| </sect2> |
| </sect1> |
| |
| <sect1 id="logical-replication-monitoring"> |
| <title>Monitoring</title> |
| |
| <para> |
| Because logical replication is based on a similar architecture as |
| <link linkend="streaming-replication">physical streaming replication</link>, |
| the monitoring on a publication node is similar to monitoring of a |
| physical replication primary |
| (see <xref linkend="streaming-replication-monitoring"/>). |
| </para> |
| |
| <para> |
| The monitoring information about subscription is visible in |
| <link linkend="monitoring-pg-stat-subscription"> |
| <structname>pg_stat_subscription</structname></link>. |
| This view contains one row for every subscription worker. A subscription |
| can have zero or more active subscription workers depending on its state. |
| </para> |
| |
| <para> |
| Normally, there is a single apply process running for an enabled |
| subscription. A disabled subscription or a crashed subscription will have |
| zero rows in this view. If the initial data synchronization of any |
| table is in progress, there will be additional workers for the tables |
| being synchronized. |
| </para> |
| </sect1> |
| |
| <sect1 id="logical-replication-security"> |
| <title>Security</title> |
| |
| <para> |
| A user able to modify the schema of subscriber-side tables can execute |
| arbitrary code as a superuser. Limit ownership |
| and <literal>TRIGGER</literal> privilege on such tables to roles that |
| superusers trust. Moreover, if untrusted users can create tables, use only |
| publications that list tables explicitly. That is to say, create a |
| subscription <literal>FOR ALL TABLES</literal> only when superusers trust |
| every user permitted to create a non-temp table on the publisher or the |
| subscriber. |
| </para> |
| |
| <para> |
| The role used for the replication connection must have |
| the <literal>REPLICATION</literal> attribute (or be a superuser). If the |
| role lacks <literal>SUPERUSER</literal> and <literal>BYPASSRLS</literal>, |
| publisher row security policies can execute. If the role does not trust |
| all table owners, include <literal>options=-crow_security=off</literal> in |
| the connection string; if a table owner then adds a row security policy, |
| that setting will cause replication to halt rather than execute the policy. |
| Access for the role must be configured in <filename>pg_hba.conf</filename> |
| and it must have the <literal>LOGIN</literal> attribute. |
| </para> |
| |
| <para> |
| In order to be able to copy the initial table data, the role used for the |
| replication connection must have the <literal>SELECT</literal> privilege on |
| a published table (or be a superuser). |
| </para> |
| |
| <para> |
| To create a publication, the user must have the <literal>CREATE</literal> |
| privilege in the database. |
| </para> |
| |
| <para> |
| To add tables to a publication, the user must have ownership rights on the |
| table. To create a publication that publishes all tables automatically, |
| the user must be a superuser. |
| </para> |
| |
| <para> |
| To create a subscription, the user must be a superuser. |
| </para> |
| |
| <para> |
| The subscription apply process will run in the local database with the |
| privileges of a superuser. |
| </para> |
| |
| <para> |
| Privileges are only checked once at the start of a replication connection. |
| They are not re-checked as each change record is read from the publisher, |
| nor are they re-checked for each change when applied. |
| </para> |
| </sect1> |
| |
| <sect1 id="logical-replication-config"> |
| <title>Configuration Settings</title> |
| |
| <para> |
| Logical replication requires several configuration options to be set. |
| </para> |
| |
| <para> |
| On the publisher side, <varname>wal_level</varname> must be set to |
| <literal>logical</literal>, and <varname>max_replication_slots</varname> |
| must be set to at least the number of subscriptions expected to connect, |
| plus some reserve for table synchronization. And |
| <varname>max_wal_senders</varname> should be set to at least the same as |
| <varname>max_replication_slots</varname> plus the number of physical |
| replicas that are connected at the same time. |
| </para> |
| |
| <para> |
| <varname>max_replication_slots</varname> must also be set on the subscriber. |
| It should be set to at least the number of subscriptions that will be added |
| to the subscriber, plus some reserve for table synchronization. |
| <varname>max_logical_replication_workers</varname> must be set to at least |
| the number of subscriptions, again plus some reserve for the table |
| synchronization. Additionally the <varname>max_worker_processes</varname> |
| may need to be adjusted to accommodate for replication workers, at least |
| (<varname>max_logical_replication_workers</varname> |
| + <literal>1</literal>). Note that some extensions and parallel queries |
| also take worker slots from <varname>max_worker_processes</varname>. |
| </para> |
| </sect1> |
| |
| <sect1 id="logical-replication-quick-setup"> |
| <title>Quick Setup</title> |
| |
| <para> |
| First set the configuration options in <filename>postgresql.conf</filename>: |
| <programlisting> |
| wal_level = logical |
| </programlisting> |
| The other required settings have default values that are sufficient for a |
| basic setup. |
| </para> |
| |
| <para> |
| <filename>pg_hba.conf</filename> needs to be adjusted to allow replication |
| (the values here depend on your actual network configuration and user you |
| want to use for connecting): |
| <programlisting> |
| host all repuser 0.0.0.0/0 md5 |
| </programlisting> |
| </para> |
| |
| <para> |
| Then on the publisher database: |
| <programlisting> |
| CREATE PUBLICATION mypub FOR TABLE users, departments; |
| </programlisting> |
| </para> |
| |
| <para> |
| And on the subscriber database: |
| <programlisting> |
| CREATE SUBSCRIPTION mysub CONNECTION 'dbname=foo host=bar user=repuser' PUBLICATION mypub; |
| </programlisting> |
| </para> |
| |
| <para> |
| The above will start the replication process, which synchronizes the |
| initial table contents of the tables <literal>users</literal> and |
| <literal>departments</literal> and then starts replicating |
| incremental changes to those tables. |
| </para> |
| </sect1> |
| </chapter> |