| ------------------------------------------------------------------------------ |
| PG_UPGRADE: IN-PLACE UPGRADES FOR POSTGRESQL |
| ------------------------------------------------------------------------------ |
| |
| Upgrading a PostgreSQL database from one major release to another can be |
| an expensive process. For minor upgrades, you can simply install new |
| executables and forget about upgrading existing data. But for major |
| upgrades, you have to export all of your data using pg_dump, install the |
| new release, run initdb to create a new cluster, and then import your |
| old data. If you have a lot of data, that can take a considerable amount |
| of time. If you have too much data, you may have to buy more storage |
| since you need enough room to hold the original data plus the exported |
| data. pg_upgrade can reduce the amount of time and disk space required |
| for many upgrades. |
| |
| The URL http://momjian.us/main/writings/pgsql/pg_upgrade.pdf contains a |
| presentation about pg_upgrade internals that mirrors the text |
| description below. |
| |
| ------------------------------------------------------------------------------ |
| WHAT IT DOES |
| ------------------------------------------------------------------------------ |
| |
| pg_upgrade is a tool that performs an in-place upgrade of existing |
| data. Some upgrades change the on-disk representation of data; |
| pg_upgrade cannot help in those upgrades. However, many upgrades do |
| not change the on-disk representation of a user-defined table. In those |
| cases, pg_upgrade can move existing user-defined tables from the old |
| database cluster into the new cluster. |
| |
| There are two factors that determine whether an in-place upgrade is |
| practical. |
| |
| Every table in a cluster shares the same on-disk representation of the |
| table headers and trailers and the on-disk representation of tuple |
| headers. If this changes between the old version of PostgreSQL and the |
| new version, pg_upgrade cannot move existing tables to the new cluster; |
| you will have to pg_dump the old data and then import that data into the |
| new cluster. |
| |
| Second, all data types should have the same binary representation |
| between the two major PostgreSQL versions. |
| |
| ------------------------------------------------------------------------------ |
| HOW IT WORKS |
| ------------------------------------------------------------------------------ |
| |
| To use pg_upgrade during an upgrade, start by installing a fresh |
| cluster using the newest version in a new directory. When you've |
| finished installation, the new cluster will contain the new executables |
| and the usual template0, template1, and postgres, but no user-defined |
| tables. At this point, you can shut down the old and new postmasters and |
| invoke pg_upgrade. |
| |
| When pg_upgrade starts, it ensures that all required executables are |
| present and contain the expected version numbers. The verification |
| process also checks the old and new $PGDATA directories to ensure that |
| the expected files and subdirectories are in place. If the verification |
| process succeeds, pg_upgrade starts the old postmaster and runs |
| pg_dumpall --schema-only to capture the metadata contained in the old |
| cluster. The script produced by pg_dumpall will be used in a later step |
| to recreate all user-defined objects in the new cluster. |
| |
| Note that the script produced by pg_dumpall will only recreate |
| user-defined objects, not system-defined objects. The new cluster will |
| contain the system-defined objects created by the latest version of |
| PostgreSQL. |
| |
| Once pg_upgrade has extracted the metadata from the old cluster, it |
| performs a number of bookkeeping tasks required to 'sync up' the new |
| cluster with the existing data. |
| |
| First, pg_upgrade copies the commit status information and 'next |
| transaction ID' from the old cluster to the new cluster. This step |
| ensures that the proper tuples are visible from the new cluster. |
| Remember, pg_upgrade does not export/import the content of user-defined |
| tables so the transaction IDs in the new cluster must match the |
| transaction IDs in the old data. pg_upgrade also copies the starting |
| address for write-ahead logs from the old cluster to the new cluster. |
| |
| Now pg_upgrade begins reconstructing the metadata obtained from the old |
| cluster using the first part of the pg_dumpall output. |
| |
| Next, pg_upgrade executes the remainder of the script produced earlier |
| by pg_dumpall --- this script effectively creates the complete |
| user-defined metadata from the old cluster to the new cluster. It |
| preserves the relfilenode numbers so TOAST and other references |
| to relfilenodes in user data is preserved. (See binary-upgrade usage |
| in pg_dump). |
| |
| Finally, pg_upgrade links or copies each user-defined table and its |
| supporting indexes and toast tables from the old cluster to the new |
| cluster. |
| |
| An important feature of the pg_upgrade design is that it leaves the |
| original cluster intact --- if a problem occurs during the upgrade, you |
| can still run the previous version, after renaming the tablespaces back |
| to the original names. |