Fix upgrade test: replace data-integrity checks with catalog comparison (#2403)
The age_upgrade regression test (added in #2364, improved in #2377, #2397)
was designed to validate the upgrade template (age--<VER>--y.y.y.sql) by
creating graph data before the upgrade and verifying it survived afterward.
This approach had two fundamental problems:
1. It did not detect incomplete upgrade templates. The test verified that
graph data (vertices, edges, checksums, GIN indexes) survived ALTER
EXTENSION UPDATE, but never checked whether new SQL objects (functions,
views, relations, indexes, types, operators, casts, constraints) were
actually created by the template. A developer could add a new function
to sql/ and sql_files, forget to add it to the upgrade template, and
all tests would pass — the function existed via the fresh CREATE
EXTENSION install that ran before the upgrade test, but would be
missing for users who upgraded via ALTER EXTENSION UPDATE.
2. The data-integrity checks relied on cypher queries (MATCH/RETURN) within
the same backend session after DROP EXTENSION + CREATE EXTENSION. This
caused intermittent failures on some PostgreSQL versions where AGE's
internal type cache (agtype OID) was not properly refreshed after the
extension was dropped and recreated, resulting in 'type with OID 0
does not exist' errors. The data-integrity aspect was also redundant —
ALTER EXTENSION UPDATE runs DDL statements and does not touch heap data,
so data survival is guaranteed by PostgreSQL and not a meaningful test.
The fix replaces the entire test with a comprehensive catalog comparison:
1. Snapshot the ag_catalog schema from the fresh install across seven
PostgreSQL system catalogs:
- pg_proc: functions, aggregates, procedures (name, args, and
properties: volatility, strictness, kind, return type, setof)
- pg_class: tables, views, sequences, indexes (name, kind)
- pg_type: types (name, type category)
- pg_operator: operators (name, left/right operand types)
- pg_cast: casts involving AGE types (source, target, context)
- pg_opclass: operator classes (name, access method)
- pg_constraint: constraints (name, type, table, referenced table)
2. DROP EXTENSION, CREATE EXTENSION at the synthetic initial version,
then ALTER EXTENSION UPDATE to the current version via the stamped
upgrade template.
3. Snapshot the catalog again after upgrade.
4. Compare: any object present in the fresh snapshot but missing after
upgrade means the template is incomplete. Any object present after
upgrade but not in the fresh snapshot means the template creates
something unexpected. Function properties (volatility, strictness,
prokind, return type) are also compared for functions that exist in
both — catching cases where a CREATE OR REPLACE in the template
changes a function's signature or behavior.
Additional improvements from code review feedback:
- Graph cleanup in Step 1 uses a DO block with PERFORM and suppressed
NOTICEs to produce deterministic output regardless of prior test state.
- The pg_class snapshot includes indexes (relkind 'i') in addition to
tables, views, and sequences.
- Diagnostic output includes relkind/typtype suffixes for actionable diffs.
- Summary uses boolean equality checks (funcs_match, rels_match, etc.)
instead of absolute counts, so the expected output does not need
updating when new objects are added to AGE. Developers who correctly
add objects to both sql/ and the template will never need to modify
this test or its expected output.
This approach:
- Catches the actual failure mode: incomplete upgrade templates.
- Covers all SQL object categories: functions (including aggregates),
relations, types, operators, casts, operator classes, and constraints.
- Detects property changes on existing functions (volatility, strictness,
kind, return type changes).
- Uses only plain SQL catalog queries — no cypher, no .so cache issues.
- Works reliably across all PostgreSQL versions.
- Reports the exact missing/extra/changed object in the diff output.
- Is maintenance-free: no expected output changes needed when AGE grows.
Makefile: updated step 5 comment to reflect catalog comparison approach.
All 33 regression tests pass.
Co-authored-by: Claude <noreply@anthropic.com>
modified: Makefile
modified: regress/expected/age_upgrade.out
modified: regress/sql/age_upgrade.sqlApache AGE is an extension for PostgreSQL that enables users to leverage a graph database on top of the existing relational databases. AGE is an acronym for A Graph Extension and is inspired by Bitnine's AgensGraph, a multi-model database fork of PostgreSQL. The basic principle of the project is to create a single storage that handles both the relational and graph data model so that the users can use the standard ANSI SQL along with openCypher, one of the most popular graph query languages today. There is a strong need for cohesive, easy-to-implement multi-model databases. As an extension of PostgreSQL, AGE supports all the functionalities and features of PostgreSQL while also offering a graph model to boot.
Apache AGE is :
Refer to our latest Apache AGE documentation to learn about installation, features, built-in functions, and Cypher queries.
Install the following essential libraries according to each OS. Building AGE from the source depends on the following Linux libraries (Ubuntu package names shown below):
yum install gcc glibc glib-common readline readline-devel zlib zlib-devel flex bison
dnf install gcc glibc bison flex readline readline-devel zlib zlib-devel
sudo apt-get install build-essential libreadline-dev zlib1g-dev flex bison
Apache AGE is intended to be simple to install and run. It can be installed with Docker and other traditional ways.
You will need to install an AGE compatible version of Postgres, for now AGE supports Postgres 11, 12, 13, 14, 15, 16, 17 & 18. Supporting the latest versions is on AGE roadmap.
You can use a package management that your OS provides to download PostgreSQL.
sudo apt install postgresql
You can download the Postgres source code and install your own instance of Postgres. You can read instructions on how to install from source code for different versions on the official Postgres Website.
Clone the github repository or download the download an official release. Run the pg_config utility and check the version of PostgreSQL. Currently, only PostgreSQL versions 11, 12, 13, 14, 15, 16, 17 & 18 are supported. If you have any other version of Postgres, you will need to install PostgreSQL version 11, 12, 13, 14, 15, 16, 17 & 18.
pg_config
Run the following command in the source code directory of Apache AGE to build and install the extension.
make install
If the path to your Postgres installation is not in the PATH variable, add the path in the arguments:
make PG_CONFIG=/path/to/postgres/bin/pg_config install
docker pull apache/age
docker run \ --name age \ -p 5455:5432 \ -e POSTGRES_USER=postgresUser \ -e POSTGRES_PASSWORD=postgresPW \ -e POSTGRES_DB=postgresDB \ -d \ apache/age
docker exec -it age psql -d postgresDB -U postgresUser
For every connection of AGE you start, you will need to load the AGE extension.
CREATE EXTENSION age;
LOAD 'age';
SET search_path = ag_catalog, "$user", public;
If you are using AGE from a database client that does not default to autocommit — most commonly psycopg v3 or JDBC — you must understand how PostgreSQL‘s transaction semantics apply to AGE’s setup and DDL-like functions. Otherwise, you may see graphs or labels that appear to be created successfully, but are not visible from new connections.
This is not a bug in AGE — it is standard PostgreSQL behavior. AGE's DDL-like functions write to the catalog, and catalog writes only become visible to other sessions after the enclosing transaction is committed.
| Statement | Scope | Needs commit to be visible elsewhere? |
|---|---|---|
LOAD 'age' | Session-local (loads the .so into the current backend) | No |
SET search_path = ag_catalog, "$user", public | Session-local | No |
SELECT create_graph('g') | Writes to ag_graph and creates a schema | Yes |
SELECT create_vlabel('g', 'L') / create_elabel(...) | Writes to ag_label and creates a table | Yes |
SELECT drop_graph('g', true) / drop_label(...) | Writes to catalog | Yes |
SELECT load_labels_from_file(...) / load_edges_from_file(...) | Writes to catalog + data | Yes |
cypher('g', $$ CREATE (:L {...}) $$) | Writes data | Yes |
In a client that defaults to autocommit (e.g. psql), every statement commits automatically, so this is never noticed. In a non-autocommit client, the first statement you run implicitly opens a transaction that stays open until you call commit(), rollback(), or close the connection.
The common pitfall is that with connection.transaction(): in psycopg does not start a new top-level transaction when one is already open — it creates a savepoint inside the existing outer transaction. Releasing a savepoint is not a commit, so your create_graph write stays invisible to other sessions until the outer transaction is explicitly committed.
import psycopg params = {"host": "localhost", "port": 5432, "user": "postgres", "password": "pw", "dbname": "mydb"} # --- First connection --- conn = psycopg.connect(**params) conn.execute("LOAD 'age'") # implicitly opens a txn conn.execute("SET search_path = ag_catalog, '$user', public") with conn.transaction(), conn.cursor() as cur: # <-- SAVEPOINT, not a real txn cur.execute("SELECT * FROM create_graph('my_graph')") # outer transaction is STILL OPEN here conn.close() # outer transaction is rolled back on close → my_graph is gone # --- New connection --- conn = psycopg.connect(**params) conn.execute("LOAD 'age'") conn.execute("SET search_path = ag_catalog, '$user', public") with conn.cursor() as cur: cur.execute("SELECT name FROM ag_graph;") # 'my_graph' is NOT in the results
commit() after setupconn = psycopg.connect(**params) conn.execute("LOAD 'age'") conn.execute("SET search_path = ag_catalog, '$user', public") conn.commit() # <-- closes the implicit outer txn with conn.transaction(), conn.cursor() as cur: cur.execute("SELECT * FROM create_graph('my_graph')") # this transaction block is now top-level and commits on exit conn.close()
conn = psycopg.connect(**params, autocommit=True) conn.execute("LOAD 'age'") conn.execute("SET search_path = ag_catalog, '$user', public") conn.execute("SELECT * FROM create_graph('my_graph')") # commits immediately conn.close()
You can also toggle autocommit at runtime with conn.set_autocommit(True).
JDBC connections also default to autocommit true per the JDBC spec, but many frameworks (Spring, etc.) flip it off. If you are running AGE DDL-like calls from JDBC, either:
connection.setAutoCommit(true); // ... LOAD 'age'; SET search_path ...; SELECT create_graph(...);
or keep autocommit off and explicitly commit after DDL-like calls:
stmt.execute("LOAD 'age'"); stmt.execute("SET search_path = ag_catalog, \"$user\", public;"); stmt.execute("SELECT create_graph('my_graph');"); connection.commit(); // make the graph visible to other sessions
If an AGE call creates, drops, or modifies a graph, label, vertex, edge, or property, it is a transactional write. In a non-autocommit client, it will not be visible to other sessions until you explicitly
commit().
To create a graph, use the create_graph function located in the ag_catalog namespace.
SELECT create_graph('graph_name');
To create a single vertex with label and properties, use the CREATE clause.
SELECT * FROM cypher('graph_name', $$ CREATE (:label {property:"Node A"}) $$) as (v agtype);
SELECT * FROM cypher('graph_name', $$ CREATE (:label {property:"Node B"}) $$) as (v agtype);
To create an edge between two nodes and set its properties:
SELECT * FROM cypher('graph_name', $$ MATCH (a:label), (b:label) WHERE a.property = 'Node A' AND b.property = 'Node B' CREATE (a)-[e:RELTYPE {property:a.property + '<->' + b.property}]->(b) RETURN e $$) as (e agtype);
And to query the connected nodes:
SELECT * from cypher('graph_name', $$
MATCH (V)-[R]-(V2)
RETURN V,R,V2
$$) as (V agtype, R agtype, V2 agtype);
Starting with Apache AGE is very simple. You can easily select your platform and incorporate the relevant SDK into your code.
Apache AGE Viewer is a user interface for Apache AGE that provides visualization and exploration of data. This web visualization tool allows users to enter complex graph queries and explore the results in graph and table forms. Apache AGE Viewer is enhanced to proceed with extensive graph data and discover insights through various graph algorithms. Apache AGE Viewer will become a graph data administration and development platform for Apache AGE to support multiple relational databases: https://github.com/apache/age-viewer.
This is a visualization tool. After installing AGE Extension, you may use this tool to get access to the visualization features.
You can also get help from these videos.
You can improve ongoing efforts or initiate new ones by sending pull requests to this repository. Also, you can learn from the code review process, how to merge pull requests, and from code style compliance to documentation by visiting the Apache AGE official site - Developer Guidelines. Send all your comments and inquiries to the user mailing list, users@age.apache.org.