Fix upgrade test: allow function removal (#2422)

Fix upgrade test: allow function removal and detect more deficiencies.

The age_upgrade regression test (added in #2364, refined in #2377, #2397,
install and a synthetic-initial -> current upgrade. Three gaps surfaced
in practice:

1. Function removal forced permanent C stubs.
   The synthetic '_initial' install is built from a fixed historical
   commit. CREATE EXTENSION resolves every CREATE FUNCTION ... AS
   '$libdir/age', '<symbol>' via dlsym at install time when
   check_function_bodies is on (the default). If a developer retires a
   C entry point in HEAD's age.so, step 10 aborts with "could not find
   function ... in file age.so" -- even though the immediately-following
   ALTER EXTENSION UPDATE would DROP that SQL declaration. The only way
   to keep the test green was to leave a permanent error-raising stub
   in age.so, and to remember to add a DROP to the upgrade template.

2. Modifications were under-detected.
   The function-property-change query did not compare probin or prosrc,
   so a C function whose symbol was renamed in the upgrade template, or
   a SQL/plpgsql function whose body changed in either path, slipped
   through.

3. Extension membership was not checked.
   A template that CREATEs an object but never ALTER EXTENSION ADDs it
   leaves a row in pg_proc/pg_class but no pg_depend deptype='e' link.
   pg_dump --extension would diverge, but the existing per-catalog diff
   queries all returned 0 rows.

Changes (regress/sql/age_upgrade.sql + regress/expected/age_upgrade.out):

* Step 10 wraps the synthetic CREATE EXTENSION in
  SET check_function_bodies = off; ... RESET check_function_bodies;
  Symbol resolution is deferred to call time. Step 11's ALTER EXTENSION
  UPDATE then DROPs any retired functions before any plan can call them.
  Step 35's fresh CREATE EXTENSION runs at the GUC default, so HEAD's
  sql/ <-> HEAD's age.so consistency is still enforced on the production
  install path.

* Steps 2 and 13 add probin and prosrc to the function snapshot.
  Step 21 reports probin and prosrc divergences alongside the existing
  property-change columns.

* Steps 7b and 18b add an extension-membership snapshot from
  pg_depend deptype='e' filtered to the AGE extension OID. Every member
  is labeled by stable identity (regprocedure, regtype, regoperator,
  opfname+strategy+types, etc.), never by raw OID, so OID drift between
  fresh and upgrade installs cannot produce false positives. Steps 33a
  and 33b report MISSING / EXTRA members. Step 34 adds extmembers_match
  to the summary row.

* Section-header step ranges updated to include the new sub-steps.

The change is fully self-contained: only regress/sql/age_upgrade.sql and
regress/expected/age_upgrade.out are modified. No production C, SQL,
build, or test files are touched. All 34 regression tests pass on a
clean tree.

Mutation-tested with 8 cases against the unmutated tree: baseline pass;
remove-function-with-DROP pass (no stub needed); remove-function-forget-
DROP fail; add-function-with-CREATE pass; add-function-forget-CREATE
fail; volatility-change-no-template fail; volatility-change-with-CREATE-
OR-REPLACE pass; C-symbol-rename-no-template fail. All eight expected
outcomes observed.

All 34 regression tests pass.

Co-authored-by: Claude <noreply@anthropic.com>

modified:   regress/expected/age_upgrade.out
modified:   regress/sql/age_upgrade.sql
2 files changed
tree: b83ca631fbcb0a16389f84427d6e8a42e9a8653b
  1. .github/
  2. docker/
  3. drivers/
  4. img/
  5. regress/
  6. sql/
  7. src/
  8. tools/
  9. .asf.yaml
  10. .dockerignore
  11. .gitignore
  12. age--1.6.0--1.7.0.sql
  13. age--1.7.0--y.y.y.sql
  14. age.control
  15. clang-format.5
  16. CONTRIBUTING.md
  17. LICENSE
  18. Makefile
  19. META.json
  20. NOTICE
  21. README.md
  22. RELEASE
README.md

Apache AGE is an extension for PostgreSQL that enables users to leverage a graph database on top of the existing relational databases. AGE is an acronym for A Graph Extension and is inspired by Bitnine's AgensGraph, a multi-model database fork of PostgreSQL. The basic principle of the project is to create a single storage that handles both the relational and graph data model so that the users can use the standard ANSI SQL along with openCypher, one of the most popular graph query languages today. There is a strong need for cohesive, easy-to-implement multi-model databases. As an extension of PostgreSQL, AGE supports all the functionalities and features of PostgreSQL while also offering a graph model to boot.

Apache AGE is :

  • Powerful: adds graph database support to the already popular PostgreSQL database: PostgreSQL is used by organizations including Apple, Spotify, and NASA.
  • Flexible: allows you to perform openCypher queries, which makes complex queries much easier to write. It also enables querying multiple graphs at the same time.
  • Intelligent: allows you to perform graph queries that are the basis for many next-level web services such as fraud detection, master data management, product recommendations, identity and relationship management, experience personalization, knowledge management, and more.
  • Cypher Query: supports graph query language
  • Hybrid Querying: enables SQL and/or Cypher
  • Querying: enables multiple graphs
  • Hierarchical: graph label organization
  • Property Indexes: on both vertices(nodes) and edges
  • Full PostgreSQL: supports PG features

Refer to our latest Apache AGE documentation to learn about installation, features, built-in functions, and Cypher queries.

Install the following essential libraries according to each OS. Building AGE from the source depends on the following Linux libraries (Ubuntu package names shown below):

  • CentOS
yum install gcc glibc glib-common readline readline-devel zlib zlib-devel flex bison
  • Fedora
dnf install gcc glibc bison flex readline readline-devel zlib zlib-devel
  • Ubuntu
sudo apt-get install build-essential libreadline-dev zlib1g-dev flex bison

Apache AGE is intended to be simple to install and run. It can be installed with Docker and other traditional ways.

You will need to install an AGE compatible version of Postgres, for now AGE supports Postgres 11, 12, 13, 14, 15, 16, 17 & 18. Supporting the latest versions is on AGE roadmap.

You can use a package management that your OS provides to download PostgreSQL.

sudo apt install postgresql

You can download the Postgres source code and install your own instance of Postgres. You can read instructions on how to install from source code for different versions on the official Postgres Website.

Clone the github repository or download the download an official release. Run the pg_config utility and check the version of PostgreSQL. Currently, only PostgreSQL versions 11, 12, 13, 14, 15, 16, 17 & 18 are supported. If you have any other version of Postgres, you will need to install PostgreSQL version 11, 12, 13, 14, 15, 16, 17 & 18.

pg_config

Run the following command in the source code directory of Apache AGE to build and install the extension.

make install

If the path to your Postgres installation is not in the PATH variable, add the path in the arguments:

make PG_CONFIG=/path/to/postgres/bin/pg_config install
docker pull apache/age

docker run \
    --name age  \
    -p 5455:5432 \
    -e POSTGRES_USER=postgresUser \
    -e POSTGRES_PASSWORD=postgresPW \
    -e POSTGRES_DB=postgresDB \
    -d \
    apache/age
docker exec -it age psql -d postgresDB -U postgresUser

For every connection of AGE you start, you will need to load the AGE extension.

CREATE EXTENSION age;
LOAD 'age';
SET search_path = ag_catalog, "$user", public;

If you are using AGE from a database client that does not default to autocommit — most commonly psycopg v3 or JDBC — you must understand how PostgreSQL‘s transaction semantics apply to AGE’s setup and DDL-like functions. Otherwise, you may see graphs or labels that appear to be created successfully, but are not visible from new connections.

This is not a bug in AGE — it is standard PostgreSQL behavior. AGE's DDL-like functions write to the catalog, and catalog writes only become visible to other sessions after the enclosing transaction is committed.

What is and isn't transactional

StatementScopeNeeds commit to be visible elsewhere?
LOAD 'age'Session-local (loads the .so into the current backend)No
SET search_path = ag_catalog, "$user", publicSession-localNo
SELECT create_graph('g')Writes to ag_graph and creates a schemaYes
SELECT create_vlabel('g', 'L') / create_elabel(...)Writes to ag_label and creates a tableYes
SELECT drop_graph('g', true) / drop_label(...)Writes to catalogYes
SELECT load_labels_from_file(...) / load_edges_from_file(...)Writes to catalog + dataYes
cypher('g', $$ CREATE (:L {...}) $$)Writes dataYes

In a client that defaults to autocommit (e.g. psql), every statement commits automatically, so this is never noticed. In a non-autocommit client, the first statement you run implicitly opens a transaction that stays open until you call commit(), rollback(), or close the connection.

psycopg v3 — the “savepoint gotcha”

The common pitfall is that with connection.transaction(): in psycopg does not start a new top-level transaction when one is already open — it creates a savepoint inside the existing outer transaction. Releasing a savepoint is not a commit, so your create_graph write stays invisible to other sessions until the outer transaction is explicitly committed.

❌ Broken: graph is not visible from a new connection

import psycopg

params = {"host": "localhost", "port": 5432, "user": "postgres",
          "password": "pw", "dbname": "mydb"}

# --- First connection ---
conn = psycopg.connect(**params)
conn.execute("LOAD 'age'")                                      # implicitly opens a txn
conn.execute("SET search_path = ag_catalog, '$user', public")

with conn.transaction(), conn.cursor() as cur:                  # <-- SAVEPOINT, not a real txn
    cur.execute("SELECT * FROM create_graph('my_graph')")
# outer transaction is STILL OPEN here

conn.close()  # outer transaction is rolled back on close → my_graph is gone

# --- New connection ---
conn = psycopg.connect(**params)
conn.execute("LOAD 'age'")
conn.execute("SET search_path = ag_catalog, '$user', public")
with conn.cursor() as cur:
    cur.execute("SELECT name FROM ag_graph;")
    # 'my_graph' is NOT in the results

✅ Fix 1: explicit commit() after setup

conn = psycopg.connect(**params)
conn.execute("LOAD 'age'")
conn.execute("SET search_path = ag_catalog, '$user', public")
conn.commit()   # <-- closes the implicit outer txn

with conn.transaction(), conn.cursor() as cur:
    cur.execute("SELECT * FROM create_graph('my_graph')")
# this transaction block is now top-level and commits on exit
conn.close()

✅ Fix 2: enable autocommit on the connection

conn = psycopg.connect(**params, autocommit=True)
conn.execute("LOAD 'age'")
conn.execute("SET search_path = ag_catalog, '$user', public")
conn.execute("SELECT * FROM create_graph('my_graph')")          # commits immediately
conn.close()

You can also toggle autocommit at runtime with conn.set_autocommit(True).

JDBC

JDBC connections also default to autocommit true per the JDBC spec, but many frameworks (Spring, etc.) flip it off. If you are running AGE DDL-like calls from JDBC, either:

connection.setAutoCommit(true);
// ... LOAD 'age'; SET search_path ...; SELECT create_graph(...);

or keep autocommit off and explicitly commit after DDL-like calls:

stmt.execute("LOAD 'age'");
stmt.execute("SET search_path = ag_catalog, \"$user\", public;");
stmt.execute("SELECT create_graph('my_graph');");
connection.commit();   // make the graph visible to other sessions

Rule of thumb

If an AGE call creates, drops, or modifies a graph, label, vertex, edge, or property, it is a transactional write. In a non-autocommit client, it will not be visible to other sessions until you explicitly commit().

To create a graph, use the create_graph function located in the ag_catalog namespace.

SELECT create_graph('graph_name');

To create a single vertex with label and properties, use the CREATE clause.

SELECT * 
FROM cypher('graph_name', $$
    CREATE (:label {property:"Node A"})
$$) as (v agtype);
SELECT * 
FROM cypher('graph_name', $$
    CREATE (:label {property:"Node B"})
$$) as (v agtype);

To create an edge between two nodes and set its properties:

SELECT * 
FROM cypher('graph_name', $$
    MATCH (a:label), (b:label)
    WHERE a.property = 'Node A' AND b.property = 'Node B'
    CREATE (a)-[e:RELTYPE {property:a.property + '<->' + b.property}]->(b)
    RETURN e
$$) as (e agtype);

And to query the connected nodes:

SELECT * from cypher('graph_name', $$
        MATCH (V)-[R]-(V2)
        RETURN V,R,V2
$$) as (V agtype, R agtype, V2 agtype);

Starting with Apache AGE is very simple. You can easily select your platform and incorporate the relevant SDK into your code.

Apache AGE Viewer is a user interface for Apache AGE that provides visualization and exploration of data. This web visualization tool allows users to enter complex graph queries and explore the results in graph and table forms. Apache AGE Viewer is enhanced to proceed with extensive graph data and discover insights through various graph algorithms. Apache AGE Viewer will become a graph data administration and development platform for Apache AGE to support multiple relational databases: https://github.com/apache/age-viewer.

This is a visualization tool. After installing AGE Extension, you may use this tool to get access to the visualization features.

Viewer gdb, and graph

You can also get help from these videos.

You can improve ongoing efforts or initiate new ones by sending pull requests to this repository. Also, you can learn from the code review process, how to merge pull requests, and from code style compliance to documentation by visiting the Apache AGE official site - Developer Guidelines. Send all your comments and inquiries to the user mailing list, users@age.apache.org.