Fix MATCH on brand-new label after CREATE returning 0 rows (#2341)

* Fix MATCH on brand-new label after CREATE returning 0 rows (issue #2193)

When CREATE introduces a new label and a subsequent MATCH references it
(e.g., CREATE (:Person) WITH ... MATCH (p:Person)), the query returns
0 rows on first execution but works on the second.

Root cause: match_check_valid_label() in transform_cypher_match() runs
before transform_prev_cypher_clause() processes the predecessor chain.
Since CREATE has not yet executed its transform (which creates the label
table as a side effect), the label is not in the cache and the check
generates a One-Time Filter: false plan that returns no rows.

Fix: Skip the early label validity check when the predecessor clause
chain contains a data-modifying operation (CREATE, SET, DELETE, MERGE).
After transform_prev_cypher_clause() completes and any new labels exist
in the cache, run a deferred label check. If the labels are still
invalid at that point, generate an empty result via makeBoolConst(false).

This preserves the existing behavior for MATCH without DML predecessors
(e.g., MATCH-MATCH chains still get the early check and proper error
messages for invalid labels).

Depends on: PR #2340 (clause_chain_has_dml helper)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Address review feedback: fix variable registration for deferred label check

When the deferred label validity check (DML predecessor + non-existent
label) found an invalid label, the code skipped transform_match_pattern()
entirely, which meant MATCH-introduced variables were never registered
in the namespace. This would cause errors if a later clause referenced
those variables (e.g., RETURN p).

Fix: mirror the early-check strategy by injecting a paradoxical WHERE
(true = false) and always calling transform_match_pattern(). Variables
get registered normally; zero rows are returned via the impossible qual.

Also add ORDER BY to multi-row regression tests for deterministic output,
and add a test case for DML predecessor + non-existent label + returning
a MATCH-introduced variable.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Address Copilot review: DRY false-where helper, cache has_dml, ORDER BY in tests

- Factor duplicated WHERE true=false construction into
  make_false_where_clause() helper (used in both early and deferred
  label validation paths)
- Compute clause_chain_has_dml() once and reuse, avoiding repeated
  clause chain traversal
- Add ORDER BY to the single-CREATE City regression test for
  deterministic result ordering

* Address Copilot review: volatile false predicate, DML side-effect test

1. Prevent plan elimination of DML predecessor: replace constant
   (true = false) with volatile (random() IS NULL) in the deferred
   label check path. PG's planner can constant-fold the former into
   a One-Time Filter: false, skipping the DML scan entirely.

2. Unify make_false_where_clause(bool volatile_needed): merge the
   constant and volatile variants into a single parameterized
   function. Call sites are now self-documenting:
   - make_false_where_clause(false) for non-DML path
   - make_false_where_clause(true) for DML predecessor path

3. Document why add_volatile_wrapper() cannot be reused here (it
   operates post-transform at the Expr level and returns agtype,
   while the WHERE clause is built at the parse-tree level).

4. Add regression test verifying CREATE side effects persist when
   MATCH references a non-existent label after a DML predecessor.

All regression tests pass (cypher_match: ok).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Replace non-ASCII em dashes with -- in C comments

ASCII-only codebase convention; avoids encoding/tooling issues.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
3 files changed
tree: 7c8e723de4c82fd28634dc50cf06591c0c88189a
  1. .github/
  2. docker/
  3. drivers/
  4. img/
  5. regress/
  6. sql/
  7. src/
  8. tools/
  9. .asf.yaml
  10. .dockerignore
  11. .gitignore
  12. age--1.6.0--1.7.0.sql
  13. age--1.7.0--y.y.y.sql
  14. age.control
  15. clang-format.5
  16. CONTRIBUTING.md
  17. LICENSE
  18. Makefile
  19. META.json
  20. NOTICE
  21. README.md
  22. RELEASE
README.md

Apache AGE is an extension for PostgreSQL that enables users to leverage a graph database on top of the existing relational databases. AGE is an acronym for A Graph Extension and is inspired by Bitnine's AgensGraph, a multi-model database fork of PostgreSQL. The basic principle of the project is to create a single storage that handles both the relational and graph data model so that the users can use the standard ANSI SQL along with openCypher, one of the most popular graph query languages today. There is a strong need for cohesive, easy-to-implement multi-model databases. As an extension of PostgreSQL, AGE supports all the functionalities and features of PostgreSQL while also offering a graph model to boot.

Apache AGE is :

  • Powerful: adds graph database support to the already popular PostgreSQL database: PostgreSQL is used by organizations including Apple, Spotify, and NASA.
  • Flexible: allows you to perform openCypher queries, which makes complex queries much easier to write. It also enables querying multiple graphs at the same time.
  • Intelligent: allows you to perform graph queries that are the basis for many next-level web services such as fraud detection, master data management, product recommendations, identity and relationship management, experience personalization, knowledge management, and more.
  • Cypher Query: supports graph query language
  • Hybrid Querying: enables SQL and/or Cypher
  • Querying: enables multiple graphs
  • Hierarchical: graph label organization
  • Property Indexes: on both vertices(nodes) and edges
  • Full PostgreSQL: supports PG features

Refer to our latest Apache AGE documentation to learn about installation, features, built-in functions, and Cypher queries.

Install the following essential libraries according to each OS. Building AGE from the source depends on the following Linux libraries (Ubuntu package names shown below):

  • CentOS
yum install gcc glibc glib-common readline readline-devel zlib zlib-devel flex bison
  • Fedora
dnf install gcc glibc bison flex readline readline-devel zlib zlib-devel
  • Ubuntu
sudo apt-get install build-essential libreadline-dev zlib1g-dev flex bison

Apache AGE is intended to be simple to install and run. It can be installed with Docker and other traditional ways.

You will need to install an AGE compatible version of Postgres, for now AGE supports Postgres 11, 12, 13, 14, 15, 16, 17 & 18. Supporting the latest versions is on AGE roadmap.

You can use a package management that your OS provides to download PostgreSQL.

sudo apt install postgresql

You can download the Postgres source code and install your own instance of Postgres. You can read instructions on how to install from source code for different versions on the official Postgres Website.

Clone the github repository or download the download an official release. Run the pg_config utility and check the version of PostgreSQL. Currently, only PostgreSQL versions 11, 12, 13, 14, 15, 16, 17 & 18 are supported. If you have any other version of Postgres, you will need to install PostgreSQL version 11, 12, 13, 14, 15, 16, 17 & 18.

pg_config

Run the following command in the source code directory of Apache AGE to build and install the extension.

make install

If the path to your Postgres installation is not in the PATH variable, add the path in the arguments:

make PG_CONFIG=/path/to/postgres/bin/pg_config install
docker pull apache/age

docker run \
    --name age  \
    -p 5455:5432 \
    -e POSTGRES_USER=postgresUser \
    -e POSTGRES_PASSWORD=postgresPW \
    -e POSTGRES_DB=postgresDB \
    -d \
    apache/age
docker exec -it age psql -d postgresDB -U postgresUser

For every connection of AGE you start, you will need to load the AGE extension.

CREATE EXTENSION age;
LOAD 'age';
SET search_path = ag_catalog, "$user", public;

To create a graph, use the create_graph function located in the ag_catalog namespace.

SELECT create_graph('graph_name');

To create a single vertex with label and properties, use the CREATE clause.

SELECT * 
FROM cypher('graph_name', $$
    CREATE (:label {property:"Node A"})
$$) as (v agtype);
SELECT * 
FROM cypher('graph_name', $$
    CREATE (:label {property:"Node B"})
$$) as (v agtype);

To create an edge between two nodes and set its properties:

SELECT * 
FROM cypher('graph_name', $$
    MATCH (a:label), (b:label)
    WHERE a.property = 'Node A' AND b.property = 'Node B'
    CREATE (a)-[e:RELTYPE {property:a.property + '<->' + b.property}]->(b)
    RETURN e
$$) as (e agtype);

And to query the connected nodes:

SELECT * from cypher('graph_name', $$
        MATCH (V)-[R]-(V2)
        RETURN V,R,V2
$$) as (V agtype, R agtype, V2 agtype);

Starting with Apache AGE is very simple. You can easily select your platform and incorporate the relevant SDK into your code.

Apache AGE Viewer is a user interface for Apache AGE that provides visualization and exploration of data. This web visualization tool allows users to enter complex graph queries and explore the results in graph and table forms. Apache AGE Viewer is enhanced to proceed with extensive graph data and discover insights through various graph algorithms. Apache AGE Viewer will become a graph data administration and development platform for Apache AGE to support multiple relational databases: https://github.com/apache/age-viewer.

This is a visualization tool. After installing AGE Extension, you may use this tool to get access to the visualization features.

Viewer gdb, and graph

You can also get help from these videos.

You can improve ongoing efforts or initiate new ones by sending pull requests to this repository. Also, you can learn from the code review process, how to merge pull requests, and from code style compliance to documentation by visiting the Apache AGE official site - Developer Guidelines. Send all your comments and inquiries to the user mailing list, users@age.apache.org.