| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| |
| # Apache Groovy — Repository Architecture |
| |
| A contributor-facing map of how the Groovy compiler and runtime are |
| organised in this repository. This document is for people working *on* |
| Groovy. For documentation aimed at people *using* Groovy, see |
| <https://groovy.apache.org/> and the AsciiDoc sources under |
| `src/spec/doc/` and `subprojects/<module>/src/spec/doc/`. |
| |
| This is an overview, not a reference. It exists to give a new |
| contributor — human or AI — enough orientation to read the code |
| productively and to avoid a small set of common mis-steps. Code is the |
| source of truth; this document is a pointer file. |
| |
| ## Repository layout (top level) |
| |
| | Path | What lives there | |
| |---|---| |
| | `src/main/java/org/codehaus/groovy/` | Core compiler and runtime (legacy package — most of the codebase) | |
| | `src/main/java/org/apache/groovy/` | Newer code added under the `org.apache.groovy.*` package convention | |
| | `src/main/java/groovy/` | User-facing API (`groovy.lang.*`, `groovy.util.*`, etc.) | |
| | `src/main/groovy/` | Groovy sources compiled into the core jar | |
| | `src/main/resources/` | Service files, META-INF, default scripts | |
| | `src/antlr/` | ANTLR4 grammar (`GroovyLexer.g4`, `GroovyParser.g4`) — see "Generated code" below | |
| | `src/spec/doc/` | User-facing AsciiDoc reference docs | |
| | `src/spec/test/` | Executable Groovy snippets `include::`'d by the AsciiDoc sources | |
| | `src/test/` | JUnit / Spock tests for the core jar | |
| | `subprojects/` | ~50 modular subprojects (groovy-json, groovy-sql, groovy-xml, groovy-typecheckers, parser-antlr4 wiring, etc.) | |
| | `subprojects/groovy-binary/` | Aggregator that produces the final distribution and the published spec | |
| | `subprojects/binary-compatibility/` | Enforces public-API binary compatibility across releases | |
| | `subprojects/tests-preview/` | Tests that depend on preview JDK features | |
| | `bootstrap/`, `buildSrc/`, `build-logic/` | Build infrastructure (Gradle convention plugins, bootstrap helpers) | |
| |
| When in doubt, prefer adding new code under `org.apache.groovy.*`; the |
| older `org.codehaus.groovy.*` packages remain for legacy reasons but |
| are kept stable for compatibility. |
| |
| ## Compilation pipeline |
| |
| The driver is `org.codehaus.groovy.control.CompilationUnit`. A |
| `SourceUnit` represents a single source file inside it. Compilation |
| proceeds in numbered phases declared in |
| [`Phases.java`](src/main/java/org/codehaus/groovy/control/Phases.java) |
| and exposed as the |
| [`CompilePhase`](src/main/java/org/codehaus/groovy/control/CompilePhase.java) |
| enum that AST transformations and customizers attach to: |
| |
| | # | Phase | What happens | Driver classes | |
| |---|---|---|---| |
| | 1 | `INITIALIZATION` | Source files opened, `CompilationUnit` configured, customizers applied | `CompilationUnit`, `CompilerConfiguration` | |
| | 2 | `PARSING` | ANTLR4 lexer + parser produce a CST (parse tree) | `Antlr4ParserPlugin`, `GroovyLangLexer`, `GroovyLangParser` | |
| | 3 | `CONVERSION` | CST → AST (`ModuleNode` / `ClassNode` / `MethodNode` / ...) | `AstBuilder` | |
| | 4 | `SEMANTIC_ANALYSIS` | Class resolution, import handling, validity checks the grammar can't catch | `ResolveVisitor`, `StaticImportVisitor`, `AnnotationConstantsVisitor` | |
| | 5 | `CANONICALIZATION` | Fill in the AST: synthesised members, generic types, most local AST transforms run here | `ASTTransformationVisitor`, `GenericsVisitor` | |
| | 6 | `INSTRUCTION_SELECTION` | Optimisations and instruction-set selection; `@CompileStatic` / `@TypeChecked` run here | `OptimizerVisitor`, `StaticTypeCheckingVisitor` | |
| | 7 | `CLASS_GENERATION` | AST → bytecode in memory | `AsmClassGenerator`, `Verifier`, classes under `classgen/asm/` | |
| | 8 | `OUTPUT` | Write generated `.class` files | `CompilationUnit` output stage | |
| | 9 | `FINALIZATION` | Cleanup, `Janitor` callbacks | `CompilationUnit`, `Janitor` | |
| |
| Each phase iterates over all `SourceUnit`s before the next phase |
| begins. AST transformations declare which phase they run in; the |
| canonical question to ask before adding one is *"what state must the |
| AST be in for this transform to make sense?"* — pick the earliest phase |
| where that holds. |
| |
| The phase enum is the right anchor for any documentation that talks |
| about "when X happens during compilation". Quoting the phase names |
| verbatim keeps the reference precise; paraphrasing tends to drift. |
| |
| ### Parser (phase 2) |
| |
| - Grammar lives in `src/antlr/GroovyLexer.g4` and |
| `src/antlr/GroovyParser.g4`. The generated parser is regenerated |
| from these sources on every build, so changes belong in the `.g4` |
| files. |
| - The ANTLR Gradle plugin generates `GroovyLexer`, `GroovyParser`, |
| `GroovyParserVisitor`, and `GroovyParserBaseVisitor` into |
| `build/generated/sources/antlr4/org/apache/groovy/parser/antlr4/`. |
| - Hand-written code that wires the parser into `CompilationUnit` lives |
| in `src/main/java/org/apache/groovy/parser/antlr4/` |
| (`Antlr4PluginFactory`, `Antlr4ParserPlugin`, `GroovyLangLexer`, |
| `GroovyLangParser`, `AstBuilder`, plus support classes: |
| `ModifierManager`, `GroovydocManager`, `SemanticPredicates`, |
| `PositionInfo`). |
| - `AstBuilder` is the hand-off from CST to AST. It is large; almost |
| every parser-visible language change touches it. |
| |
| ### AST (phase 3 onward) |
| |
| - Root: `org.codehaus.groovy.ast.ASTNode`. |
| - Sub-packages: |
| - `org.codehaus.groovy.ast.expr` — expression nodes |
| (`BinaryExpression`, `MethodCallExpression`, ...) |
| - `org.codehaus.groovy.ast.stmt` — statement nodes |
| (`BlockStatement`, `ForStatement`, ...) |
| - `org.codehaus.groovy.ast.tools` — helpers (`GeneralUtils` is the |
| common one — prefer its factory methods over hand-built nodes) |
| - Top-level structural nodes: `ModuleNode` (one per source file) → |
| `ClassNode` → `MethodNode` / `FieldNode` / `PropertyNode` / |
| `ConstructorNode`. |
| - `ClassNode` instances for primitive and common types should be |
| obtained from `ClassHelper`, not constructed directly. Constructing |
| fresh `ClassNode`s for `int`, `String`, `Object`, etc. is a frequent |
| source of equality and resolution bugs. |
| - Visitors: `GroovyCodeVisitor` (expression + statement), |
| `GroovyClassVisitor` (class members), with `ClassCodeVisitorSupport` |
| / `CodeVisitorSupport` as bases, and |
| `ClassCodeExpressionTransformer` for transforms that rewrite |
| expressions in place. |
| |
| ### Static type checker (phase 6) |
| |
| - Entry point: `org.codehaus.groovy.transform.stc.StaticTypeCheckingVisitor`. |
| - Driven by `@TypeChecked` and `@CompileStatic`. The latter runs the |
| same checker, then directs `AsmClassGenerator` to emit direct calls |
| rather than dynamic dispatch. |
| - Extensible from user code via type-checking extension scripts; see |
| `src/spec/doc/_type-checking-extensions.adoc` for the user-facing |
| documentation of that mechanism. |
| |
| ### Class generation (phase 7) |
| |
| - `org.codehaus.groovy.classgen.AsmClassGenerator` walks the AST and |
| emits bytecode via ASM. Supporting visitors run here too: |
| `Verifier` (synthesises bridge methods, accessors, default |
| constructors), `EnumVisitor`, `EnumCompletionVisitor`, |
| `InnerClassVisitor`, `InnerClassCompletionVisitor`, |
| `VariableScopeVisitor`, `ReturnAdder`. |
| - ASM-specific helpers: `org.codehaus.groovy.classgen.asm.*`. |
| - The class loader path for compiled classes goes through |
| `org.codehaus.groovy.reflection.*` and the meta-class system in |
| `groovy.lang.MetaClass*`. |
| |
| ## Extension points |
| |
| Most contributor work touches one of these. Each has a dedicated |
| mechanism — knowing which one applies tells you where the change |
| belongs: |
| |
| - **AST transformations** — annotation-driven AST rewrites. Local |
| transforms run in `CANONICALIZATION` by default; global transforms |
| apply to every compilation unit and are registered via |
| `META-INF/services/org.codehaus.groovy.transform.ASTTransformation`. |
| Implementations live in `org.codehaus.groovy.transform.*`. |
| `AbstractASTTransformation` is the usual base class, and |
| `org.codehaus.groovy.ast.tools.GeneralUtils` is the standard library |
| for building AST fragments. |
| - **Type-checking extensions** — DSL scripts that hook into the |
| static type checker. See |
| `org.codehaus.groovy.transform.stc.GroovyTypeCheckingExtensionSupport` |
| and the user docs at `src/spec/doc/_type-checking-extensions.adoc`. |
| - **Compilation customizers** — |
| `org.codehaus.groovy.control.customizers.*`. Programmatic |
| configuration applied at `INITIALIZATION`: `ImportCustomizer`, |
| `ASTTransformationCustomizer`, `SecureASTCustomizer`, |
| `CompilationCustomizer` (base class for custom ones). |
| - **Extension modules** — add instance / static methods to existing |
| classes via descriptor files. Discovered through |
| `META-INF/groovy/org.codehaus.groovy.runtime.ExtensionModule`. The |
| GDK itself is built this way; see |
| `org.codehaus.groovy.runtime.DefaultGroovyMethods` and friends, and |
| the user-facing description in `src/spec/doc/core-gdk.adoc`. |
| - **Parser plugin** — `org.codehaus.groovy.control.ParserPluginFactory` |
| selects the parser. The ANTLR4 implementation is the only supported |
| one; the older Antlr2-based parser has been removed. |
| |
| ## Generated code |
| |
| The following are produced by the build and regenerated on every |
| run, so direct edits to them are overwritten. Changes belong in the |
| source they're generated from. |
| |
| | Generated artefact | Source | |
| |---|---| |
| | `build/generated/sources/antlr4/org/apache/groovy/parser/antlr4/Groovy{Lexer,Parser,ParserVisitor,ParserBaseVisitor}.java` | `src/antlr/GroovyLexer.g4`, `src/antlr/GroovyParser.g4` | |
| | Anything under `build/`, `*/build/`, `out/`, `subprojects/*/build/` | The build itself; never committed | |
| | Repackaged dependency classes (ASM, ANTLR runtime, picocli) | Configured in `build.gradle` under `repackagedDependencies` | |
| |
| If a `.java` file under `build/generated/...` looks like the right |
| thing to change, you are looking at the wrong file. The grammar fix |
| goes in `src/antlr/`. |
| |
| ## Public API boundaries |
| |
| Groovy has a covenanted public API. The shape of a change determines |
| which review path applies — see [`CONTRIBUTING.md`](CONTRIBUTING.md). |
| |
| | Package convention | Audience | Stability | |
| |---|---|---| |
| | `groovy.*` | End users (the public API surface) | Strongly stable; breaking changes need a major version | |
| | `org.apache.groovy.*` | Mixed; preferred location for new code | Stable unless explicitly marked otherwise | |
| | `org.codehaus.groovy.*` | Historical core; some user-visible, much internal | Stable in practice for things users have come to rely on; treat as public unless marked `@Internal` | |
| | Anything annotated [`@groovy.transform.Internal`](src/main/java/groovy/transform/Internal.java) or in a package named `internal` | Implementation detail | No stability guarantee | |
| |
| Binary compatibility against a baseline release is checked by the |
| `subprojects/binary-compatibility/` module as part of the build. See |
| [`COMPATIBILITY.md`](COMPATIBILITY.md) for the full stability story: |
| what counts as breaking, the deprecation policy, and how the |
| `japicmp`-based check is wired up. |
| |
| ## Tests |
| |
| - Core: `src/test/`. New tests use JUnit 5 |
| (`org.junit.jupiter.api.Test`); older tests are a mix of JUnit 3 |
| (`extends GroovyTestCase`) and JUnit 4. Spock is bundled and |
| available, but the core repo's own tests are predominantly JUnit. |
| - Module-specific: `subprojects/<module>/src/test/`. Same conventions |
| apply unless the module documents otherwise. |
| - Documentation examples: `src/spec/test/` and |
| `subprojects/<module>/src/spec/test/`. These are real Groovy files |
| that the AsciiDoc sources `include::` to keep examples executable. |
| A change to a documented example normally touches both files |
| together. |
| - Preview-feature tests: |
| `subprojects/tests-preview/src/test/` — use this when a test |
| depends on a JDK preview feature. |
| - Regression tests for a fixed JIRA: standalone test classes follow |
| the `Groovy<NNNN>` naming (e.g. `Groovy11955.groovy`); a regression |
| added to an existing class gets a `// GROOVY-<NNNN>` comment |
| immediately above the new method. Either shape leaves the JIRA ID |
| searchable. See the "Tests" section in |
| [`CONTRIBUTING.md`](CONTRIBUTING.md) for the full convention. |
| |
| Run a single test with: |
| |
| ``` |
| ./gradlew :test --tests <FullyQualifiedClassName> |
| ./gradlew :<subproject>:test --tests <FullyQualifiedClassName> |
| ``` |
| |
| ## Where to read next |
| |
| - [`CONTRIBUTING.md`](CONTRIBUTING.md) — how to build, test, and |
| submit a change. |
| - [`COMPATIBILITY.md`](COMPATIBILITY.md) — stability tiers, what |
| counts as a breaking change, deprecation policy, and the |
| binary-compatibility check. |
| - [`GOVERNANCE.md`](GOVERNANCE.md) — how decisions get made, where |
| discussions happen, review modes, and wait periods (placeholder |
| draft pending dev@ confirmation). |
| - [`AGENTS.md`](AGENTS.md) — supplemental guidance for AI coding |
| assistants; layered on top of this document, not a replacement for |
| it. |
| - `README.adoc` — the canonical build instructions. |
| - `src/spec/doc/core-metaprogramming.adoc` — user-facing description |
| of AST transformations and metaprogramming. |
| - `src/spec/doc/_type-checking-extensions.adoc` — user-facing |
| description of the type-checking extension mechanism. |
| - The Groovy issue tracker (<https://issues.apache.org/jira/browse/GROOVY>) |
| and the existing test suite are the best source of precedent for any |
| given change. `git log --grep GROOVY-NNNNN` finds the original fix |
| for an issue. |