A contributor-facing map of how the Groovy compiler and runtime are organised in this repository. This document is for people working on Groovy. For documentation aimed at people using Groovy, see https://groovy.apache.org/ and the AsciiDoc sources under src/spec/doc/ and subprojects/<module>/src/spec/doc/.
This is an overview, not a reference. It exists to give a new contributor — human or AI — enough orientation to read the code productively and to avoid a small set of common mis-steps. Code is the source of truth; this document is a pointer file.
| Path | What lives there |
|---|---|
src/main/java/org/codehaus/groovy/ | Core compiler and runtime (legacy package — most of the codebase) |
src/main/java/org/apache/groovy/ | Newer code added under the org.apache.groovy.* package convention |
src/main/java/groovy/ | User-facing API (groovy.lang.*, groovy.util.*, etc.) |
src/main/groovy/ | Groovy sources compiled into the core jar |
src/main/resources/ | Service files, META-INF, default scripts |
src/antlr/ | ANTLR4 grammar (GroovyLexer.g4, GroovyParser.g4) — see “Generated code” below |
src/spec/doc/ | User-facing AsciiDoc reference docs |
src/spec/test/ | Executable Groovy snippets include::'d by the AsciiDoc sources |
src/test/ | JUnit / Spock tests for the core jar |
subprojects/ | ~50 modular subprojects (groovy-json, groovy-sql, groovy-xml, groovy-typecheckers, parser-antlr4 wiring, etc.) |
subprojects/groovy-binary/ | Aggregator that produces the final distribution and the published spec |
subprojects/binary-compatibility/ | Enforces public-API binary compatibility across releases |
subprojects/tests-preview/ | Tests that depend on preview JDK features |
bootstrap/, buildSrc/, build-logic/ | Build infrastructure (Gradle convention plugins, bootstrap helpers) |
When in doubt, prefer adding new code under org.apache.groovy.*; the older org.codehaus.groovy.* packages remain for legacy reasons but are kept stable for compatibility.
Some subprojects with significant local complexity carry their own ARCHITECTURE.md alongside the source — see subprojects/groovy-groovysh/ARCHITECTURE.md for the worked example.
The driver is org.codehaus.groovy.control.CompilationUnit. A SourceUnit represents a single source file inside it. Compilation proceeds in numbered phases declared in Phases.java and exposed as the CompilePhase enum that AST transformations and customizers attach to:
| # | Phase | What happens | Driver classes |
|---|---|---|---|
| 1 | INITIALIZATION | Source files opened, CompilationUnit configured, customizers applied | CompilationUnit, CompilerConfiguration |
| 2 | PARSING | ANTLR4 lexer + parser produce a CST (parse tree) | Antlr4ParserPlugin, GroovyLangLexer, GroovyLangParser |
| 3 | CONVERSION | CST → AST (ModuleNode / ClassNode / MethodNode / ...) | AstBuilder |
| 4 | SEMANTIC_ANALYSIS | Class resolution, import handling, validity checks the grammar can't catch | ResolveVisitor, StaticImportVisitor, AnnotationConstantsVisitor |
| 5 | CANONICALIZATION | Fill in the AST: synthesized members, generic types, most local AST transforms run here | ASTTransformationVisitor, GenericsVisitor |
| 6 | INSTRUCTION_SELECTION | Optimisations and instruction-set selection; @CompileStatic / @TypeChecked run here | OptimizerVisitor, StaticTypeCheckingVisitor |
| 7 | CLASS_GENERATION | AST → bytecode in memory | AsmClassGenerator, Verifier, classes under classgen/asm/ |
| 8 | OUTPUT | Write generated .class files | CompilationUnit output stage |
| 9 | FINALIZATION | Cleanup, Janitor callbacks | CompilationUnit, Janitor |
Each phase iterates over all SourceUnits before the next phase begins. AST transformations declare which phase they run in; the canonical question to ask before adding one is “what state must the AST be in for this transform to make sense?” — pick the earliest phase where that holds.
The phase enum is the right anchor for any documentation that talks about “when X happens during compilation”. Quoting the phase names verbatim keeps the reference precise; paraphrasing tends to drift.
Since Groovy 2.0, dynamic method dispatch can be performed using the invokedynamic instruction. The core of this implementation lives in org.codehaus.groovy.vmplugin.v8.
| Class | Role |
|---|---|
IndyInterface | Bootstrap methods and optimization lifecycle management. |
CacheableCallSite | The stateful call site holding the PIC chain, MRU entry, and LRU cache. |
Selector | Logic for finding the target method/property and constructing the guarded MethodHandle. |
MethodHandleWrapper | Combines a MethodHandle with metadata like hit counts and target description. |
To maximize performance, CacheableCallSite uses three levels of caching:
Detailed technical documentation of this hierarchy can be found in the Javadoc of CacheableCallSite.
src/antlr/GroovyLexer.g4 and src/antlr/GroovyParser.g4. The generated parser is regenerated from these sources on every build, so changes belong in the .g4 files.GroovyLexer, GroovyParser, GroovyParserVisitor, and GroovyParserBaseVisitor into build/generated/sources/antlr4/org/apache/groovy/parser/antlr4/.CompilationUnit lives in src/main/java/org/apache/groovy/parser/antlr4/ (Antlr4PluginFactory, Antlr4ParserPlugin, GroovyLangLexer, GroovyLangParser, AstBuilder, plus support classes: ModifierManager, GroovydocManager, SemanticPredicates, PositionInfo).AstBuilder is the hand-off from CST to AST. It is large; almost every parser-visible language change touches it.org.codehaus.groovy.ast.ASTNode.org.codehaus.groovy.ast.expr — expression nodes (BinaryExpression, MethodCallExpression, ...)org.codehaus.groovy.ast.stmt — statement nodes (BlockStatement, ForStatement, ...)org.codehaus.groovy.ast.tools — helpers (GeneralUtils is the common one — prefer its factory methods over hand-built nodes)ModuleNode (one per source file) → ClassNode → MethodNode / FieldNode / PropertyNode / ConstructorNode.ClassNode instances for primitive and common types should be obtained from ClassHelper, not constructed directly. Constructing fresh ClassNodes for int, String, Object, etc. is a frequent source of equality and resolution bugs.GroovyCodeVisitor (expression + statement), GroovyClassVisitor (class members), with ClassCodeVisitorSupport / CodeVisitorSupport as bases, and ClassCodeExpressionTransformer for transforms that rewrite expressions in place.org.codehaus.groovy.transform.stc.StaticTypeCheckingVisitor.@TypeChecked and @CompileStatic. The latter runs the same checker, then directs AsmClassGenerator to emit direct calls rather than dynamic dispatch.src/spec/doc/_type-checking-extensions.adoc for the user-facing documentation of that mechanism.org.codehaus.groovy.classgen.AsmClassGenerator walks the AST and emits bytecode via ASM. Supporting visitors run here too: Verifier (synthesizes bridge methods, accessors, default constructors), EnumVisitor, EnumCompletionVisitor, InnerClassVisitor, InnerClassCompletionVisitor, VariableScopeVisitor, ReturnAdder.org.codehaus.groovy.classgen.asm.*.org.codehaus.groovy.reflection.* and the meta-class system in groovy.lang.MetaClass*.Most contributor work touches one of these. Each has a dedicated mechanism — knowing which one applies tells you where the change belongs:
CANONICALIZATION by default; global transforms apply to every compilation unit and are registered via META-INF/services/org.codehaus.groovy.transform.ASTTransformation. Implementations live in org.codehaus.groovy.transform.*. AbstractASTTransformation is the usual base class, and org.codehaus.groovy.ast.tools.GeneralUtils is the standard library for building AST fragments.org.codehaus.groovy.transform.stc.GroovyTypeCheckingExtensionSupport and the user docs at src/spec/doc/_type-checking-extensions.adoc.org.codehaus.groovy.control.customizers.*. Programmatic configuration applied at INITIALIZATION: ImportCustomizer, ASTTransformationCustomizer, SecureASTCustomizer, CompilationCustomizer (base class for custom ones).META-INF/groovy/org.codehaus.groovy.runtime.ExtensionModule. The GDK itself is built this way; see org.codehaus.groovy.runtime.DefaultGroovyMethods and friends, and the user-facing description in src/spec/doc/core-gdk.adoc.org.codehaus.groovy.control.ParserPluginFactory selects the parser. The ANTLR4 implementation is the only supported one; the older Antlr2-based parser has been removed.A few project-specific conventions on top of the architecture above. Each bites contributors quickly if missed:
public, not package-private. A static foo() method in a .groovy file is public static foo(). For “visible for testing” helpers in .groovy files, use @groovy.transform.PackageScope so same-package tests see the method while external callers don't. Some existing helpers in the tree (e.g. in groovy.grape.ivy.*) omit the modifier and are technically public; match that pattern only when you intend public exposure.org.apache.groovy.*. Legacy code under org.codehaus.groovy.* stays where it is for compatibility, but new packages should follow the org.apache.groovy.* convention.@groovy.transform.Internal or place the code in a package named internal to signal “no stability guarantee” — see the Public API boundaries table below.ClassHelper for known ClassNode instances. Fresh ClassNode instances for primitives (int, boolean) or common reference types (String, Object) break equality and resolution. Prefer ClassHelper.int_TYPE, ClassHelper.STRING_TYPE, ClassHelper.make(SomeType.class), etc.GeneralUtils factories over hand-built AST nodes. The helpers under org.codehaus.groovy.ast.tools.GeneralUtils cover the common construction patterns and avoid the shape-mismatch traps that hand-building tends to produce.CANONICALIZATION; transforms that need resolved types belong in INSTRUCTION_SELECTION or later. See the Compilation pipeline phase table.Several Groovy operators and expression forms are defined for multiple backing types, and behaviour across the members of a family is sometimes inconsistent. When investigating a bug reported for one type, probing the same expression across siblings often surfaces nuance the reporter missed — a hidden bug in a sibling type, confirmation that an asymmetry spans the whole family, or a project-wide spec gap that wasn't visible from a single-type report.
| Family | Members | Notes |
|---|---|---|
Range / index operators (agg[idx], agg[range]) | List / Object[] / primitive arrays (int[], long[], …) / String / CharSequence | Different exception classes (IndexOutOfBoundsException vs ArrayIndexOutOfBoundsException vs StringIndexOutOfBoundsException). Negative-endpoint and out-of-range-negative semantics have historically diverged across types — see GROOVY-3974 for a concrete example surfaced by cross-type probing. |
GPath expressions (x.y.z, x?.y, x*.y) | In-memory (Map, List, nested combinations) / JSON (JsonSlurper) / XML (XmlSlurper / XmlParser) / POGO / Java POJO / SQL result sets (groovy.sql.Sql) | XML has special handling for attributes (@attr syntax) and returns empty NodeChild collections on missing children rather than null. Map/JSON return null on missing keys. POGOs and POJOs throw MissingPropertyException for missing properties — the asymmetry is by-design (each backend‘s natural type semantics) but surfaces as a cross-type inconsistency from the user’s perspective. |
Numeric coercion (+, -, *, /, comparison) | int / long / BigInteger / BigDecimal / double / Float / Long (boxed) | Coercion rules vary; the result type of int + BigDecimal may surprise. |
Some operators have multiple syntactic variants that share a family but dispatch differently:
| Family | Variants | Dispatch notes |
|---|---|---|
| Safe navigation | ?. (SAFE_DOT) / ??. (SAFE_CHAIN_DOT — shorthand for chained ?.) / ?[..] (SAFE_INDEX) | ?. and ??. call getProperty(String). ?[..] calls getAt(Object), but on POGOs that routes through getProperty for missing keys, so the variants behave identically for POGO missing-property access. |
| Spread | *. / *[..] / *: | Different unpacking semantics across iteration / indexing / map-merge. |
| Equality / identity | == / .equals() / is | == is equals-based in Groovy (not reference-equality as in Java); is is Java's == (reference). |
| Coercion | as / asType() / constructor + from | Different conversion paths; as is statically-resolvable, asType is dynamic. |
| Range | .. / ..< / ..> | Endpoint inclusion / direction differences. |
| Elvis / null-coalesce | ?: and elaborations | Truthy-vs-null differences in the left-hand side. |
For an investigation of a bug in one family member, probing across siblings is a recurring technique. A ~50-line probe script (constructing each backend, running the same expression, recording outcomes in a table) is usually enough to:
See .agents/skills/groovy-reproducer/SKILL.md's “Cross-family probes” section for the AI-tooling pattern. The probe approach is equally useful when investigating by hand.
The following are produced by the build and regenerated on every run, so direct edits to them are overwritten. Changes belong in the source they're generated from.
| Generated artefact | Source |
|---|---|
build/generated/sources/antlr4/org/apache/groovy/parser/antlr4/Groovy{Lexer,Parser,ParserVisitor,ParserBaseVisitor}.java | src/antlr/GroovyLexer.g4, src/antlr/GroovyParser.g4 |
Anything under build/, */build/, out/, subprojects/*/build/ | The build itself; never committed |
| Repackaged dependency classes (ASM, ANTLR runtime, picocli) | Configured in build.gradle under repackagedDependencies |
If a .java file under build/generated/... looks like the right thing to change, you are looking at the wrong file. The grammar fix goes in src/antlr/.
The Gradle build is driven by:
build-logic/src/main/groovy/org.apache.groovy-*.gradle. These describe the shape every subproject takes: -base, -common, -core, -library, -published-library, -aggregating-project, -tested, and so on. A subproject applies the appropriate conventions and overrides only what's specific to it.build-logic/src/main/groovy/org/apache/groovy/. Versions, SharedConfiguration, and Services hold the canonical pinned versions and configuration the convention plugins read from.build.gradle, settings.gradle, and gradle.properties for project-wide settings, version pins, target bytecode, and build flags.A few build-side conventions:
build.gradle files. If two or more subprojects need the same configuration, add it to the right org.apache.groovy-*.gradle; conversely, don't push a one-off into a shared convention plugin.gradle.properties and the shared Versions type, not as 'group:artifact:1.2.3' literals in subproject builds. Ad-hoc version pins drift.targetJavaVersion sets the Java source/target compatibility level for the project‘s Java sources and Javadoc; groovyTargetBytecodeVersion sets the bytecode level the forked Groovy compiler emits (passed as -Dgroovy.target.bytecode). Both are pinned in gradle.properties and read through SharedConfiguration — that file is the single source of truth, so docs and skills cite the property names, never the values (they change per Groovy version). Neither controls which JVM runs the build or tests: the build compiles with whatever JDK launched Gradle, and tests run on that same JDK unless -Ptarget.java.home=<java-home> overrides the test JVM (the build then auto-derives the matching toolchain from that JDK’s release file). See CONTRIBUTING.md for the recipe.gradle/verification-metadata.xml. The build runs Gradle dependency verification; an unverified artifact fails the build. Regenerate with ./gradlew --write-verification-metadata sha256,pgp help and inspect the diff before committing.groovyjarjar* packages, configured via groovyLibrary { repackagedDependencies = ... }. A wrong repackaging rule produces a jar that compiles fine but fails at runtime — run ./gradlew :groovy-binary:installGroovy and exercise groovy / groovyc against a non-trivial script whenever a repackaging change lands.Project state at execution time. Prefer providers (providers.gradleProperty, providers.environmentVariable, Provider chains) and tasks.register / configureEach over eager realisation.plugins {} block; check the Develocity compatibility matrix linked in build.gradle before bumping.subprojects/binary-compatibility/ runs japicmp against a baseline release. Don't suppress it for green CI — either justify the change in the accepted-changes file or revert the API breakage. See COMPATIBILITY.md.For the canonical command sequence (targeted → subproject → full build, with the dependency-verification regeneration and installed-build smoke test), see CONTRIBUTING.md. For the public API contract that the binary-compatibility check defends, see COMPATIBILITY.md.
Groovy has a covenanted public API. The shape of a change determines which review path applies — see CONTRIBUTING.md.
| Package convention | Audience |
|---|---|
groovy.* | End users — the public API surface |
org.apache.groovy.* | Mixed; preferred location for new code |
org.codehaus.groovy.* | Historical core — some user-visible, much internal |
Anything annotated @groovy.transform.Internal or in a package named internal | Implementation detail |
For the stability commitment each tier carries — what counts as breaking, the deprecation policy, the four-tier stability model (Public / Incubating / Internal / Generated), and how the japicmp-based binary-compatibility check is wired up — see COMPATIBILITY.md. Binary compatibility against a baseline release is enforced by the subprojects/binary-compatibility/ module as part of the build.
Test code is laid out in parallel with source code:
| Directory | Purpose |
|---|---|
src/test/ | Core tests for the core jar |
subprojects/<module>/src/test/ | Module-specific tests |
src/spec/test/ and subprojects/<module>/src/spec/test/ | Executable Groovy snippets that AsciiDoc sources include:: to keep documentation examples runnable |
subprojects/tests-preview/src/test/ | Tests that depend on JDK preview features |
For test framework conventions (JUnit 5, regression-test naming, the fix-workflow ordering, the executable-AsciiDoc pattern, and test-writing pitfalls) and the targeted-run command sequence, see the “Tests” section in CONTRIBUTING.md.
CONTRIBUTING.md — how to build, test, and submit a change.COMPATIBILITY.md — stability tiers, what counts as a breaking change, deprecation policy, and the binary-compatibility check.GOVERNANCE.md — how decisions get made, where discussions happen, review modes, and wait periods (placeholder draft pending dev@ confirmation).AGENTS.md — supplemental guidance for AI coding assistants; layered on top of this document, not a replacement for it.README.adoc — the canonical build instructions.src/spec/doc/core-metaprogramming.adoc — user-facing description of AST transformations and metaprogramming.src/spec/doc/_type-checking-extensions.adoc — user-facing description of the type-checking extension mechanism.git log --grep GROOVY-NNNNN finds the original fix for an issue.