Licensed to the Apache Software Foundation (ASF) under one

or more contributor license agreements. See the NOTICE file

distributed with this work for additional information

regarding copyright ownership. The ASF licenses this file

to you under the Apache License, Version 2.0 (the

“License”); you may not use this file except in compliance

with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,

software distributed under the License is distributed on an

“AS IS” BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY

KIND, either express or implied. See the License for the

specific language governing permissions and limitations

under the License.

name: ci-cd description: Guides understanding and working with Apache Beam's CI/CD system using GitHub Actions. Use when debugging CI failures, understanding test workflows, or modifying CI configuration.

CI/CD in Apache Beam

Overview

Apache Beam uses GitHub Actions for CI/CD. Workflows are located in .github/workflows/.

Workflow Types

PreCommit Workflows

  • Run on PRs and merges
  • Validate code changes before merge
  • Naming: beam_PreCommit_*.yml

PostCommit Workflows

  • Run after merge and on schedule
  • More comprehensive testing
  • Naming: beam_PostCommit_*.yml

Scheduled Workflows

  • Run nightly on master
  • Check for external dependency impacts
  • Tag master with nightly-master

Key Workflows

PreCommit

WorkflowDescription
beam_PreCommit_Java.ymlJava build and tests
beam_PreCommit_Python.ymlPython tests
beam_PreCommit_Go.ymlGo tests
beam_PreCommit_RAT.ymlLicense header checks
beam_PreCommit_Spotless.ymlCode formatting

PostCommit - Java

WorkflowDescription
beam_PostCommit_Java.ymlFull Java test suite
beam_PostCommit_Java_ValidatesRunner_*.ymlRunner validation tests
beam_PostCommit_Java_Examples_*.ymlExample pipeline tests

PostCommit - Python

WorkflowDescription
beam_PostCommit_Python.ymlFull Python test suite
beam_PostCommit_Python_ValidatesRunner_*.ymlRunner validation
beam_PostCommit_Python_Examples_*.ymlExamples

Load & Performance Tests

WorkflowDescription
beam_LoadTests_*.ymlLoad testing
beam_PerformanceTests_*.ymlI/O performance

Triggering Tests

Automatic

  • PRs trigger PreCommit tests
  • Merges trigger PostCommit tests

Triggering Specific Workflows

Use trigger files to run specific workflows.

Workflow Dispatch

Most workflows support manual triggering via GitHub UI.

Understanding Test Results

Finding Logs

  1. Go to PR → Checks tab
  2. Click on failed workflow
  3. Expand failed job
  4. View step logs

Common Failure Patterns

Flaky Tests

  • Random failures unrelated to change
  • Solution: Use trigger files to re-run the specific workflow.

Timeout

  • Increase timeout in workflow if justified
  • Or optimize test

Resource Exhaustion

  • GCP quota issues
  • Check project settings

GCP Credentials

Workflows requiring GCP access use these secrets:

  • GCP_PROJECT_ID - Project ID (e.g., apache-beam-testing)
  • GCP_REGION - Region (e.g., us-central1)
  • GCP_TESTING_BUCKET - Temp storage bucket
  • GCP_PYTHON_WHEELS_BUCKET - Python wheels bucket
  • GCP_SA_EMAIL - Service account email
  • GCP_SA_KEY - Base64-encoded service account key

Required IAM roles:

  • Storage Admin
  • Dataflow Admin
  • Artifact Registry Writer
  • BigQuery Data Editor
  • Service Account User

Self-hosted vs GitHub-hosted Runners

Self-hosted (majority of workflows)

  • Pre-configured with dependencies
  • GCP credentials pre-configured
  • Naming: beam_*.yml

GitHub-hosted

  • Used for cross-platform testing (Linux, macOS, Windows)
  • May need explicit credential setup

Workflow Structure

name: Workflow Name
on:
  push:
    branches: [master]
  pull_request:
    branches: [master]
  schedule:
    - cron: '0 0 * * *'
  workflow_dispatch:

jobs:
  build:
    runs-on: [self-hosted, ...]
    steps:
      - uses: actions/checkout@v4
      - name: Run Gradle
        run: ./gradlew :task:name

Local Debugging

Run Same Commands as CI

Check workflow file's run commands:

./gradlew :sdks:java:core:test
./gradlew :sdks:python:test

Common Issues

  • Clean gradle cache: rm -rf ~/.gradle .gradle
  • Remove build directory: rm -rf build
  • Check Java version matches CI

Snapshot Builds

Locations

Release Workflows

WorkflowPurpose
cut_release_branch.ymlCreate release branch
build_release_candidate.ymlBuild RC
finalize_release.ymlFinalize release
publish_github_release_notes.ymlPublish notes