blob: 6b5bc3b0134dcc239751c94bcbe99a8c131fb19e [file] [view]
---
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
name: ci-cd
description: Guides understanding and working with Apache Beam's CI/CD system using GitHub Actions. Use when debugging CI failures, understanding test workflows, or modifying CI configuration.
---
# CI/CD in Apache Beam
## Overview
Apache Beam uses GitHub Actions for CI/CD. Workflows are located in `.github/workflows/`.
## Workflow Types
### PreCommit Workflows
- Run on PRs and merges
- Validate code changes before merge
- Naming: `beam_PreCommit_*.yml`
### PostCommit Workflows
- Run after merge and on schedule
- More comprehensive testing
- Naming: `beam_PostCommit_*.yml`
### Scheduled Workflows
- Run nightly on master
- Check for external dependency impacts
- Tag master with `nightly-master`
## Key Workflows
### PreCommit
| Workflow | Description |
|----------|-------------|
| `beam_PreCommit_Java.yml` | Java build and tests |
| `beam_PreCommit_Python.yml` | Python tests |
| `beam_PreCommit_Go.yml` | Go tests |
| `beam_PreCommit_RAT.yml` | License header checks |
| `beam_PreCommit_Spotless.yml` | Code formatting |
### PostCommit - Java
| Workflow | Description |
|----------|-------------|
| `beam_PostCommit_Java.yml` | Full Java test suite |
| `beam_PostCommit_Java_ValidatesRunner_*.yml` | Runner validation tests |
| `beam_PostCommit_Java_Examples_*.yml` | Example pipeline tests |
### PostCommit - Python
| Workflow | Description |
|----------|-------------|
| `beam_PostCommit_Python.yml` | Full Python test suite |
| `beam_PostCommit_Python_ValidatesRunner_*.yml` | Runner validation |
| `beam_PostCommit_Python_Examples_*.yml` | Examples |
### Load & Performance Tests
| Workflow | Description |
|----------|-------------|
| `beam_LoadTests_*.yml` | Load testing |
| `beam_PerformanceTests_*.yml` | I/O performance |
## Triggering Tests
### Automatic
- PRs trigger PreCommit tests
- Merges trigger PostCommit tests
### Triggering Specific Workflows
Use [trigger files](https://github.com/apache/beam/blob/master/.github/workflows/README.md#running-workflows-manually) to run specific workflows.
### Workflow Dispatch
Most workflows support manual triggering via GitHub UI.
## Understanding Test Results
### Finding Logs
1. Go to PR → Checks tab
2. Click on failed workflow
3. Expand failed job
4. View step logs
### Common Failure Patterns
#### Flaky Tests
- Random failures unrelated to change
- Solution: Use [trigger files](https://github.com/apache/beam/blob/master/.github/workflows/README.md#running-workflows-manually) to re-run the specific workflow.
#### Timeout
- Increase timeout in workflow if justified
- Or optimize test
#### Resource Exhaustion
- GCP quota issues
- Check project settings
## GCP Credentials
Workflows requiring GCP access use these secrets:
- `GCP_PROJECT_ID` - Project ID (e.g., `apache-beam-testing`)
- `GCP_REGION` - Region (e.g., `us-central1`)
- `GCP_TESTING_BUCKET` - Temp storage bucket
- `GCP_PYTHON_WHEELS_BUCKET` - Python wheels bucket
- `GCP_SA_EMAIL` - Service account email
- `GCP_SA_KEY` - Base64-encoded service account key
Required IAM roles:
- Storage Admin
- Dataflow Admin
- Artifact Registry Writer
- BigQuery Data Editor
- Service Account User
## Self-hosted vs GitHub-hosted Runners
### Self-hosted (majority of workflows)
- Pre-configured with dependencies
- GCP credentials pre-configured
- Naming: `beam_*.yml`
### GitHub-hosted
- Used for cross-platform testing (Linux, macOS, Windows)
- May need explicit credential setup
## Workflow Structure
```yaml
name: Workflow Name
on:
push:
branches: [master]
pull_request:
branches: [master]
schedule:
- cron: '0 0 * * *'
workflow_dispatch:
jobs:
build:
runs-on: [self-hosted, ...]
steps:
- uses: actions/checkout@v4
- name: Run Gradle
run: ./gradlew :task:name
```
## Local Debugging
### Run Same Commands as CI
Check workflow file's `run` commands:
```bash
./gradlew :sdks:java:core:test
./gradlew :sdks:python:test
```
### Common Issues
- Clean gradle cache: `rm -rf ~/.gradle .gradle`
- Remove build directory: `rm -rf build`
- Check Java version matches CI
## Snapshot Builds
### Locations
- Java SDK: https://repository.apache.org/content/groups/snapshots/org/apache/beam/
- SDK Containers: https://gcr.io/apache-beam-testing/beam-sdk
- Portable Runners: https://gcr.io/apache-beam-testing/beam_portability
- Python SDK: gs://beam-python-nightly-snapshots
## Release Workflows
| Workflow | Purpose |
|----------|---------|
| `cut_release_branch.yml` | Create release branch |
| `build_release_candidate.yml` | Build RC |
| `finalize_release.yml` | Finalize release |
| `publish_github_release_notes.yml` | Publish notes |