layout: section title: “Beam Contribution Guide” permalink: /contribute/ section_menu: section-menu/contribute.html redirect_from:

  • /contribution-guide/
  • /contribute/contribution-guide/
  • /docs/contribute/
  • /contribute/source-repository/
  • /contribute/design-principles/

Apache Beam Contribution Guide

The Apache Beam community welcomes contributions from anyone!

If you have questions, please [reach out to the Beam community]({{ site.baseurl }}/contribute/get-help).

There are lots of opportunities to contribute:

  • ask or answer questions on [user@beam.apache.org]({{ site.baseurl }}/community/contact-us/) or stackoverflow
  • review proposed design ideas on [dev@beam.apache.org]({{ site.baseurl }}/community/contact-us/)
  • improve the documentation
  • contribute bug reports
  • contribute by testing releases
  • contribute by reviewing changes
  • write new examples
  • improve your favorite language SDK (Java, Python, Go, etc)
  • improve specific runners (Apache Apex, Apache Flink, Apache Spark, Google Cloud Dataflow, etc)
  • improve or add IO connectors
  • add new transform libraries (statistics, ML, image processing, etc)
  • work on the core programming model (what is a Beam pipeline and how does it run?)
  • improve the developer experience (for example, Windows guides)
  • add answers to the contribution FAQ
  • organize local meetups of users or contributors to Apache Beam

Most importantly, if you have an idea of how to contribute, then do it!

Contributing code

Below is a tutorial for contributing code to Beam, covering our tools and typical process in detail.

Prerequisites

To contribute code, you need

  • a GitHub account
  • a Linux, macOS, or Microsoft Windows development environment with Java JDK 8 installed
  • Docker installed for some tasks including building worker containers and testing website changes locally
  • Go 1.10 or later installed for Go SDK development
  • Python, virtualenv, and tox installed for Python SDK development
  • for large contributions, a signed Individual Contributor License Agreement (ICLA) to the Apache Software Foundation (ASF).

Connect With the Beam community

  1. Consider subscribing to the [dev@ mailing list]({{ site.baseurl}}/community/contact-us/), especially if you plan to make more than one change or the change will be large. All decisions happen on the public dev list.
  2. (Optionally) Join the [#beam channel of the ASF slack]({{ site.baseurl}}/community/contact-us/).
  3. Create an account on Beam issue tracker (JIRA) (anyone can do this).

Share your intent

  1. Find or create an issue in the Beam issue tracker (JIRA). Tracking your work in an issue will avoid duplicated or conflicting work, and provide a place for notes. Later, your pull request will be linked to the issue as well.
  2. If you want to get involved but don't have a project in mind, check our list of open starter tasks, https://s.apache.org/beam-starter-tasks.
  3. Assign the issue to yourself. To get the permission to do so, email the [dev@ mailing list]({{ site.baseurl }}/community/contact-us) to introduce yourself and to be added as a contributor in the Beam issue tracker including your ASF Jira Username. For example this welcome email.
  4. If your change is large or it is your first change, it is a good idea to [discuss it on the dev@ mailing list]({{ site.baseurl }}/community/contact-us/).
  5. For large changes create a design doc (template, examples) and email it to the [dev@ mailing list]({{ site.baseurl }}/community/contact-us).

Development Setup

  1. If you need help with git forking, cloning, branching, committing, pull requests, and squashing commits, see Git workflow tips

  2. Familiarize yourself with gradle and the project structure. At the root of the git repository, run:

    $ ./gradlew projects
    

    Examine the available tasks in a project. For the default set of tasks, use:

    $ ./gradlew tasks
    

    For a given module, use:

    $ ./gradlew -p sdks/java/io/cassandra tasks
    

    For an exhaustive list of tasks, use:

    $ ./gradlew tasks --all
    
  3. Make sure you can build and run tests

    Run the entire set of tests with:

    $ ./gradlew check
    

    You can limit testing to a particular module. Gradle will build just the necessary things to run those tests. For example:

    $ ./gradlew -p sdks/go check
    $ ./gradlew -p sdks/java/io/cassandra check
    $ ./gradlew -p runners/flink check
    
  4. Now you may want to set up your preferred IDE and other aspects of your development environment. See the Developers' wiki for tips, guides, and FAQs on:

Make your change

  1. Make your code change. Every source file needs to include the Apache license header. Every new dependency needs to have an open source license compatible with Apache.

  2. Add unit tests for your change

  3. When your change is ready to be reviewed and merged, create a pull request. Format commit messages and the pull request title like [BEAM-XXX] Fixes bug in ApproximateQuantiles, where you replace BEAM-XXX with the appropriate JIRA issue. This will automatically link the pull request to the issue. Use descriptive commit messages that make it easy to identify changes and provide a clear history. To support efficient and quality review, avoid tiny or out-of-context changes and huge mega-changes.

  4. The pull request and any changes pushed to it will trigger pre-commit jobs. If a test fails and appears unrelated to your change, you can cause tests to be re-run by adding a single line comment on your PR

     retest this please
    

    There are other trigger phrases for post-commit tests found in .testinfra/jenkins, but use these sparingly because post-commit tests consume shared development resources.

  5. Pull requests can only be merged by a [Beam committer]({{ site.baseurl }}/contribute/team/). To find a committer for your area, either:

    • look in the OWNERS file of the directory where you changed files, or
    • look for similar code merges, or
    • ask on [dev@beam.apache.org]({{ site.baseurl }}/community/contact-us/)

    Use R: @username in the pull request to notify a reviewer.

  6. If you don't get any response in 3 business days, email the [dev@ mailing list]({{ site.baseurl }}/community/contact-us) to ask for someone to look at your pull request.

  7. Review feedback typically leads to follow-up changes. You can add these changes as additional “fixup” commits to the existing PR/branch. This will allow reviewer(s) to track the incremental progress. After review is complete and the PR accepted, multiple commits should be squashed (see Git workflow tips).

When will my change show up in an Apache Beam release?

Apache Beam makes minor releases every 6 weeks. Apache Beam has a calendar for cutting the next release branch. Your change needs to be checked into master before the release branch is cut to make the next release.

Stale pull requests

The community will close stale pull requests in order to keep the project healthy. A pull request becomes stale after its author fails to respond to actionable comments for 60 days. Author of a closed pull request is welcome to reopen the same pull request again in the future. The associated JIRAs will be unassigned from the author but will stay open.

Accounts and Permissions

  • Beam issue tracker (JIRA): Anyone can access it and browse issues. Anyone can register an account and login to create issues or add comments. Only contributors can be assigned issues. If you want to be assigned issues, a PMC member can add you to the project contributor group. Email the [dev@ mailing list]({{ site.baseurl }}/community/contact-us) to ask to be added as a contributor in the Beam issue tracker, and include your ASF Jira username.

  • Beam Wiki Space: Anyone has read access. If you wish to contribute changes, please create an account and request edit access on the [dev@ mailing list]({{ site.baseurl }}/community/contact-us) (include your Wiki account user ID).

  • Pull requests can only be merged by a [Beam committer]({{ site.baseurl }}/contribute/team/).

  • Voting on a release: Everyone can vote. Only [Beam PMC]({{ site.baseurl }}/contribute/team/) members should mark their votes as binding.

Communication

All communication is expected to align with the Code of Conduct.

Discussion about contributing code to Beam happens on the [dev@ mailing list]({{ site.baseurl }}/community/contact-us/). Introduce yourself!

Questions can be asked on the [#beam channel of the ASF slack]({{ site.baseurl }}/community/contact-us/). Introduce yourself!

Additional resources

If you are contributing a PTransform to Beam, we have an extensive [PTransform Style Guide]({{ site.baseurl }}/contribute/ptransform-style-guide).

If you are contributing a Runner to Beam, refer to the [Runner authoring guide]({{ site.baseurl }}/contribute/runner-guide/)

Review design documents.

A great way to contribute is to join an existing effort. For the most intensive efforts, check out the roadmap.

You can also find a more exhaustive list on the Beam developers' wiki

Troubleshooting

If you run into any issues, check out the contribution FAQ or ask on on the [dev@ mailing list]({{ site.baseurl}}/community/contact-us/) or [#beam channel of the ASF slack]({{ site.baseurl}}/community/contact-us/).

If you didn't find the information you were looking for in this guide, please [reach out to the Beam community]({{ site.baseurl }}/community/contact-us).