This document summarizes information relevant to Storm committers and contributors. It includes information about the development processes and policies as well as the tools we use to facilitate those.
Table of Contents
If you are reading this document then you are interested in contributing to the Storm project -- many thanks for that! All contributions are welcome: ideas, documentation, code, patches, bug reports, feature requests, etc. And you do not need to be a programmer to speak up.
This section explains how to perform common activities such as reporting a bug or merging a pull request.
To report a bug you should open an issue in our issue tracker that summarizes the bug. Set the form field “Issue type” to “Bug”. If you have not used the issue tracker before you will need to register an account (free), log in, and then click on the blue “Create Issue” button in the top navigation bar.
In order to help us understand and fix the bug it would be great if you could provide us with:
Feel free to search the issue tracker for existing issues (aka tickets) that already describe the problem; if there is such a ticket please add your information as a comment.
If you want to provide a patch along with your bug report: That is great! In this case please send us a pull request as described in section Create a pull request below. You can also opt to attach a patch file to the issue ticket, but we prefer pull requests because they are easier to work with.
To request a new feature you should open an issue in our issue tracker and summarize the desired functionality. Set the form field “Issue type” to “New feature”. If you have not used the issue tracker before you will need to register an account (free), log in, and then click on the blue “Create Issue” button in the top navigation bar.
You can also opt to send a message to the Storm Users mailing list.
Before you set out to contribute code we recommend that you familiarize yourself with the Storm codebase, notably by reading through the Implementation documentation.
If you are interested in contributing code to Storm but do not know where to begin: In this case you should browse our issue tracker for open issues and tasks. You may want to start with beginner-friendly, easier issues (newbie issues and trivial issues) because they require learning about only an isolated portion of the codebase and are a relatively small amount of work.
Please use idiomatic Clojure style, as explained in this Clojure style guide. Another useful reference is the Clojure Library Coding Standards. Perhaps the most important is consistenly writing a clear docstring for functions, explaining the return value and arguments. As of this writing, the Storm codebase would benefit from various style improvements.
Contributions to the Storm codebase should be sent as GitHub pull requests. See section Create a pull request below for details. If there is any problem with the pull request we can iterate on it using the commenting features of GitHub.
For small patches, feel free to submit pull requests directly for those patches.
For larger code contributions, please use the following process. The idea behind this process is to prevent any wasted work and catch design issues early on.
Documentation contributions are very welcome! The best way to send contributions is as emails through the Storm Developers mailing list.
Pull requests should be done against the read-only git repository at https://github.com/apache/storm.
Take a look at Creating a pull request. In a nutshell you need to:
You may want to read Syncing a fork for instructions on how to keep your fork up to date with the latest changes of the upstream (official)
NOTE: The information in this section may need to be formalized via proper project bylaws.
Pull requests are approved with two +1s from committers and need to be up for at least 72 hours for all committers to have a chance to comment. In case it was a committer who sent the pull request than two different committers must +1 the request.
This section applies to committers only.
Important: A pull request must first be properly approved before you are allowed to merge it.
Committers that are integrating patches or pull requests should use the official Apache repository at https://git-wip-us.apache.org/repos/asf/storm.git.
To pull in a merge request you should generally follow the command line instructions sent out by GitHub.
Go to your local copy of the Apache git repo, switch to the
master branch, and make sure it is up to date.
$ git checkout master $ git fetch origin $ git merge origin/master
Create a local branch for integrating and testing the pull request. You may want to name the branch according to the Storm JIRA ticket associated with the pull request (example:
$ git checkout -b <local_test_branch> # e.g. git checkout -b STORM-1234
Merge the pull request into your local test branch.
$ git pull <remote_repo_url> <remote_branch>
Assuming that the pull request merges without any conflicts: Update the top-level
CHANGELOG.md, and add in the JIRA ticket number (example:
STORM-1234) and ticket description to the change log. Make sure that you place the JIRA ticket number in the commit comments where applicable.
Run any sanity tests that you think are needed.
Once you are confident that everything is ok, you can merge your local test branch into your local
master branch, and push the changes back to the official Apache repo.
# Pull request looks ok, change log was updated, etc. We are ready for pushing. $ git checkout master $ git merge <local_test_branch> # e.g. git merge STORM-1234 # At this point our local master branch is ready, so now we will push the changes # to the official Apache repo. Note: The read-only mirror on GitHub will be updated # automatically a short while after you have pushed to the Apache repo. $ git push origin master
The last step is updating the corresponding JIRA ticket. Go to JIRA and resolve the ticket. Be sure to set the
Fix Version/s field to the version you pushed your changes to. It is usually good practice to thank the author of the pull request for their contribution if you have not done so already.
The following commands must be run from the top-level directory.
# Build the code and run the tests (requires nodejs, python and ruby installed) $ mvn clean install # Build the code and run the tests, with specifying default test timeout (in millisecond) $ export STORM_TEST_TIMEOUT_MS=10000 $ mvn clean install # Build the code but skip the tests $ mvn clean install -DskipTests=true
You can create a distribution (like what you can download from Apache) as follows. Note that the instructions below do not use the Maven release plugin because creating an official release is the task of our release manager.
# First, build the code. $ mvn clean install # you may skip tests with `-DskipTests=true` to save time # Create the binary distribution. $ cd storm-dist/binary && mvn package
The last command will create Storm binaries at:
storm-dist/binary/target/apache-storm-<version>.pom storm-dist/binary/target/apache-storm-<version>.tar.gz storm-dist/binary/target/apache-storm-<version>.zip
*.asc digital signature files.
mvn package you may be asked to enter your GPG/PGP credentials (once for each binary file, in fact). This happens because the packaging step will create
*.asc digital signatures for all the binaries, and in the workflow above your GPG private key will be used to create those signatures.
You can verify whether the digital signatures match their corresponding files:
# Example: Verify the signature of the `.tar.gz` binary. $ gpg --verify storm-dist/binary/target/apache-storm-<version>.tar.gz.asc
Tests should never rely on timing in order to pass. In Storm can properly test functionality that depends on time by simulating time, which means we do not have to worry about e.g. random delays failing our tests indeterministically.
If you are testing topologies that do not do full tuple acking, then you should be testing using the “tracked topologies” utilities in
backtype.storm.testing.clj. For example, test-acking (around line 213) tests the acking system in Storm using tracked topologies. Here, the key is the
tracked-wait function: it will only return when both that many tuples have been emitted by the spouts and the topology is idle (i.e. no tuples have been emitted nor will be emitted without further input). Note that you should not use tracked topologies for topologies that have tick tuples.
The source code of Storm is managed via git. For a number of reasons there is more than one git repository associated with Storm.
An automated bot (called ASF GitHub Bot in Storm JIRA) runs periodically to merge changes in the official Apache repo to the read-only GitHub mirror repository, and to merge comments in GitHub pull requests to the Storm JIRA.
Issue tracking includes tasks such as reporting bugs, requesting and collaborating on new features, and administrative activities for release management. As an Apache software project we use JIRA as our issue tracking tool.
The Storm JIRA is available at:
If you do not have a JIRA account yet, then you can create one via the link above (registration is free).
If you have any questions after reading this document, then please reach out to us via the Storm Developers mailing list.