Development happens on the main
branch, and most of the time, we depend on DataFusion using a git dependency (depending on a specific git revision) rather than using an official release from crates.io. This allows us to pick up new features and bug fixes frequently by creating PRs to move to a later revision of the code. It also means we can incrementally make updates that are required due to changes in DataFusion rather than having a large amount of work to do when the next official release is available.
When there is a new official release of DataFusion, we update the main
branch to point to that, update the version number, and create a new release branch, such as branch-0.11
. Once this branch is created, we switch the main
branch back to using GitHub dependencies. The release activity (such as generating the changelog) can then happen on the release branch without blocking ongoing development in the main
branch.
We can cherry-pick commits from the main
branch into branch-0.11
as needed and then create new patch releases from that branch.
Although some tasks can only be performed by a PMC member, many tasks can be performed by committers and contributors.
Task | Role Required |
---|---|
Create PRs against main branch to update DataFusion dependencies | None |
Create PRs against main branch to update Ballista version | None |
Create release branch (e.g. branch-0.11) | Committer |
Create PRs against release branch with CHANGELOG | None |
Create PRs against release branch with cherry-picked commits | None |
Create release candidate tag | Committer |
Task | Role Required |
---|---|
Create release candidate tarball and publish to SVN | PMC |
Start vote on mailing list | PMC |
Call vote on mailing list | PMC |
Publish release tarball to SVN | PMC |
Publish binary artifacts to crates.io | PMC |
Task | Role Required |
---|---|
Create PR against arrow-site with updated documentation | None |
git@github.com:apache/arrow-ballista.git
add as git remote apache
.main
BranchBefore creating a new release:
./dev/update_ballista_versions.py 0.11.0
branch-0.11
Once the release branch has been created, the main
branch can immediately go back to depending on DataFusion with a GitHub dependency.
Define release branch (e.g. branch-0.11
), base version tag (e.g. 0.7.0
) and future version tag (e.g. 0.9.0
). Commits between the base version tag and the release branch will be used to populate the changelog content.
# create the changelog CHANGELOG_GITHUB_TOKEN=<TOKEN> ./dev/release/update_change_log-ballista.sh main 0.8.0 0.7.0 # review change log / edit issues and labels if needed, rerun until you are happy with the result git commit -a -m 'Create changelog for release'
If you see the error "You have exceeded a secondary rate limit"
when running this script, try reducing the CPU allocation to slow the process down and throttle the number of GitHub requests made per minute, by modifying the value of the --cpus
argument in the update_change_log.sh
script.
You can add invalid
or development-process
label to exclude items from release notes. Add datafusion
, ballista
and python
labels to group items into each sub-project's change log.
Send a PR to get these changes merged into the release branch (e.g. branch-0.11
). If new commits that could change the change log content landed in the release branch before you could merge the PR, you need to rerun the changelog update script to regenerate the changelog and update the PR accordingly.
After the PR gets merged, you are ready to create release artifacts based off the merged commit.
(Note you need to be a committer to run these scripts as they upload to the apache svn distribution servers)
Pick numbers in sequential order, with 0
for rc0
, 1
for rc1
, etc.
While the official release artifacts are signed tarballs and zip files, we also tag the commit it was created for convenience and code archaeology.
Using a string such as 0.11.0
as the <version>
, create and push the tag by running these commands:
git tag <version>-<rc> # push tag to Github remote git push apache <version>
See instructions at https://infra.apache.org/release-signing.html#generate for generating keys.
Committers can add signing keys in Subversion client with their ASF account. e.g.:
$ svn co https://dist.apache.org/repos/dist/dev/arrow $ cd arrow $ editor KEYS $ svn ci KEYS
Follow the instructions in the header of the KEYS file to append your key. Here is an example:
(gpg --list-sigs "John Doe" && gpg --armor --export "John Doe") >> KEYS svn commit KEYS -m "Add key for John Doe"
Run create-tarball.sh
with the <version>
tag and <rc>
and you found in previous steps:
GH_TOKEN=<TOKEN> ./dev/release/create-tarball.sh 0.11.0 1
The create-tarball.sh
script
creates and uploads all release candidate artifacts to the arrow dev location on the apache distribution svn server
provide you an email template to send to dev@arrow.apache.org for release voting.
Send the email output from the script to dev@arrow.apache.org. The email should look like
To: dev@arrow.apache.org Subject: [VOTE][Ballista] Release Apache Arrow Ballista 0.8.0 RC0 Hi, I would like to propose a release of Apache Arrow Ballista Implementation, version 0.8.0. This release candidate is based on commit: a5dd428f57e62db20a945e8b1895de91405958c4 [1] The proposed release artifacts and signatures are hosted at [2]. The changelog is located at [3]. Please download, verify checksums and signatures, run the unit tests, and vote on the release. The vote will be open for at least 72 hours. [ ] +1 Release this as Apache Arrow Ballista 0.8.0 [ ] +0 [ ] -1 Do not release this as Apache Arrow Ballista 0.8.0 because... [1]: https://github.com/apache/arrow-ballista/tree/a5dd428f57e62db20a945e8b1895de91405958c4 [2]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-ballista-0.8.0 [3]: https://github.com/apache/arrow-ballista/blob/a5dd428f57e62db20a945e8b1895de91405958c4/CHANGELOG.md
For the release to become “official” it needs at least three PMC members to vote +1 on it.
The dev/release/verify-release-candidate.sh
is a script in this repository that can assist in the verification process. Run it like:
./dev/release/verify-release-candidate.sh 0.11.0 0
If the release is not approved, fix whatever the problem is, merge changelog changes into main if there is any and try again with the next RC number.
NOTE: steps in this section can only be done by PMC members.
Move artifacts to the release location in SVN, e.g. https://dist.apache.org/repos/dist/release/arrow/arrow-ballista-0.8.0/, using the release-tarball.sh
script:
./dev/release/release-tarball.sh 0.11.0 1
Congratulations! The release is now official!
Tag the same release candidate commit with the final release tag
git checkout 0.11.0-rc1 git tag 0.11.0 git push apache 0.11.0
Only approved releases of the tarball should be published to crates.io, in order to conform to Apache Software Foundation governance standards.
An Arrow committer can publish this crate after an official project release has been made to crates.io using the following instructions.
Follow these instructions to create an account and login to crates.io before asking to be added as an owner of the following crates:
Download and unpack the official release tarball
Verify that the Cargo.toml in the tarball contains the correct version (e.g. version = "0.8.0"
) and then publish the crates with the following commands. Crates need to be published in the correct order as shown in this diagram.
To update this diagram, manually edit the dependencies in crate-deps.dot and then run:
dot -Tsvg dev/release/crate-deps.dot > dev/release/crate-deps.svg
(cd ballista/core && cargo publish) (cd ballista/executor && cargo publish) (cd ballista/scheduler && cargo publish) (cd ballista/client && cargo publish) (cd ballista-cli && cargo publish)
We release the Docker image that was voted on rather than build a new image. We do this by re-tagging the image.
$ docker pull ghcr.io/apache/arrow-ballista-standalone:0.10.0-rc3 $ docker tag ghcr.io/apache/arrow-ballista-standalone:0.10.0-rc3 ghcr.io/apache/arrow-ballista-standalone:0.10.0 $ docker push ghcr.io/apache/arrow-ballista-standalone:0.10.0
Call the vote on the Arrow dev list by replying to the RC voting thread. The reply should have a new subject constructed by adding [RESULT]
prefix to the old subject line.
Sample announcement template:
The vote has passed with <NUMBER> +1 votes. Thank you to all who helped with the release verification.
Add the release to https://reporter.apache.org/addrelease.html?arrow with a version name prefixed with RS-BALLISTA-
, for example RS-BALLISTA-0.9.0
.
The release information is used to generate a template for a board report (see example here).
See the ASF documentation on when to archive for more information.
dev
svnRelease candidates should be deleted once the release is published.
Get a list of Ballista release candidates:
svn ls https://dist.apache.org/repos/dist/dev/arrow | grep ballista
Delete a release candidate:
svn delete -m "delete old Ballista RC" https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-ballista-0.8.0-rc1/
release
svnOnly the latest release should be available. Delete old releases after publishing the new release.
Get a list of Ballista releases:
svn ls https://dist.apache.org/repos/dist/release/arrow | grep ballista
Delete a release:
svn delete -m "delete old Ballista release" https://dist.apache.org/repos/dist/release/arrow/arrow-ballista-0.8.0
We typically crowdsource release announcements by collaborating on a Google document, usually starting with a copy of the previous release announcement.
Run the following commands to get the number of commits and number of unique contributors for inclusion in the blog post.
git log --pretty=oneline 0.11.0..0.10.0 ballista ballista-cli examples | wc -l git shortlog -sn 0.11.0..0.10.0 ballista ballista-cli examples | wc -l
Once there is consensus on the contents of the post, create a PR to add a blog post to the arrow-site repository. Note that there is no need for a formal PMC vote on the blog post contents since this isn't considered to be a “release”.
Here is an example blog post PR:
Once the PR is merged, a GitHub action will publish the new blog post to https://arrow.apache.org/blog/.