This file documents the release process for the “Rust Arrow Crates”: arrow, arrow-flight, parquet, and parquet-derive.
The Rust Arrow Crates are interconnected (e.g. parquet has an optional dependency on arrow) so we increment and release all of them together.
If any code has been merged to main that has a breaking API change, as defined in Rust RFC 1105 he major version number is incremented (e.g. 9.0.2 to 10.0.2). Otherwise the new minor version incremented (e.g. 9.0.2 to 9.1.0).
As part of the Apache governance model, official releases consist of signed source tarballs approved by the Arrow PMC.
We then use the code in the approved source tarball to release to crates.io, the Rust ecosystem's package manager.
We create a CHANGELOG.md so our users know what has been changed between releases.
The CHANGELOG is created automatically using update_change_log.sh
This script creates a changelog using github issues and the labels associated with them.
Now prepare a PR to update CHANGELOG.md and versions on main to reflect the planned release.
Do this in the root of this repository. For example #2323
git checkout main git pull git checkout -b <RELEASE_BRANCH> # Update versions. Make sure to run it before the next step since we do not want CHANGELOG-old.md affected. sed -i '' -e 's/14.0.0/39.0.0/g' `find . -name 'Cargo.toml' -or -name '*.md' | grep -v CHANGELOG.md | grep -v README.md` git commit -a -m 'Update version' # ensure your github token is available export ARROW_GITHUB_API_TOKEN=<TOKEN> # manually edit ./dev/release/update_change_log.sh to reflect the release version # create the changelog ./dev/release/update_change_log.sh # commit the initial changes git commit -a -m 'Create changelog' # run automated script to copy labels to issues based on referenced PRs # (NOTE 1: this must be done by a committer / other who has # write access to the repository) # # NOTE 2: this must be done after creating the initial CHANGELOG file python dev/release/label_issues.py # review change log / edit issues and labels if needed, rerun, repeat as necessary # note you need to revert changes to CHANGELOG-old.md if you want to rerun the script ./dev/release/update_change_log.sh # Commit the changes git commit -a -m 'Update changelog' git push
Note that when reviewing the change log, rather than editing the CHANGELOG.md, it is preferred to update the issues and their labels (e.g. add invalid label to exclude them from release notes)
Merge this PR to main prior to the next step.
After you have merged the updates to the CHANGELOG and version, create a release candidate using the following steps. Note you need to be a committer to run these scripts as they upload to the apache svn distribution servers.
Pick numbers in sequential order, with 1 for rc1, 2 for rc2, etc.
While the official release artifact is a signed tarball, we also tag the commit it was created for convenience and code archaeology.
Use a string such as 43.0.0 as the <version>.
Create and push the tag thusly (for example, for version 4.1.0 and rc2 would be 4.1.0-rc2):
git fetch apache git tag <version>-<rc> apache/main # push tag to apache git push apache <version>-<rc>
Run create-tarball.sh with the <version> tag and <rc> and you found in previous steps.
Rust Arrow Crates:
./dev/release/create-tarball.sh 4.1.0 2
The create-tarball.sh script
creates and uploads a release candidate tarball to the arrow dev location on the apache distribution svn server
provide you an email template to send to dev@arrow.apache.org for release voting.
Send an email, based on the output from the script to dev@arrow.apache.org. The email should look like
To: dev@arrow.apache.org Subject: [VOTE][RUST] Release Apache Arrow Hi, I would like to propose a release of Apache Arrow Rust Implementation, version 4.1.0. This release candidate is based on commit: a5dd428f57e62db20a945e8b1895de91405958c4 [1] The proposed release tarball and signatures are hosted at [2]. The changelog is located at [3]. Please download, verify checksums and signatures, run the unit tests, and vote on the release. The vote will be open for at least 72 hours. [ ] +1 Release this as Apache Arrow Rust [ ] +0 [ ] -1 Do not release this as Apache Arrow Rust because... [1]: https://github.com/apache/arrow-rs/tree/a5dd428f57e62db20a945e8b1895de91405958c4 [2]: https://dist.apache.org/repos/dist/dev/arrow/apache-arrow-rs-4.1.0 [3]: https://github.com/apache/arrow-rs/blob/a5dd428f57e62db20a945e8b1895de91405958c4/CHANGELOG.md
For the release to become “official” it needs at least three Apache Arrow PMC members to vote +1 on it.
The dev/release/verify-release-candidate.sh script in this repository can assist in the verification process. Run it like:
./dev/release/verify-release-candidate.sh 4.1.0 2
If the release is not approved, fix whatever the problem is and try again with the next RC number
Then, create a new release on GitHub using the tag <version> (e.g. 4.1.0).
Push the release tag to github
git tag <version> <version>-<rc> git push apache <version>
Move tarball to the release location in SVN, e.g. https://dist.apache.org/repos/dist/release/arrow/arrow-rs-4.1.0/, using the release-tarball.sh script:
./dev/release/release-tarball.sh 4.1.0 2
Congratulations! The release is now official!
The release.yml workflow automatically creates a github release for the tag. Check that the release is created and contains the correct changelog here: https://github.com/apache/arrow-rs/releases
It is important that only approved releases of the tarball should be published to crates.io, in order to conform to Apache Software Foundation governance standards.
An Arrow committer can publish this crate after an official project release has been made to crates.io using the following instructions.
Follow these instructions to create an account and login to crates.io before asking to be added as an owner of the arrow crate.
Download and unpack the official release tarball
Verify that the Cargo.toml in the tarball contains the correct version (e.g. version = "0.11.0") and then publish the crate with the following commands
Rust Arrow Crates:
(cd arrow-buffer && cargo publish) (cd arrow-schema && cargo publish) (cd arrow-data && cargo publish) (cd arrow-array && cargo publish) (cd arrow-select && cargo publish) (cd arrow-ord && cargo publish) (cd arrow-cast && cargo publish) (cd arrow-ipc && cargo publish) (cd arrow-csv && cargo publish) (cd arrow-json && cargo publish) (cd arrow-avro && cargo publish) (cd arrow-arith && cargo publish) (cd arrow-string && cargo publish) (cd arrow-row && cargo publish) (cd arrow-pyarrow && cargo publish) (cd arrow && cargo publish) (cd arrow-avro && cargo publish) (cd arrow-flight && cargo publish) (cd parquet-variant && cargo publish) (cd parquet-variant-json && cargo publish) (cd parquet-variant-compute && cargo publish) (cd parquet-geospatial && cargo publish) (cd parquet && cargo publish) (cd parquet_derive && cargo publish) (cd arrow-integration-test && cargo publish)