Table of Contents generated with DocToc
The Provider packages are separate packages (one package per provider) that implement integrations with external services for Airflow in the form of installable Python packages.
The Release Manager prepares packages separately from the main Airflow Release, using breeze
commands and accompanying scripts. This document provides an overview of the command line tools needed to prepare the packages.
First thing that release manager has to do is to change version of the provider to a target version. Each provider has a provider.yaml
file that, among others, stores information about provider versions. When you attempt to release a provider you should update that information based on the changes for the provider, and it's CHANGELOG.rst
. It might be that CHANGELOG.rst
already contains the right target version. This will be especially true if some changes in the provider add new features (then minor version is increased) or when the changes introduce backwards-incompatible, breaking change in the provider (then major version is incremented). Committers, when approving and merging changes to the providers, should pay attention that the CHANGELOG.rst
is updated whenever anything other than bugfix is added.
If there are no new features or breaking changes, the release manager should simply increase the patch-level version for the provider.
The new version should be first on the list.
Each of the provider packages contains Release notes in the form of the CHANGELOG.rst
file that is automatically generated from history of the changes and code of the provider. They are stored in the documentation directory. The README.md
file generated during package preparation is not stored anywhere in the repository - it contains however link to the Changelog generated.
The README.rst
file contains the following information:
The index.rst
stored in the docs\apache-airflow-providers-<PROVIDER>
folder contains:
provider.yaml
file.When you want to prepare release notes for a package, you need to run:
./breeze prepare-provider-documentation <PACKAGE_ID> ...
airflow/providers
folder (for example google
but in several cases, it might be one level deeper separated with .
for example apache.hive
The index.rst is updated automatically in the docs/apache-airflow-providers-<provider>
folder
You can run the script with multiple package names if you want to prepare several packages at the same time. By default, the command runs in interactive mode when you can decide one-by-one whether the package documentation should be prepared or not.
As soon as you are satisfied with the release notes generated you can commit generated changes/new files to the repository.
You should manually update generated changelog and classify the commits updated and re-run the prepare-documentation-readme
after all the changes.
You can repeat this several times, the changes generated will automatically include new commits that appeared since last run.
You can also run it in non-interactive mode adding --non-interactive
flag.
You build the packages in the breeze environment, so you do not have to worry about common environment.
Note that readme release notes have to be generated first, so that the package preparation script reads the provider.yaml
.
providers
directory. Sometimes they are one level deeper (apache/hive
folder for example, in which case PACKAGE_ID uses “.” to separate the folders (for example Apache Hive's PACKAGE_ID is apache.hive
). You can see the list of all available providers by running:./breeze prepare-provider-packages -- --help
The examples below show how you can build selected packages, but you can also build all packages by omitting the package ids altogether.
By default, you build both
packages, but you can use --package-format wheel
to generate only wheel package, or --package-format sdist
to only generate sdist package.
./breeze prepare-provider-packages --version-suffix-for-svn=rc1 [PACKAGE_ID] ...
for example:
./breeze prepare-provider-packages --version-suffix-for-svn=rc1 http ...
./breeze prepare-provider-packages --version-suffix-for-pypi=rc1 [PACKAGE_ID] ...
for example:
./breeze prepare-provider-packages --version-suffix-for-pypi=rc1 http ...
./breeze prepare-provider-packages [--package-format PACKAGE_FORMAT] [PACKAGE_ID] ...
Where PACKAGE_FORMAT might be one of : wheel
, sdist
, both
(wheel
is the default format)
for example:
./breeze prepare-provider-packages http ...
For each package, this creates a wheel package and source distribution package in your dist
folder with names following the patterns:
apache_airflow_providers_<PROVIDER>_YYYY.[M]M.[D]D[suffix]-py3-none-any.whl
apache-airflow-providers-<PROVIDER>-YYYY.[M]M.[D]D[suffix].tar.gz
Note! Even if we always use the two-digit month and day when generating the readme files, the version in PyPI does not contain the leading 0s in version name - therefore the artifacts generated also do not container the leading 0s.
pip install <PACKAGE_FILE>
You can add --verbose
flag if you want to see detailed commands executed by the script.
The provider preparation is done using Breeze
development environment and CI image. This way we have common environment for package preparation, and we can easily verify if provider packages are OK and can be installed for released versions of Airflow (including 2.0.0 version).
The same scripts and environment is run in our CI Workflow - the packages are prepared, installed and tested using the same CI image. The tests are performed via the Production image, also in the CI workflow. Our production images are built using Airflow and Provider packages prepared on the CI so that they are as close to what users will be using when they are installing from PyPI. Our scripts prepare wheel
and sdist
packages for both - airflow and provider packages and install them during building of the images. This is very helpful in case of testing new providers that do not yet have PyPI package released, but also it allows checking if provider's authors did not make breaking changes.
All classes from all providers must be imported - otherwise our CI will fail. Also, verification of the image is performed where expected providers should be installed (for production image) and providers should be discoverable, as well as pip check
with all the dependencies has to succeed.
You might want to occasionally modify the preparation scripts for providers. They are all present in the dev/provider_packages
folder. There are the Breeze
commands above - they perform the sequence of those steps automatically, but you can manually run the scripts as follows to debug them:
The commands are best to execute in the Breeze environment as it has all the dependencies installed, Examples below describe that. However, for development you might run them in your local development environment as it makes it easier to debug. Just make sure you install your development environment with ‘devel_all’ extra (make sure to use the right python version).
Note that it is best to use INSTALL_PROVIDERS_FROM_SOURCES
set totrue
, to make sure that any new added providers are not added as packages (in case they are not yet available in PyPI.
INSTALL_PROVIDERS_FROM_SOURCES="true" pip install -e ".[devel_all]" \ --constraint https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-3.6.txt
Note that you might need to add some extra dependencies to your system to install “devel_all” - many dependencies are needed to make a clean install - the Breeze
environment has all the dependencies installed in case you have problem with setting up your local virtualenv.
You can also use breeze
to prepare your virtualenv (it will print extra information if some dependencies are missing/installation fails and it will also reset your SQLite test db in the ${HOME}/airflow
directory:
./breeze initialize-local-virtualenv
You can find description of all the commands and more information about the “prepare” tool by running it with --help
./dev/provider_packages/prepare_provider_packages.py --help
You can see for example list of all provider packages:
./dev/provider_packages/prepare_provider_packages.py list-providers-packages
You can add --verbose
flag in breeze command if you want to see commands executed.
The script verifies if all provider's classes can be imported.
./dev/import_all_classes.py --path airflow/providers
It checks if all classes from provider packages can be imported.
The script verifies if all provider's classes are correctly named.
./dev/provider_packages/prepare_provider_packages.py verify-provider-classes
It checks if all provider Operators/Hooks etc. are correctly named.
The script updates documentation of the provider packages. Note that it uses airflow git and pulls the latest version of tags available in Airflow, so you need to enter Breeze with --mount-all-local-sources flag
./dev/provider_packages/prepare_provider_packages.py update-package-documentation \ --version-suffix <SUFFIX> \ <PACKAGE>
This script will fetch the latest version of airflow from Airflow's repo (it will automatically add apache-https-for-providers
remote and pull airflow (read only) from there. There is no need to setup any credentials for it.
In case version being prepared is already tagged in the repo documentation preparation returns immediately and prints warning.
You can add --verbose
flag if you want to see detailed commands executed by the script.
./dev/provider_packages/prepare_provider_packages.py update-changelog <PACKAGE>
You can add --verbose
flag if you want to see detailed commands executed by the script.
This script prepares the actual packages.
This is needed because setup tools does not clean those files and generating packages one by one without cleanup, might include artifacts from previous package to be included in the new one.
rm -rf -- *.egg-info build/
The version suffix specified here will be appended to the version retrieved from provider.yaml
. Note that this command will fail if the tag denoted by the version + suffix already exist. This means that the version was not updated since the last time it was generated. In the CI we always add ‘dev’ suffix, and we never create TAG for it, so in the CI the setup.py is generated and should never fail.
./dev/provider_packages/prepare_provider_packages.py generate-setup-files \ --version-suffix "<SUFFIX>" \ <PACKAGE>
You can add --verbose
flag if you want to see detailed commands executed by the script.
The script prepares the package after sources have been copied and setup files generated. Note that it uses airflow git and pulls the latest version of tags available in Airflow, so you need to enter Breeze with --mount-all-local-sources flag
./dev/provider_packages/prepare_provider_packages.py build-provider-packages \ --version-suffix <SUFFIX> \ <PACKAGE>
In case version being prepared is already tagged in the repo documentation preparation returns immediately and prints error. You can prepare the error regardless and build the packages even if the tag exists, by specifying --version-suffix
(for example --version-suffix dev
).
By default, you prepare both
packages, but you can add --package-format
argument and specify wheel
, sdist
to build only one of them.
The provider packages importing and tests execute within the “CI” environment of Airflow -the same image that is used by Breeze. They however require special mounts (no sources of Airflow mounted to it) and possibility to install all extras and packages in order to test if all classes can be imported. It is rather simple but requires some semi-automated process:
./breeze prepare-provider-packages
This prepares all provider packages in the “dist” folder
./breeze prepare-airflow-packages
This prepares airflow package in the “dist” folder
export USE_AIRFLOW_VERSION="wheel" export USE_PACKAGES_FROM_DIST="true" ./dev/provider_packages/enter_breeze_provider_package_tests.sh
(the rest of it is in the container)
pip install apache-beam[gcp]
pip install --no-deps /dist/apache_airflow_providers_*.whl
Note! No-deps is because we are installing the version installed from wheel package.
python3 <<EOF 2>/dev/null import airflow.providers; path=airflow.providers.__path__ for p in path._path: print(p) EOF