tree: 3c1b460d8162516ccd38d0555f12815c826a371e [path history] [tgz]
  1. __init__.py
  2. BACKPORT_PROVIDER_CHANGES_TEMPLATE.md.jinja2
  3. BACKPORT_PROVIDER_CLASSES_TEMPLATE.md.jinja2
  4. BACKPORT_PROVIDER_README_TEMPLATE.md.jinja2
  5. build_source_package.sh
  6. enter_breeze_provider_package_tests.sh
  7. get_provider_info_TEMPLATE.py.jinja2
  8. MANIFEST_TEMPLATE.in.jinja2
  9. prepare_provider_packages.py
  10. PROVIDER_CHANGES_TEMPLATE.md.jinja2
  11. PROVIDER_CLASSES_TEMPLATE.md.jinja2
  12. PROVIDER_README_TEMPLATE.md.jinja2
  13. README.md
  14. refactor_provider_packages.py
  15. remove_old_releases.py
  16. SETUP_TEMPLATE.cfg.jinja2
  17. SETUP_TEMPLATE.py.jinja2
dev/provider_packages/README.md

Table of Contents generated with DocToc

Backport packages

What the backport packages are

The Backport Provider packages are packages (per provider) that make it possible to easily use Hooks, Operators, Sensors, and Secrets from the 2.0 version of Airflow in the 1.10.* series.

The release manager prepares backport packages separately from the main Airflow Release, using breeze commands and accompanying scripts. This document provides an overview of the command line tools needed to prepare backport packages.

Content of the release notes

Each of the backport packages contains Release notes in the form of the README.md file that is automatically generated from history of the changes and code of the provider.

The script generates all the necessary information:

  • summary of requirements for each backport package
  • list of dependencies (including extras to install them) when package depends on other providers packages
  • table of new hooks/operators/sensors/protocols/secrets
  • table of moved hooks/operators/sensors/protocols/secrets with the information where they were moved from
  • changelog of all the changes to the provider package. This will be automatically updated with an incremental changelog whenever we decide to release separate packages.

The script generates two types of files:

  • BACKPORT_PROVIDERS_CHANGES_YYYY.MM.DD.md which keeps information about changes (commits) in a particular version of the provider package. The file for latest release gets updated when you iterate with the same new date/version, but it never changes automatically for already released packages. This way - just before the final release, you can manually correct the changes file if you want to remove some changes from the file.

  • README.md which is regenerated every time you run the script (unless there are no changes since the last time you generated the release notes

Note that our CI system builds the release notes for backport packages automatically with every build and current date - this way you might be sure the automated generation of the release notes continues to work. You can also preview the generated readme files (by downloading artifacts from GitHub Actions). The script does not modify the README and CHANGES files if there is no change in the repo for that provider.

Generating release notes

When you want to prepare release notes for a package, you need to run:

./breeze prepare-provider-readme [YYYY.MM.DD] <PACKAGE_ID> ...
  • YYYY.MM.DD - is the CALVER version of the package to prepare. Note that this date cannot be earlier than the already released version (the script will fail if it will be). It can be set in the future anticipating the future release date. If you do not specify date, the date will be taken from the last generated readme - the last generated CHANGES file will be updated.

  • <PACKAGE_ID> is usually directory in the airflow/providers folder (for example google but in several cases, it might be one level deeper separated with . for example apache.hive

You can run the script with multiple package names if you want to prepare several packages at the same time. Before you specify a new version, the last released version is update in case you have any bug fixes merged in the master recently, they will be automatically taken into account.

Typically, the first time you run release before release, you run it with target release.date:

./breeze prepare-provider-readme 2020.05.20 google

Then while you iterate with merges and release candidates you update the release date without providing the date (to update the existing release notes)

./breeze prepare-provider-readme google

Whenever you are satisfied with the release notes generated you can commit generated changes/new files to the repository.

Preparing backport packages

As part of preparation to Airflow 2.0 we decided to prepare backport of providers package that will be possible to install in the Airflow 1.10.*, Python 3.6+ environment. Some of those packages will be soon (after testing) officially released via PyPi, but you can build and prepare such packages on your own easily.

You build those packages in the breeze environment, so you do not have to worry about common environment.

Note that readme release notes have to be generated first, so that the package preparation script reads the latest version from the latest version of release notes prepared.

  • The provider package ids PACKAGE_ID are subdirectories in the providers directory. Sometimes they are one level deeper (apache/hive folder for example, in which case PACKAGE_ID uses “.” to separate the folders (for example Apache Hive's PACKAGE_ID is apache.hive ). You can see the list of all available providers by running:
./breeze prepare-provider-packages -- --help

The examples below show how you can build selected packages, but you can also build all packages by omitting the package ids altogether.

By default, you build only wheel packages, but you can use --package-format both to generate both wheel and sdist packages, or --package-format sdist to only generate sdist packages.

  • To build the release candidate packages for SVN Apache upload run the following command:
./breeze prepare-provider-packages --package-format both --version-suffix-for-svn=rc1 [PACKAGE_ID] ...

for example:

./breeze prepare-provider-packages --package-format both  --version-suffix-for-svn=rc1 http ...
  • To build the release candidate packages for PyPI upload run the following command:
./breeze prepare-provider-packages --package-format both --version-suffix-for-pypi=rc1 [PACKAGE_ID] ...

for example:

./breeze prepare-provider-packages --package-format both --version-suffix-for-pypi=rc1 http ...
  • To build the final release packages run the following command:
./breeze prepare-provider-packages [--package-format PACKAGE_FORMAT] [PACKAGE_ID] ...

Where PACKAGE_FORMAT might be one of : wheel, sdist, both (wheel is the default format)

for example:

./breeze prepare-provider-packages --package-format both http ...
  • For each package, this creates a wheel package and source distribution package in your dist folder with names following the patterns:

    • apache_airflow_backport_providers_<PROVIDER>_YYYY.[M]M.[D]D[suffix]-py3-none-any.whl
    • apache-airflow-backport-providers-<PROVIDER>-YYYY.[M]M.[D]D[suffix].tar.gz

Note! Even if we always use the two-digit month and day when generating the readme files, the version in PyPI does not contain the leading 0s in version name - therefore the artifacts generated also do not container the leading 0s.

  • You can install the .whl packages with pip install <PACKAGE_FILE>

Testing provider package scripts

The backport packages importing and tests execute within the “CI” environment of Airflow -the same image that is used by Breeze. They however require special mounts (no sources of Airflow mounted to it) and possibility to install all extras and packages in order to test importability of all the packages. It is rather simple but requires some semi-automated process:

Backport packages

  1. Prepare backport packages
./breeze --backports prepare-provider-packages --package-format both

This prepares all backport packages in the “dist” folder

  1. Enter the container:
export INSTALL_AIRFLOW_VERSION=1.10.12
export BACKPORT_PACKAGES="true"

./dev/provider_packages/enter_breeze_provider_package_tests.sh

(the rest of it is in the container)

  1. [IN CONTAINER] Install all remaining dependencies and reinstall airflow 1.10:
cd /airflow_sources

pip install ".[devel_all]"

pip install "apache-airflow==${INSTALL_AIRFLOW_VERSION}"

cd
  1. [IN CONTAINER] Install the provider packages from /dist
pip install /dist/apache_airflow_backport_providers_*.whl
  1. [IN CONTAINER] Check the installation folder for providers:
python3 <<EOF 2>/dev/null
import airflow.providers;
path=airflow.providers.__path__
for p in path._path:
    print(p)
EOF
  1. [IN CONTAINER] Check if all the providers can be imported python3 /opt/airflow/dev/import_all_classes.py --path <PATH_REPORTED_IN_THE_PREVIOUS_STEP>

Regular packages

  1. Prepare regular packages
./breeze prepare-provider-packages --package-format both

This prepares all backport packages in the “dist” folder

  1. Prepare airflow package from sources
python setup.py compile_assets sdist bdist_wheel
rm -rf -- *egg-info*

This prepares airflow package in the “dist” folder

  1. Enter the container:
export INSTALL_AIRFLOW_VERSION="wheel"
unset BACKPORT_PACKAGES

./dev/provider_packages/enter_breeze_provider_package_tests.sh

(the rest of it is in the container)

  1. [IN CONTAINER] Install apache-beam.
pip install apache-beam[gcp]
  1. [IN CONTAINER] Install the provider packages from /dist
pip install --no-deps /dist/apache_airflow_providers_*.whl

Note! No-deps is because we are installing the version installed from wheel package.

  1. [IN CONTAINER] Check the installation folder for providers:
python3 <<EOF 2>/dev/null
import airflow.providers;
path=airflow.providers.__path__
for p in path._path:
    print(p)
EOF
  1. [IN CONTAINER] Check if all the providers can be imported python3 /opt/airflow/dev/import_all_classes.py --path <PATH_REPORTED_IN_THE_PREVIOUS_STEP>