| .. Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| .. http://www.apache.org/licenses/LICENSE-2.0 |
| |
| .. Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| |
| ************************ |
| Apache Airflow Providers |
| ************************ |
| |
| .. contents:: :local: |
| |
| What is a provider? |
| =================== |
| |
| Airflow 2.0 introduced the concept of providers. Providers are packages that contain integrations with |
| external systems. They are meant to extend capabilities of the core "Apache Airflow". Thus they are |
| part of the vision of Airflow-as-a-Platform - where the Airflow Core provides basic data-workflow scheduling |
| and management capabilities and can be extended by implementing Open APIs Airflow supports, adding |
| Plugins that can add new features to the Core, and adding Providers that allow to interact with external |
| systems. |
| |
| The providers are released separately from the core Airflow and they are versioned independently. The |
| ways how providers can extend the Airflow Core, including the types of providers, can be found at the |
| `Providers page <https://airflow.apache.org/docs/apache-airflow-providers/index.html>`_. You can also find |
| out there, how you can create your own provider. |
| |
| Providers can be maintained and released by the Airflow community or by 3rd-party teams. In any case - |
| whether community-managed, or 3rd-party managed - they are released independently of the Airflow Core package. |
| |
| When community releases the Airflow Core, it is released together with constraints, those constraints use |
| the latest released version of providers, and our published convenience images contain a subset of most |
| popular community providers. However our users are free to upgrade and downgrade providers independently of |
| the Airflow Core version as they see fit, as long as it does not cause conflicting dependencies. |
| |
| You can read more about it in the |
| `Installation and upgrade scenarios <https://airflow.apache.org/docs/apache-airflow/stable/installation/installing-from-pypi.html#installation-and-upgrade-scenarios>`_ |
| chapter of our user documentation. |
| |
| Community managed providers |
| =========================== |
| |
| When providers are accepted by the community, the process of managing and releasing them must follow the |
| Apache Software Foundation rules and policies. This is especially, about accepting contributions and |
| releasing new versions of the providers. This means that the code changes in the providers must be |
| reviewed by Airflow committers and merged when they are accepted by them. Also we must have sufficient |
| test coverage and documentation that allow us to maintain the providers, and our users to use them. |
| |
| The providers - their latest version in "main" branch of airflow repository - are installed and tested together |
| with other community providers and one of the key properties of the community providers is that the latest |
| version of providers contribute their dependencies to constraints of Airflow, published when Airflow Core is |
| released. This means that when users are using constraints published by Airflow, they can install all |
| the providers together and they are more likely to not interfere with each other, especially they should |
| be able to be installed together, without conflicting dependencies. This allows to add an optional |
| "extra" to Airflow for each provider, so that the providers can be installed together with Airflow by |
| specifying the "extra" in the installation command. |
| |
| Because of the constraint and potential conflicting dependencies, the community providers have to be regularly |
| updated and the community might decide to suspend releases of a provider if we find out that we have trouble |
| with updating the dependencies, or if we find out that the provider is not compatible with other more |
| popular providers and when the popular providers are limited by the constraints of the less popular ones. |
| See the section below for more details on suspending releases of the community providers. |
| |
| List of all available community providers is available at the `Providers index <https://airflow.apache.org/docs/>`_. |
| |
| |
| Community providers lifecycle |
| ============================= |
| |
| This document describes the complete life-cycle of community providers - from inception and approval to |
| Airflow main branch to being decommissioned and removed from the main branch in Airflow repository. |
| |
| .. note:: |
| |
| Technical details on how to manage lifecycle of providers are described in the document: |
| |
| `Managing provider's lifecycle <https://github.com/apache/airflow/blob/main/airflow/providers/MANAGING_PROVIDERS_LIFECYCLE.rst>`_ |
| |
| |
| Accepting new community providers |
| --------------------------------- |
| |
| Accepting new community providers should be a deliberate process that requires ``[DISCUSSION]`` |
| followed by ``[VOTE]`` thread at the airflow `devlist <https://airflow.apache.org/community/#mailing-list>`_. |
| |
| In case the provider is integration with an open-source software rather than service we can relax the vote |
| procedure a bit. Particularly if the open-source software is an Apache Software Foundation, |
| Linux Software Foundation or similar organisation with well established governance processes that are not |
| strictly vendor-controlled, and when the software is well established an popular, it might be enough to |
| have a good and complete PR of the provider, ideally with a great test coverage, including integration tests, |
| and documentation. Then it should be enough to request the provider acceptance by a ``[LAZY CONSENSUS]`` mail |
| on the devlist and assuming such lazy consensus is not objected by anyone in the community, the provider |
| might be merged. |
| |
| For service providers, the ``[DISCUSSION]`` thread is aimed to gather information about the reasons why |
| the one who proposes the new provider thinks it should be accepted by the community. Maintaining the provider |
| in the community is a burden. Contrary to many people's beliefs, code is often liability rather than asset, |
| and accepting the code to be managed by the community, especially when it involves significant effort on |
| maintenance is often undesired, especially that the community consists of volunteers. There must be a really |
| good reason why we would believe that the provider is better to be maintained by the community if there |
| are 3rd-party teams that can be paid to manage it on their own. We have to believe that the current |
| community interest is in managing the provider and that enough volunteers in the community will be |
| willing to maintain it in the future in order to accept the provider. |
| |
| The ``[VOTE]`` thread is aimed to gather votes from the community on whether the provider should be accepted |
| or not and it follows the usual Apache Software Foundation voting rules concerning |
| `Votes on Code Modification <https://www.apache.org/foundation/voting.html#votes-on-code-modification>`_ |
| |
| The Ecosystem page and registries, and own resources of the 3rd-party teams are the best places to increase |
| visibility that such providers exist, so there is no "great" visibility achieved by getting the provider in |
| the community. Also it is often easier to advertise and promote usage of the provider by the service providers |
| themselves when they own, manage and release their provider, especially when they can synchronize releases |
| of their provider with new feature, the service might get added. |
| |
| Community providers release process |
| ----------------------------------- |
| |
| The community providers are released regularly (usually every 2 weeks) in batches consisting of any providers |
| that need to be released because they changed since last release. The release manager decides which providers |
| to include and whether some or all providers should be released (see the next chapter about upgrading the |
| minimum version of Airflow for example the case where we release all active meaning non-suspended providers, |
| together in a single batch). Also Release Manager decides on the version bump of the provider (depending on |
| classification, whether there are breaking changes, new features or just bugs comparing to previous version). |
| |
| Upgrading Minimum supported version of Airflow |
| ---------------------------------------------- |
| |
| One of the important limitations of the Providers released by the community is that we introduce the limit |
| of a minimum supported version of Airflow. The minimum version of Airflow is the ``MINOR`` version (2.4, 2.5 etc.) |
| indicating that the providers might use features that appeared in this release. The default support timespan |
| for the minimum version of Airflow (there could be justified exceptions) is that we increase the minimum |
| Airflow version to the next MINOR release, when 12 months passed since the first release for the |
| MINOR version of Airflow. |
| |
| For example this means that by default we upgrade the minimum version of Airflow supported by providers |
| to 2.8.0 in the first Provider's release after 18th of August 2024. 18th of August 2023 is the date when the |
| first ``PATCHLEVEL`` of 2.7 (2.7.0) has been released. |
| |
| When we increase the minimum Airflow version, this is not a reason to bump ``MAJOR`` version of the providers |
| (unless there are other breaking changes in the provider). The reason for that is that people who use |
| older version of Airflow will not be able to use that provider (so it is not a breaking change for them) |
| and for people who are using supported version of Airflow this is not a breaking change on its own - they |
| will be able to use the new version without breaking their workflows. When we upgraded min-version to |
| 2.2+, our approach was different but as of 2.3+ upgrade (November 2022) we only bump ``MINOR`` version of the |
| provider when we increase minimum Airflow version. |
| |
| Increasing the minimum version ot the Providers is one of the reasons why 3rd-party provider maintainers |
| might want to maintain their own providers - as they can decide to support older versions of Airflow. |
| |
| 3rd-parties relation to community providers |
| ------------------------------------------- |
| |
| Providers, can (and it is recommended for 3rd-party services) also be maintained and released by 3rd parties, |
| but for multiple reasons we might decide to keep those providers as community managed providers - mostly |
| due to prevalence and popularity of the 3rd-party services and use cases they serve among our community. There |
| are however certain conditions and expectations we have in order. |
| |
| There is no difference between the community and 3rd party providers - they have all the same capabilities |
| and limitations. The consensus in the Airflow community is that usually it is better for the community and |
| for the health of the provider to be managed by the 3rd party team, rather than by the Airflow community. |
| This is especially in case the provider concerns 3rd-party service that has a team that can manage provider |
| on their own. For the Airflow community, managing and releasing a 3rd-party provider that we cannot test |
| and verify is a lot of effort and uncertainty, especially including the cases where the external service is |
| live and going to evolve in the future, and it is better to let the 3rd party team manage it, |
| as they can better keep pace with the changes in the service. |
| |
| Information about such 3rd-party providers are usually published at the |
| `Ecosystem: plugins and providers <https://airflow.apache.org/ecosystem/#third-party-airflow-plugins-and-providers>`_ |
| page of the Airflow website and we encourage the service providers to publish their providers there. You can also |
| find a 3rd-party registries of such providers, that you can use if you search for existing providers (they |
| are also listed at the "Ecosystem" page in the same chapter) |
| |
| While we already have - historically - a number of 3rd-party service providers managed by the community, |
| most of those services have dedicated teams that keep an eye on the community providers and not only take |
| active part in managing them (see mixed-governance model below), but also provide a way that we can |
| verify whether the provider works with the latest version of the service via dashboards that show |
| status of System Tests for the provider. This allows us to have a high level of confidence that when we |
| release the provider it works with the latest version of the service. System Tests are part of the Airflow |
| code, but they are executed and verified by those 3rd party service teams. We are working with the 3rd |
| party service teams (who are often important stakeholders of the Apache Airflow project) to add dashboards |
| for the historical providers that are managed by the community, and current set of Dashboards can be also |
| found at the |
| `Ecosystem: system test dashboards <https://airflow.apache.org/ecosystem/#airflow-provider-system-test-dashboards>`_ |
| |
| Mixed governance model for 3rd-party related community providers |
| ---------------------------------------------------------------- |
| |
| Providers are often connected with some stakeholders that are vitally interested in maintaining backwards |
| compatibilities in their integrations (for example cloud providers, or specific service providers). But, |
| we are also bound with the `Apache Software Foundation release policy <https://www.apache.org/legal/release-policy.html>`_ |
| which describes who releases, and how to release the ASF software. The provider's governance model is something we name |
| ``mixed governance`` - where we follow the release policies, while the burden of maintaining and testing |
| the cherry-picked versions is on those who commit to perform the cherry-picks and make PRs to older |
| branches. |
| |
| The "mixed governance" (optional, per-provider) means that: |
| |
| * The Airflow Community and release manager decide when to release those providers. |
| This is fully managed by the community and the usual release-management process following the |
| `Apache Software Foundation release policy <https://www.apache.org/legal/release-policy.html>`_ |
| * The contributors (who might or might not be direct stakeholders in the provider) will carry the burden |
| of cherry-picking and testing the older versions of providers. |
| * There is no "selection" and acceptance process to determine which version of the provider is released. |
| It is determined by the actions of contributors raising the PR with cherry-picked changes and it follows |
| the usual PR review process where maintainer approves (or not) and merges (or not) such PR. Simply |
| speaking - the completed action of cherry-picking and testing the older version of the provider make |
| it eligible to be released. Unless there is someone who volunteers and perform the cherry-picking and |
| testing, the provider is not released. |
| * Branches to raise PR against are created when a contributor commits to perform the cherry-picking |
| (as a comment in PR to cherry-pick for example) |
| |
| Usually, community effort is focused on the most recent version of each provider. The community approach is |
| that we should rather aggressively remove deprecations in "major" versions of the providers - whenever |
| there is an opportunity to increase major version of a provider, we attempt to remove all deprecations. |
| However, sometimes there is a contributor (who might or might not represent stakeholder), |
| willing to make their effort on cherry-picking and testing the non-breaking changes to a selected, |
| previous major branch of the provider. This results in releasing at most two versions of a |
| provider at a time: |
| |
| * potentially breaking "latest" major version |
| * selected past major version with non-breaking changes applied by the contributor |
| |
| Cherry-picking such changes follows the same process for releasing Airflow |
| patch-level releases for a previous minor Airflow version. Usually such cherry-picking is done when |
| there is an important bugfix and the latest version contains breaking changes that are not |
| coupled with the bugfix. Releasing them together in the latest version of the provider effectively couples |
| them, and therefore they're released separately. The cherry-picked changes have to be merged by the committer following the usual rules of the |
| community. |
| |
| There is no obligation to cherry-pick and release older versions of the providers. |
| The community continues to release such older versions of the providers for as long as there is an effort |
| of the contributors to perform the cherry-picks and carry-on testing of the older provider version. |
| |
| The availability of stakeholder that can manage "service-oriented" maintenance and agrees to such a |
| responsibility, will also drive our willingness to accept future, new providers to become community managed. |
| |
| Suspending releases for providers |
| --------------------------------- |
| |
| In case a provider is found to require old dependencies that are not compatible with upcoming versions of |
| the Apache Airflow or with newer dependencies required by other providers, the provider's release |
| process can be suspended. |
| |
| This means: |
| |
| * The provider's state in ``provider.yaml`` is set to "suspended" |
| * No new releases of the provider will be made until the problem with dependencies is solved |
| * Sources of the provider remain in the repository for now (in the future we might add process to remove them) |
| * No new changes will be accepted for the provider (other than the ones that fix the dependencies) |
| * The provider will be removed from the list of Apache Airflow extras in the next Airflow release |
| (including patch-level release if it is possible/easy to cherry-pick the suspension change) |
| * Tests of the provider will not be run on our CI (in main branch) |
| * Dependencies of the provider will not be installed in our main branch CI image nor included in constraints |
| * We can still decide to apply security fixes to released providers - by adding fixes to the main branch |
| but cherry-picking, testing and releasing them in the patch-level branch of the provider similar to the |
| mixed governance model described above. |
| |
| The suspension may be triggered by any committer after the following criteria are met: |
| |
| * The maintainers of dependencies of the provider are notified about the issue and are given a reasonable |
| time to resolve it (at least 1 week) |
| * Other options to resolve the issue have been exhausted and there are good reasons for upgrading |
| the old dependencies in question |
| * Explanation why we need to suspend the provider is stated in a public discussion in the devlist. Followed |
| by ``[LAZY CONSENSUS]`` or ``[VOTE]`` discussion at the devlist (with the majority of the binding votes |
| agreeing that we should suspend the provider) |
| |
| The suspension will be lifted when the dependencies of the provider are made compatible with the Apache |
| Airflow and with other providers - by merging a PR that removes the suspension and succeeds. |
| |
| Removing community providers |
| ---------------------------- |
| |
| The providers can be removed from main branch of Airflow when the community agrees that there should be no |
| more updates to the providers done by the community - except maybe potentially security fixes found. There |
| might be various reasons for the providers to be removed: |
| |
| * the service they connect to is no longer available |
| * the dependencies for the provider are not maintained anymore and there is no viable alternative |
| * there is another, more popular provider that supersedes community provider |
| * etc. etc. |
| |
| Each case of removing provider should be discussed individually and separate ``[VOTE]`` thread should start, |
| where regular rules for code modification apply (following the |
| `Apache Software Foundation voting rules <https://www.apache.org/foundation/voting.html#votes-on-code-modification>`_). |
| In cases where the reasons for removal are ``obvious``, and discussed before, also ``[LAZY CONSENSUS]`` thread |
| can be started. Generally speaking a discussion thread ``[DISCUSS]`` is advised before such removal and |
| sufficient time should pass (at least a week) to give a chance for community members to express their |
| opinion on the removal. |
| |
| There are the following consequences (or lack of them) of removing the provider: |
| |
| * One last release of the provider is done with documentation updated informing that the provider is no |
| longer maintained by the Apache Airflow community - linking to this page. This information should also |
| find its way to the package documentation and consequently - to the description of the package in PyPI. |
| * An ``[ANNOUNCE]`` thread is sent to the devlist and user list announcing removal of the provider |
| * The released provider packages remain available on PyPI and in the |
| `Archives <https://archive.apache.org/dist/airflow/providers/>`_ of the Apache |
| Software Foundation, while they are removed from the |
| `Downloads <https://downloads.apache.org/airflow/providers/>`_ . |
| Also it remains in the Index of the Apache Airflow Providers documentation at |
| `Airflow Documentation <https://airflow.apache.org/docs/>`_ with note ``(not maintained)`` next to it. |
| * The code of the provider is removed from ``main`` branch of the Apache Airflow repository - including |
| the tests and documentation. It is no longer built in CI and dependencies of the provider no longer |
| contribute to the CI image/constraints of Apache Airflow for development and future ``MINOR`` release. |
| * The provider is removed from the list of Apache Airflow extras in the next ``MINOR`` Airflow release |
| * The dependencies of the provider are removed from the constraints of the Apache Airflow |
| (and the constraints are updated in the next ``MINOR`` release of Airflow) |
| * In case of confirmed security issues that need fixing that are reported to the provider after it has been |
| removed, there are two options: |
| * in case there is a viable alternative or in case the provider is anyhow not useful to be installed, we |
| might issue advisory to the users to remove the provider (and use alternatives if applicable) |
| * in case the users might still need the provider, we still might decide to release new version of the |
| provider with security issue fixed, starting from the source code in Git history where the provider was |
| last released. This however, should only be done in case there are no viable alternatives for the users. |
| * Removed provider might be re-instated as maintained provider, but it needs to go through the regular process |
| of accepting new provider described above. |