blob: e5802b24c41d1683ea5afee6f5f8c8cbdde0466c [file] [log] [blame]
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
.. contents:: :local:
Why non-standard pull request workflow?
---------------------------------------
This document describes the Pull Request Workflow we've implemented in Airflow. The workflow is slightly
more complex than regular workflow you might encounter in most of the projects because after experiencing
some huge delays in processing queues in October 2020 with GitHub Actions, we've decided to optimize the
workflow to minimize the use of GitHub Actions build time by utilising selective approach on which tests
and checks in the CI system are run depending on analysis of which files changed in the incoming PR and
allowing the Committers to control the scope of the tests during the approval/review process.
Just to give a bit of context, we started off with the approach that we always run all tests for all the
incoming PRs, however due to our matrix of tests growing, this approach did not scale with the increasing
number of PRs and when we had to compete with other Apache Software Foundation projects for the 180
slots that are available for the whole organization. More Apache Software Foundation projects started
to use GitHub Actions and we've started to experience long queues when our jobs waited for free slots.
We approached the problem by:
1) Improving mechanism of cancelling duplicate workflow runs more efficiently in case of queue conditions
(duplicate workflow runs are generated when someone pushes a fixup quickly - leading to running both
out-dated and current run to completion, taking precious slots. This has been implemented by improving
`cancel-workflow-run <https://github.com/potiuk/cancel-workflow-runs/>`_ action we are using. In version
4.1 it got a new feature of cancelling all duplicates even if there is a long queue of builds.
2) Heavily decreasing strain on the GitHub Actions jobs by introducing selective checks - mechanism
to control which parts of the tests are run during the tests. This is implemented by the
``scripts/ci/selective_ci_checks.sh`` script in our repository. This script analyses which part of the
code has changed and based on that it sets the right outputs that control which tests are executed in
the ``Tests`` workflow, and whether we need to build CI images necessary to run those steps. This allowed to
heavily decrease the strain especially for the Pull Requests that were not touching code (in which case
the builds can complete in < 2 minutes) but also by limiting the number of tests executed in PRs that do
not touch the "core" of Airflow, or only touching some - standalone - parts of Airflow such as
"Providers", "WWW" or "CLI". This solution is not yet perfect as there are likely some edge cases but
it is easy to maintain and we have an escape-hatch - all the tests are always executed in main pushes,
so contributors can easily spot if there is a "missed" case and fix it - both by fixing the problem and
adding those exceptions to the code. More about it can be found in `Selective checks <SELECTIVE_CHECKS.md>`_
3) Even more optimisation came from limiting the scope of tests to only "default" matrix parameters. So far
in Airflow we always run all tests for all matrix combinations. The primary matrix components are:
* Python versions (currently 3.6, 3.7, 3.8, 3.9)
* Backend types (currently MySQL/Postgres)
* Backed version (currently MySQL 5.7, MySQL 8, Postgres 13
We've decided that instead of running all the combinations of parameters for all matrix component we will
only run default values (Python 3.6, Mysql 5.7, Postgres 13) for all PRs which are not approved yet by
the committers. This has a nice effect, that full set of tests (though with limited combinations of
the matrix) are still run in the CI for every Pull Request that needs tests at all - allowing the
contributors to make sure that their PR is "good enough" to be reviewed.
Even after approval, the automated workflows we've implemented, check if the PR seems to need
"full test matrix" and provide helpful information to both contributors and committers in the form of
explanatory comments and labels set automatically showing the status of the PR. Committers have still
control whether they want to merge such requests automatically or ask for rebase or re-run the tests
and run "full tests" by applying the "full tests needed" label and re-running such request.
The "full tests needed" label is also applied automatically after approval when the change touches
the "core" of Airflow - also a separate check is added to the PR so that the "merge" button status
will indicate to the committer that full tests are still needed. The committer might still decide,
whether to merge such PR without the "full matrix". The "escape hatch" we have - i.e. running the full
matrix of tests in the "merge push" will enable committers to catch and fix such problems quickly.
More about it can be found in `Approval workflow and Matrix tests <#approval-workflow-and-matrix-tests>`_
chapter.
4) We've also applied (and received) funds to run self-hosted runners. They are used for ``main`` runs
and whenever the PRs are done by one of the maintainers. Maintainers can force using Public GitHub runners
by applying "use public runners" label to the PR before submitting it.
Approval Workflow and Matrix tests
----------------------------------
As explained above the approval and matrix tests workflow works according to the algorithm below:
1) In case of "no-code" changes - so changes that do not change any of the code or environment of
the application, no test are run (this is done via selective checks). Also no CI/PROD images are
build saving extra minutes. Such build takes less than 2 minutes currently and only few jobs are run
which is a very small fraction of the "full build" time.
2) When new PR is created, only a "default set" of matrix test are running. Only default
values for each of the parameters are used effectively limiting it to running matrix builds for only
one python version and one version of each of the backends. In this case only one CI and one PROD
image is built, saving precious job slots. This build takes around 50% less time than the "full matrix"
build.
3) When such PR gets approved, the system further analyses the files changed in this PR and further
decision is made that should be communicated to both Committer and Reviewer.
3a) In case of "no-code" builds, a message is communicated that the PR is ready to be merged and
no tests are needed.
.. image:: images/pr/pr-no-tests-needed-comment.png
:align: center
:alt: No tests needed for "no-code" builds
3b) In case of "non-core" builds a message is communicated that such PR is likely OK to be merged as is with
limited set of tests, but that the committer might decide to re-run the PR after applying
"full tests needed" label, which will trigger full matrix build for tests for this PR. The committer
might make further decision on what to do with this PR.
.. image:: images/pr/pr-likely-ok-to-merge.png
:align: center
:alt: Likely ok to merge the PR with only small set of tests
3c) In case of "core" builds (i. e. when the PR touches some "core" part of Airflow) a message is
communicated that this PR needs "full test matrix", the "full tests needed" label is applied
automatically and either the contributor might rebase the request to trigger full test build or the
committer might re-run the build manually to trigger such full test rebuild. Also a check "in-progress"
is added, so that the committer realises that the PR is not yet "green to merge". Pull requests with
"full tests needed" label always trigger the full matrix build when rebased or re-run so if the
PR gets rebased, it will continue triggering full matrix build.
.. image:: images/pr/pr-full-tests-needed.png
:align: center
:alt: Full tests are needed for the PR
4) If this or another committer "request changes" in a previously approved PR with "full tests needed"
label, the bot automatically removes the label, moving it back to "run only default set of parameters"
mode. For PRs touching core of airflow once the PR gets approved back, the label will be restored.
If it was manually set by the committer, it has to be restored manually.
.. note:: Note that setting the labels and adding comments might be delayed, due to limitation of GitHub Actions,
in case of queues, processing of Pull Request reviews might take some time, so it is advised not to merge
PR immediately after approval. Luckily, the comments describing the status of the PR trigger notifications
for the PRs and they provide good "notification" for the committer to act on a PR that was recently
approved.
The PR approval workflow is possible thanks to two custom GitHub Actions we've developed:
* `Get workflow origin <https://github.com/potiuk/get-workflow-origin/>`_
* `Label when approved <https://github.com/TobKed/label-when-approved-action>`_
Next steps
----------
We are planning to also propose the approach to other projects from Apache Software Foundation to
make it a common approach, so that our effort is not limited only to one project.
Discussion about it in `this discussion <https://lists.apache.org/thread.html/r1708881f52adbdae722afb8fea16b23325b739b254b60890e72375e1%40%3Cbuilds.apache.org%3E>`_