blob: eea9ba5ac6c6a2333fb9e976a2e77c93c1fb75db [file] [log] [blame]
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
Airflow Unit Tests
==================
All unit tests for Apache Airflow are run using `pytest <http://doc.pytest.org/en/latest/>`_.
**The outline for this document in GitHub is available at top-right corner button (with 3-dots and 3 lines).**
Writing Unit Tests
------------------
Follow the guidelines when writing unit tests:
* For standard unit tests that do not require integrations with external systems, make sure to simulate all communications.
* All Airflow tests are run with ``pytest``. Make sure to set your IDE/runners (see below) to use ``pytest`` by default.
* For tests, use standard "asserts" of Python and ``pytest`` decorators/context managers for testing
rather than ``unittest`` ones. See `pytest docs <http://doc.pytest.org/en/latest/assert.html>`__ for details.
* Use a ``pytest.mark.parametrize`` marker for tests that have variations in parameters.
See `pytest docs <https://docs.pytest.org/en/latest/how-to/parametrize.html>`__ for details.
* Use with ``pytest.warn`` to capture warnings rather than ``recwarn`` fixture. We are aiming for 0-warning in our
tests, so we run Pytest with ``--disable-warnings`` but instead we have custom warning capture system.
Handling warnings
.................
By default, in the new tests selected warnings are prohibited:
* ``airflow.exceptions.AirflowProviderDeprecationWarning``
That mean if one of this warning appear during test run and do not captured the test will failed.
.. code-block:: console
root@91e633d08aa8:/opt/airflow# pytest tests/models/test_dag.py::TestDag::test_clear_dag
...
FAILED tests/models/test_dag.py::TestDag::test_clear_dag[None-None] - airflow.exceptions.RemovedInAirflow3Warning: Calling `DAG.create_dagrun()` without an explicit data interval is deprecated
**NOTE:** As of Airflow 3.0 the test file ``tests/models/test_dag.py`` has been relocated to ``airflow-core/tests/unit/models/test_dag.py``.
For avoid this make sure:
* You do not use deprecated method, classes and arguments in your test cases;
* Your change do not affect other component, e.g. deprecate one part of Airflow Core or one of Community Supported
Providers might be a reason for new deprecation warnings. In this case changes should be also made in all affected
components in backward compatible way.
* You use ``pytest.warn`` (see `pytest doc <https://docs.pytest.org/en/latest/how-to/capture-warnings.html#warns>`__
context manager for catch warning during the test deprecated components.
Yes we still need to test legacy/deprecated stuff until it completely removed)
.. code-block:: python
def test_deprecated_argument():
with pytest.warns(AirflowProviderDeprecationWarning, match="expected warning pattern"):
SomeDeprecatedClass(foo="bar", spam="egg")
Mocking time-related functionality in tests
-------------------------------------------
Mocking sleep calls
...................
To speed up test execution and avoid unnecessary delays, you should mock sleep calls in tests or set the sleep time to 0.
If the method you're testing includes a call to ``time.sleep()`` or ``asyncio.sleep()``, mock these calls instead.
How to mock ``sleep()`` depends on how it's imported:
* If ``time.sleep`` is imported as ``import time``:
.. code-block:: python
@mock.patch("time.sleep", return_value=None)
def test_your_test():
pass
* If ``sleep`` is imported directly using ``from time import sleep``:
.. code-block:: python
@mock.patch("path.to.the.module.sleep", return_value=None)
def test_your_test():
pass
For methods that use ``asyncio`` for async sleep calls you can proceed identically.
**NOTE:** There are certain cases in which the method functioning correctly depends on actual time passing.
In those cases the test with the mock will fail. Then it's okay to leave it unmocked.
Use your judgment and prefer mocking whenever possible.
Controlling date and time
.........................
Some features rely on the current date and time, e.g a function that generates timestamps, or passing of time.
To test such features reliably, we use the ``time-machine`` library to control the system's time:
.. code-block:: python
@time_machine.travel(datetime(2025, 3, 27, 21, 58, 1, 2345), tick=False)
def test_log_message(self):
"""
The tested code uses datetime.now() to generate a timestamp.
Freezing time ensures the timestamp is predictable and testable.
"""
By setting ``tick=False``, time is frozen at the specified moment and does not advance during the test.
If you want time to progress from a fixed starting point, you can set ``tick=True``.
Airflow configuration for unit tests
------------------------------------
Some of the unit tests require special configuration set as the ``default``. This is done automatically by
adding ``AIRFLOW__CORE__UNIT_TEST_MODE=True`` to the environment variables in Pytest auto-used
fixture. This in turn makes Airflow load test configuration from the file
``airflow/config_templates/unit_tests.cfg``. Test configuration from there replaces the original
defaults from ``airflow/config_templates/config.yml``. If you want to add some test-only configuration,
as default for all tests you should add the value to this file.
You can also - of course - override the values in individual test by patching environment variables following
the usual ``AIRFLOW__SECTION__KEY`` pattern or ``conf_vars`` context manager.
Airflow unit test types
-----------------------
Airflow tests in the CI environment are split into several test types. You can narrow down which
test types you want to use in various ``breeze testing`` sub-commands in three ways:
* via specifying the ``--test-type`` when you run single test type in ``breeze testing core-tests``.
``breeze testing providers-tests`` ``breeze testing integration-tests`` commands
* via specifying space separating list of test types via ``--parallel-test-types`` or
``--excluded-parallel-test-types`` options when you run tests in parallel (in several testing commands)
Those test types are defined:
* ``Always`` - those are tests that should be always executed (always sub-folder)
* ``API`` - Tests for the Airflow API (api, api_internal, api_fastapi sub-folders)
* ``CLI`` - Tests for the Airflow CLI (cli folder)
* ``Core`` - for the core Airflow functionality (core, executors, jobs, models, ti_deps, utils sub-folders)
* ``Operators`` - tests for the operators (operators folder)
* ``WWW`` - Tests for the Airflow webserver (www folder)
* ``Providers`` - Tests for all Providers of Airflow (providers folder)
* ``Other`` - all other tests remaining after the above tests are selected
We have also tests that run "all" tests (so they do not look at the folder, but at the ``pytest`` markers
the tests are marked with to run with some filters applied.
* ``All-Postgres`` - tests that require Postgres database. They are only run when backend is Postgres (``backend("postgres")`` marker)
* ``All-MySQL`` - tests that require MySQL database. They are only run when backend is MySQL (``backend("mysql")`` marker)
* ``All-Quarantined`` - tests that are flaky and need to be fixed (``quarantined`` marker)
* ``All`` - all tests are run (this is the default)
We also have ``Integration`` tests that are running Integration tests with external software that is run
via ``--integration`` flag in ``breeze`` environment - via ``breeze testing integration-tests``.
* ``Integration`` - tests that require external integration images running in docker-compose
This is done for two reasons:
1. in order to selectively run only subset of the test types for some PRs
2. in order to allow efficient parallel test execution of the tests on Self-Hosted runners
For case 2. We can utilize memory and CPUs available on both CI and local development machines to run
test in parallel, but we cannot use pytest xdist plugin for that - we need to split the tests into test
types and run each test type with their own instance of database and separate container where the tests
in each type are run with exclusive access to their database and each test within test type runs sequentially.
By the nature of those tests - they rely on shared databases - and they update/reset/cleanup data in the
databases while they are executing.
DB and non-DB tests
-------------------
There are two kinds of unit tests in Airflow - DB and non-DB tests. This chapter describe the differences
between those two types.
Airflow non-DB tests
....................
For the Non-DB tests, they are run once for each tested Python version with ``none`` database backend (which
causes any database access to fail. Those tests are run with ``pytest-xdist`` plugin in parallel which
means that we can efficiently utilised multi-processor machines (including ``self-hosted`` runners with
8 CPUS we have to run the tests with maximum parallelism).
It's usually straightforward to run those tests in local virtualenv because they do not require any
setup or running database. They also run much faster than DB tests. You can run them with ``pytest`` command
or with ``breeze`` that has all the dependencies needed to run all tests automatically installed. Of course
you can also select just specific test or folder or module for the Pytest to collect/run tests from there,
the example below shows how to run all tests, parallelizing them with ``pytest-xdist``
(by specifying ``tests`` folder):
.. code-block:: bash
pytest airflow-core/tests --skip-db-tests -n auto
The ``--skip-db-tests`` flag will only run tests that are not marked as DB tests.
You can also run ``breeze`` command to run all the tests (they will run in a separate container,
the selected python version and without access to any database). Adding ``--use-xdist`` flag will run all
tests in parallel using ``pytest-xdist`` plugin.
You can run parallel commands via ``breeze testing core-tests`` or ``breeze testing providers-tests``
- by adding the parallel flags:
.. code-block:: bash
breeze testing core-tests --skip-db-tests --backend none --use-xdist
You can pass ``--parallel-test-type`` list of test types to execute or ``--exclude--parallel-test-types``
to exclude them from the default set:.
.. code-block:: bash
breeze testing providers-tests --run-in-parallel --skip-db-tests --backend none --parallel-test-types "Providers[google] Providers[amazon]"
Also you can enter interactive shell with ``breeze`` and run tests from there if you want to iterate
with the tests. Source files in ``breeze`` are mounted as volumes so you can modify them locally and
rerun in Breeze as you will (``-n auto`` will parallelize tests using ``pytest-xdist`` plugin):
.. code-block:: bash
breeze shell --backend none --python 3.10
> pytest airflow-core/tests --skip-db-tests -n auto
Airflow DB tests
................
Some of the tests of Airflow require a database to connect to in order to run. Those tests store and read data
from Airflow DB using Airflow's core code and it's crucial to run the tests against all real databases
that Airflow supports in order to check if the SQLAlchemy queries are correct and if the database schema is
correct.
Those tests should be marked with ``@pytest.mark.db`` decorator on one of the levels:
* test method can be marked with ``@pytest.mark.db`` decorator
* test class can be marked with ``@pytest.mark.db`` decorator
* test module can be marked with ``pytestmark = pytest.mark.db`` at the top level of the module
For the DB tests, they are run against the multiple databases Airflow support, multiple versions of those
and multiple Python versions it supports. In order to save time for testing not all combinations are
tested but enough various combinations are tested to detect potential problems.
By default, the DB tests will use sqlite and the "airflow.db" database created and populated in the
``${AIRFLOW_HOME}`` folder. You do not need to do anything to get the database created and initialized,
but if you need to clean and restart the db, you can run tests with ``-with-db-init`` flag - then the
database will be re-initialized. You can also set ``AIRFLOW__DATABASE__SQL_ALCHEMY_CONN`` environment
variable to point to supported database (Postgres, MySQL, etc.) and the tests will use that database. You
might need to run ``airflow db reset`` to initialize the database in that case.
The "non-DB" tests are perfectly fine to run when you have database around but if you want to just run
DB tests (as happens in our CI for the ``Database`` runs) you can use ``--run-db-tests-only`` flag to filter
out non-DB tests (and obviously you can specify not only on the whole ``tests`` directory but on any
folders/files/tests selection, ``pytest`` supports).
.. code-block:: bash
pytest airflow-core/tests --run-db-tests-only
You can also run DB tests with ``breeze`` dockerized environment. You can choose backend to use with
``--backend`` flag. The default is ``sqlite`` but you can also use others such as ``postgres`` or ``mysql``.
You can also select backend version and Python version to use. You can specify the ``test-type`` to run -
breeze will list the test types you can run with ``--help`` and provide auto-complete for them. Example
below runs the ``Core`` tests with ``postgres`` backend and ``3.10`` Python version
You can also run the commands via ``breeze testing core-tests`` or ``breeze testing providers-tests``
- by adding the parallel flags manually:
.. code-block:: bash
breeze testing core-tests --run-db-tests-only --backend postgres --run-in-parallel
You can pass ``--parallel-test-type`` list of test types to execute or ``--exclude--parallel-test-types``
to exclude them from the default set:.
.. code-block:: bash
breeze testing providers-tests --run-in-parallel --run-db-tests-only --parallel-test-types "Providers[google] Providers[amazon]"
Also - if you want to iterate with the tests you can enter interactive shell and run the tests iteratively -
either by package/module/test or by test type - whatever ``pytest`` supports.
.. code-block:: bash
breeze shell --backend postgres --python 3.10
> pytest airflow-core/tests --run-db-tests-only
As explained before, you cannot run DB tests in parallel using ``pytest-xdist`` plugin, but ``breeze`` has
support to split all the tests into test-types to run in separate containers and with separate databases
and you can run the tests using ``--run-in-parallel`` flag.
.. code-block:: bash
breeze testing core-tests --run-db-tests-only --backend postgres --python 3.10 --run-in-parallel
Examples of marking test as DB test
...................................
You can apply the marker on method/function/class level with ``@pytest.mark.db_test`` decorator or
at the module level with ``pytestmark = pytest.mark.db_test`` at the top level of the module.
It's up to the author to decide whether to mark the test, class, or module as "DB-test" - generally the
less DB tests - the better and if we can clearly separate the parts that are DB from non-DB, we should,
but also it's ok if few tests are marked as DB tests when they are not but they are part of the class
or module that is "mostly-DB".
Sometimes, when your class can be clearly split to DB and non-DB parts, it's better to split the class
into two separate classes and mark only the DB class as DB test.
Method level:
.. code-block:: python
import pytest
@pytest.mark.db_test
def test_add_tagging(self, sentry, task_instance): ...
Class level:
.. code-block:: python
import pytest
@pytest.mark.db_test
class TestDatabricksHookAsyncAadTokenSpOutside: ...
Module level (at the top of the module):
.. code-block:: python
import pytest
from airflow.models.baseoperator import BaseOperator
from airflow.models.dag import DAG
from airflow.ti_deps.dep_context import DepContext
from airflow.ti_deps.deps.task_concurrency_dep import TaskConcurrencyDep
pytestmark = pytest.mark.db_test
Best practices for DB tests
...........................
Usually when you add new tests you add tests "similar" to the ones that are already there. In most cases,
therefore you do not have to worry about the test type - it will be automatically selected for you by the
fact that the Test Class that you add the tests or the whole module will be marked with ``db_test`` marker.
You should strive to write "pure" non-db unit tests (i.e. DB tests) but sometimes it's just better to plug-in
the existing framework of DagRuns, Dags, Connections and Variables to use the Database directly rather
than having to mock the DB access for example. It's up to you to decide.
However, if you choose to write DB tests you have to make sure you add the ``db_test`` marker - either to
the test method, class (with decorator) or whole module (with pytestmark at the top level of the module).
In most cases when you add tests to existing modules or classes, you follow similar tests so you do not
have to do anything, but in some cases you need to decide if your test should be marked as DB test or
whether it should be changed to not use the database at all.
If your test accesses the database but is not marked properly the Non-DB test in CI will fail with this message:
.. code ::
"Your test accessed the DB but `_AIRFLOW_SKIP_DB_TESTS` is set.
Either make sure your test does not use database or mark your test with `@pytest.mark.db_test`.
How to verify if DB test is correctly classified
................................................
If you want to see if your DB test is correctly classified, you can run the test or group
of tests with ``--skip-db-tests`` flag.
You can run the all (or subset of) test types if you want to make sure all of the problems are fixed
.. code-block:: bash
breeze testing core-tests --skip-db-tests tests/your_test.py
For the whole test suite you can run:
.. code-block:: bash
breeze testing core-tests --skip-db-tests
For selected test types (example - the tests will run for ``Providers/API/CLI`` code only:
.. code-block:: bash
breeze testing providers-tests --skip-db-tests --parallel-test-types "Providers[google] Providers[amazon]"
You can also enter interactive shell with ``--skip-db-tests`` flag and run the tests iteratively
.. code-block:: bash
breeze shell --skip-db-tests
> pytest tests/your_test.py
How to make your test not depend on DB
......................................
This is tricky and there is no single solution. Sometimes we can mock-out the methods that require
DB access or objects that normally require database. Sometimes we can decide to test just single method
of class rather than more complex set of steps. Generally speaking it's good to have as many "pure"
unit tests that require no DB as possible comparing to DB tests. They are usually faster an more
reliable as well.
Special cases
.............
There are some tricky test cases that require special handling. Here are some of them:
Parameterized tests stability
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The parameterized tests require stable order of parameters if they are run via ``xdist`` - because the parameterized
tests are distributed among multiple processes and handled separately. In some cases the parameterized tests
have undefined / random order (or parameters are not hashable - for example set of enums). In such cases
the xdist execution of the tests will fail and you will get an error mentioning "Known Limitations of xdist".
You can see details about the limitation `here <https://pytest-xdist.readthedocs.io/en/latest/known-limitations.html>`_
The error in this case will look similar to:
.. code-block::
Different tests were collected between gw0 and gw7. The difference is:
The fix for that is to sort the parameters in ``parametrize``. For example instead of this:
.. code-block:: python
@pytest.mark.parametrize("status", ALL_STATES)
def test_method(): ...
do that:
.. code-block:: python
@pytest.mark.parametrize("status", sorted(ALL_STATES))
def test_method(): ...
Similarly if your parameters are defined as result of ``utcnow()`` or other dynamic method - you should
avoid that, or assign unique IDs for those parametrized tests. Instead of this:
.. code-block:: python
@pytest.mark.parametrize(
"url, expected_dag_run_ids",
[
(
f"api/v1/dags/TEST_DAG_ID/dagRuns?end_date_gte="
f"{urllib.parse.quote((timezone.utcnow() + timedelta(days=1)).isoformat())}",
[],
),
(
f"api/v1/dags/TEST_DAG_ID/dagRuns?end_date_lte="
f"{urllib.parse.quote((timezone.utcnow() + timedelta(days=1)).isoformat())}",
["TEST_DAG_RUN_ID_1", "TEST_DAG_RUN_ID_2"],
),
],
)
def test_end_date_gte_lte(url, expected_dag_run_ids): ...
Do this:
.. code-block:: python
@pytest.mark.parametrize(
"url, expected_dag_run_ids",
[
pytest.param(
f"api/v1/dags/TEST_DAG_ID/dagRuns?end_date_gte="
f"{urllib.parse.quote((timezone.utcnow() + timedelta(days=1)).isoformat())}",
[],
id="end_date_gte",
),
pytest.param(
f"api/v1/dags/TEST_DAG_ID/dagRuns?end_date_lte="
f"{urllib.parse.quote((timezone.utcnow() + timedelta(days=1)).isoformat())}",
["TEST_DAG_RUN_ID_1", "TEST_DAG_RUN_ID_2"],
id="end_date_lte",
),
],
)
def test_end_date_gte_lte(url, expected_dag_run_ids): ...
Problems with Non-DB test collection
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Sometimes, even if the whole module is marked as ``@pytest.mark.db_test``, parsing the file and collecting
tests will fail when ``--skip-db-tests`` is used because some of the imports or objects created in the
module will read the database.
Usually what helps is to move such initialization code to inside the tests or pytest fixtures (and pass
objects needed by tests as fixtures rather than importing them from the module). Similarly you might
use DB - bound objects (like Connection) in your ``parametrize`` specification - this will also fail pytest
collection. Move creation of such objects to inside the tests:
Moving object creation from top-level to inside tests. This code will break collection of tests even if
the test is marked as DB test:
.. code-block:: python
TI = TaskInstance(
task=BashOperator(task_id="test", bash_command="true", dag=DAG(dag_id="id"), start_date=datetime.now()),
run_id="fake_run",
state=State.RUNNING,
)
class TestCallbackRequest:
@pytest.mark.parametrize(
"input,request_class",
[
(CallbackRequest(full_filepath="filepath", msg="task_failure"), CallbackRequest),
(
TaskCallbackRequest(
full_filepath="filepath",
simple_task_instance=SimpleTaskInstance.from_ti(ti=TI),
is_failure_callback=True,
),
TaskCallbackRequest,
),
(
DagCallbackRequest(
full_filepath="filepath",
dag_id="fake_dag",
run_id="fake_run",
is_failure_callback=False,
),
DagCallbackRequest,
),
(
SlaCallbackRequest(
full_filepath="filepath",
dag_id="fake_dag",
),
SlaCallbackRequest,
),
],
)
def test_from_json(self, input, request_class): ...
Instead - this will not break collection. The ``TaskInstance`` is not initialized when the module is parsed,
it will only be initialized when the test gets executed because we moved initialization of it from
top level / parametrize to inside the test:
.. code-block:: python
pytestmark = pytest.mark.db_test
class TestCallbackRequest:
@pytest.mark.parametrize(
"input,request_class",
[
(CallbackRequest(full_filepath="filepath", msg="task_failure"), CallbackRequest),
(
None, # to be generated when test is run
TaskCallbackRequest,
),
(
DagCallbackRequest(
full_filepath="filepath",
dag_id="fake_dag",
run_id="fake_run",
is_failure_callback=False,
),
DagCallbackRequest,
),
(
SlaCallbackRequest(
full_filepath="filepath",
dag_id="fake_dag",
),
SlaCallbackRequest,
),
],
)
def test_from_json(self, input, request_class):
if input is None:
ti = TaskInstance(
task=BashOperator(
task_id="test", bash_command="true", dag=DAG(dag_id="id"), start_date=datetime.now()
),
run_id="fake_run",
state=State.RUNNING,
)
input = TaskCallbackRequest(
full_filepath="filepath",
simple_task_instance=SimpleTaskInstance.from_ti(ti=ti),
is_failure_callback=True,
)
Sometimes it is difficult to rewrite the tests, so you might add conditional handling and mock out some
database-bound methods or objects to avoid hitting the database during test collection. The code below
will hit the Database while parsing the tests, because this is what ``Variable.setdefault`` does when
parametrize specification is being parsed - even if test is marked as DB test.
.. code-block:: python
from airflow.models.variable import Variable
pytestmark = pytest.mark.db_test
initial_db_init()
@pytest.mark.parametrize(
"env, expected",
[
pytest.param(
{"plain_key": "plain_value"},
"{'plain_key': 'plain_value'}",
id="env-plain-key-val",
),
pytest.param(
{"plain_key": Variable.setdefault("plain_var", "banana")},
"{'plain_key': 'banana'}",
id="env-plain-key-plain-var",
),
pytest.param(
{"plain_key": Variable.setdefault("secret_var", "monkey")},
"{'plain_key': '***'}",
id="env-plain-key-sensitive-var",
),
pytest.param(
{"plain_key": "{{ var.value.plain_var }}"},
"{'plain_key': '{{ var.value.plain_var }}'}",
id="env-plain-key-plain-tpld-var",
),
],
)
def test_rendered_task_detail_env_secret(patch_app, admin_client, request, env, expected): ...
You can make the code conditional and mock out the ``Variable`` to avoid hitting the database.
.. code-block:: python
from airflow.models.variable import Variable
pytestmark = pytest.mark.db_test
if os.environ.get("_AIRFLOW_SKIP_DB_TESTS") == "true":
# Handle collection of the test by non-db case
Variable = mock.MagicMock() # type: ignore[misc] # noqa: F811
else:
initial_db_init()
@pytest.mark.parametrize(
"env, expected",
[
pytest.param(
{"plain_key": "plain_value"},
"{'plain_key': 'plain_value'}",
id="env-plain-key-val",
),
pytest.param(
{"plain_key": Variable.setdefault("plain_var", "banana")},
"{'plain_key': 'banana'}",
id="env-plain-key-plain-var",
),
pytest.param(
{"plain_key": Variable.setdefault("secret_var", "monkey")},
"{'plain_key': '***'}",
id="env-plain-key-sensitive-var",
),
pytest.param(
{"plain_key": "{{ var.value.plain_var }}"},
"{'plain_key': '{{ var.value.plain_var }}'}",
id="env-plain-key-plain-tpld-var",
),
],
)
def test_rendered_task_detail_env_secret(patch_app, admin_client, request, env, expected): ...
You can also use fixture to create object that needs database just like this.
.. code-block:: python
from airflow.models import Connection
pytestmark = pytest.mark.db_test
@pytest.fixture()
def get_connection1():
return Connection()
@pytest.fixture()
def get_connection2():
return Connection(host="apache.org", extra={})
@pytest.mark.parametrize(
"conn",
[
"get_connection1",
"get_connection2",
],
)
def test_as_json_from_connection(self, conn: Connection):
conn = request.getfixturevalue(conn)
...
Running Unit tests
------------------
Running Unit Tests from PyCharm IDE
...................................
To run unit tests from the PyCharm IDE, create the `local virtualenv <../07_local_virtualenv.rst>`_,
select it as the default project's environment, then configure your test runner:
.. image:: images/pycharm/configure_test_runner.png
:align: center
:alt: Configuring test runner
and run unit tests as follows:
.. image:: images/pycharm/running_unittests.png
:align: center
:alt: Running unit tests
**NOTE:** You can run the unit tests in the standalone local virtualenv
(with no Breeze installed) if they do not have dependencies such as
Postgres/MySQL/Hadoop/etc.
Running Unit Tests from PyCharm IDE using Breeze
................................................
Ideally, all unit tests should be run using the standardized Breeze environment. While not
as convenient as the one-click "play button" in PyCharm, the IDE can be configured to do
this in two clicks.
1. Add Breeze as an "External Tool":
a. From the settings menu, navigate to ``Tools > External Tools``
b. Click the little plus symbol to open the ``Create Tool`` popup and fill it out:
.. image:: images/pycharm/pycharm_create_tool.png
:align: center
:alt: Installing Python extension
2. Add the tool to the context menu:
a. From the settings menu, navigate to ``Appearance & Behavior > Menus & Toolbars > Project View Popup Menu``
b. Click on the list of entries where you would like it to be added. Right above or below ``Project View Popup Menu Run Group`` may be a good choice, you can drag and drop this list to rearrange the placement later as desired.
c. Click the little plus at the top of the popup window
d. Find your ``External Tool`` in the new ``Choose Actions to Add`` popup and click OK. If you followed the image above, it will be at ``External Tools > External Tools > Breeze``
**Note:** That only adds the option to that one menu. If you would like to add it to the context menu
when right-clicking on a tab at the top of the editor, for example, follow the steps above again
and place it in the ``Editor Tab Popup Menu``
.. image:: images/pycharm/pycharm_add_to_context.png
:align: center
:alt: Installing Python extension
3. To run tests in Breeze, right click on the file or directory in the ``Project View`` and click Breeze.
Running Unit Tests from Visual Studio Code
..........................................
To run unit tests from the Visual Studio Code:
1. Using the ``Extensions`` view install Python extension, reload if required
.. image:: images/vscode_install_python_extension.png
:align: center
:alt: Installing Python extension
2. Using the ``Testing`` view click on ``Configure Python Tests`` and select ``pytest`` framework
.. image:: images/vscode_configure_python_tests.png
:align: center
:alt: Configuring Python tests
.. image:: images/vscode_select_pytest_framework.png
:align: center
:alt: Selecting pytest framework
3. Open ``/.vscode/settings.json`` and add ``"python.testing.pytestArgs": ["tests"]`` to enable tests discovery
.. image:: images/vscode_add_pytest_settings.png
:align: center
:alt: Enabling tests discovery
4. Now you are able to run and debug tests from both the ``Testing`` view and test files
.. image:: images/vscode_run_tests.png
:align: center
:alt: Running tests
Running Unit Tests in local virtualenv
......................................
To run unit, integration, and system tests from the Breeze and your
virtualenv, you can use the `pytest <http://doc.pytest.org/en/latest/>`_ framework.
Custom ``pytest`` plugin runs ``airflow db init`` and ``airflow db reset`` the first
time you launch them. So, you can count on the database being initialized. Currently,
when you run tests not supported **in the local virtualenv, they may either fail
or provide an error message**.
There are many available options for selecting a specific test in ``pytest``. Details can be found
in the official documentation, but here are a few basic examples:
.. code-block:: bash
pytest airflow-core/tests/unit/core -k "TestCore and not check"
This runs the ``TestCore`` class but skips tests of this class that include 'check' in their names.
For better performance (due to a test collection), run:
.. code-block:: bash
pytest airflow-core/tests/unit/core/test_core.py -k "TestCore and not bash"
This flag is useful when used to run a single test like this:
.. code-block:: bash
pytest airflow-core/tests/unit/core/test_core.py -k "test_check_operators"
This can also be done by specifying a full path to the test:
.. code-block:: bash
pytest airflow-core/tests/unit/core/test_core.py::TestCore::test_dag_params_and_task_params
To run the whole test class, enter:
.. code-block:: bash
pytest airflow-core/tests/unit/core/test_core.py::TestCore
You can use all available ``pytest`` flags. For example, to increase a log level
for debugging purposes, enter:
.. code-block:: bash
pytest --log-cli-level=DEBUG airflow-core/tests/unit/core/test_core.py::TestCore
Running Tests using Breeze interactive shell
............................................
You can run tests interactively using regular pytest commands inside the Breeze shell. This has the
advantage, that Breeze container has all the dependencies installed that are needed to run the tests
and it will ask you to rebuild the image if it is needed and some new dependencies should be installed.
By using interactive shell and iterating over the tests, you can iterate and re-run tests one-by-one
or group by group right after you modified them.
Entering the shell is as easy as:
.. code-block:: bash
breeze
This should drop you into the container.
You can also use other switches (like ``--backend`` for example) to configure the environment for your
tests (and for example to switch to different database backend - see ``--help`` for more details).
Once you enter the container, you might run regular pytest commands. For example:
.. code-block:: bash
pytest --log-cli-level=DEBUG airflow-core/tests/unit/core/test_core.py::TestCore
Running Tests using Breeze from the Host
........................................
If you wish to only run tests and not to drop into the shell, apply the ``tests`` command.
You can add extra targets and pytest flags after the ``tests`` command. Note that
often you want to run the tests with a clean/reset db, so usually you want to add ``--db-reset`` flag
to breeze command. The Breeze image usually will have all the dependencies needed and it
will ask you to rebuild the image if it is needed and some new dependencies should be installed.
.. code-block:: bash
breeze testing providers-tests providers/http/tests/http/hooks/test_http.py airflow-core/tests/unit/core/test_core.py --db-reset --log-cli-level=DEBUG
You can run the whole core test suite without adding the test target:
.. code-block:: bash
breeze testing core-tests --db-reset
You can run the whole providers test suite without adding the test target:
.. code-block:: bash
breeze testing providers-tests --db-reset
You can also specify individual tests or a group of tests:
.. code-block:: bash
breeze testing core-tests --db-reset airflow-core/tests/unit/core/test_core.py::TestCore
You can also limit the tests to execute to specific group of tests
.. code-block:: bash
breeze testing core-tests --test-type Other
In case of Providers tests, you can run tests for all providers
.. code-block:: bash
breeze testing providers-tests --test-type Providers
You can limit the set of providers you would like to run tests of
.. code-block:: bash
breeze testing providers-tests --test-type "Providers[airbyte,http]"
You can also run all providers but exclude the providers you would like to skip
.. code-block:: bash
breeze testing providers-tests --test-type "Providers[-amazon,google]"
Sometimes you need to inspect docker compose after tests command complete,
for example when test environment could not be properly set due to
failed health-checks. This can be achieved with ``--skip-docker-compose-down``
flag:
.. code-block:: bash
breeze testing core-tests --skip-docker-compose-down
Running full Airflow unit test suite in parallel
................................................
If you run ``breeze testing core-tests --run-in-parallel`` or
``breeze testing providers-tests --run-in-parallel``, tests are executed in parallel
on your development machine, using as many cores as are available to the Docker engine.
If your Docker environment has limited memory (less than 8 GB), then ``Integration``, ``Provider``,
and ``Core`` tests are run sequentially, with the Docker setup cleaned between test runs
to minimize memory usage.
This approach allows for a massive speedup in full test execution. On a machine with 8 CPUs
(16 cores), 64 GB of RAM, and a fast SSD, the full suite of tests can complete in about
5 minutes (!) — compared to more than 30 minutes when run sequentially.
.. note::
On MacOS you might have less CPUs and less memory available to run the tests than you have in the host,
simply because your Docker engine runs in a Linux Virtual Machine under-the-hood. If you want to make
use of the parallelism and memory usage for the CI tests you might want to increase the resources available
to your docker engine. See the `Resources <https://docs.docker.com/docker-for-mac/#resources>`_ chapter
in the ``Docker for Mac`` documentation on how to do it.
You can also limit the parallelism by specifying the maximum number of parallel jobs via
``MAX_PARALLEL_TEST_JOBS`` variable. If you set it to "1", all the test types will be run sequentially.
.. code-block:: bash
MAX_PARALLEL_TEST_JOBS="1" ./scripts/ci/testing/ci_run_airflow_testing.sh
.. note::
In case you would like to cleanup after execution of such tests you might have to cleanup
some of the docker containers running in case you use ctrl-c to stop execution. You can easily do it by
running this command (it will kill all docker containers running so do not use it if you want to keep some
docker containers running):
.. code-block:: bash
docker kill $(docker ps -q)
Running Backend-Specific Tests
..............................
Tests that are using a specific backend are marked with a custom pytest marker ``pytest.mark.backend``.
The marker has a single parameter - the name of a backend. It corresponds to the ``--backend`` switch of
the Breeze environment (one of ``mysql``, ``sqlite``, or ``postgres``). Backend-specific tests only run when
the Breeze environment is running with the right backend. If you specify more than one backend
in the marker, the test runs for all specified backends.
Example of the ``postgres`` only test:
.. code-block:: python
@pytest.mark.backend("postgres")
def test_copy_expert(self): ...
Example of the ``postgres,mysql`` test (they are skipped with the ``sqlite`` backend):
.. code-block:: python
@pytest.mark.backend("postgres", "mysql")
def test_celery_executor(self): ...
You can use the custom ``--backend`` switch in pytest to only run tests specific for that backend.
Here is an example of running only postgres-specific backend tests:
.. code-block:: bash
pytest --backend postgres
Running Long-running tests
..........................
Some of the tests rung for a long time. Such tests are marked with ``@pytest.mark.long_running`` annotation.
Those tests are skipped by default. You can enable them with ``--include-long-running`` flag. You
can also decide to only run tests with ``-m long-running`` flags to run only those tests.
Running Quarantined tests
.........................
Some of our tests are quarantined. This means that this test will be run in isolation and that it will be
re-run several times. Also when quarantined tests fail, the whole test suite will not fail. The quarantined
tests are usually flaky tests that need some attention and fix.
Those tests are marked with ``@pytest.mark.quarantined`` annotation.
Those tests are skipped by default. You can enable them with ``--include-quarantined`` flag. You
can also decide to only run tests with ``-m quarantined`` flag to run only those tests.
Compatibility Provider unit tests against older Airflow releases
----------------------------------------------------------------
Why we run provider compatibility tests
.......................................
Our CI runs provider tests for providers with previous compatible Airflow releases. This allows to check
if the providers still work when installed for older Airflow versions.
The back-compatibility tests based on the configuration specified in the
``PROVIDERS_COMPATIBILITY_TESTS_MATRIX`` constant in the ``./dev/breeze/src/airflow_breeze/global_constants.py``
file - where we specify:
* Python version
* Airflow version
* which providers should be removed for the tests (exclusions)
* whether to run tests for this Airflow/Python version
Those tests can be used to test compatibility of the providers with past (and future!) releases of airflow.
For example it could be used to run latest provider versions with released or main
Airflow 3 if they are developed independently.
The tests use the current source version of ``tests`` folder and current ``providers`` - so care should be
taken that the tests implemented for providers in the sources allow to run it against previous versions
of Airflow and against Airflow installed from PyPI package rather than from the sources.
Running the compatibility tests locally
.......................................
Running tests can be easily done locally by running appropriate ``breeze`` command. In CI the command
is slightly different as it is run using providers build using wheel packages, but it is faster
to run it locally and easier to iterate if you need to fix provider using provider sources mounted
directly to the container.
1. Make sure to build latest Breeze ci image
.. code-block:: bash
breeze ci-image build --python 3.9
2. Enter breeze environment by selecting the appropriate Airflow version and choosing
``providers-and-tests`` option for ``--mount-sources`` flag.
.. code-block:: bash
breeze shell --use-airflow-version 2.9.1 --mount-sources providers-and-tests
3. You can then run tests as usual:
.. code-block:: bash
pytest providers/<provider>/tests/.../test.py
4. Iterate with the tests and providers. Both providers and tests are mounted from local sources so
changes you do locally in both - tests and provider sources are immediately reflected inside the
breeze container and you can re-run the tests inside ``breeze`` container without restarting the
container (which makes it faster to iterate).
.. note::
Since providers are installed from sources rather than from packages, plugins from providers are not
recognised by ProvidersManager for Airflow < 2.10 and tests that expect plugins to work might not work.
In such case you should follow the ``CI`` way of running the tests (see below).
Implementing compatibility for provider tests for older Airflow versions
........................................................................
When you implement tests for providers, you should make sure that they are compatible with older Airflow versions.
Note that some of the tests, if written without taking care about the compatibility, might not work with older
versions of Airflow - this is because of refactorings, renames, and tests relying on internals of Airflow that
are not part of the public API. We deal with it in one of the following ways:
1) If the whole provider is supposed to only work for later Airflow version, we remove the whole provider
by excluding it from compatibility test configuration (see below)
2) Some compatibility shims are defined in ``devel-common/src/tests_common/test_utils/compat.py`` - and
they can be used to make the tests compatible - for example importing ``ParseImportError`` after the
exception has been renamed from ``ImportError`` and it would fail in Airflow 2.9, but we have a fallback
import in ``compat.py`` that falls back to old import automatically, so all tests testing / expecting
``ParseImportError`` should import it from the ``tests_common.tests_utils.compat`` module. There are few
other compatibility shims defined there and you can add more if needed in a similar way.
3) If only some tests are not compatible and use features that are available only in newer Airflow version,
we can mark those tests with appropriate ``AIRFLOW_V_2_X_PLUS`` boolean constant defined in ``version_compat.py``
For example:
.. code-block:: python
from tests_common.test_utils.version_compat import AIRFLOW_V_2_10_PLUS
@pytest.mark.skipif(not AIRFLOW_V_2_10_PLUS, reason="The tests should be skipped for Airflow < 2.10")
def some_test_that_only_works_for_airflow_2_10_plus():
pass
4) Sometimes, the tests should only be run when Airflow is installed from the sources in main.
In this case you can add conditional ``skipif`` markerfor ``RUNNING_TESTS_AGAINST_AIRFLOW_PACKAGES``
to the test. For example:
.. code-block:: python
from tests_common import RUNNING_TESTS_AGAINST_AIRFLOW_PACKAGES
@pytest.mark.skipif(
RUNNING_TESTS_AGAINST_AIRFLOW_PACKAGES, reason="Plugin initialization is done early in case of packages"
)
def test_plugin():
pass
5) Sometimes Pytest collection fails to work, when certain imports used by the tests either do not exist
or fail with RuntimeError about compatibility ("minimum Airflow version is required") or because they
raise AirflowOptionalProviderFeatureException. In such case you should wrap the imports in
``ignore_provider_compatibility_error`` context manager adding the ``__file__``
module name as parameter. This will stop failing pytest collection and automatically skip the whole
module from unit.
For example:
.. code-block:: python
with ignore_provider_compatibility_error("2.8.0", __file__):
from airflow.providers.common.io.xcom.backend import XComObjectStorageBackend
6) In some cases in order to enable collection of pytest on older Airflow version you might need to convert
top-level import into a local import, so that Pytest parser does not fail on collection.
Running provider compatibility tests in CI
..........................................
In CI those tests are run in a slightly more complex way because we want to run them against the build
providers, rather than mounted from sources.
In case of canary runs we add ``--clean-airflow-installation`` flag that removes all packages before
installing older Airflow version, and then installs development dependencies
from latest Airflow - in order to avoid case where a provider depends on a new dependency added in latest
version of Airflow. This clean removal and re-installation takes quite some time though and in order to
speed up the tests in regular PRs we only do that in the canary runs.
The exact way CI tests are run can be reproduced locally building providers from selected tag/commit and
using them to install and run tests against the selected Airflow version.
Herr id how to reproduce it.
1. Make sure to build latest Breeze ci image
.. code-block:: bash
breeze ci-image build --python 3.10
2. Build providers from latest sources:
.. code-block:: bash
rm dist/*
breeze release-management prepare-provider-distributions --include-not-ready-providers \
--skip-tag-check --distribution-format wheel
3. Prepare provider constraints
.. code-block:: bash
breeze release-management generate-constraints --airflow-constraints-mode constraints-source-providers --answer yes
4. Remove providers that are not compatible with Airflow version installed by default. You can look up
the incompatible providers in the ``PROVIDERS_COMPATIBILITY_TESTS_MATRIX`` constant in the
``./dev/breeze/src/airflow_breeze/global_constants.py`` file.
5. Enter breeze environment, installing selected Airflow version and the providers prepared from main
.. code-block:: bash
breeze shell --use-distributions-from-dist --distribution-format wheel --use-airflow-version 2.9.1 \
--install-airflow-with-constraints --providers-skip-constraints --mount-sources tests
In case you want to reproduce canary run, you need to add ``--clean-airflow-installation`` flag:
.. code-block:: bash
breeze shell --use-distributions-from-dist --distribution-format wheel --use-airflow-version 2.9.1 \
--install-airflow-with-constraints --providers-skip-constraints --mount-sources tests --clean-airflow-installation
6. You can then run tests as usual:
.. code-block:: bash
pytest providers/<provider>/tests/.../test.py
7. Iterate with the tests
The tests are run using:
* Airflow installed from PyPI
* tests coming from the current Airflow sources (they are mounted inside the breeze image)
* providers built from the current Airflow sources and placed in dist
This means that you can modify and run tests and re-run them because sources are mounted from the host,
but if you want to modify provider code you need to exit breeze, rebuild the provider package and
restart breeze using the command above.
Rebuilding single provider package can be done using this command:
.. code-block:: bash
breeze release-management prepare-provider-distributions \
--skip-tag-check --distribution-format wheel <provider>
Lowest direct dependency resolution tests
-----------------------------------------
We have special tests that run with the lowest direct resolution of dependencies for Airflow and providers.
This is run in order to check whether we are not using a feature that is not available in an
older version of some dependencies.
Tests with lowest-direct dependency resolution for Airflow
..........................................................
You can test minimum dependencies that are installed by Airflow by running (for example to run "Core" tests):
.. code-block:: bash
breeze testing core-tests --force-lowest-dependencies --test-type "Core"
You can also iterate on the tests and versions of the dependencies by entering breeze shell and
running the tests from there, after manually downgrading the dependencies:
.. code-block:: bash
breeze shell # enter the container
cd airflow-core
uv sync --resolution lowest-direct
or run ``--force-lowest-dependencies`` switch directly from the breeze command line:
.. code-block:: bash
breeze shell --force-lowest-dependencies --test-type "Core"
The way it works - after you enter breeze container, you run the uv-sync in the airflow-core
folder to downgrade the dependencies to the lowest version that is compatible
with the dependencies specified in airflow-core dependencies. You will see it in the output of the breeze
command as a sequence of downgrades like this:
.. code-block:: diff
- aiohttp==3.9.5
+ aiohttp==3.9.2
- anyio==4.4.0
+ anyio==3.7.1
Tests with lowest-direct dependency resolution for a Provider
.............................................................
Similarly we can test if the provider tests are working for lowest dependencies of specific provider.
Those tests can be easily run locally with breeze (replace PROVIDER_ID with id of the provider):
.. code-block:: bash
breeze testing providers-tests --force-lowest-dependencies --test-type "Providers[PROVIDER_ID]"
If you find that the tests are failing for some dependencies, make sure to add minimum version for
the dependency in the provider.yaml file of the appropriate provider and re-run it.
You can also iterate on the tests and versions of the dependencies by entering breeze shell and
manually downgrading dependencies for the provider and running the tests after that:
.. code-block:: bash
breeze shell
cd providers/PROVIDER_ID
uv sync --resolution lowest-direct
or run ``--force-lowest-dependencies`` switch directly from the breeze command line:
.. code-block:: bash
breeze shell --force-lowest-dependencies --test-type "Providers[google]"
Similarly as in case of "Core" tests, the dependencies will be downgraded to the lowest version that is
compatible with the dependencies specified in the provider dependencies and you will see the list of
downgrades in the output of the breeze command. Note that this will be combined downgrades of both
Airflow and selected provider dependencies, so the list will be longer than in case of "Core" tests
and longer than **just** dependencies of the provider. For example for a ``google`` provider, part of the
downgraded dependencies will contain both Airflow and Google Provider dependencies:
.. code-block:: diff
- flask-login==0.6.3
+ flask-login==0.6.2
- flask-session==0.5.0
+ flask-session==0.4.0
- flask-wtf==1.2.1
+ flask-wtf==1.1.0
- fsspec==2023.12.2
+ fsspec==2023.10.0
- gcloud-aio-bigquery==7.1.0
+ gcloud-aio-bigquery==6.1.2
- gcloud-aio-storage==9.2.0
You can also (if your local virtualenv can install the dependencies for the provider)
reproduce the same set of dependencies in your local virtual environment by:
.. code-block:: bash
cd airflow-core
uv sync --resolution lowest-direct
for Airflow core, and
.. code-block:: bash
cd providers/PROVIDER_ID
uv sync --resolution lowest-direct
for the providers.
How to fix failing lowest-direct dependency resolution tests
............................................................
When your tests pass in regular test, but fail in "lowest-direct" dependency resolution tests, you need
to figure out one of the problems:
* lower-bindings missing in the ``pyproject.toml`` file (in ``airflow-core`` or corresponding provider).
This is usually a very easy thing that takes a little bit of time to figure out especially if you
just added new feature from a library that you use, just check in the release notes what is the minimum
version of the library that you can use and set it as the ``>=VERSION`` in the ``pyproject.toml``.
* figuring out if airflow-core or the provider needs additional providers or additional dependencies in dev
dependency group for the provider - sometimes tests need another provider to be installed that is not
normally needed as required dependencies of the provider being tested. Those dependencies
should be added after the ``# Additional devel dependencies`` comment in case of providers. Adding the
dependencies here means that when ``uv sync`` is run, the packages and its dependencies will be installed.
.. code-block:: toml
[dependency-groups]
dev = [
"apache-airflow",
"apache-airflow-task-sdk",
"apache-airflow-devel-common",
"apache-airflow-providers-common-sql",
"apache-airflow-providers-fab",
# Additional devel dependencies (do not remove this line and add extra development dependencies)
"deltalake>=0.12.0",
"apache-airflow-providers-microsoft-azure",
]
Sometimes it might get a bit tricky to know what is the minimum version of the library you should be using
but in this case you can easily find it by looking at the error and list of downgraded packages and
guessing which one is the one that is causing the problem. You can then look at the release notes of the
library and find the minimum version but also you can revert to technique known as bisecting which allows
you to quickly figure out the right version without knowing the root cause of the problem.
Assume you suspect library "foo" that was downgraded from 1.0.0 to 0.1.0 is causing the problem. Bisecting
technique looks like follows:
* Run ``uv sync --resolution lowest-direct``(the ``foo`` library is downgraded to 0.1.0). Your test should
fail.
* make sure that just upgrading the ``foo`` library to 1.0.0 -> re-run failing test (with ``pytest <test>``)
and see that it passes.
* downgrade the ``foo`` library to 0.1.0 -> re-run failing test (with ``pytest <test>``) and see that it
fails.
* look at the list of versions available for the library between 0.1.0 and 1.0.0 (for example via
`<https://pypi.org/project/foo/#history>`_ link - where ``foo`` is your library.
* find a middle version between the 1.0.0 and 0.1.0 and upgrade the library to this version - see if the
test passes or fails - if it passes, continue with finding the middle version between the current version
and lower version, if it fails, continue with finding the middle version between the current version and
higher version.
* continue that way until you find the version that is the lowest version that passes the test.
* set this version in ``pyproject.toml`` file, run ``uv sync --resolution lowest-direct`` and see if the test
passes. If it does, you are done. If it does not, repeat the process.
You can also skip some of the tests to be run when force lowest dependencies are used when tests are run in
breeze by adding the marker below. This is sometimes needed if your "core" or "provider" tests depend on
all or many providers to be installed (for example tests loading multiple examples or connections):
.. code-block:: python
from tests_common.pytest_plugin import skip_if_force_lowest_dependencies_marker
@skip_if_force_lowest_dependencies_marker
def test_my_test_that_should_be_skipped():
assert 1 == 1
And you can locally also set ``FORCE_LOWEST_DEPENDENCIES`` to ``true`` environment variable before
running ``pytest`` to also skip the tests when running them locally.
Other Settings
--------------
Enable masking secrets in tests
...............................
By default masking secrets in test disabled because it might have side effects
into the other tests which intends to check ``logging/stdout/stderr`` values
If you need to test masking secrets in test cases
you have to apply ``pytest.mark.enable_redact`` to the specific test case, class or module.
.. code-block:: python
@pytest.mark.enable_redact
def test_masking(capsys):
mask_secret("eggs")
RedactedIO().write("spam eggs and potatoes")
assert "spam *** and potatoes" in capsys.readouterr().out
Skip test on unsupported platform / environment
...............................................
You can apply the marker ``pytest.mark.platform(name)`` to the specific test case, class or module
for prevent to run on unsupported platform.
- ``linux``: Run test only on linux platform
- ``breeze``: Run test only inside of Breeze container, it might be useful in case of run
some potential dangerous things in tests or if it expects to use common Breeze things.
Warnings capture system
.......................
By default, all warnings captured during the test runs are saved into the ``tests/warnings.txt``.
If required, you could change the path by providing ``--warning-output-path`` as pytest CLI arguments
or by setting the environment variable ``CAPTURE_WARNINGS_OUTPUT``.
.. code-block:: console
root@3f98e75b1ebe:/opt/airflow# pytest airflow-core/tests/unit/core/ --warning-output-path=/foo/bar/spam.egg
...
========================= Warning summary. Total: 28, Unique: 12 ==========================
airflow: total 11, unique 1
runtest: total 11, unique 1
other: total 7, unique 1
runtest: total 7, unique 1
tests: total 10, unique 10
runtest: total 10, unique 10
Warnings saved into /foo/bar/spam.egg file.
================================= short test summary info =================================
You might also disable capture warnings by providing ``--disable-capture-warnings`` as pytest CLI arguments
or by setting `global warnings filter <https://docs.python.org/3/library/warnings.html#the-warnings-filter>`__
to **ignore**, e.g. set ``PYTHONWARNINGS`` environment variable to ``ignore``.
.. code-block:: bash
pytest airflow-core/tests/unit/core/ --disable-capture-warnings
Keep tests using environment variables
......................................
By default, all environment variables related to Airflow (starting by ``AIRFLOW__``) are all cleared before running tests
to avoid potential side effect. However, in some scenarios you might want to disable this mechanism and keep the
environment variables you defined to configure your Airflow environment. For example, you might want to run tests
against a specific database configured through the environment variable ``AIRFLOW__DATABASE__SQL_ALCHEMY_CONN``.
Or running tests using a specific executor to run tasks configured through ``AIRFLOW__CORE__EXECUTOR``.
To keep using environment variables you defined in your environment, you need to provide ``--keep-env-variables`` as
pytest CLI argument.
.. code-block:: bash
pytest airflow-core/tests/unit/core/ --no-db-cleanup
This parameter is also available in Breeze.
.. code-block:: bash
breeze testing core-tests --keep-env-variables
Disable database cleanup before each test module
................................................
By default, the database is cleared from all items before running tests. This is to avoid potential conflicts with
existing resources in the database when running tests using the database. However, in some scenarios you might want to
disable this mechanism and keep the database as is. For example, you might want to run tests in parallel against the
same database. In that case, you need to disable the database cleanup, otherwise the tests are going to conflict with
each other (one test will delete the resources that another one is creating).
To disable the database cleanup, you need to provide ``--no-db-cleanup`` as pytest CLI argument.
.. code-block:: bash
pytest airflow-core/tests/unit/core/ --no-db-cleanup
This parameter is also available in Breeze.
.. code-block:: bash
breeze testing core-tests --no-db-cleanup airflow-core/tests/unit/core/
Code Coverage
-------------
Airflow's CI process automatically uploads the code coverage report to codecov.io.
For the most recent coverage report of the main branch, visit: https://codecov.io/gh/apache/airflow.
Generating Local Coverage Reports:
..................................
If you wish to obtain coverage reports for specific areas of the codebase on your local machine, follow these steps:
a. Initiate a breeze shell.
b. Execute one of the commands below based on the desired coverage area:
- **Core:** ``python scripts/cov/core_coverage.py``
- **REST API:** ``python scripts/cov/restapi_coverage.py``
- **CLI:** ``python scripts/cov/cli_coverage.py``
- **Other:** ``python scripts/cov/other_coverage.py``
c. After execution, run the following commands from the repository root
(inside the Breeze shell):
.. code-block:: bash
cd htmlcov/
python -m http.server 5555
The Breeze container maps port ``5555`` inside the container to
``25555`` on the host, so you can open the coverage report at
http://localhost:25555 in your browser.
.. note::
You no longer need to start the Airflow web server to view the
coverage report. The lightweight HTTP server above is sufficient and
avoids an extra service. If port 25555 on the host is already in use,
adjust the container-to-host mapping with
``BREEZE_PORTS_EXTRA="<host_port>:5555" breeze start-airflow``.
Modules Not Fully Covered:
..........................
Each coverage command provides a list of modules that aren't fully covered. If you wish to enhance coverage for a particular module:
a. Work on the module to improve its coverage.
b. Once coverage reaches 100%, you can safely remove the module from the list of modules that are not fully covered.
This list is inside each command's source code.
Tracking SQL statements
-----------------------
You can run tests with SQL statements tracking. To do this, use the ``--trace-sql`` option and pass the
columns to be displayed as an argument. Each query will be displayed on a separate line.
Supported values:
* ``num`` - displays the query number;
* ``time`` - displays the query execution time;
* ``trace`` - displays the simplified (one-line) stack trace;
* ``sql`` - displays the SQL statements;
* ``parameters`` - display SQL statement parameters.
If you only provide ``num``, then only the final number of queries will be displayed.
By default, pytest does not display output for successful tests, if you still want to see them, you must
pass the ``--capture=no`` option.
If you run the following command:
.. code-block:: bash
pytest --trace-sql=num,sql,parameters --capture=no \
airflow-core/tests/unit/jobs/test_scheduler_job.py -k test_process_dags_queries_count_05
On the screen you will see database queries for the given test.
SQL query tracking does not work properly if your test runs subprocesses. Only queries from the main process
are tracked.
-----
For other kinds of tests look at `Testing document <../09_testing.rst>`__