blob: 986213df8187f292a93fb82b2498da9351fdfff7 [file] [log] [blame]
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
Timetables
==========
A DAG's scheduling strategy is determined by its internal "timetable". This
timetable can be created by specifying the DAG's ``schedule_interval`` argument,
as described in :doc:`DAG Run </dag-run>`, or by passing a ``timetable`` argument
directly. The timetable also dictates the data interval and the logical time of each
run created for the DAG.
Cron expressions and timedeltas are still supported (using
``CronDataIntervalTimetable`` and ``DeltaDataIntervalTimetable`` under the hood
respectively), however, there are situations where they cannot properly express
the schedule. Some examples are:
* Data intervals with "holes" between. (Instead of continuous, as both the cron
expression and ``timedelta`` schedules represent.)
* Run tasks at different times each day. For example, an astronomer may find it
useful to run a task at dawn to process data collected from the previous
night-time period.
* Schedules not following the Gregorian calendar. For example, create a run for
each month in the `Traditional Chinese Calendar`_. This is conceptually
similar to the sunset case above, but for a different time scale.
* Rolling windows, or overlapping data intervals. For example, one may want to
have a run each day, but make each run cover the period of the previous seven
days. It is possible to "hack" this with a cron expression, but a custom data
interval would be a more natural representation.
.. _`Traditional Chinese Calendar`: https://en.wikipedia.org/wiki/Chinese_calendar
As such, Airflow allows for custom timetables to be written in plugins and used by
DAGs. An example demonstrating a custom timetable can be found in the
:doc:`/howto/timetable` how-to guide.
Built In Timetables
-------------------
Airflow comes with several common timetables built in to cover the most common use cases. Additional timetables
may be available in plugins.
CronDataIntervalTimetable
^^^^^^^^^^^^^^^^^^^^^^^^^
Set schedule based on a cron expression. Can be selected by providing a string that is a valid
cron expression to the ``schedule_interval`` parameter of a DAG as described in the :doc:`/concepts/dags` documentation.
.. code-block:: python
@dag(schedule_interval="0 1 * * 3") # At 01:00 on Wednesday.
def example_dag():
pass
DeltaDataIntervalTimetable
^^^^^^^^^^^^^^^^^^^^^^^^^^
Schedules data intervals with a time delta. Can be selected by providing a
:class:`datetime.timedelta` or ``dateutil.relativedelta.relativedelta`` to the ``schedule_interval`` parameter of a DAG.
.. code-block:: python
@dag(schedule_interval=datetime.timedelta(minutes=30))
def example_dag():
pass
EventsTimetable
^^^^^^^^^^^^^^^
Simply pass a list of ``datetime``\s for the DAG to run after. Useful for timing based on sporting
events, planned communication campaigns, and other schedules that are arbitrary and irregular but predictable.
The list of events must be finite and of reasonable size as it must be loaded every time the DAG is parsed. Optionally,
the ``restrict_to_events`` flag can be used to force manual runs of the DAG to use the time of the most recent (or very
first) event for the data interval, otherwise manual runs will run with a ``data_interval_start`` and
``data_interval_end`` equal to the time at which the manual run was begun. You can also name the set of events using the
``description`` parameter, which will be displayed in the Airflow UI.
.. code-block:: python
from airflow.timetables.events import EventsTimetable
@dag(
timetable=EventsTimetable(
event_dates=[
pendulum.datetime(2022, 4, 5, 8, 27, tz="America/Chicago"),
pendulum.datetime(2022, 4, 17, 8, 27, tz="America/Chicago"),
pendulum.datetime(2022, 4, 22, 20, 50, tz="America/Chicago"),
],
description="My Team's Baseball Games",
restrict_to_events=False,
),
)
def example_dag():
pass