blob: bd9c121990090b129f6466d2f654f7ae4434fc49 [file] [log] [blame]
:mod:`airflow.operators.python`
===============================
.. py:module:: airflow.operators.python
Module Contents
---------------
.. py:class:: PythonOperator(*, python_callable: Callable, op_args: Optional[List] = None, op_kwargs: Optional[Dict] = None, templates_dict: Optional[Dict] = None, templates_exts: Optional[List[str]] = None, **kwargs)
Bases: :class:`airflow.models.BaseOperator`
Executes a Python callable
.. seealso::
For more information on how to use this operator, take a look at the guide:
:ref:`howto/operator:PythonOperator`
:param python_callable: A reference to an object that is callable
:type python_callable: python callable
:param op_kwargs: a dictionary of keyword arguments that will get unpacked
in your function
:type op_kwargs: dict (templated)
:param op_args: a list of positional arguments that will get unpacked when
calling your callable
:type op_args: list (templated)
:param templates_dict: a dictionary where the values are templates that
will get templated by the Airflow engine sometime between
``__init__`` and ``execute`` takes place and are made available
in your callable's context after the template has been applied. (templated)
:type templates_dict: dict[str]
:param templates_exts: a list of file extensions to resolve while
processing templated fields, for examples ``['.sql', '.hql']``
:type templates_exts: list[str]
.. attribute:: template_fields
:annotation: = ['templates_dict', 'op_args', 'op_kwargs']
.. attribute:: template_fields_renderers
.. attribute:: ui_color
:annotation: = #ffefeb
.. attribute:: shallow_copy_attrs
:annotation: = ['python_callable', 'op_kwargs']
.. method:: execute(self, context: Dict)
.. method:: execute_callable(self)
Calls the python callable with the given arguments.
:return: the return value of the call.
:rtype: any
.. py:class:: _PythonDecoratedOperator(*, python_callable: Callable, task_id: str, op_args: Tuple[Any], op_kwargs: Dict[str, Any], multiple_outputs: bool = False, **kwargs)
Bases: :class:`airflow.models.BaseOperator`
Wraps a Python callable and captures args/kwargs when called for execution.
:param python_callable: A reference to an object that is callable
:type python_callable: python callable
:param op_kwargs: a dictionary of keyword arguments that will get unpacked
in your function (templated)
:type op_kwargs: dict
:param op_args: a list of positional arguments that will get unpacked when
calling your callable (templated)
:type op_args: list
:param multiple_outputs: if set, function return value will be
unrolled to multiple XCom values. Dict will unroll to xcom values with keys as keys.
Defaults to False.
:type multiple_outputs: bool
.. attribute:: template_fields
:annotation: = ['op_args', 'op_kwargs']
.. attribute:: template_fields_renderers
.. attribute:: ui_color
.. attribute:: shallow_copy_attrs
:annotation: = ['python_callable']
.. staticmethod:: _get_unique_task_id(task_id: str, dag: Optional[DAG] = None, task_group: Optional[TaskGroup] = None)
Generate unique task id given a DAG (or if run in a DAG context)
Ids are generated by appending a unique number to the end of
the original task id.
Example:
task_id
task_id__1
task_id__2
...
task_id__20
.. staticmethod:: validate_python_callable(python_callable)
Validate that python callable can be wrapped by operator.
Raises exception if invalid.
:param python_callable: Python object to be validated
:raises: TypeError, AirflowException
.. method:: execute(self, context: Dict)
.. data:: T
.. function:: task(python_callable: Optional[Callable] = None, multiple_outputs: Optional[bool] = None, **kwargs) -> Callable[[T], T]
Python operator decorator. Wraps a function into an Airflow operator.
Accepts kwargs for operator kwarg. Can be reused in a single DAG.
:param python_callable: Function to decorate
:type python_callable: Optional[Callable]
:param multiple_outputs: if set, function return value will be
unrolled to multiple XCom values. List/Tuples will unroll to xcom values
with index as key. Dict will unroll to xcom values with keys as XCom keys.
Defaults to False.
:type multiple_outputs: bool
.. py:class:: BranchPythonOperator
Bases: :class:`airflow.operators.python.PythonOperator`, :class:`airflow.models.skipmixin.SkipMixin`
Allows a workflow to "branch" or follow a path following the execution
of this task.
It derives the PythonOperator and expects a Python function that returns
a single task_id or list of task_ids to follow. The task_id(s) returned
should point to a task directly downstream from {self}. All other "branches"
or directly downstream tasks are marked with a state of ``skipped`` so that
these paths can't move forward. The ``skipped`` states are propagated
downstream to allow for the DAG state to fill up and the DAG run's state
to be inferred.
.. method:: execute(self, context: Dict)
.. py:class:: ShortCircuitOperator
Bases: :class:`airflow.operators.python.PythonOperator`, :class:`airflow.models.skipmixin.SkipMixin`
Allows a workflow to continue only if a condition is met. Otherwise, the
workflow "short-circuits" and downstream tasks are skipped.
The ShortCircuitOperator is derived from the PythonOperator. It evaluates a
condition and short-circuits the workflow if the condition is False. Any
downstream tasks are marked with a state of "skipped". If the condition is
True, downstream tasks proceed as normal.
The condition is determined by the result of `python_callable`.
.. method:: execute(self, context: Dict)
.. py:class:: PythonVirtualenvOperator(*, python_callable: Callable, requirements: Optional[Iterable[str]] = None, python_version: Optional[Union[str, int, float]] = None, use_dill: bool = False, system_site_packages: bool = True, op_args: Optional[List] = None, op_kwargs: Optional[Dict] = None, string_args: Optional[Iterable[str]] = None, templates_dict: Optional[Dict] = None, templates_exts: Optional[List[str]] = None, **kwargs)
Bases: :class:`airflow.operators.python.PythonOperator`
Allows one to run a function in a virtualenv that is created and destroyed
automatically (with certain caveats).
The function must be defined using def, and not be
part of a class. All imports must happen inside the function
and no variables outside of the scope may be referenced. A global scope
variable named virtualenv_string_args will be available (populated by
string_args). In addition, one can pass stuff through op_args and op_kwargs, and one
can use a return value.
Note that if your virtualenv runs in a different Python major version than Airflow,
you cannot use return values, op_args, op_kwargs, or use any macros that are being provided to
Airflow through plugins. You can use string_args though.
.. seealso::
For more information on how to use this operator, take a look at the guide:
:ref:`howto/operator:PythonVirtualenvOperator`
:param python_callable: A python function with no references to outside variables,
defined with def, which will be run in a virtualenv
:type python_callable: function
:param requirements: A list of requirements as specified in a pip install command
:type requirements: list[str]
:param python_version: The Python version to run the virtualenv with. Note that
both 2 and 2.7 are acceptable forms.
:type python_version: Optional[Union[str, int, float]]
:param use_dill: Whether to use dill to serialize
the args and result (pickle is default). This allow more complex types
but requires you to include dill in your requirements.
:type use_dill: bool
:param system_site_packages: Whether to include
system_site_packages in your virtualenv.
See virtualenv documentation for more information.
:type system_site_packages: bool
:param op_args: A list of positional arguments to pass to python_callable.
:type op_args: list
:param op_kwargs: A dict of keyword arguments to pass to python_callable.
:type op_kwargs: dict
:param string_args: Strings that are present in the global var virtualenv_string_args,
available to python_callable at runtime as a list[str]. Note that args are split
by newline.
:type string_args: list[str]
:param templates_dict: a dictionary where the values are templates that
will get templated by the Airflow engine sometime between
``__init__`` and ``execute`` takes place and are made available
in your callable's context after the template has been applied
:type templates_dict: dict of str
:param templates_exts: a list of file extensions to resolve while
processing templated fields, for examples ``['.sql', '.hql']``
:type templates_exts: list[str]
.. attribute:: BASE_SERIALIZABLE_CONTEXT_KEYS
.. attribute:: PENDULUM_SERIALIZABLE_CONTEXT_KEYS
.. attribute:: AIRFLOW_SERIALIZABLE_CONTEXT_KEYS
.. method:: execute(self, context: Dict)
.. method:: execute_callable(self)
.. method:: _write_args(self, filename)
.. method:: _get_serializable_context_keys(self)
.. method:: _write_string_args(self, filename)
.. method:: _read_result(self, filename)
.. function:: get_current_context() -> Dict[str, Any]
Obtain the execution context for the currently executing operator without
altering user method's signature.
This is the simplest method of retrieving the execution context dictionary.
**Old style:**
.. code:: python
def my_task(**context):
ti = context["ti"]
**New style:**
.. code:: python
from airflow.task.context import get_current_context
def my_task():
context = get_current_context()
ti = context["ti"]
Current context will only have value if this method was called after an operator
was starting to execute.