blob: fd2f6699baf7f9664de55a538772cd548546e922 [file] [log] [blame]
:py:mod:`airflow.models.dataset`
================================
.. py:module:: airflow.models.dataset
Module Contents
---------------
Classes
~~~~~~~
.. autoapisummary::
airflow.models.dataset.DatasetModel
airflow.models.dataset.DagScheduleDatasetReference
airflow.models.dataset.TaskOutletDatasetReference
airflow.models.dataset.DatasetDagRunQueue
airflow.models.dataset.DatasetEvent
Attributes
~~~~~~~~~~
.. autoapisummary::
airflow.models.dataset.association_table
.. py:class:: DatasetModel(uri, **kwargs)
Bases: :py:obj:`airflow.models.base.Base`
A table to store datasets.
:param uri: a string that uniquely identifies the dataset
:param extra: JSON field for arbitrary extra info
.. py:attribute:: id
.. py:attribute:: uri
.. py:attribute:: extra
.. py:attribute:: created_at
.. py:attribute:: updated_at
.. py:attribute:: consuming_dags
.. py:attribute:: producing_tasks
.. py:attribute:: __tablename__
:annotation: = dataset
.. py:attribute:: __table_args__
.. py:method:: from_public(obj)
:classmethod:
.. py:method:: __eq__(other)
.. py:method:: __hash__()
.. py:method:: __repr__()
.. py:class:: DagScheduleDatasetReference
Bases: :py:obj:`airflow.models.base.Base`
References from a DAG to a dataset of which it is a consumer.
.. py:attribute:: dataset_id
.. py:attribute:: dag_id
.. py:attribute:: created_at
.. py:attribute:: updated_at
.. py:attribute:: dataset
.. py:attribute:: queue_records
.. py:attribute:: __tablename__
:annotation: = dag_schedule_dataset_reference
.. py:attribute:: __table_args__
.. py:method:: __eq__(other)
.. py:method:: __hash__()
.. py:method:: __repr__()
.. py:class:: TaskOutletDatasetReference
Bases: :py:obj:`airflow.models.base.Base`
References from a task to a dataset that it updates / produces.
.. py:attribute:: dataset_id
.. py:attribute:: dag_id
.. py:attribute:: task_id
.. py:attribute:: created_at
.. py:attribute:: updated_at
.. py:attribute:: dataset
.. py:attribute:: __tablename__
:annotation: = task_outlet_dataset_reference
.. py:attribute:: __table_args__
.. py:method:: __eq__(other)
.. py:method:: __hash__()
.. py:method:: __repr__()
.. py:class:: DatasetDagRunQueue
Bases: :py:obj:`airflow.models.base.Base`
Model for storing dataset events that need processing.
.. py:attribute:: dataset_id
.. py:attribute:: target_dag_id
.. py:attribute:: created_at
.. py:attribute:: __tablename__
:annotation: = dataset_dag_run_queue
.. py:attribute:: __table_args__
.. py:method:: __eq__(other)
.. py:method:: __hash__()
.. py:method:: __repr__()
.. py:data:: association_table
.. py:class:: DatasetEvent
Bases: :py:obj:`airflow.models.base.Base`
A table to store datasets events.
:param dataset_id: reference to DatasetModel record
:param extra: JSON field for arbitrary extra info
:param source_task_id: the task_id of the TI which updated the dataset
:param source_dag_id: the dag_id of the TI which updated the dataset
:param source_run_id: the run_id of the TI which updated the dataset
:param source_map_index: the map_index of the TI which updated the dataset
:param timestamp: the time the event was logged
We use relationships instead of foreign keys so that dataset events are not deleted even
if the foreign key object is.
.. py:attribute:: id
.. py:attribute:: dataset_id
.. py:attribute:: extra
.. py:attribute:: source_task_id
.. py:attribute:: source_dag_id
.. py:attribute:: source_run_id
.. py:attribute:: source_map_index
.. py:attribute:: timestamp
.. py:attribute:: __tablename__
:annotation: = dataset_event
.. py:attribute:: __table_args__
.. py:attribute:: created_dagruns
.. py:attribute:: source_task_instance
.. py:attribute:: source_dag_run
.. py:attribute:: dataset
.. py:method:: uri()
:property:
.. py:method:: __repr__()