| :py:mod:`airflow.models.serialized_dag` |
| ======================================= |
| |
| .. py:module:: airflow.models.serialized_dag |
| |
| .. autoapi-nested-parse:: |
| |
| Serialized DAG table in database. |
| |
| |
| |
| Module Contents |
| --------------- |
| |
| Classes |
| ~~~~~~~ |
| |
| .. autoapisummary:: |
| |
| airflow.models.serialized_dag.SerializedDagModel |
| |
| |
| |
| |
| Attributes |
| ~~~~~~~~~~ |
| |
| .. autoapisummary:: |
| |
| airflow.models.serialized_dag.log |
| |
| |
| .. py:data:: log |
| |
| |
| |
| |
| .. py:class:: SerializedDagModel(dag: airflow.models.dag.DAG) |
| |
| Bases: :py:obj:`airflow.models.base.Base` |
| |
| A table for serialized DAGs. |
| |
| serialized_dag table is a snapshot of DAG files synchronized by scheduler. |
| This feature is controlled by: |
| |
| * ``[core] min_serialized_dag_update_interval = 30`` (s): |
| serialized DAGs are updated in DB when a file gets processed by scheduler, |
| to reduce DB write rate, there is a minimal interval of updating serialized DAGs. |
| * ``[scheduler] dag_dir_list_interval = 300`` (s): |
| interval of deleting serialized DAGs in DB when the files are deleted, suggest |
| to use a smaller interval such as 60 |
| |
| It is used by webserver to load dags |
| because reading from database is lightweight compared to importing from files, |
| it solves the webserver scalability issue. |
| |
| .. py:attribute:: __tablename__ |
| :annotation: = serialized_dag |
| |
| |
| |
| .. py:attribute:: dag_id |
| |
| |
| |
| |
| .. py:attribute:: fileloc |
| |
| |
| |
| |
| .. py:attribute:: fileloc_hash |
| |
| |
| |
| |
| .. py:attribute:: data |
| |
| |
| |
| |
| .. py:attribute:: last_updated |
| |
| |
| |
| |
| .. py:attribute:: dag_hash |
| |
| |
| |
| |
| .. py:attribute:: __table_args__ |
| |
| |
| |
| |
| .. py:attribute:: dag_runs |
| |
| |
| |
| |
| .. py:attribute:: dag_model |
| |
| |
| |
| |
| .. py:attribute:: load_op_links |
| :annotation: = True |
| |
| |
| |
| .. py:method:: __repr__(self) |
| |
| Return repr(self). |
| |
| |
| .. py:method:: write_dag(cls, dag: airflow.models.dag.DAG, min_update_interval: Optional[int] = None, session: sqlalchemy.orm.Session = None) -> bool |
| :classmethod: |
| |
| Serializes a DAG and writes it into database. |
| If the record already exists, it checks if the Serialized DAG changed or not. If it is |
| changed, it updates the record, ignores otherwise. |
| |
| :param dag: a DAG to be written into database |
| :param min_update_interval: minimal interval in seconds to update serialized DAG |
| :param session: ORM Session |
| |
| :returns: Boolean indicating if the DAG was written to the DB |
| |
| |
| .. py:method:: read_all_dags(cls, session: sqlalchemy.orm.Session = None) -> Dict[str, airflow.serialization.serialized_objects.SerializedDAG] |
| :classmethod: |
| |
| Reads all DAGs in serialized_dag table. |
| |
| :param session: ORM Session |
| :returns: a dict of DAGs read from database |
| |
| |
| .. py:method:: dag(self) |
| :property: |
| |
| The DAG deserialized from the ``data`` column |
| |
| |
| .. py:method:: remove_dag(cls, dag_id: str, session: sqlalchemy.orm.Session = None) |
| :classmethod: |
| |
| Deletes a DAG with given dag_id. |
| :param dag_id: dag_id to be deleted |
| :param session: ORM Session |
| |
| |
| .. py:method:: remove_deleted_dags(cls, alive_dag_filelocs: List[str], session=None) |
| :classmethod: |
| |
| Deletes DAGs not included in alive_dag_filelocs. |
| |
| :param alive_dag_filelocs: file paths of alive DAGs |
| :param session: ORM Session |
| |
| |
| .. py:method:: has_dag(cls, dag_id: str, session: sqlalchemy.orm.Session = None) -> bool |
| :classmethod: |
| |
| Checks a DAG exist in serialized_dag table. |
| |
| :param dag_id: the DAG to check |
| :param session: ORM Session |
| |
| |
| .. py:method:: get(cls, dag_id: str, session: sqlalchemy.orm.Session = None) -> Optional[SerializedDagModel] |
| :classmethod: |
| |
| Get the SerializedDAG for the given dag ID. |
| It will cope with being passed the ID of a subdag by looking up the |
| root dag_id from the DAG table. |
| |
| :param dag_id: the DAG to fetch |
| :param session: ORM Session |
| |
| |
| .. py:method:: bulk_sync_to_db(dags: List[airflow.models.dag.DAG], session: sqlalchemy.orm.Session = None) |
| :staticmethod: |
| |
| Saves DAGs as Serialized DAG objects in the database. Each |
| DAG is saved in a separate database query. |
| |
| :param dags: the DAG objects to save to the DB |
| :type dags: List[airflow.models.dag.DAG] |
| :param session: ORM Session |
| :type session: Session |
| :return: None |
| |
| |
| .. py:method:: get_last_updated_datetime(cls, dag_id: str, session: sqlalchemy.orm.Session = None) -> Optional[datetime.datetime] |
| :classmethod: |
| |
| Get the date when the Serialized DAG associated to DAG was last updated |
| in serialized_dag table |
| |
| :param dag_id: DAG ID |
| :type dag_id: str |
| :param session: ORM Session |
| :type session: Session |
| |
| |
| .. py:method:: get_max_last_updated_datetime(cls, session: sqlalchemy.orm.Session = None) -> Optional[datetime.datetime] |
| :classmethod: |
| |
| Get the maximum date when any DAG was last updated in serialized_dag table |
| |
| :param session: ORM Session |
| :type session: Session |
| |
| |
| .. py:method:: get_latest_version_hash(cls, dag_id: str, session: sqlalchemy.orm.Session = None) -> Optional[str] |
| :classmethod: |
| |
| Get the latest DAG version for a given DAG ID. |
| |
| :param dag_id: DAG ID |
| :type dag_id: str |
| :param session: ORM Session |
| :type session: Session |
| :return: DAG Hash, or None if the DAG is not found |
| :rtype: str | None |
| |
| |
| .. py:method:: get_dag_dependencies(cls, session: sqlalchemy.orm.Session = None) -> Dict[str, List[airflow.serialization.serialized_objects.DagDependency]] |
| :classmethod: |
| |
| Get the dependencies between DAGs |
| |
| :param session: ORM Session |
| :type session: Session |
| |
| |
| |