| :mod:`airflow.models.dag` |
| ========================= |
| |
| .. py:module:: airflow.models.dag |
| |
| |
| Module Contents |
| --------------- |
| |
| .. data:: log |
| |
| |
| |
| |
| .. data:: ScheduleInterval |
| |
| |
| |
| |
| .. function:: get_last_dagrun(dag_id, session, include_externally_triggered=False) |
| Returns the last dag run for a dag, None if there was none. |
| Last dag run can be any type of run eg. scheduled or backfilled. |
| Overridden DagRuns are ignored. |
| |
| |
| .. py:class:: DAG(dag_id, description=None, schedule_interval=timedelta(days=1), start_date=None, end_date=None, full_filepath=None, template_searchpath=None, template_undefined=None, user_defined_macros=None, user_defined_filters=None, default_args=None, concurrency=conf.getint('core', 'dag_concurrency'), max_active_runs=conf.getint('core', 'max_active_runs_per_dag'), dagrun_timeout=None, sla_miss_callback=None, default_view=None, orientation=conf.get('webserver', 'dag_orientation'), catchup=conf.getboolean('scheduler', 'catchup_by_default'), on_success_callback=None, on_failure_callback=None, doc_md=None, params=None, access_control=None, is_paused_upon_creation=None, jinja_environment_kwargs=None, tags=None) |
| |
| Bases: :class:`airflow.dag.base_dag.BaseDag`, :class:`airflow.utils.log.logging_mixin.LoggingMixin` |
| |
| A dag (directed acyclic graph) is a collection of tasks with directional |
| dependencies. A dag also has a schedule, a start date and an end date |
| (optional). For each schedule, (say daily or hourly), the DAG needs to run |
| each individual tasks as their dependencies are met. Certain tasks have |
| the property of depending on their own past, meaning that they can't run |
| until their previous schedule (and upstream tasks) are completed. |
| |
| DAGs essentially act as namespaces for tasks. A task_id can only be |
| added once to a DAG. |
| |
| :param dag_id: The id of the DAG |
| :type dag_id: str |
| :param description: The description for the DAG to e.g. be shown on the webserver |
| :type description: str |
| :param schedule_interval: Defines how often that DAG runs, this |
| timedelta object gets added to your latest task instance's |
| execution_date to figure out the next schedule |
| :type schedule_interval: datetime.timedelta or |
| dateutil.relativedelta.relativedelta or str that acts as a cron |
| expression |
| :param start_date: The timestamp from which the scheduler will |
| attempt to backfill |
| :type start_date: datetime.datetime |
| :param end_date: A date beyond which your DAG won't run, leave to None |
| for open ended scheduling |
| :type end_date: datetime.datetime |
| :param template_searchpath: This list of folders (non relative) |
| defines where jinja will look for your templates. Order matters. |
| Note that jinja/airflow includes the path of your DAG file by |
| default |
| :type template_searchpath: str or list[str] |
| :param template_undefined: Template undefined type. |
| :type template_undefined: jinja2.Undefined |
| :param user_defined_macros: a dictionary of macros that will be exposed |
| in your jinja templates. For example, passing ``dict(foo='bar')`` |
| to this argument allows you to ``{{ foo }}`` in all jinja |
| templates related to this DAG. Note that you can pass any |
| type of object here. |
| :type user_defined_macros: dict |
| :param user_defined_filters: a dictionary of filters that will be exposed |
| in your jinja templates. For example, passing |
| ``dict(hello=lambda name: 'Hello %s' % name)`` to this argument allows |
| you to ``{{ 'world' | hello }}`` in all jinja templates related to |
| this DAG. |
| :type user_defined_filters: dict |
| :param default_args: A dictionary of default parameters to be used |
| as constructor keyword parameters when initialising operators. |
| Note that operators have the same hook, and precede those defined |
| here, meaning that if your dict contains `'depends_on_past': True` |
| here and `'depends_on_past': False` in the operator's call |
| `default_args`, the actual value will be `False`. |
| :type default_args: dict |
| :param params: a dictionary of DAG level parameters that are made |
| accessible in templates, namespaced under `params`. These |
| params can be overridden at the task level. |
| :type params: dict |
| :param concurrency: the number of task instances allowed to run |
| concurrently |
| :type concurrency: int |
| :param max_active_runs: maximum number of active DAG runs, beyond this |
| number of DAG runs in a running state, the scheduler won't create |
| new active DAG runs |
| :type max_active_runs: int |
| :param dagrun_timeout: specify how long a DagRun should be up before |
| timing out / failing, so that new DagRuns can be created. The timeout |
| is only enforced for scheduled DagRuns, and only once the |
| # of active DagRuns == max_active_runs. |
| :type dagrun_timeout: datetime.timedelta |
| :param sla_miss_callback: specify a function to call when reporting SLA |
| timeouts. |
| :type sla_miss_callback: types.FunctionType |
| :param default_view: Specify DAG default view (tree, graph, duration, |
| gantt, landing_times) |
| :type default_view: str |
| :param orientation: Specify DAG orientation in graph view (LR, TB, RL, BT) |
| :type orientation: str |
| :param catchup: Perform scheduler catchup (or only run latest)? Defaults to True |
| :type catchup: bool |
| :param on_failure_callback: A function to be called when a DagRun of this dag fails. |
| A context dictionary is passed as a single parameter to this function. |
| :type on_failure_callback: callable |
| :param on_success_callback: Much like the ``on_failure_callback`` except |
| that it is executed when the dag succeeds. |
| :type on_success_callback: callable |
| :param access_control: Specify optional DAG-level permissions, e.g., |
| "{'role1': {'can_dag_read'}, 'role2': {'can_dag_read', 'can_dag_edit'}}" |
| :type access_control: dict |
| :param is_paused_upon_creation: Specifies if the dag is paused when created for the first time. |
| If the dag exists already, this flag will be ignored. If this optional parameter |
| is not specified, the global config setting will be used. |
| :type is_paused_upon_creation: bool or None |
| :param jinja_environment_kwargs: additional configuration options to be passed to Jinja |
| ``Environment`` for template rendering |
| |
| **Example**: to avoid Jinja from removing a trailing newline from template strings :: |
| |
| DAG(dag_id='my-dag', |
| jinja_environment_kwargs={ |
| 'keep_trailing_newline': True, |
| # some other jinja2 Environment options here |
| } |
| ) |
| |
| **See**: `Jinja Environment documentation |
| <https://jinja.palletsprojects.com/en/master/api/#jinja2.Environment>`_ |
| |
| :type jinja_environment_kwargs: dict |
| :param tags: List of tags to help filtering DAGS in the UI. |
| :type tags: List[str] |
| |
| .. attribute:: _comps |
| |
| |
| |
| |
| .. attribute:: __serialized_fields |
| :annotation: :Optional[FrozenSet[str]] |
| |
| |
| |
| .. attribute:: dag_id |
| |
| |
| |
| |
| .. attribute:: full_filepath |
| |
| |
| |
| |
| .. attribute:: concurrency |
| |
| |
| |
| |
| .. attribute:: access_control |
| |
| |
| |
| |
| .. attribute:: description |
| |
| |
| |
| |
| .. attribute:: description_unicode |
| |
| |
| |
| |
| .. attribute:: pickle_id |
| |
| |
| |
| |
| .. attribute:: tasks |
| |
| |
| |
| |
| .. attribute:: task_ids |
| |
| |
| |
| |
| .. attribute:: filepath |
| |
| |
| File location of where the dag object is instantiated |
| |
| |
| .. attribute:: folder |
| |
| |
| Folder location of where the DAG object is instantiated. |
| |
| |
| .. attribute:: owner |
| |
| |
| Return list of all owners found in DAG tasks. |
| |
| :return: Comma separated list of owners in DAG tasks |
| :rtype: str |
| |
| |
| .. attribute:: allow_future_exec_dates |
| |
| |
| |
| |
| .. attribute:: concurrency_reached |
| |
| |
| Returns a boolean indicating whether the concurrency limit for this DAG |
| has been reached |
| |
| |
| .. attribute:: is_paused |
| |
| |
| Returns a boolean indicating whether this DAG is paused |
| |
| |
| .. attribute:: normalized_schedule_interval |
| |
| |
| Returns Normalized Schedule Interval. This is used internally by the Scheduler to |
| schedule DAGs. |
| |
| 1. Converts Cron Preset to a Cron Expression (e.g ``@monthly`` to ``0 0 1 * *``) |
| 2. If Schedule Interval is "@once" return "None" |
| 3. If not (1) or (2) returns schedule_interval |
| |
| |
| .. attribute:: latest_execution_date |
| |
| |
| Returns the latest date for which at least one dag run exists |
| |
| |
| .. attribute:: subdags |
| |
| |
| Returns a list of the subdag objects associated to this DAG |
| |
| |
| .. attribute:: roots |
| |
| |
| Return nodes with no parents. These are first to execute and are called roots or root nodes. |
| |
| |
| .. attribute:: leaves |
| |
| |
| Return nodes with no children. These are last to execute and are called leaves or leaf nodes. |
| |
| |
| |
| .. method:: __repr__(self) |
| |
| |
| |
| |
| .. method:: __eq__(self, other) |
| |
| |
| |
| |
| .. method:: __ne__(self, other) |
| |
| |
| |
| |
| .. method:: __lt__(self, other) |
| |
| |
| |
| |
| .. method:: __hash__(self) |
| |
| |
| |
| |
| .. method:: __enter__(self) |
| |
| |
| |
| |
| .. method:: __exit__(self, _type, _value, _tb) |
| |
| |
| |
| |
| .. method:: get_default_view(self) |
| |
| This is only there for backward compatible jinja2 templates |
| |
| |
| |
| |
| .. method:: date_range(self, start_date, num=None, end_date=timezone.utcnow()) |
| |
| |
| |
| |
| .. method:: is_fixed_time_schedule(self) |
| |
| Figures out if the DAG schedule has a fixed time (e.g. 3 AM). |
| |
| :return: True if the schedule has a fixed time, False if not. |
| |
| |
| |
| |
| .. method:: following_schedule(self, dttm) |
| |
| Calculates the following schedule for this dag in UTC. |
| |
| :param dttm: utc datetime |
| :return: utc datetime |
| |
| |
| |
| |
| .. method:: previous_schedule(self, dttm) |
| |
| Calculates the previous schedule for this dag in UTC |
| |
| :param dttm: utc datetime |
| :return: utc datetime |
| |
| |
| |
| |
| .. method:: get_run_dates(self, start_date, end_date=None) |
| |
| Returns a list of dates between the interval received as parameter using this |
| dag's schedule interval. Returned dates can be used for execution dates. |
| |
| :param start_date: the start date of the interval |
| :type start_date: datetime |
| :param end_date: the end date of the interval, defaults to timezone.utcnow() |
| :type end_date: datetime |
| :return: a list of dates within the interval following the dag's schedule |
| :rtype: list |
| |
| |
| |
| |
| .. method:: normalize_schedule(self, dttm) |
| |
| Returns dttm + interval unless dttm is first interval then it returns dttm |
| |
| |
| |
| |
| .. method:: get_last_dagrun(self, session=None, include_externally_triggered=False) |
| |
| |
| |
| |
| .. method:: _get_concurrency_reached(self, session=None) |
| |
| |
| |
| |
| .. method:: _get_is_paused(self, session=None) |
| |
| |
| |
| |
| .. method:: handle_callback(self, dagrun, success=True, reason=None, session=None) |
| |
| Triggers the appropriate callback depending on the value of success, namely the |
| on_failure_callback or on_success_callback. This method gets the context of a |
| single TaskInstance part of this DagRun and passes that to the callable along |
| with a 'reason', primarily to differentiate DagRun failures. |
| |
| .. note: The logs end up in |
| ``$AIRFLOW_HOME/logs/scheduler/latest/PROJECT/DAG_FILE.py.log`` |
| |
| :param dagrun: DagRun object |
| :param success: Flag to specify if failure or success callback should be called |
| :param reason: Completion reason |
| :param session: Database session |
| |
| |
| |
| |
| .. method:: get_active_runs(self) |
| |
| Returns a list of dag run execution dates currently running |
| |
| :return: List of execution dates |
| |
| |
| |
| |
| .. method:: get_num_active_runs(self, external_trigger=None, session=None) |
| |
| Returns the number of active "running" dag runs |
| |
| :param external_trigger: True for externally triggered active dag runs |
| :type external_trigger: bool |
| :param session: |
| :return: number greater than 0 for active dag runs |
| |
| |
| |
| |
| .. method:: get_dagrun(self, execution_date, session=None) |
| |
| Returns the dag run for a given execution date if it exists, otherwise |
| none. |
| |
| :param execution_date: The execution date of the DagRun to find. |
| :param session: |
| :return: The DagRun if found, otherwise None. |
| |
| |
| |
| |
| .. method:: get_dagruns_between(self, start_date, end_date, session=None) |
| |
| Returns the list of dag runs between start_date (inclusive) and end_date (inclusive). |
| |
| :param start_date: The starting execution date of the DagRun to find. |
| :param end_date: The ending execution date of the DagRun to find. |
| :param session: |
| :return: The list of DagRuns found. |
| |
| |
| |
| |
| .. method:: _get_latest_execution_date(self, session=None) |
| |
| |
| |
| |
| .. method:: resolve_template_files(self) |
| |
| |
| |
| |
| .. method:: get_template_env(self) |
| |
| Build a Jinja2 environment. |
| |
| |
| |
| |
| .. method:: set_dependency(self, upstream_task_id, downstream_task_id) |
| |
| Simple utility method to set dependency between two tasks that |
| already have been added to the DAG using add_task() |
| |
| |
| |
| |
| .. method:: get_task_instances(self, start_date=None, end_date=None, state=None, session=None) |
| |
| |
| |
| |
| .. method:: topological_sort(self) |
| |
| Sorts tasks in topographical order, such that a task comes after any of its |
| upstream dependencies. |
| |
| Heavily inspired by: |
| http://blog.jupo.org/2012/04/06/topological-sorting-acyclic-directed-graphs/ |
| |
| :return: list of tasks in topological order |
| |
| |
| |
| |
| .. method:: set_dag_runs_state(self, state=State.RUNNING, session=None, start_date=None, end_date=None) |
| |
| |
| |
| |
| .. method:: clear(self, start_date=None, end_date=None, only_failed=False, only_running=False, confirm_prompt=False, include_subdags=True, include_parentdag=True, reset_dag_runs=True, dry_run=False, session=None, get_tis=False, recursion_depth=0, max_recursion_depth=None, dag_bag=None) |
| |
| Clears a set of task instances associated with the current dag for |
| a specified date range. |
| |
| :param start_date: The minimum execution_date to clear |
| :type start_date: datetime.datetime or None |
| :param end_date: The maximum exeuction_date to clear |
| :type end_date: datetime.datetime or None |
| :param only_failed: Only clear failed tasks |
| :type only_failed: bool |
| :param only_running: Only clear running tasks. |
| :type only_running: bool |
| :param confirm_prompt: Ask for confirmation |
| :type confirm_prompt: bool |
| :param include_subdags: Clear tasks in subdags and clear external tasks |
| indicated by ExternalTaskMarker |
| :type include_subdags: bool |
| :param include_parentdag: Clear tasks in the parent dag of the subdag. |
| :type include_parentdag: bool |
| :param reset_dag_runs: Set state of dag to RUNNING |
| :type reset_dag_runs: bool |
| :param dry_run: Find the tasks to clear but don't clear them. |
| :type dry_run: bool |
| :param session: The sqlalchemy session to use |
| :type session: sqlalchemy.orm.session.Session |
| :param get_tis: Return the sqlachemy query for finding the TaskInstance without clearing the tasks |
| :type get_tis: bool |
| :param recursion_depth: The recursion depth of nested calls to DAG.clear(). |
| :type recursion_depth: int |
| :param max_recursion_depth: The maximum recusion depth allowed. This is determined by the |
| first encountered ExternalTaskMarker. Default is None indicating no ExternalTaskMarker |
| has been encountered. |
| :type max_recursion_depth: int |
| :param dag_bag: The DagBag used to find the dags |
| :type dag_bag: airflow.models.dagbag.DagBag |
| |
| |
| |
| |
| .. classmethod:: clear_dags(cls, dags, start_date=None, end_date=None, only_failed=False, only_running=False, confirm_prompt=False, include_subdags=True, include_parentdag=False, reset_dag_runs=True, dry_run=False) |
| |
| |
| |
| |
| .. method:: __deepcopy__(self, memo) |
| |
| |
| |
| |
| .. method:: sub_dag(self, task_regex, include_downstream=False, include_upstream=True) |
| |
| Returns a subset of the current dag as a deep copy of the current dag |
| based on a regex that should match one or many tasks, and includes |
| upstream and downstream neighbours based on the flag passed. |
| |
| |
| |
| |
| .. method:: has_task(self, task_id) |
| |
| |
| |
| |
| .. method:: get_task(self, task_id) |
| |
| |
| |
| |
| .. method:: pickle_info(self) |
| |
| |
| |
| |
| .. method:: pickle(self, session=None) |
| |
| |
| |
| |
| .. method:: tree_view(self) |
| |
| Print an ASCII tree representation of the DAG. |
| |
| |
| |
| |
| .. method:: add_task(self, task) |
| |
| Add a task to the DAG |
| |
| :param task: the task you want to add |
| :type task: task |
| |
| |
| |
| |
| .. method:: add_tasks(self, tasks) |
| |
| Add a list of tasks to the DAG |
| |
| :param tasks: a lit of tasks you want to add |
| :type tasks: list of tasks |
| |
| |
| |
| |
| .. method:: run(self, start_date=None, end_date=None, mark_success=False, local=False, executor=None, donot_pickle=conf.getboolean('core', 'donot_pickle'), ignore_task_deps=False, ignore_first_depends_on_past=False, pool=None, delay_on_limit_secs=1.0, verbose=False, conf=None, rerun_failed_tasks=False, run_backwards=False) |
| |
| Runs the DAG. |
| |
| :param start_date: the start date of the range to run |
| :type start_date: datetime.datetime |
| :param end_date: the end date of the range to run |
| :type end_date: datetime.datetime |
| :param mark_success: True to mark jobs as succeeded without running them |
| :type mark_success: bool |
| :param local: True to run the tasks using the LocalExecutor |
| :type local: bool |
| :param executor: The executor instance to run the tasks |
| :type executor: airflow.executor.BaseExecutor |
| :param donot_pickle: True to avoid pickling DAG object and send to workers |
| :type donot_pickle: bool |
| :param ignore_task_deps: True to skip upstream tasks |
| :type ignore_task_deps: bool |
| :param ignore_first_depends_on_past: True to ignore depends_on_past |
| dependencies for the first set of tasks only |
| :type ignore_first_depends_on_past: bool |
| :param pool: Resource pool to use |
| :type pool: str |
| :param delay_on_limit_secs: Time in seconds to wait before next attempt to run |
| dag run when max_active_runs limit has been reached |
| :type delay_on_limit_secs: float |
| :param verbose: Make logging output more verbose |
| :type verbose: bool |
| :param conf: user defined dictionary passed from CLI |
| :type conf: dict |
| :param rerun_failed_tasks: |
| :type: bool |
| :param run_backwards: |
| :type: bool |
| |
| |
| |
| |
| .. method:: cli(self) |
| |
| Exposes a CLI specific to this DAG |
| |
| |
| |
| |
| .. method:: create_dagrun(self, run_id, state, execution_date=None, start_date=None, external_trigger=False, conf=None, session=None) |
| |
| Creates a dag run from this dag including the tasks associated with this dag. |
| Returns the dag run. |
| |
| :param run_id: defines the the run id for this dag run |
| :type run_id: str |
| :param execution_date: the execution date of this dag run |
| :type execution_date: datetime.datetime |
| :param state: the state of the dag run |
| :type state: airflow.utils.state.State |
| :param start_date: the date this dag run should be evaluated |
| :type start_date: datetime |
| :param external_trigger: whether this dag run is externally triggered |
| :type external_trigger: bool |
| :param conf: Dict containing configuration/parameters to pass to the DAG |
| :type conf: dict |
| :param session: database session |
| :type session: sqlalchemy.orm.session.Session |
| |
| |
| |
| |
| .. method:: sync_to_db(self, owner=None, sync_time=None, session=None) |
| |
| Save attributes about this DAG to the DB. Note that this method |
| can be called for both DAGs and SubDAGs. A SubDag is actually a |
| SubDagOperator. |
| |
| :param dag: the DAG object to save to the DB |
| :type dag: airflow.models.DAG |
| :param sync_time: The time that the DAG should be marked as sync'ed |
| :type sync_time: datetime |
| :return: None |
| |
| |
| |
| |
| .. method:: get_dagtags(self, session=None) |
| |
| Creating a list of DagTags, if one is missing from the DB, will insert. |
| |
| :return: The DagTag list. |
| :rtype: list |
| |
| |
| |
| |
| .. staticmethod:: deactivate_unknown_dags(active_dag_ids, session=None) |
| |
| Given a list of known DAGs, deactivate any other DAGs that are |
| marked as active in the ORM |
| |
| :param active_dag_ids: list of DAG IDs that are active |
| :type active_dag_ids: list[unicode] |
| :return: None |
| |
| |
| |
| |
| .. staticmethod:: deactivate_stale_dags(expiration_date, session=None) |
| |
| Deactivate any DAGs that were last touched by the scheduler before |
| the expiration date. These DAGs were likely deleted. |
| |
| :param expiration_date: set inactive DAGs that were touched before this |
| time |
| :type expiration_date: datetime |
| :return: None |
| |
| |
| |
| |
| .. staticmethod:: get_num_task_instances(dag_id, task_ids=None, states=None, session=None) |
| |
| Returns the number of task instances in the given DAG. |
| |
| :param session: ORM session |
| :param dag_id: ID of the DAG to get the task concurrency of |
| :type dag_id: unicode |
| :param task_ids: A list of valid task IDs for the given DAG |
| :type task_ids: list[unicode] |
| :param states: A list of states to filter by if supplied |
| :type states: list[state] |
| :return: The number of running tasks |
| :rtype: int |
| |
| |
| |
| |
| .. method:: test_cycle(self) |
| |
| Check to see if there are any cycles in the DAG. Returns False if no cycle found, |
| otherwise raises exception. |
| |
| |
| |
| |
| .. method:: _test_cycle_helper(self, visit_map, task_id) |
| |
| Checks if a cycle exists from the input task using DFS traversal |
| |
| |
| |
| |
| .. classmethod:: get_serialized_fields(cls) |
| |
| Stringified DAGs and operators contain exactly these fields. |
| |
| |
| |
| |
| .. py:class:: DagTag |
| |
| Bases: :class:`airflow.models.base.Base` |
| |
| A tag name per dag, to allow quick filtering in the DAG view. |
| |
| .. attribute:: __tablename__ |
| :annotation: = dag_tag |
| |
| |
| |
| .. attribute:: name |
| |
| |
| |
| |
| .. attribute:: dag_id |
| |
| |
| |
| |
| |
| .. method:: __repr__(self) |
| |
| |
| |
| |
| .. py:class:: DagModel |
| |
| Bases: :class:`airflow.models.base.Base` |
| |
| .. attribute:: __tablename__ |
| :annotation: = dag |
| |
| These items are stored in the database for state related information |
| |
| |
| .. attribute:: dag_id |
| |
| |
| |
| |
| .. attribute:: root_dag_id |
| |
| |
| |
| |
| .. attribute:: is_paused_at_creation |
| |
| |
| |
| |
| .. attribute:: is_paused |
| |
| |
| |
| |
| .. attribute:: is_subdag |
| |
| |
| |
| |
| .. attribute:: is_active |
| |
| |
| |
| |
| .. attribute:: last_scheduler_run |
| |
| |
| |
| |
| .. attribute:: last_pickled |
| |
| |
| |
| |
| .. attribute:: last_expired |
| |
| |
| |
| |
| .. attribute:: scheduler_lock |
| |
| |
| |
| |
| .. attribute:: pickle_id |
| |
| |
| |
| |
| .. attribute:: fileloc |
| |
| |
| |
| |
| .. attribute:: owners |
| |
| |
| |
| |
| .. attribute:: description |
| |
| |
| |
| |
| .. attribute:: default_view |
| |
| |
| |
| |
| .. attribute:: schedule_interval |
| |
| |
| |
| |
| .. attribute:: tags |
| |
| |
| |
| |
| .. attribute:: __table_args__ |
| |
| |
| |
| |
| .. attribute:: timezone |
| |
| |
| |
| |
| .. attribute:: safe_dag_id |
| |
| |
| |
| |
| |
| .. method:: __repr__(self) |
| |
| |
| |
| |
| .. staticmethod:: get_dagmodel(dag_id, session=None) |
| |
| |
| |
| |
| .. classmethod:: get_current(cls, dag_id, session=None) |
| |
| |
| |
| |
| .. method:: get_default_view(self) |
| |
| |
| |
| |
| .. method:: get_last_dagrun(self, session=None, include_externally_triggered=False) |
| |
| |
| |
| |
| .. staticmethod:: get_paused_dag_ids(dag_ids, session) |
| |
| Given a list of dag_ids, get a set of Paused Dag Ids |
| |
| :param dag_ids: List of Dag ids |
| :param session: ORM Session |
| :return: Paused Dag_ids |
| |
| |
| |
| |
| .. method:: get_dag(self, store_serialized_dags=False) |
| |
| Creates a dagbag to load and return a DAG. |
| Calling it from UI should set store_serialized_dags = STORE_SERIALIZED_DAGS. |
| There may be a delay for scheduler to write serialized DAG into database, |
| loads from file in this case. |
| FIXME: remove it when webserver does not access to DAG folder in future. |
| |
| |
| |
| |
| .. method:: create_dagrun(self, run_id, state, execution_date, start_date=None, external_trigger=False, conf=None, session=None) |
| |
| Creates a dag run from this dag including the tasks associated with this dag. |
| Returns the dag run. |
| |
| :param run_id: defines the the run id for this dag run |
| :type run_id: str |
| :param execution_date: the execution date of this dag run |
| :type execution_date: datetime.datetime |
| :param state: the state of the dag run |
| :type state: airflow.utils.state.State |
| :param start_date: the date this dag run should be evaluated |
| :type start_date: datetime.datetime |
| :param external_trigger: whether this dag run is externally triggered |
| :type external_trigger: bool |
| :param session: database session |
| :type session: sqlalchemy.orm.session.Session |
| |
| |
| |
| |
| .. method:: set_is_paused(self, is_paused, including_subdags=True, store_serialized_dags=False, session=None) |
| |
| Pause/Un-pause a DAG. |
| |
| :param is_paused: Is the DAG paused |
| :param including_subdags: whether to include the DAG's subdags |
| :param store_serialized_dags: whether to serialize DAGs & store it in DB |
| :param session: session |
| |
| |
| |
| |
| .. classmethod:: deactivate_deleted_dags(cls, alive_dag_filelocs, session=None) |
| |
| Set ``is_active=False`` on the DAGs for which the DAG files have been removed. |
| Additionally change ``is_active=False`` to ``True`` if the DAG file exists. |
| |
| :param alive_dag_filelocs: file paths of alive DAGs |
| :param session: ORM Session |
| |
| |
| |
| |