blob: 3a723cad88763d590c63d0e7094dcbdcbf70d92b [file] [log] [blame]
:py:mod:`airflow.providers.databricks.operators.databricks_repos`
=================================================================
.. py:module:: airflow.providers.databricks.operators.databricks_repos
.. autoapi-nested-parse::
This module contains Databricks operators.
Module Contents
---------------
Classes
~~~~~~~
.. autoapisummary::
airflow.providers.databricks.operators.databricks_repos.DatabricksReposCreateOperator
airflow.providers.databricks.operators.databricks_repos.DatabricksReposUpdateOperator
airflow.providers.databricks.operators.databricks_repos.DatabricksReposDeleteOperator
.. py:class:: DatabricksReposCreateOperator(*, git_url, git_provider = None, branch = None, tag = None, repo_path = None, ignore_existing_repo = False, databricks_conn_id = 'databricks_default', databricks_retry_limit = 3, databricks_retry_delay = 1, **kwargs)
Bases: :py:obj:`airflow.models.BaseOperator`
Creates a Databricks Repo
using
`POST api/2.0/repos <https://docs.databricks.com/dev-tools/api/latest/repos.html#operation/create-repo>`_
API endpoint and optionally checking it out to a specific branch or tag.
:param git_url: Required HTTPS URL of a Git repository
:param git_provider: Optional name of Git provider. Must be provided if we can't guess its name from URL.
:param repo_path: optional path for a repository. Must be in the format ``/Repos/{folder}/{repo-name}``.
If not specified, it will be created in the user's directory.
:param branch: optional name of branch to check out.
:param tag: optional name of tag to checkout.
:param ignore_existing_repo: don't throw exception if repository with given path already exists.
:param databricks_conn_id: Reference to the :ref:`Databricks connection <howto/connection:databricks>`.
By default and in the common case this will be ``databricks_default``. To use
token based authentication, provide the key ``token`` in the extra field for the
connection and create the key ``host`` and leave the ``host`` field empty. (templated)
:param databricks_retry_limit: Amount of times retry if the Databricks backend is
unreachable. Its value must be greater than or equal to 1.
:param databricks_retry_delay: Number of seconds to wait between retries (it
might be a floating point number).
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['repo_path', 'tag', 'branch', 'databricks_conn_id']
.. py:attribute:: __git_providers__
.. py:attribute:: __aws_code_commit_regexp__
.. py:attribute:: __repos_path_regexp__
.. py:method:: __detect_repo_provider__(url)
:staticmethod:
.. py:method:: execute(context)
Creates a Databricks Repo
:param context: context
:return: Repo ID
.. py:class:: DatabricksReposUpdateOperator(*, branch = None, tag = None, repo_id = None, repo_path = None, databricks_conn_id = 'databricks_default', databricks_retry_limit = 3, databricks_retry_delay = 1, **kwargs)
Bases: :py:obj:`airflow.models.BaseOperator`
Updates specified repository to a given branch or tag
using `PATCH api/2.0/repos
<https://docs.databricks.com/dev-tools/api/latest/repos.html#operation/update-repo>`_ API endpoint.
:param branch: optional name of branch to update to. Should be specified if ``tag`` is omitted
:param tag: optional name of tag to update to. Should be specified if ``branch`` is omitted
:param repo_id: optional ID of existing repository. Should be specified if ``repo_path`` is omitted
:param repo_path: optional path of existing repository. Should be specified if ``repo_id`` is omitted
:param databricks_conn_id: Reference to the :ref:`Databricks connection <howto/connection:databricks>`.
By default and in the common case this will be ``databricks_default``. To use
token based authentication, provide the key ``token`` in the extra field for the
connection and create the key ``host`` and leave the ``host`` field empty. (templated)
:param databricks_retry_limit: Amount of times retry if the Databricks backend is
unreachable. Its value must be greater than or equal to 1.
:param databricks_retry_delay: Number of seconds to wait between retries (it
might be a floating point number).
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['repo_path', 'tag', 'branch', 'databricks_conn_id']
.. py:method:: execute(context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: DatabricksReposDeleteOperator(*, repo_id = None, repo_path = None, databricks_conn_id = 'databricks_default', databricks_retry_limit = 3, databricks_retry_delay = 1, **kwargs)
Bases: :py:obj:`airflow.models.BaseOperator`
Deletes specified repository
using `DELETE api/2.0/repos
<https://docs.databricks.com/dev-tools/api/latest/repos.html#operation/delete-repo>`_ API endpoint.
:param repo_id: optional ID of existing repository. Should be specified if ``repo_path`` is omitted
:param repo_path: optional path of existing repository. Should be specified if ``repo_id`` is omitted
:param databricks_conn_id: Reference to the :ref:`Databricks connection <howto/connection:databricks>`.
By default and in the common case this will be ``databricks_default``. To use
token based authentication, provide the key ``token`` in the extra field for the
connection and create the key ``host`` and leave the ``host`` field empty. (templated)
:param databricks_retry_limit: Amount of times retry if the Databricks backend is
unreachable. Its value must be greater than or equal to 1.
:param databricks_retry_delay: Number of seconds to wait between retries (it
might be a floating point number).
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['repo_path', 'databricks_conn_id']
.. py:method:: execute(context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.