blob: ea148ab74c8e1c98bf8eeaf3f546ce95945d6338 [file] [log] [blame]
:py:mod:`airflow.providers.google.cloud.operators.dataproc_metastore`
=====================================================================
.. py:module:: airflow.providers.google.cloud.operators.dataproc_metastore
.. autoapi-nested-parse::
This module contains Google Dataproc Metastore operators.
Module Contents
---------------
Classes
~~~~~~~
.. autoapisummary::
airflow.providers.google.cloud.operators.dataproc_metastore.DataprocMetastoreLink
airflow.providers.google.cloud.operators.dataproc_metastore.DataprocMetastoreDetailedLink
airflow.providers.google.cloud.operators.dataproc_metastore.DataprocMetastoreCreateBackupOperator
airflow.providers.google.cloud.operators.dataproc_metastore.DataprocMetastoreCreateMetadataImportOperator
airflow.providers.google.cloud.operators.dataproc_metastore.DataprocMetastoreCreateServiceOperator
airflow.providers.google.cloud.operators.dataproc_metastore.DataprocMetastoreDeleteBackupOperator
airflow.providers.google.cloud.operators.dataproc_metastore.DataprocMetastoreDeleteServiceOperator
airflow.providers.google.cloud.operators.dataproc_metastore.DataprocMetastoreExportMetadataOperator
airflow.providers.google.cloud.operators.dataproc_metastore.DataprocMetastoreGetServiceOperator
airflow.providers.google.cloud.operators.dataproc_metastore.DataprocMetastoreListBackupsOperator
airflow.providers.google.cloud.operators.dataproc_metastore.DataprocMetastoreRestoreServiceOperator
airflow.providers.google.cloud.operators.dataproc_metastore.DataprocMetastoreUpdateServiceOperator
Attributes
~~~~~~~~~~
.. autoapisummary::
airflow.providers.google.cloud.operators.dataproc_metastore.BASE_LINK
airflow.providers.google.cloud.operators.dataproc_metastore.METASTORE_BASE_LINK
airflow.providers.google.cloud.operators.dataproc_metastore.METASTORE_BACKUP_LINK
airflow.providers.google.cloud.operators.dataproc_metastore.METASTORE_BACKUPS_LINK
airflow.providers.google.cloud.operators.dataproc_metastore.METASTORE_EXPORT_LINK
airflow.providers.google.cloud.operators.dataproc_metastore.METASTORE_IMPORT_LINK
airflow.providers.google.cloud.operators.dataproc_metastore.METASTORE_SERVICE_LINK
.. py:data:: BASE_LINK
:annotation: = https://console.cloud.google.com
.. py:data:: METASTORE_BASE_LINK
.. py:data:: METASTORE_BACKUP_LINK
.. py:data:: METASTORE_BACKUPS_LINK
.. py:data:: METASTORE_EXPORT_LINK
.. py:data:: METASTORE_IMPORT_LINK
.. py:data:: METASTORE_SERVICE_LINK
.. py:class:: DataprocMetastoreLink
Bases: :py:obj:`airflow.models.BaseOperatorLink`
Helper class for constructing Dataproc Metastore resource link
.. py:attribute:: name
:annotation: = Dataproc Metastore
.. py:attribute:: key
:annotation: = conf
.. py:method:: persist(context, task_instance, url)
:staticmethod:
.. py:method:: get_link(operator, dttm = None, ti_key = None)
Link to external system.
Note: The old signature of this function was ``(self, operator, dttm: datetime)``. That is still
supported at runtime but is deprecated.
:param operator: The Airflow operator object this link is associated to.
:param ti_key: TaskInstance ID to return link for.
:return: link to external system
.. py:class:: DataprocMetastoreDetailedLink
Bases: :py:obj:`airflow.models.BaseOperatorLink`
Helper class for constructing Dataproc Metastore detailed resource link
.. py:attribute:: name
:annotation: = Dataproc Metastore resource
.. py:attribute:: key
:annotation: = config
.. py:method:: persist(context, task_instance, url, resource)
:staticmethod:
.. py:method:: get_link(operator, dttm = None, ti_key = None)
Link to external system.
Note: The old signature of this function was ``(self, operator, dttm: datetime)``. That is still
supported at runtime but is deprecated.
:param operator: The Airflow operator object this link is associated to.
:param ti_key: TaskInstance ID to return link for.
:return: link to external system
.. py:class:: DataprocMetastoreCreateBackupOperator(*, project_id, region, service_id, backup, backup_id, request_id = None, retry = DEFAULT, timeout = None, metadata = (), gcp_conn_id = 'google_cloud_default', impersonation_chain = None, **kwargs)
Bases: :py:obj:`airflow.models.BaseOperator`
Creates a new backup in a given project and location.
:param project_id: Required. The ID of the Google Cloud project that the service belongs to.
:param region: Required. The ID of the Google Cloud region that the service belongs to.
:param service_id: Required. The ID of the metastore service, which is used as the final component of
the metastore service's name. This value must be between 2 and 63 characters long inclusive, begin
with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or
hyphens.
This corresponds to the ``service_id`` field on the ``request`` instance; if ``request`` is
provided, this should not be set.
:param backup: Required. The backup to create. The ``name`` field is ignored. The ID of the created
backup must be provided in the request's ``backup_id`` field.
This corresponds to the ``backup`` field on the ``request`` instance; if ``request`` is provided, this
should not be set.
:param backup_id: Required. The ID of the backup, which is used as the final component of the backup's
name. This value must be between 1 and 64 characters long, begin with a letter, end with a letter or
number, and consist of alphanumeric ASCII characters or hyphens.
This corresponds to the ``backup_id`` field on the ``request`` instance; if ``request`` is provided,
this should not be set.
:param request_id: Optional. A unique id used to identify the request.
:param retry: Optional. Designation of what errors, if any, should be retried.
:param timeout: Optional. The timeout for this request.
:param metadata: Optional. Strings which should be sent along with the request as metadata.
:param gcp_conn_id: The connection ID to use connecting to Google Cloud.
:param impersonation_chain: Optional service account to impersonate using short-term
credentials, or chained list of accounts required to get the access_token
of the last account in the list, which will be impersonated in the request.
If set as a string, the account must grant the originating account
the Service Account Token Creator IAM role.
If set as a sequence, the identities from the list must grant
Service Account Token Creator IAM role to the directly preceding identity, with first
account from the list granting this role to the originating account (templated).
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['project_id', 'backup', 'impersonation_chain']
.. py:attribute:: template_fields_renderers
.. py:attribute:: operator_extra_links
.. py:method:: execute(context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: DataprocMetastoreCreateMetadataImportOperator(*, project_id, region, service_id, metadata_import, metadata_import_id, request_id = None, retry = DEFAULT, timeout = None, metadata = (), gcp_conn_id = 'google_cloud_default', impersonation_chain = None, **kwargs)
Bases: :py:obj:`airflow.models.BaseOperator`
Creates a new MetadataImport in a given project and location.
:param project_id: Required. The ID of the Google Cloud project that the service belongs to.
:param region: Required. The ID of the Google Cloud region that the service belongs to.
:param service_id: Required. The ID of the metastore service, which is used as the final component of
the metastore service's name. This value must be between 2 and 63 characters long inclusive, begin
with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or
hyphens.
This corresponds to the ``service_id`` field on the ``request`` instance; if ``request`` is
provided, this should not be set.
:param metadata_import: Required. The metadata import to create. The ``name`` field is ignored. The ID of
the created metadata import must be provided in the request's ``metadata_import_id`` field.
This corresponds to the ``metadata_import`` field on the ``request`` instance; if ``request`` is
provided, this should not be set.
:param metadata_import_id: Required. The ID of the metadata import, which is used as the final component
of the metadata import's name. This value must be between 1 and 64 characters long, begin with a
letter, end with a letter or number, and consist of alphanumeric ASCII characters or hyphens.
This corresponds to the ``metadata_import_id`` field on the ``request`` instance; if ``request`` is
provided, this should not be set.
:param request_id: Optional. A unique id used to identify the request.
:param retry: Optional. Designation of what errors, if any, should be retried.
:param timeout: Optional. The timeout for this request.
:param metadata: Optional. Strings which should be sent along with the request as metadata.
:param gcp_conn_id: The connection ID to use connecting to Google Cloud.
:param impersonation_chain: Optional service account to impersonate using short-term
credentials, or chained list of accounts required to get the access_token
of the last account in the list, which will be impersonated in the request.
If set as a string, the account must grant the originating account
the Service Account Token Creator IAM role.
If set as a sequence, the identities from the list must grant
Service Account Token Creator IAM role to the directly preceding identity, with first
account from the list granting this role to the originating account (templated).
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['project_id', 'metadata_import', 'impersonation_chain']
.. py:attribute:: template_fields_renderers
.. py:attribute:: operator_extra_links
.. py:method:: execute(context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: DataprocMetastoreCreateServiceOperator(*, region, project_id, service, service_id, request_id = None, retry = DEFAULT, timeout = None, metadata = (), gcp_conn_id = 'google_cloud_default', impersonation_chain = None, **kwargs)
Bases: :py:obj:`airflow.models.BaseOperator`
Creates a metastore service in a project and location.
:param region: Required. The ID of the Google Cloud region that the service belongs to.
:param project_id: Required. The ID of the Google Cloud project that the service belongs to.
:param service: Required. The Metastore service to create. The ``name`` field is ignored. The ID of
the created metastore service must be provided in the request's ``service_id`` field.
This corresponds to the ``service`` field on the ``request`` instance; if ``request`` is provided,
this should not be set.
:param service_id: Required. The ID of the metastore service, which is used as the final component of
the metastore service's name. This value must be between 2 and 63 characters long inclusive, begin
with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or
hyphens.
This corresponds to the ``service_id`` field on the ``request`` instance; if ``request`` is
provided, this should not be set.
:param request_id: Optional. A unique id used to identify the request.
:param retry: Designation of what errors, if any, should be retried.
:param timeout: The timeout for this request.
:param metadata: Strings which should be sent along with the request as metadata.
:param gcp_conn_id: The connection ID to use connecting to Google Cloud.
:param impersonation_chain: Optional service account to impersonate using short-term
credentials, or chained list of accounts required to get the access_token
of the last account in the list, which will be impersonated in the request.
If set as a string, the account must grant the originating account
the Service Account Token Creator IAM role.
If set as a sequence, the identities from the list must grant
Service Account Token Creator IAM role to the directly preceding identity, with first
account from the list granting this role to the originating account (templated).
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['project_id', 'service', 'impersonation_chain']
.. py:attribute:: template_fields_renderers
.. py:attribute:: operator_extra_links
.. py:method:: execute(context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: DataprocMetastoreDeleteBackupOperator(*, project_id, region, service_id, backup_id, request_id = None, retry = DEFAULT, timeout = None, metadata = (), gcp_conn_id = 'google_cloud_default', impersonation_chain = None, **kwargs)
Bases: :py:obj:`airflow.models.BaseOperator`
Deletes a single backup.
:param project_id: Required. The ID of the Google Cloud project that the backup belongs to.
:param region: Required. The ID of the Google Cloud region that the backup belongs to.
:param service_id: Required. The ID of the metastore service, which is used as the final component of
the metastore service's name. This value must be between 2 and 63 characters long inclusive, begin
with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or
hyphens.
This corresponds to the ``service_id`` field on the ``request`` instance; if ``request`` is
provided, this should not be set.
:param backup_id: Required. The ID of the backup, which is used as the final component of the backup's
name. This value must be between 1 and 64 characters long, begin with a letter, end with a letter or
number, and consist of alphanumeric ASCII characters or hyphens.
This corresponds to the ``backup_id`` field on the ``request`` instance; if ``request`` is provided,
this should not be set.
:param request_id: Optional. A unique id used to identify the request.
:param retry: Optional. Designation of what errors, if any, should be retried.
:param timeout: Optional. The timeout for this request.
:param metadata: Optional. Strings which should be sent along with the request as metadata.
:param gcp_conn_id: The connection ID to use connecting to Google Cloud.
:param impersonation_chain: Optional service account to impersonate using short-term
credentials, or chained list of accounts required to get the access_token
of the last account in the list, which will be impersonated in the request.
If set as a string, the account must grant the originating account
the Service Account Token Creator IAM role.
If set as a sequence, the identities from the list must grant
Service Account Token Creator IAM role to the directly preceding identity, with first
account from the list granting this role to the originating account (templated).
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['project_id', 'impersonation_chain']
.. py:method:: execute(context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: DataprocMetastoreDeleteServiceOperator(*, region, project_id, service_id, retry = DEFAULT, timeout = None, metadata = (), gcp_conn_id = 'google_cloud_default', impersonation_chain = None, **kwargs)
Bases: :py:obj:`airflow.models.BaseOperator`
Deletes a single service.
:param request: The request object. Request message for
[DataprocMetastore.DeleteService][google.cloud.metastore.v1.DataprocMetastore.DeleteService].
:param project_id: Required. The ID of the Google Cloud project that the service belongs to.
:param retry: Designation of what errors, if any, should be retried.
:param timeout: The timeout for this request.
:param metadata: Strings which should be sent along with the request as metadata.
:param gcp_conn_id:
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['project_id', 'impersonation_chain']
.. py:method:: execute(context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: DataprocMetastoreExportMetadataOperator(*, destination_gcs_folder, project_id, region, service_id, request_id = None, database_dump_type = None, retry = DEFAULT, timeout = None, metadata = (), gcp_conn_id = 'google_cloud_default', impersonation_chain = None, **kwargs)
Bases: :py:obj:`airflow.models.BaseOperator`
Exports metadata from a service.
:param destination_gcs_folder: A Cloud Storage URI of a folder, in the format
``gs://<bucket_name>/<path_inside_bucket>``. A sub-folder
``<export_folder>`` containing exported files will be
created below it.
:param project_id: Required. The ID of the Google Cloud project that the service belongs to.
:param region: Required. The ID of the Google Cloud region that the service belongs to.
:param service_id: Required. The ID of the metastore service, which is used as the final component of
the metastore service's name. This value must be between 2 and 63 characters long inclusive, begin
with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or
hyphens.
This corresponds to the ``service_id`` field on the ``request`` instance; if ``request`` is
provided, this should not be set.
:param request_id: Optional. A unique id used to identify the request.
:param retry: Optional. Designation of what errors, if any, should be retried.
:param timeout: Optional. The timeout for this request.
:param metadata: Optional. Strings which should be sent along with the request as metadata.
:param gcp_conn_id: The connection ID to use connecting to Google Cloud.
:param impersonation_chain: Optional service account to impersonate using short-term
credentials, or chained list of accounts required to get the access_token
of the last account in the list, which will be impersonated in the request.
If set as a string, the account must grant the originating account
the Service Account Token Creator IAM role.
If set as a sequence, the identities from the list must grant
Service Account Token Creator IAM role to the directly preceding identity, with first
account from the list granting this role to the originating account (templated).
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['project_id', 'impersonation_chain']
.. py:attribute:: operator_extra_links
.. py:method:: execute(context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: DataprocMetastoreGetServiceOperator(*, region, project_id, service_id, retry = DEFAULT, timeout = None, metadata = (), gcp_conn_id = 'google_cloud_default', impersonation_chain = None, **kwargs)
Bases: :py:obj:`airflow.models.BaseOperator`
Gets the details of a single service.
:param region: Required. The ID of the Google Cloud region that the service belongs to.
:param project_id: Required. The ID of the Google Cloud project that the service belongs to.
:param service_id: Required. The ID of the metastore service, which is used as the final component of
the metastore service's name. This value must be between 2 and 63 characters long inclusive, begin
with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or
hyphens.
This corresponds to the ``service_id`` field on the ``request`` instance; if ``request`` is
provided, this should not be set.
:param retry: Designation of what errors, if any, should be retried.
:param timeout: The timeout for this request.
:param metadata: Strings which should be sent along with the request as metadata.
:param gcp_conn_id: The connection ID to use connecting to Google Cloud.
:param impersonation_chain: Optional service account to impersonate using short-term
credentials, or chained list of accounts required to get the access_token
of the last account in the list, which will be impersonated in the request.
If set as a string, the account must grant the originating account
the Service Account Token Creator IAM role.
If set as a sequence, the identities from the list must grant
Service Account Token Creator IAM role to the directly preceding identity, with first
account from the list granting this role to the originating account (templated).
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['project_id', 'impersonation_chain']
.. py:attribute:: operator_extra_links
.. py:method:: execute(context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: DataprocMetastoreListBackupsOperator(*, project_id, region, service_id, page_size = None, page_token = None, filter = None, order_by = None, retry = DEFAULT, timeout = None, metadata = (), gcp_conn_id = 'google_cloud_default', impersonation_chain = None, **kwargs)
Bases: :py:obj:`airflow.models.BaseOperator`
Lists backups in a service.
:param project_id: Required. The ID of the Google Cloud project that the backup belongs to.
:param region: Required. The ID of the Google Cloud region that the backup belongs to.
:param service_id: Required. The ID of the metastore service, which is used as the final component of
the metastore service's name. This value must be between 2 and 63 characters long inclusive, begin
with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or
hyphens.
This corresponds to the ``service_id`` field on the ``request`` instance; if ``request`` is
provided, this should not be set.
:param retry: Optional. Designation of what errors, if any, should be retried.
:param timeout: Optional. The timeout for this request.
:param metadata: Optional. Strings which should be sent along with the request as metadata.
:param gcp_conn_id: The connection ID to use connecting to Google Cloud.
:param impersonation_chain: Optional service account to impersonate using short-term
credentials, or chained list of accounts required to get the access_token
of the last account in the list, which will be impersonated in the request.
If set as a string, the account must grant the originating account
the Service Account Token Creator IAM role.
If set as a sequence, the identities from the list must grant
Service Account Token Creator IAM role to the directly preceding identity, with first
account from the list granting this role to the originating account (templated).
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['project_id', 'impersonation_chain']
.. py:attribute:: operator_extra_links
.. py:method:: execute(context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: DataprocMetastoreRestoreServiceOperator(*, project_id, region, service_id, backup_project_id, backup_region, backup_service_id, backup_id, restore_type = None, request_id = None, retry = DEFAULT, timeout = None, metadata = (), gcp_conn_id = 'google_cloud_default', impersonation_chain = None, **kwargs)
Bases: :py:obj:`airflow.models.BaseOperator`
Restores a service from a backup.
:param project_id: Required. The ID of the Google Cloud project that the service belongs to.
:param region: Required. The ID of the Google Cloud region that the service belongs to.
:param service_id: Required. The ID of the metastore service, which is used as the final component of
the metastore service's name. This value must be between 2 and 63 characters long inclusive, begin
with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or
hyphens.
This corresponds to the ``service_id`` field on the ``request`` instance; if ``request`` is
provided, this should not be set.
:param backup_project_id: Required. The ID of the Google Cloud project that the metastore
service backup to restore from.
:param backup_region: Required. The ID of the Google Cloud region that the metastore
service backup to restore from.
:param backup_service_id: Required. The ID of the metastore service backup to restore from, which is
used as the final component of the metastore service's name. This value must be between 2 and 63
characters long inclusive, begin with a letter, end with a letter or number, and consist
of alphanumeric ASCII characters or hyphens.
:param backup_id: Required. The ID of the metastore service backup to restore from
:param restore_type: Optional. The type of restore. If unspecified, defaults to
``METADATA_ONLY``
:param request_id: Optional. A unique id used to identify the request.
:param retry: Optional. Designation of what errors, if any, should be retried.
:param timeout: Optional. The timeout for this request.
:param metadata: Optional. Strings which should be sent along with the request as metadata.
:param gcp_conn_id: The connection ID to use connecting to Google Cloud.
:param impersonation_chain: Optional service account to impersonate using short-term
credentials, or chained list of accounts required to get the access_token
of the last account in the list, which will be impersonated in the request.
If set as a string, the account must grant the originating account
the Service Account Token Creator IAM role.
If set as a sequence, the identities from the list must grant
Service Account Token Creator IAM role to the directly preceding identity, with first
account from the list granting this role to the originating account (templated).
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['project_id', 'impersonation_chain']
.. py:attribute:: operator_extra_links
.. py:method:: execute(context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: DataprocMetastoreUpdateServiceOperator(*, project_id, region, service_id, service, update_mask, request_id = None, retry = DEFAULT, timeout = None, metadata = (), gcp_conn_id = 'google_cloud_default', impersonation_chain = None, **kwargs)
Bases: :py:obj:`airflow.models.BaseOperator`
Updates the parameters of a single service.
:param project_id: Required. The ID of the Google Cloud project that the service belongs to.
:param region: Required. The ID of the Google Cloud region that the service belongs to.
:param service_id: Required. The ID of the metastore service, which is used as the final component of
the metastore service's name. This value must be between 2 and 63 characters long inclusive, begin
with a letter, end with a letter or number, and consist of alphanumeric ASCII characters or
hyphens.
This corresponds to the ``service_id`` field on the ``request`` instance; if ``request`` is
provided, this should not be set.
:param service: Required. The metastore service to update. The server only merges fields in the service
if they are specified in ``update_mask``.
The metastore service's ``name`` field is used to identify the metastore service to be updated.
This corresponds to the ``service`` field on the ``request`` instance; if ``request`` is provided,
this should not be set.
:param update_mask: Required. A field mask used to specify the fields to be overwritten in the metastore
service resource by the update. Fields specified in the ``update_mask`` are relative to the resource
(not to the full request). A field is overwritten if it is in the mask.
This corresponds to the ``update_mask`` field on the ``request`` instance; if ``request`` is provided,
this should not be set.
:param request_id: Optional. A unique id used to identify the request.
:param retry: Optional. Designation of what errors, if any, should be retried.
:param timeout: Optional. The timeout for this request.
:param metadata: Optional. Strings which should be sent along with the request as metadata.
:param gcp_conn_id: The connection ID to use connecting to Google Cloud.
:param impersonation_chain: Optional service account to impersonate using short-term
credentials, or chained list of accounts required to get the access_token
of the last account in the list, which will be impersonated in the request.
If set as a string, the account must grant the originating account
the Service Account Token Creator IAM role.
If set as a sequence, the identities from the list must grant
Service Account Token Creator IAM role to the directly preceding identity, with first
account from the list granting this role to the originating account (templated).
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['project_id', 'impersonation_chain']
.. py:attribute:: operator_extra_links
.. py:method:: execute(context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.