| :mod:`airflow.contrib.operators.gcp_transfer_operator` |
| ====================================================== |
| |
| .. py:module:: airflow.contrib.operators.gcp_transfer_operator |
| |
| |
| Module Contents |
| --------------- |
| |
| .. data:: AwsHook |
| |
| |
| |
| |
| .. py:class:: TransferJobPreprocessor(body, aws_conn_id='aws_default', default_schedule=False) |
| |
| |
| .. method:: _inject_aws_credentials(self) |
| |
| |
| |
| |
| .. method:: _reformat_date(self, field_key) |
| |
| |
| |
| |
| .. method:: _reformat_time(self, field_key) |
| |
| |
| |
| |
| .. method:: _reformat_schedule(self) |
| |
| |
| |
| |
| .. method:: process_body(self) |
| |
| |
| |
| |
| .. staticmethod:: _convert_date_to_dict(field_date) |
| |
| Convert native python ``datetime.date`` object to a format supported by the API |
| |
| |
| |
| |
| .. staticmethod:: _convert_time_to_dict(time) |
| |
| Convert native python ``datetime.time`` object to a format supported by the API |
| |
| |
| |
| |
| .. py:class:: TransferJobValidator(body) |
| |
| |
| .. method:: _verify_data_source(self) |
| |
| |
| |
| |
| .. method:: _restrict_aws_credentials(self) |
| |
| |
| |
| |
| .. method:: _restrict_empty_body(self) |
| |
| |
| |
| |
| .. method:: validate_body(self) |
| |
| |
| |
| |
| .. py:class:: GcpTransferServiceJobCreateOperator(body, aws_conn_id='aws_default', gcp_conn_id='google_cloud_default', api_version='v1', *args, **kwargs) |
| |
| Bases: :class:`airflow.models.BaseOperator` |
| |
| Creates a transfer job that runs periodically. |
| |
| .. warning:: |
| |
| This operator is NOT idempotent. If you run it many times, many transfer |
| jobs will be created in the Google Cloud Platform. |
| |
| .. seealso:: |
| For more information on how to use this operator, take a look at the guide: |
| :ref:`howto/operator:GcpTransferServiceJobCreateOperator` |
| |
| :param body: (Required) The request body, as described in |
| https://cloud.google.com/storage-transfer/docs/reference/rest/v1/transferJobs/create#request-body |
| With three additional improvements: |
| |
| * dates can be given in the form :class:`datetime.date` |
| * times can be given in the form :class:`datetime.time` |
| * credentials to Amazon Web Service should be stored in the connection and indicated by the |
| aws_conn_id parameter |
| |
| :type body: dict |
| :param aws_conn_id: The connection ID used to retrieve credentials to |
| Amazon Web Service. |
| :type aws_conn_id: str |
| :param gcp_conn_id: The connection ID used to connect to Google Cloud |
| Platform. |
| :type gcp_conn_id: str |
| :param api_version: API version used (e.g. v1). |
| :type api_version: str |
| |
| .. attribute:: template_fields |
| :annotation: = ['body', 'gcp_conn_id', 'aws_conn_id'] |
| |
| |
| |
| |
| .. method:: _validate_inputs(self) |
| |
| |
| |
| |
| .. method:: execute(self, context) |
| |
| |
| |
| |
| .. py:class:: GcpTransferServiceJobUpdateOperator(job_name, body, aws_conn_id='aws_default', gcp_conn_id='google_cloud_default', api_version='v1', *args, **kwargs) |
| |
| Bases: :class:`airflow.models.BaseOperator` |
| |
| Updates a transfer job that runs periodically. |
| |
| .. seealso:: |
| For more information on how to use this operator, take a look at the guide: |
| :ref:`howto/operator:GcpTransferServiceJobUpdateOperator` |
| |
| :param job_name: (Required) Name of the job to be updated |
| :type job_name: str |
| :param body: (Required) The request body, as described in |
| https://cloud.google.com/storage-transfer/docs/reference/rest/v1/transferJobs/patch#request-body |
| With three additional improvements: |
| |
| * dates can be given in the form :class:`datetime.date` |
| * times can be given in the form :class:`datetime.time` |
| * credentials to Amazon Web Service should be stored in the connection and indicated by the |
| aws_conn_id parameter |
| |
| :type body: dict |
| :param aws_conn_id: The connection ID used to retrieve credentials to |
| Amazon Web Service. |
| :type aws_conn_id: str |
| :param gcp_conn_id: The connection ID used to connect to Google Cloud |
| Platform. |
| :type gcp_conn_id: str |
| :param api_version: API version used (e.g. v1). |
| :type api_version: str |
| |
| .. attribute:: template_fields |
| :annotation: = ['job_name', 'body', 'gcp_conn_id', 'aws_conn_id'] |
| |
| |
| |
| |
| .. method:: _validate_inputs(self) |
| |
| |
| |
| |
| .. method:: execute(self, context) |
| |
| |
| |
| |
| .. py:class:: GcpTransferServiceJobDeleteOperator(job_name, gcp_conn_id='google_cloud_default', api_version='v1', project_id=None, *args, **kwargs) |
| |
| Bases: :class:`airflow.models.BaseOperator` |
| |
| Delete a transfer job. This is a soft delete. After a transfer job is |
| deleted, the job and all the transfer executions are subject to garbage |
| collection. Transfer jobs become eligible for garbage collection |
| 30 days after soft delete. |
| |
| .. seealso:: |
| For more information on how to use this operator, take a look at the guide: |
| :ref:`howto/operator:GcpTransferServiceJobDeleteOperator` |
| |
| :param job_name: (Required) Name of the TRANSFER operation |
| :type job_name: str |
| :param project_id: (Optional) the ID of the project that owns the Transfer |
| Job. If set to None or missing, the default project_id from the GCP |
| connection is used. |
| :type project_id: str |
| :param gcp_conn_id: The connection ID used to connect to Google Cloud |
| Platform. |
| :type gcp_conn_id: str |
| :param api_version: API version used (e.g. v1). |
| :type api_version: str |
| |
| .. attribute:: template_fields |
| :annotation: = ['job_name', 'project_id', 'gcp_conn_id', 'api_version'] |
| |
| |
| |
| |
| .. method:: _validate_inputs(self) |
| |
| |
| |
| |
| .. method:: execute(self, context) |
| |
| |
| |
| |
| .. py:class:: GcpTransferServiceOperationGetOperator(operation_name, gcp_conn_id='google_cloud_default', api_version='v1', *args, **kwargs) |
| |
| Bases: :class:`airflow.models.BaseOperator` |
| |
| Gets the latest state of a long-running operation in Google Storage Transfer |
| Service. |
| |
| .. seealso:: |
| For more information on how to use this operator, take a look at the guide: |
| :ref:`howto/operator:GcpTransferServiceOperationGetOperator` |
| |
| :param operation_name: (Required) Name of the transfer operation. |
| :type operation_name: str |
| :param gcp_conn_id: The connection ID used to connect to Google |
| Cloud Platform. |
| :type gcp_conn_id: str |
| :param api_version: API version used (e.g. v1). |
| :type api_version: str |
| |
| .. attribute:: template_fields |
| :annotation: = ['operation_name', 'gcp_conn_id'] |
| |
| |
| |
| |
| .. method:: _validate_inputs(self) |
| |
| |
| |
| |
| .. method:: execute(self, context) |
| |
| |
| |
| |
| .. py:class:: GcpTransferServiceOperationsListOperator(filter, gcp_conn_id='google_cloud_default', api_version='v1', *args, **kwargs) |
| |
| Bases: :class:`airflow.models.BaseOperator` |
| |
| Lists long-running operations in Google Storage Transfer |
| Service that match the specified filter. |
| |
| .. seealso:: |
| For more information on how to use this operator, take a look at the guide: |
| :ref:`howto/operator:GcpTransferServiceOperationsListOperator` |
| |
| :param filter: (Required) A request filter, as described in |
| https://cloud.google.com/storage-transfer/docs/reference/rest/v1/transferJobs/list#body.QUERY_PARAMETERS.filter |
| :type filter: dict |
| :param gcp_conn_id: The connection ID used to connect to Google |
| Cloud Platform. |
| :type gcp_conn_id: str |
| :param api_version: API version used (e.g. v1). |
| :type api_version: str |
| |
| .. attribute:: template_fields |
| :annotation: = ['filter', 'gcp_conn_id'] |
| |
| |
| |
| |
| .. method:: _validate_inputs(self) |
| |
| |
| |
| |
| .. method:: execute(self, context) |
| |
| |
| |
| |
| .. py:class:: GcpTransferServiceOperationPauseOperator(operation_name, gcp_conn_id='google_cloud_default', api_version='v1', *args, **kwargs) |
| |
| Bases: :class:`airflow.models.BaseOperator` |
| |
| Pauses a transfer operation in Google Storage Transfer Service. |
| |
| .. seealso:: |
| For more information on how to use this operator, take a look at the guide: |
| :ref:`howto/operator:GcpTransferServiceOperationPauseOperator` |
| |
| :param operation_name: (Required) Name of the transfer operation. |
| :type operation_name: str |
| :param gcp_conn_id: The connection ID used to connect to Google Cloud Platform. |
| :type gcp_conn_id: str |
| :param api_version: API version used (e.g. v1). |
| :type api_version: str |
| |
| .. attribute:: template_fields |
| :annotation: = ['operation_name', 'gcp_conn_id', 'api_version'] |
| |
| |
| |
| |
| .. method:: _validate_inputs(self) |
| |
| |
| |
| |
| .. method:: execute(self, context) |
| |
| |
| |
| |
| .. py:class:: GcpTransferServiceOperationResumeOperator(operation_name, gcp_conn_id='google_cloud_default', api_version='v1', *args, **kwargs) |
| |
| Bases: :class:`airflow.models.BaseOperator` |
| |
| Resumes a transfer operation in Google Storage Transfer Service. |
| |
| .. seealso:: |
| For more information on how to use this operator, take a look at the guide: |
| :ref:`howto/operator:GcpTransferServiceOperationResumeOperator` |
| |
| :param operation_name: (Required) Name of the transfer operation. |
| :type operation_name: str |
| :param gcp_conn_id: The connection ID used to connect to Google Cloud Platform. |
| :param api_version: API version used (e.g. v1). |
| :type api_version: str |
| :type gcp_conn_id: str |
| |
| .. attribute:: template_fields |
| :annotation: = ['operation_name', 'gcp_conn_id', 'api_version'] |
| |
| |
| |
| |
| .. method:: _validate_inputs(self) |
| |
| |
| |
| |
| .. method:: execute(self, context) |
| |
| |
| |
| |
| .. py:class:: GcpTransferServiceOperationCancelOperator(operation_name, api_version='v1', gcp_conn_id='google_cloud_default', *args, **kwargs) |
| |
| Bases: :class:`airflow.models.BaseOperator` |
| |
| Cancels a transfer operation in Google Storage Transfer Service. |
| |
| .. seealso:: |
| For more information on how to use this operator, take a look at the guide: |
| :ref:`howto/operator:GcpTransferServiceOperationCancelOperator` |
| |
| :param operation_name: (Required) Name of the transfer operation. |
| :type operation_name: str |
| :param api_version: API version used (e.g. v1). |
| :type api_version: str |
| :param gcp_conn_id: The connection ID used to connect to Google |
| Cloud Platform. |
| :type gcp_conn_id: str |
| |
| .. attribute:: template_fields |
| :annotation: = ['operation_name', 'gcp_conn_id', 'api_version'] |
| |
| |
| |
| |
| .. method:: _validate_inputs(self) |
| |
| |
| |
| |
| .. method:: execute(self, context) |
| |
| |
| |
| |
| .. py:class:: S3ToGoogleCloudStorageTransferOperator(s3_bucket, gcs_bucket, project_id=None, aws_conn_id='aws_default', gcp_conn_id='google_cloud_default', delegate_to=None, description=None, schedule=None, object_conditions=None, transfer_options=None, wait=True, timeout=None, *args, **kwargs) |
| |
| Bases: :class:`airflow.models.BaseOperator` |
| |
| Synchronizes an S3 bucket with a Google Cloud Storage bucket using the |
| GCP Storage Transfer Service. |
| |
| .. warning:: |
| |
| This operator is NOT idempotent. If you run it many times, many transfer |
| jobs will be created in the Google Cloud Platform. |
| |
| **Example**: |
| |
| .. code-block:: python |
| |
| s3_to_gcs_transfer_op = S3ToGoogleCloudStorageTransferOperator( |
| task_id='s3_to_gcs_transfer_example', |
| s3_bucket='my-s3-bucket', |
| project_id='my-gcp-project', |
| gcs_bucket='my-gcs-bucket', |
| dag=my_dag) |
| |
| :param s3_bucket: The S3 bucket where to find the objects. (templated) |
| :type s3_bucket: str |
| :param gcs_bucket: The destination Google Cloud Storage bucket |
| where you want to store the files. (templated) |
| :type gcs_bucket: str |
| :param project_id: Optional ID of the Google Cloud Platform Console project that |
| owns the job |
| :type project_id: str |
| :param aws_conn_id: The source S3 connection |
| :type aws_conn_id: str |
| :param gcp_conn_id: The destination connection ID to use |
| when connecting to Google Cloud Storage. |
| :type gcp_conn_id: str |
| :param delegate_to: The account to impersonate, if any. |
| For this to work, the service account making the request must have |
| domain-wide delegation enabled. |
| :type delegate_to: str |
| :param description: Optional transfer service job description |
| :type description: str |
| :param schedule: Optional transfer service schedule; |
| If not set, run transfer job once as soon as the operator runs |
| The format is described |
| https://cloud.google.com/storage-transfer/docs/reference/rest/v1/transferJobs. |
| With two additional improvements: |
| |
| * dates they can be passed as :class:`datetime.date` |
| * times they can be passed as :class:`datetime.time` |
| |
| :type schedule: dict |
| :param object_conditions: Optional transfer service object conditions; see |
| https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec |
| :type object_conditions: dict |
| :param transfer_options: Optional transfer service transfer options; see |
| https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec |
| :type transfer_options: dict |
| :param wait: Wait for transfer to finish |
| :type wait: bool |
| :param timeout: Time to wait for the operation to end in seconds |
| :type timeout: int |
| |
| .. attribute:: template_fields |
| :annotation: = ['gcp_conn_id', 's3_bucket', 'gcs_bucket', 'description', 'object_conditions'] |
| |
| |
| |
| .. attribute:: ui_color |
| :annotation: = #e09411 |
| |
| |
| |
| |
| .. method:: execute(self, context) |
| |
| |
| |
| |
| .. method:: _create_body(self) |
| |
| |
| |
| |
| .. py:class:: GoogleCloudStorageToGoogleCloudStorageTransferOperator(source_bucket, destination_bucket, project_id=None, gcp_conn_id='google_cloud_default', delegate_to=None, description=None, schedule=None, object_conditions=None, transfer_options=None, wait=True, timeout=None, *args, **kwargs) |
| |
| Bases: :class:`airflow.models.BaseOperator` |
| |
| Copies objects from a bucket to another using the GCP Storage Transfer |
| Service. |
| |
| .. warning:: |
| |
| This operator is NOT idempotent. If you run it many times, many transfer |
| jobs will be created in the Google Cloud Platform. |
| |
| **Example**: |
| |
| .. code-block:: python |
| |
| gcs_to_gcs_transfer_op = GoogleCloudStorageToGoogleCloudStorageTransferOperator( |
| task_id='gcs_to_gcs_transfer_example', |
| source_bucket='my-source-bucket', |
| destination_bucket='my-destination-bucket', |
| project_id='my-gcp-project', |
| dag=my_dag) |
| |
| :param source_bucket: The source Google cloud storage bucket where the |
| object is. (templated) |
| :type source_bucket: str |
| :param destination_bucket: The destination Google cloud storage bucket |
| where the object should be. (templated) |
| :type destination_bucket: str |
| :param project_id: The ID of the Google Cloud Platform Console project that |
| owns the job |
| :type project_id: str |
| :param gcp_conn_id: Optional connection ID to use when connecting to Google Cloud |
| Storage. |
| :type gcp_conn_id: str |
| :param delegate_to: The account to impersonate, if any. |
| For this to work, the service account making the request must have |
| domain-wide delegation enabled. |
| :type delegate_to: str |
| :param description: Optional transfer service job description |
| :type description: str |
| :param schedule: Optional transfer service schedule; |
| If not set, run transfer job once as soon as the operator runs |
| See: |
| https://cloud.google.com/storage-transfer/docs/reference/rest/v1/transferJobs. |
| With two additional improvements: |
| |
| * dates they can be passed as :class:`datetime.date` |
| * times they can be passed as :class:`datetime.time` |
| |
| :type schedule: dict |
| :param object_conditions: Optional transfer service object conditions; see |
| https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec#ObjectConditions |
| :type object_conditions: dict |
| :param transfer_options: Optional transfer service transfer options; see |
| https://cloud.google.com/storage-transfer/docs/reference/rest/v1/TransferSpec#TransferOptions |
| :type transfer_options: dict |
| :param wait: Wait for transfer to finish; defaults to `True` |
| :type wait: bool |
| :param timeout: Time to wait for the operation to end in seconds |
| :type timeout: int |
| |
| .. attribute:: template_fields |
| :annotation: = ['gcp_conn_id', 'source_bucket', 'destination_bucket', 'description', 'object_conditions'] |
| |
| |
| |
| .. attribute:: ui_color |
| :annotation: = #e09411 |
| |
| |
| |
| |
| .. method:: execute(self, context) |
| |
| |
| |
| |
| .. method:: _create_body(self) |
| |
| |
| |
| |