blob: d3082cf85b4cbbd596c694b6496fa30e36137766 [file] [log] [blame]
:py:mod:`airflow.providers.amazon.aws.transfers.dynamodb_to_s3`
===============================================================
.. py:module:: airflow.providers.amazon.aws.transfers.dynamodb_to_s3
.. autoapi-nested-parse::
This module contains operators to replicate records from
DynamoDB table to S3.
Module Contents
---------------
Classes
~~~~~~~
.. autoapisummary::
airflow.providers.amazon.aws.transfers.dynamodb_to_s3.DynamoDBToS3Operator
.. py:class:: DynamoDBToS3Operator(*, dynamodb_table_name, s3_bucket_name, file_size, dynamodb_scan_kwargs = None, s3_key_prefix = '', process_func = _convert_item_to_json_bytes, aws_conn_id = 'aws_default', **kwargs)
Bases: :py:obj:`airflow.models.BaseOperator`
Replicates records from a DynamoDB table to S3.
It scans a DynamoDB table and writes the received records to a file
on the local filesystem. It flushes the file to S3 once the file size
exceeds the file size limit specified by the user.
Users can also specify a filtering criteria using dynamodb_scan_kwargs
to only replicate records that satisfy the criteria.
.. seealso::
For more information on how to use this operator, take a look at the guide:
:ref:`howto/transfer:DynamoDBToS3Operator`
:param dynamodb_table_name: Dynamodb table to replicate data from
:param s3_bucket_name: S3 bucket to replicate data to
:param file_size: Flush file to s3 if file size >= file_size
:param dynamodb_scan_kwargs: kwargs pass to <https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/dynamodb.html#DynamoDB.Table.scan> # noqa: E501
:param s3_key_prefix: Prefix of s3 object key
:param process_func: How we transforms a dynamodb item to bytes. By default we dump the json
:param aws_conn_id: The Airflow connection used for AWS credentials.
If this is None or empty then the default boto3 behaviour is used. If
running Airflow in a distributed manner and aws_conn_id is None or
empty, then default boto3 configuration would be used (and must be
maintained on each worker node).
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['s3_bucket_name', 'dynamodb_table_name']
.. py:attribute:: template_fields_renderers
.. py:method:: execute(self, context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.