| :py:mod:`airflow.providers.amazon.aws.operators.glue` |
| ===================================================== |
| |
| .. py:module:: airflow.providers.amazon.aws.operators.glue |
| |
| |
| Module Contents |
| --------------- |
| |
| Classes |
| ~~~~~~~ |
| |
| .. autoapisummary:: |
| |
| airflow.providers.amazon.aws.operators.glue.GlueJobOperator |
| airflow.providers.amazon.aws.operators.glue.AwsGlueJobOperator |
| |
| |
| |
| |
| .. py:class:: GlueJobOperator(*, job_name = 'aws_glue_default_job', job_desc = 'AWS Glue Job with Airflow', script_location, concurrent_run_limit = None, script_args = None, retry_limit = 0, num_of_dpus = None, aws_conn_id = 'aws_default', region_name = None, s3_bucket = None, iam_role_name = None, create_job_kwargs = None, run_job_kwargs = None, wait_for_completion = True, **kwargs) |
| |
| Bases: :py:obj:`airflow.models.BaseOperator` |
| |
| Creates an AWS Glue Job. AWS Glue is a serverless Spark |
| ETL service for running Spark Jobs on the AWS cloud. |
| Language support: Python and Scala |
| |
| .. seealso:: |
| For more information on how to use this operator, take a look at the guide: |
| :ref:`howto/operator:GlueJobOperator` |
| |
| :param job_name: unique job name per AWS Account |
| :param script_location: location of ETL script. Must be a local or S3 path |
| :param job_desc: job description details |
| :param concurrent_run_limit: The maximum number of concurrent runs allowed for a job |
| :param script_args: etl script arguments and AWS Glue arguments (templated) |
| :param retry_limit: The maximum number of times to retry this job if it fails |
| :param num_of_dpus: Number of AWS Glue DPUs to allocate to this Job. |
| :param region_name: aws region name (example: us-east-1) |
| :param s3_bucket: S3 bucket where logs and local etl script will be uploaded |
| :param iam_role_name: AWS IAM Role for Glue Job Execution |
| :param create_job_kwargs: Extra arguments for Glue Job Creation |
| :param run_job_kwargs: Extra arguments for Glue Job Run |
| :param wait_for_completion: Whether or not wait for job run completion. (default: True) |
| |
| .. py:attribute:: template_fields |
| :annotation: :Sequence[str] = ['script_args'] |
| |
| |
| |
| .. py:attribute:: template_ext |
| :annotation: :Sequence[str] = [] |
| |
| |
| |
| .. py:attribute:: template_fields_renderers |
| |
| |
| |
| |
| .. py:attribute:: ui_color |
| :annotation: = #ededed |
| |
| |
| |
| .. py:method:: execute(self, context) |
| |
| Executes AWS Glue Job from Airflow |
| |
| :return: the id of the current glue job. |
| |
| |
| |
| .. py:class:: AwsGlueJobOperator(*args, **kwargs) |
| |
| Bases: :py:obj:`GlueJobOperator` |
| |
| This operator is deprecated. |
| Please use :class:`airflow.providers.amazon.aws.operators.glue.GlueJobOperator`. |
| |
| |