blob: 723ccef18529fb0df80c58b394de550d14fb5919 [file] [log] [blame]
:py:mod:`airflow.providers.amazon.aws.hooks.glue`
=================================================
.. py:module:: airflow.providers.amazon.aws.hooks.glue
Module Contents
---------------
Classes
~~~~~~~
.. autoapisummary::
airflow.providers.amazon.aws.hooks.glue.GlueJobHook
airflow.providers.amazon.aws.hooks.glue.AwsGlueJobHook
.. py:class:: GlueJobHook(s3_bucket = None, job_name = None, desc = None, concurrent_run_limit = 1, script_location = None, retry_limit = 0, num_of_dpus = None, iam_role_name = None, create_job_kwargs = None, *args, **kwargs)
Bases: :py:obj:`airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
Interact with AWS Glue - create job, trigger, crawler
:param s3_bucket: S3 bucket where logs and local etl script will be uploaded
:param job_name: unique job name per AWS account
:param desc: job description
:param concurrent_run_limit: The maximum number of concurrent runs allowed for a job
:param script_location: path to etl script on s3
:param retry_limit: Maximum number of times to retry this job if it fails
:param num_of_dpus: Number of AWS Glue DPUs to allocate to this Job
:param region_name: aws region name (example: us-east-1)
:param iam_role_name: AWS IAM Role for Glue Job Execution
:param create_job_kwargs: Extra arguments for Glue Job Creation
.. py:attribute:: JOB_POLL_INTERVAL
:annotation: = 6
.. py:method:: list_jobs(self)
:return: Lists of Jobs
.. py:method:: get_iam_execution_role(self)
:return: iam role for job execution
.. py:method:: initialize_job(self, script_arguments = None, run_kwargs = None)
Initializes connection with AWS Glue
to run job
:return:
.. py:method:: get_job_state(self, job_name, run_id)
Get state of the Glue job. The job state can be
running, finished, failed, stopped or timeout.
:param job_name: unique job name per AWS account
:param run_id: The job-run ID of the predecessor job run
:return: State of the Glue job
.. py:method:: job_completion(self, job_name, run_id)
Waits until Glue job with job_name completes or
fails and return final state if finished.
Raises AirflowException when the job failed
:param job_name: unique job name per AWS account
:param run_id: The job-run ID of the predecessor job run
:return: Dict of JobRunState and JobRunId
.. py:method:: get_or_create_glue_job(self)
Creates(or just returns) and returns the Job name
:return:Name of the Job
.. py:class:: AwsGlueJobHook(*args, **kwargs)
Bases: :py:obj:`GlueJobHook`
This hook is deprecated.
Please use :class:`airflow.providers.amazon.aws.hooks.glue.GlueJobHook`.