blob: 2a299e77a0c8b4c95523d3a48e84f37a3b7c483f [file] [log] [blame]
:py:mod:`airflow.providers.amazon.aws.hooks.glue_crawler`
=========================================================
.. py:module:: airflow.providers.amazon.aws.hooks.glue_crawler
Module Contents
---------------
Classes
~~~~~~~
.. autoapisummary::
airflow.providers.amazon.aws.hooks.glue_crawler.GlueCrawlerHook
airflow.providers.amazon.aws.hooks.glue_crawler.AwsGlueCrawlerHook
.. py:class:: GlueCrawlerHook(*args, **kwargs)
Bases: :py:obj:`airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
Interacts with AWS Glue Crawler.
Additional arguments (such as ``aws_conn_id``) may be specified and
are passed down to the underlying AwsBaseHook.
.. seealso::
:class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
.. py:method:: glue_client(self)
:return: AWS Glue client
.. py:method:: has_crawler(self, crawler_name)
Checks if the crawler already exists
:param crawler_name: unique crawler name per AWS account
:return: Returns True if the crawler already exists and False if not.
.. py:method:: get_crawler(self, crawler_name)
Gets crawler configurations
:param crawler_name: unique crawler name per AWS account
:return: Nested dictionary of crawler configurations
.. py:method:: update_crawler(self, **crawler_kwargs)
Updates crawler configurations
:param crawler_kwargs: Keyword args that define the configurations used for the crawler
:return: True if crawler was updated and false otherwise
.. py:method:: create_crawler(self, **crawler_kwargs)
Creates an AWS Glue Crawler
:param crawler_kwargs: Keyword args that define the configurations used to create the crawler
:return: Name of the crawler
.. py:method:: start_crawler(self, crawler_name)
Triggers the AWS Glue crawler
:param crawler_name: unique crawler name per AWS account
:return: Empty dictionary
.. py:method:: wait_for_crawler_completion(self, crawler_name, poll_interval = 5)
Waits until Glue crawler completes and
returns the status of the latest crawl run.
Raises AirflowException if the crawler fails or is cancelled.
:param crawler_name: unique crawler name per AWS account
:param poll_interval: Time (in seconds) to wait between two consecutive calls to check crawler status
:return: Crawler's status
.. py:class:: AwsGlueCrawlerHook(*args, **kwargs)
Bases: :py:obj:`GlueCrawlerHook`
This hook is deprecated.
Please use :class:`airflow.providers.amazon.aws.hooks.glue_crawler.GlueCrawlerHook`.