| :mod:`airflow.providers.amazon.aws.hooks.glue_crawler` |
| ====================================================== |
| |
| .. py:module:: airflow.providers.amazon.aws.hooks.glue_crawler |
| |
| |
| Module Contents |
| --------------- |
| |
| .. py:class:: AwsGlueCrawlerHook(*args, **kwargs) |
| |
| Bases: :class:`airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook` |
| |
| Interacts with AWS Glue Crawler. |
| |
| Additional arguments (such as ``aws_conn_id``) may be specified and |
| are passed down to the underlying AwsBaseHook. |
| |
| .. seealso:: |
| :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook` |
| |
| |
| .. method:: glue_client(self) |
| |
| :return: AWS Glue client |
| |
| |
| |
| |
| .. method:: has_crawler(self, crawler_name) |
| |
| Checks if the crawler already exists |
| |
| :param crawler_name: unique crawler name per AWS account |
| :type crawler_name: str |
| :return: Returns True if the crawler already exists and False if not. |
| |
| |
| |
| |
| .. method:: get_crawler(self, crawler_name: str) |
| |
| Gets crawler configurations |
| |
| :param crawler_name: unique crawler name per AWS account |
| :type crawler_name: str |
| :return: Nested dictionary of crawler configurations |
| |
| |
| |
| |
| .. method:: update_crawler(self, **crawler_kwargs) |
| |
| Updates crawler configurations |
| |
| :param crawler_kwargs: Keyword args that define the configurations used for the crawler |
| :type crawler_kwargs: any |
| :return: True if crawler was updated and false otherwise |
| |
| |
| |
| |
| .. method:: create_crawler(self, **crawler_kwargs) |
| |
| Creates an AWS Glue Crawler |
| |
| :param crawler_kwargs: Keyword args that define the configurations used to create the crawler |
| :type crawler_kwargs: any |
| :return: Name of the crawler |
| |
| |
| |
| |
| .. method:: start_crawler(self, crawler_name: str) |
| |
| Triggers the AWS Glue crawler |
| |
| :param crawler_name: unique crawler name per AWS account |
| :type crawler_name: str |
| :return: Empty dictionary |
| |
| |
| |
| |
| .. method:: wait_for_crawler_completion(self, crawler_name: str, poll_interval: int = 5) |
| |
| Waits until Glue crawler completes and |
| returns the status of the latest crawl run. |
| Raises AirflowException if the crawler fails or is cancelled. |
| |
| :param crawler_name: unique crawler name per AWS account |
| :type crawler_name: str |
| :param poll_interval: Time (in seconds) to wait between two consecutive calls to check crawler status |
| :type poll_interval: int |
| :return: Crawler's status |
| |
| |
| |
| |