| :py:mod:`airflow.providers.amazon.aws.hooks.athena` |
| =================================================== |
| |
| .. py:module:: airflow.providers.amazon.aws.hooks.athena |
| |
| .. autoapi-nested-parse:: |
| |
| This module contains AWS Athena hook. |
| |
| .. spelling:: |
| |
| PageIterator |
| |
| |
| |
| Module Contents |
| --------------- |
| |
| Classes |
| ~~~~~~~ |
| |
| .. autoapisummary:: |
| |
| airflow.providers.amazon.aws.hooks.athena.AthenaHook |
| airflow.providers.amazon.aws.hooks.athena.AWSAthenaHook |
| |
| |
| |
| |
| .. py:class:: AthenaHook(*args, sleep_time = 30, **kwargs) |
| |
| Bases: :py:obj:`airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook` |
| |
| Interact with AWS Athena to run, poll queries and return query results |
| |
| Additional arguments (such as ``aws_conn_id``) may be specified and |
| are passed down to the underlying AwsBaseHook. |
| |
| .. seealso:: |
| :class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook` |
| |
| :param sleep_time: Time (in seconds) to wait between two consecutive calls to check query status on Athena |
| |
| .. py:attribute:: INTERMEDIATE_STATES |
| :annotation: = ['QUEUED', 'RUNNING'] |
| |
| |
| |
| .. py:attribute:: FAILURE_STATES |
| :annotation: = ['FAILED', 'CANCELLED'] |
| |
| |
| |
| .. py:attribute:: SUCCESS_STATES |
| :annotation: = ['SUCCEEDED'] |
| |
| |
| |
| .. py:attribute:: TERMINAL_STATES |
| :annotation: = ['SUCCEEDED', 'FAILED', 'CANCELLED'] |
| |
| |
| |
| .. py:method:: run_query(self, query, query_context, result_configuration, client_request_token = None, workgroup = 'primary') |
| |
| Run Presto query on athena with provided config and return submitted query_execution_id |
| |
| :param query: Presto query to run |
| :param query_context: Context in which query need to be run |
| :param result_configuration: Dict with path to store results in and config related to encryption |
| :param client_request_token: Unique token created by user to avoid multiple executions of same query |
| :param workgroup: Athena workgroup name, when not specified, will be 'primary' |
| :return: str |
| |
| |
| .. py:method:: check_query_status(self, query_execution_id) |
| |
| Fetch the status of submitted athena query. Returns None or one of valid query states. |
| |
| :param query_execution_id: Id of submitted athena query |
| :return: str |
| |
| |
| .. py:method:: get_state_change_reason(self, query_execution_id) |
| |
| Fetch the reason for a state change (e.g. error message). Returns None or reason string. |
| |
| :param query_execution_id: Id of submitted athena query |
| :return: str |
| |
| |
| .. py:method:: get_query_results(self, query_execution_id, next_token_id = None, max_results = 1000) |
| |
| Fetch submitted athena query results. returns none if query is in intermediate state or |
| failed/cancelled state else dict of query output |
| |
| :param query_execution_id: Id of submitted athena query |
| :param next_token_id: The token that specifies where to start pagination. |
| :param max_results: The maximum number of results (rows) to return in this request. |
| :return: dict |
| |
| |
| .. py:method:: get_query_results_paginator(self, query_execution_id, max_items = None, page_size = None, starting_token = None) |
| |
| Fetch submitted athena query results. returns none if query is in intermediate state or |
| failed/cancelled state else a paginator to iterate through pages of results. If you |
| wish to get all results at once, call build_full_result() on the returned PageIterator |
| |
| :param query_execution_id: Id of submitted athena query |
| :param max_items: The total number of items to return. |
| :param page_size: The size of each page. |
| :param starting_token: A token to specify where to start paginating. |
| :return: PageIterator |
| |
| |
| .. py:method:: poll_query_status(self, query_execution_id, max_tries = None) |
| |
| Poll the status of submitted athena query until query state reaches final state. |
| Returns one of the final states |
| |
| :param query_execution_id: Id of submitted athena query |
| :param max_tries: Number of times to poll for query state before function exits |
| :return: str |
| |
| |
| .. py:method:: get_output_location(self, query_execution_id) |
| |
| Function to get the output location of the query results |
| in s3 uri format. |
| |
| :param query_execution_id: Id of submitted athena query |
| :return: str |
| |
| |
| .. py:method:: stop_query(self, query_execution_id) |
| |
| Cancel the submitted athena query |
| |
| :param query_execution_id: Id of submitted athena query |
| :return: dict |
| |
| |
| |
| .. py:class:: AWSAthenaHook(*args, **kwargs) |
| |
| Bases: :py:obj:`AthenaHook` |
| |
| This hook is deprecated. |
| Please use :class:`airflow.providers.amazon.aws.hooks.athena.AthenaHook`. |
| |
| |