| :mod:`airflow.contrib.hooks.aws_athena_hook` |
| ============================================ |
| |
| .. py:module:: airflow.contrib.hooks.aws_athena_hook |
| |
| .. autoapi-nested-parse:: |
| |
| This module contains AWS Athena hook |
| |
| |
| |
| Module Contents |
| --------------- |
| |
| .. py:class:: AWSAthenaHook(aws_conn_id='aws_default', sleep_time=30, *args, **kwargs) |
| |
| Bases: :class:`airflow.contrib.hooks.aws_hook.AwsHook` |
| |
| Interact with AWS Athena to run, poll queries and return query results |
| |
| :param aws_conn_id: aws connection to use. |
| :type aws_conn_id: str |
| :param sleep_time: Time to wait between two consecutive call to check query status on athena |
| :type sleep_time: int |
| |
| .. attribute:: INTERMEDIATE_STATES |
| :annotation: = ['QUEUED', 'RUNNING'] |
| |
| |
| |
| .. attribute:: FAILURE_STATES |
| :annotation: = ['FAILED', 'CANCELLED'] |
| |
| |
| |
| .. attribute:: SUCCESS_STATES |
| :annotation: = ['SUCCEEDED'] |
| |
| |
| |
| |
| .. method:: get_conn(self) |
| |
| check if aws conn exists already or create one and return it |
| |
| :return: boto3 session |
| |
| |
| |
| |
| .. method:: run_query(self, query, query_context, result_configuration, client_request_token=None, workgroup='primary') |
| |
| Run Presto query on athena with provided config and return submitted query_execution_id |
| |
| :param query: Presto query to run |
| :type query: str |
| :param query_context: Context in which query need to be run |
| :type query_context: dict |
| :param result_configuration: Dict with path to store results in and config related to encryption |
| :type result_configuration: dict |
| :param client_request_token: Unique token created by user to avoid multiple executions of same query |
| :type client_request_token: str |
| :param workgroup: Athena workgroup name, when not specified, will be 'primary' |
| :type workgroup: str |
| :return: str |
| |
| |
| |
| |
| .. method:: check_query_status(self, query_execution_id) |
| |
| Fetch the status of submitted athena query. Returns None or one of valid query states. |
| |
| :param query_execution_id: Id of submitted athena query |
| :type query_execution_id: str |
| :return: str |
| |
| |
| |
| |
| .. method:: get_state_change_reason(self, query_execution_id) |
| |
| Fetch the reason for a state change (e.g. error message). Returns None or reason string. |
| |
| :param query_execution_id: Id of submitted athena query |
| :type query_execution_id: str |
| :return: str |
| |
| |
| |
| |
| .. method:: get_query_results(self, query_execution_id) |
| |
| Fetch submitted athena query results. returns none if query is in intermediate state or |
| failed/cancelled state else dict of query output |
| |
| :param query_execution_id: Id of submitted athena query |
| :type query_execution_id: str |
| :return: dict |
| |
| |
| |
| |
| .. method:: poll_query_status(self, query_execution_id, max_tries=None) |
| |
| Poll the status of submitted athena query until query state reaches final state. |
| Returns one of the final states |
| |
| :param query_execution_id: Id of submitted athena query |
| :type query_execution_id: str |
| :param max_tries: Number of times to poll for query state before function exits |
| :type max_tries: int |
| :return: str |
| |
| |
| |
| |
| .. method:: stop_query(self, query_execution_id) |
| |
| Cancel the submitted athena query |
| |
| :param query_execution_id: Id of submitted athena query |
| :type query_execution_id: str |
| :return: dict |
| |
| |
| |
| |