blob: 29fd07902e282f42231efca541e41833546faf88 [file] [log] [blame]
:py:mod:`airflow.providers.amazon.aws.hooks.athena`
===================================================
.. py:module:: airflow.providers.amazon.aws.hooks.athena
.. autoapi-nested-parse::
This module contains AWS Athena hook.
.. spelling::
PageIterator
Module Contents
---------------
Classes
~~~~~~~
.. autoapisummary::
airflow.providers.amazon.aws.hooks.athena.AthenaHook
.. py:class:: AthenaHook(*args, sleep_time = 30, **kwargs)
Bases: :py:obj:`airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
Interact with AWS Athena to run, poll queries and return query results
Additional arguments (such as ``aws_conn_id``) may be specified and
are passed down to the underlying AwsBaseHook.
.. seealso::
:class:`~airflow.providers.amazon.aws.hooks.base_aws.AwsBaseHook`
:param sleep_time: Time (in seconds) to wait between two consecutive calls to check query status on Athena
.. py:attribute:: INTERMEDIATE_STATES
:annotation: = ['QUEUED', 'RUNNING']
.. py:attribute:: FAILURE_STATES
:annotation: = ['FAILED', 'CANCELLED']
.. py:attribute:: SUCCESS_STATES
:annotation: = ['SUCCEEDED']
.. py:attribute:: TERMINAL_STATES
:annotation: = ['SUCCEEDED', 'FAILED', 'CANCELLED']
.. py:method:: run_query(query, query_context, result_configuration, client_request_token = None, workgroup = 'primary')
Run Presto query on athena with provided config and return submitted query_execution_id
:param query: Presto query to run
:param query_context: Context in which query need to be run
:param result_configuration: Dict with path to store results in and config related to encryption
:param client_request_token: Unique token created by user to avoid multiple executions of same query
:param workgroup: Athena workgroup name, when not specified, will be 'primary'
:return: str
.. py:method:: check_query_status(query_execution_id)
Fetch the status of submitted athena query. Returns None or one of valid query states.
:param query_execution_id: Id of submitted athena query
:return: str
.. py:method:: get_state_change_reason(query_execution_id)
Fetch the reason for a state change (e.g. error message). Returns None or reason string.
:param query_execution_id: Id of submitted athena query
:return: str
.. py:method:: get_query_results(query_execution_id, next_token_id = None, max_results = 1000)
Fetch submitted athena query results. returns none if query is in intermediate state or
failed/cancelled state else dict of query output
:param query_execution_id: Id of submitted athena query
:param next_token_id: The token that specifies where to start pagination.
:param max_results: The maximum number of results (rows) to return in this request.
:return: dict
.. py:method:: get_query_results_paginator(query_execution_id, max_items = None, page_size = None, starting_token = None)
Fetch submitted athena query results. returns none if query is in intermediate state or
failed/cancelled state else a paginator to iterate through pages of results. If you
wish to get all results at once, call build_full_result() on the returned PageIterator
:param query_execution_id: Id of submitted athena query
:param max_items: The total number of items to return.
:param page_size: The size of each page.
:param starting_token: A token to specify where to start paginating.
:return: PageIterator
.. py:method:: poll_query_status(query_execution_id, max_tries = None)
Poll the status of submitted athena query until query state reaches final state.
Returns one of the final states
:param query_execution_id: Id of submitted athena query
:param max_tries: Number of times to poll for query state before function exits
:return: str
.. py:method:: get_output_location(query_execution_id)
Function to get the output location of the query results
in s3 uri format.
:param query_execution_id: Id of submitted athena query
:return: str
.. py:method:: stop_query(query_execution_id)
Cancel the submitted athena query
:param query_execution_id: Id of submitted athena query
:return: dict