blob: 49236b9ed8cd05471386c8dfb61760aa21e4f5d4 [file] [log] [blame]
:py:mod:`airflow.providers.apache.hive.operators.hive`
======================================================
.. py:module:: airflow.providers.apache.hive.operators.hive
Module Contents
---------------
Classes
~~~~~~~
.. autoapisummary::
airflow.providers.apache.hive.operators.hive.HiveOperator
.. py:class:: HiveOperator(*, hql, hive_cli_conn_id = 'hive_cli_default', schema = 'default', hiveconfs = None, hiveconf_jinja_translate = False, script_begin_tag = None, run_as_owner = False, mapred_queue = None, mapred_queue_priority = None, mapred_job_name = None, **kwargs)
Bases: :py:obj:`airflow.models.BaseOperator`
Executes hql code or hive script in a specific Hive database.
:param hql: the hql to be executed. Note that you may also use
a relative path from the dag file of a (template) hive
script. (templated)
:param hive_cli_conn_id: Reference to the
:ref:`Hive CLI connection id <howto/connection:hive_cli>`. (templated)
:param hiveconfs: if defined, these key value pairs will be passed
to hive as ``-hiveconf "key"="value"``
:param hiveconf_jinja_translate: when True, hiveconf-type templating
${var} gets translated into jinja-type templating {{ var }} and
${hiveconf:var} gets translated into jinja-type templating {{ var }}.
Note that you may want to use this along with the
``DAG(user_defined_macros=myargs)`` parameter. View the DAG
object documentation for more details.
:param script_begin_tag: If defined, the operator will get rid of the
part of the script before the first occurrence of `script_begin_tag`
:param run_as_owner: Run HQL code as a DAG's owner.
:param mapred_queue: queue used by the Hadoop CapacityScheduler. (templated)
:param mapred_queue_priority: priority within CapacityScheduler queue.
Possible settings include: VERY_HIGH, HIGH, NORMAL, LOW, VERY_LOW
:param mapred_job_name: This name will appear in the jobtracker.
This can make monitoring easier.
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['hql', 'schema', 'hive_cli_conn_id', 'mapred_queue', 'hiveconfs', 'mapred_job_name',...
.. py:attribute:: template_ext
:annotation: :Sequence[str] = ['.hql', '.sql']
.. py:attribute:: template_fields_renderers
.. py:attribute:: ui_color
:annotation: = #f0e4ec
.. py:method:: get_hook(self)
Get Hive cli hook
.. py:method:: prepare_template(self)
Hook triggered after the templated fields get replaced by their content.
If you need your operator to alter the content of the file before the
template is rendered, it should override this method to do so.
.. py:method:: execute(self, context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:method:: dry_run(self)
Performs dry run for the operator - just render template fields.
.. py:method:: on_kill(self)
Override this method to cleanup subprocesses when a task instance
gets killed. Any use of the threading, subprocess or multiprocessing
module within an operator needs to be cleaned up or it will leave
ghost processes behind.
.. py:method:: clear_airflow_vars(self)
Reset airflow environment variables to prevent existing ones from impacting behavior.