docs-archive/apache-airflow-providers-apache-livy/3.0.0/_sources/_api/airflow/providers/apache/livy/operators/livy/index.rst.txt - airflow-site - Git at Google

 :py:mod:`airflow.providers.apache.livy.operators.livy`
 ======================================================

 .. py:module:: airflow.providers.apache.livy.operators.livy

 .. autoapi-nested-parse::

    This module contains the Apache Livy operator.


 Module Contents
 ---------------

 Classes
 ~~~~~~~

 .. autoapisummary::

    airflow.providers.apache.livy.operators.livy.LivyOperator


 .. py:class:: LivyOperator(*, file, class_name = None, args = None, conf = None, jars = None, py_files = None, files = None, driver_memory = None, driver_cores = None, executor_memory = None, executor_cores = None, num_executors = None, archives = None, queue = None, name = None, proxy_user = None, livy_conn_id = 'livy_default', polling_interval = 0, extra_options = None, extra_headers = None, retry_args = None, **kwargs)

    Bases: :py:obj:`airflow.models.BaseOperator`

    This operator wraps the Apache Livy batch REST API, allowing to submit a Spark
    application to the underlying cluster.

    :param file: path of the file containing the application to execute (required).
    :param class_name: name of the application Java/Spark main class.
    :param args: application command line arguments.
    :param jars: jars to be used in this sessions.
    :param py_files: python files to be used in this session.
    :param files: files to be used in this session.
    :param driver_memory: amount of memory to use for the driver process.
    :param driver_cores: number of cores to use for the driver process.
    :param executor_memory: amount of memory to use per executor process.
    :param executor_cores: number of cores to use for each executor.
    :param num_executors: number of executors to launch for this session.
    :param archives: archives to be used in this session.
    :param queue: name of the YARN queue to which the application is submitted.
    :param name: name of this session.
    :param conf: Spark configuration properties.
    :param proxy_user: user to impersonate when running the job.
    :param livy_conn_id: reference to a pre-defined Livy Connection.
    :param polling_interval: time in seconds between polling for job completion. Don't poll for values >=0
    :param extra_options: A dictionary of options, where key is string and value
        depends on the option that's being modified.
    :param extra_headers: A dictionary of headers passed to the HTTP request to livy.
    :param retry_args: Arguments which define the retry behaviour.
            See Tenacity documentation at https://github.com/jd/tenacity

    .. py:attribute:: template_fields
       :annotation: :Sequence[str] = ['spark_params']


    .. py:method:: get_hook(self)

       Get valid hook.

       :return: hook
       :rtype: LivyHook


    .. py:method:: execute(self, context)

       This is the main method to derive when creating an operator.
       Context is the same dictionary used as when rendering jinja templates.

       Refer to get_template_context for more context.


    .. py:method:: poll_for_termination(self, batch_id)

       Pool Livy for batch termination.

       :param batch_id: id of the batch session to monitor.


    .. py:method:: on_kill(self)

       Override this method to cleanup subprocesses when a task instance
       gets killed. Any use of the threading, subprocess or multiprocessing
       module within an operator needs to be cleaned up or it will leave
       ghost processes behind.


    .. py:method:: kill(self)

       Delete the current batch session.
	:py:mod:`airflow.providers.apache.livy.operators.livy`
	======================================================

	.. py:module:: airflow.providers.apache.livy.operators.livy

	.. autoapi-nested-parse::

	This module contains the Apache Livy operator.



	Module Contents
	---------------

	Classes
	~~~~~~~

	.. autoapisummary::

	airflow.providers.apache.livy.operators.livy.LivyOperator




	.. py:class:: LivyOperator(, file, class_name = None, args = None, conf = None, jars = None, py_files = None, files = None, driver_memory = None, driver_cores = None, executor_memory = None, executor_cores = None, num_executors = None, archives = None, queue = None, name = None, proxy_user = None, livy_conn_id = 'livy_default', polling_interval = 0, extra_options = None, extra_headers = None, retry_args = None, *kwargs)

	Bases: :py:obj:`airflow.models.BaseOperator`

	This operator wraps the Apache Livy batch REST API, allowing to submit a Spark
	application to the underlying cluster.

	:param file: path of the file containing the application to execute (required).
	:param class_name: name of the application Java/Spark main class.
	:param args: application command line arguments.
	:param jars: jars to be used in this sessions.
	:param py_files: python files to be used in this session.
	:param files: files to be used in this session.
	:param driver_memory: amount of memory to use for the driver process.
	:param driver_cores: number of cores to use for the driver process.
	:param executor_memory: amount of memory to use per executor process.
	:param executor_cores: number of cores to use for each executor.
	:param num_executors: number of executors to launch for this session.
	:param archives: archives to be used in this session.
	:param queue: name of the YARN queue to which the application is submitted.
	:param name: name of this session.
	:param conf: Spark configuration properties.
	:param proxy_user: user to impersonate when running the job.
	:param livy_conn_id: reference to a pre-defined Livy Connection.
	:param polling_interval: time in seconds between polling for job completion. Don't poll for values >=0
	:param extra_options: A dictionary of options, where key is string and value
	depends on the option that's being modified.
	:param extra_headers: A dictionary of headers passed to the HTTP request to livy.
	:param retry_args: Arguments which define the retry behaviour.
	See Tenacity documentation at https://github.com/jd/tenacity

	.. py:attribute:: template_fields
	:annotation: :Sequence[str] = ['spark_params']



	.. py:method:: get_hook(self)

	Get valid hook.

	:return: hook
	:rtype: LivyHook


	.. py:method:: execute(self, context)

	This is the main method to derive when creating an operator.
	Context is the same dictionary used as when rendering jinja templates.

	Refer to get_template_context for more context.


	.. py:method:: poll_for_termination(self, batch_id)

	Pool Livy for batch termination.

	:param batch_id: id of the batch session to monitor.


	.. py:method:: on_kill(self)

	Override this method to cleanup subprocesses when a task instance
	gets killed. Any use of the threading, subprocess or multiprocessing
	module within an operator needs to be cleaned up or it will leave
	ghost processes behind.


	.. py:method:: kill(self)

	Delete the current batch session.