:py:mod:`airflow.providers.amazon.aws.operators.emr`
====================================================

.. py:module:: airflow.providers.amazon.aws.operators.emr


Module Contents
---------------

Classes
~~~~~~~

.. autoapisummary::

   airflow.providers.amazon.aws.operators.emr.EmrAddStepsOperator
   airflow.providers.amazon.aws.operators.emr.EmrContainerOperator
   airflow.providers.amazon.aws.operators.emr.EmrClusterLink
   airflow.providers.amazon.aws.operators.emr.EmrCreateJobFlowOperator
   airflow.providers.amazon.aws.operators.emr.EmrModifyClusterOperator
   airflow.providers.amazon.aws.operators.emr.EmrTerminateJobFlowOperator




.. py:class:: EmrAddStepsOperator(*, job_flow_id = None, job_flow_name = None, cluster_states = None, aws_conn_id = 'aws_default', steps = None, **kwargs)

   Bases: :py:obj:`airflow.models.BaseOperator`

   An operator that adds steps to an existing EMR job_flow.

   .. seealso::
       For more information on how to use this operator, take a look at the guide:
       :ref:`howto/operator:EmrAddStepsOperator`

   :param job_flow_id: id of the JobFlow to add steps to. (templated)
   :param job_flow_name: name of the JobFlow to add steps to. Use as an alternative to passing
       job_flow_id. will search for id of JobFlow with matching name in one of the states in
       param cluster_states. Exactly one cluster like this should exist or will fail. (templated)
   :param cluster_states: Acceptable cluster states when searching for JobFlow id by job_flow_name.
       (templated)
   :param aws_conn_id: aws connection to uses
   :param steps: boto3 style steps or reference to a steps file (must be '.json') to
       be added to the jobflow. (templated)
   :param do_xcom_push: if True, job_flow_id is pushed to XCom with key job_flow_id.

   .. py:attribute:: template_fields
      :annotation: :Sequence[str] = ['job_flow_id', 'job_flow_name', 'cluster_states', 'steps']

      

   .. py:attribute:: template_ext
      :annotation: :Sequence[str] = ['.json']

      

   .. py:attribute:: template_fields_renderers
      

      

   .. py:attribute:: ui_color
      :annotation: = #f9c915

      

   .. py:method:: execute(self, context)

      This is the main method to derive when creating an operator.
      Context is the same dictionary used as when rendering jinja templates.

      Refer to get_template_context for more context.



.. py:class:: EmrContainerOperator(*, name, virtual_cluster_id, execution_role_arn, release_label, job_driver, configuration_overrides = None, client_request_token = None, aws_conn_id = 'aws_default', wait_for_completion = True, poll_interval = 30, max_tries = None, tags = None, **kwargs)

   Bases: :py:obj:`airflow.models.BaseOperator`

   An operator that submits jobs to EMR on EKS virtual clusters.

   .. seealso::
       For more information on how to use this operator, take a look at the guide:
       :ref:`howto/operator:EmrContainerOperator`

   :param name: The name of the job run.
   :param virtual_cluster_id: The EMR on EKS virtual cluster ID
   :param execution_role_arn: The IAM role ARN associated with the job run.
   :param release_label: The Amazon EMR release version to use for the job run.
   :param job_driver: Job configuration details, e.g. the Spark job parameters.
   :param configuration_overrides: The configuration overrides for the job run,
       specifically either application configuration or monitoring configuration.
   :param client_request_token: The client idempotency token of the job run request.
       Use this if you want to specify a unique ID to prevent two jobs from getting started.
       If no token is provided, a UUIDv4 token will be generated for you.
   :param aws_conn_id: The Airflow connection used for AWS credentials.
   :param wait_for_completion: Whether or not to wait in the operator for the job to complete.
   :param poll_interval: Time (in seconds) to wait between two consecutive calls to check query status on EMR
   :param max_tries: Maximum number of times to wait for the job run to finish.
       Defaults to None, which will poll until the job is *not* in a pending, submitted, or running state.
   :param tags: The tags assigned to job runs.
       Defaults to None

   .. py:attribute:: template_fields
      :annotation: :Sequence[str] = ['name', 'virtual_cluster_id', 'execution_role_arn', 'release_label', 'job_driver']

      

   .. py:attribute:: ui_color
      :annotation: = #f9c915

      

   .. py:method:: hook(self)

      Create and return an EmrContainerHook.


   .. py:method:: execute(self, context)

      Run job on EMR Containers


   .. py:method:: on_kill(self)

      Cancel the submitted job run



.. py:class:: EmrClusterLink

   Bases: :py:obj:`airflow.models.BaseOperatorLink`

   Operator link for EmrCreateJobFlowOperator. It allows users to access the EMR Cluster

   .. py:attribute:: name
      :annotation: = EMR Cluster

      

   .. py:method:: get_link(self, operator, dttm = None, ti_key = None)

      Get link to EMR cluster.

      :param operator: operator
      :param dttm: datetime
      :return: url link



.. py:class:: EmrCreateJobFlowOperator(*, aws_conn_id = 'aws_default', emr_conn_id = 'emr_default', job_flow_overrides = None, region_name = None, **kwargs)

   Bases: :py:obj:`airflow.models.BaseOperator`

   Creates an EMR JobFlow, reading the config from the EMR connection.
   A dictionary of JobFlow overrides can be passed that override
   the config from the connection.

   .. seealso::
       For more information on how to use this operator, take a look at the guide:
       :ref:`howto/operator:EmrCreateJobFlowOperator`

   :param aws_conn_id: aws connection to uses
   :param emr_conn_id: emr connection to use
   :param job_flow_overrides: boto3 style arguments or reference to an arguments file
       (must be '.json') to override emr_connection extra. (templated)
   :param region_name: Region named passed to EmrHook

   .. py:attribute:: template_fields
      :annotation: :Sequence[str] = ['job_flow_overrides']

      

   .. py:attribute:: template_ext
      :annotation: :Sequence[str] = ['.json']

      

   .. py:attribute:: template_fields_renderers
      

      

   .. py:attribute:: ui_color
      :annotation: = #f9c915

      

   .. py:attribute:: operator_extra_links
      

      

   .. py:method:: execute(self, context)

      This is the main method to derive when creating an operator.
      Context is the same dictionary used as when rendering jinja templates.

      Refer to get_template_context for more context.



.. py:class:: EmrModifyClusterOperator(*, cluster_id, step_concurrency_level, aws_conn_id = 'aws_default', **kwargs)

   Bases: :py:obj:`airflow.models.BaseOperator`

   An operator that modifies an existing EMR cluster.

   .. seealso::
       For more information on how to use this operator, take a look at the guide:
       :ref:`howto/operator:EmrModifyClusterOperator`

   :param cluster_id: cluster identifier
   :param step_concurrency_level: Concurrency of the cluster
   :param aws_conn_id: aws connection to uses
   :param do_xcom_push: if True, cluster_id is pushed to XCom with key cluster_id.

   .. py:attribute:: template_fields
      :annotation: :Sequence[str] = ['cluster_id', 'step_concurrency_level']

      

   .. py:attribute:: template_ext
      :annotation: :Sequence[str] = []

      

   .. py:attribute:: ui_color
      :annotation: = #f9c915

      

   .. py:method:: execute(self, context)

      This is the main method to derive when creating an operator.
      Context is the same dictionary used as when rendering jinja templates.

      Refer to get_template_context for more context.



.. py:class:: EmrTerminateJobFlowOperator(*, job_flow_id, aws_conn_id = 'aws_default', **kwargs)

   Bases: :py:obj:`airflow.models.BaseOperator`

   Operator to terminate EMR JobFlows.

   .. seealso::
       For more information on how to use this operator, take a look at the guide:
       :ref:`howto/operator:EmrTerminateJobFlowOperator`

   :param job_flow_id: id of the JobFlow to terminate. (templated)
   :param aws_conn_id: aws connection to uses

   .. py:attribute:: template_fields
      :annotation: :Sequence[str] = ['job_flow_id']

      

   .. py:attribute:: template_ext
      :annotation: :Sequence[str] = []

      

   .. py:attribute:: ui_color
      :annotation: = #f9c915

      

   .. py:method:: execute(self, context)

      This is the main method to derive when creating an operator.
      Context is the same dictionary used as when rendering jinja templates.

      Refer to get_template_context for more context.



