| :py:mod:`airflow.providers.amazon.aws.operators.emr` |
| ==================================================== |
| |
| .. py:module:: airflow.providers.amazon.aws.operators.emr |
| |
| |
| Module Contents |
| --------------- |
| |
| Classes |
| ~~~~~~~ |
| |
| .. autoapisummary:: |
| |
| airflow.providers.amazon.aws.operators.emr.EmrAddStepsOperator |
| airflow.providers.amazon.aws.operators.emr.EmrContainerOperator |
| airflow.providers.amazon.aws.operators.emr.EmrClusterLink |
| airflow.providers.amazon.aws.operators.emr.EmrCreateJobFlowOperator |
| airflow.providers.amazon.aws.operators.emr.EmrModifyClusterOperator |
| airflow.providers.amazon.aws.operators.emr.EmrTerminateJobFlowOperator |
| |
| |
| |
| |
| .. py:class:: EmrAddStepsOperator(*, job_flow_id = None, job_flow_name = None, cluster_states = None, aws_conn_id = 'aws_default', steps = None, **kwargs) |
| |
| Bases: :py:obj:`airflow.models.BaseOperator` |
| |
| An operator that adds steps to an existing EMR job_flow. |
| |
| .. seealso:: |
| For more information on how to use this operator, take a look at the guide: |
| :ref:`howto/operator:EmrAddStepsOperator` |
| |
| :param job_flow_id: id of the JobFlow to add steps to. (templated) |
| :param job_flow_name: name of the JobFlow to add steps to. Use as an alternative to passing |
| job_flow_id. will search for id of JobFlow with matching name in one of the states in |
| param cluster_states. Exactly one cluster like this should exist or will fail. (templated) |
| :param cluster_states: Acceptable cluster states when searching for JobFlow id by job_flow_name. |
| (templated) |
| :param aws_conn_id: aws connection to uses |
| :param steps: boto3 style steps or reference to a steps file (must be '.json') to |
| be added to the jobflow. (templated) |
| :param do_xcom_push: if True, job_flow_id is pushed to XCom with key job_flow_id. |
| |
| .. py:attribute:: template_fields |
| :annotation: :Sequence[str] = ['job_flow_id', 'job_flow_name', 'cluster_states', 'steps'] |
| |
| |
| |
| .. py:attribute:: template_ext |
| :annotation: :Sequence[str] = ['.json'] |
| |
| |
| |
| .. py:attribute:: template_fields_renderers |
| |
| |
| |
| |
| .. py:attribute:: ui_color |
| :annotation: = #f9c915 |
| |
| |
| |
| .. py:method:: execute(self, context) |
| |
| This is the main method to derive when creating an operator. |
| Context is the same dictionary used as when rendering jinja templates. |
| |
| Refer to get_template_context for more context. |
| |
| |
| |
| .. py:class:: EmrContainerOperator(*, name, virtual_cluster_id, execution_role_arn, release_label, job_driver, configuration_overrides = None, client_request_token = None, aws_conn_id = 'aws_default', poll_interval = 30, max_tries = None, **kwargs) |
| |
| Bases: :py:obj:`airflow.models.BaseOperator` |
| |
| An operator that submits jobs to EMR on EKS virtual clusters. |
| |
| :param name: The name of the job run. |
| :param virtual_cluster_id: The EMR on EKS virtual cluster ID |
| :param execution_role_arn: The IAM role ARN associated with the job run. |
| :param release_label: The Amazon EMR release version to use for the job run. |
| :param job_driver: Job configuration details, e.g. the Spark job parameters. |
| :param configuration_overrides: The configuration overrides for the job run, |
| specifically either application configuration or monitoring configuration. |
| :param client_request_token: The client idempotency token of the job run request. |
| Use this if you want to specify a unique ID to prevent two jobs from getting started. |
| If no token is provided, a UUIDv4 token will be generated for you. |
| :param aws_conn_id: The Airflow connection used for AWS credentials. |
| :param poll_interval: Time (in seconds) to wait between two consecutive calls to check query status on EMR |
| :param max_tries: Maximum number of times to wait for the job run to finish. |
| Defaults to None, which will poll until the job is *not* in a pending, submitted, or running state. |
| |
| .. py:attribute:: template_fields |
| :annotation: :Sequence[str] = ['name', 'virtual_cluster_id', 'execution_role_arn', 'release_label', 'job_driver'] |
| |
| |
| |
| .. py:attribute:: ui_color |
| :annotation: = #f9c915 |
| |
| |
| |
| .. py:method:: hook(self) |
| |
| Create and return an EmrContainerHook. |
| |
| |
| .. py:method:: execute(self, context) |
| |
| Run job on EMR Containers |
| |
| |
| .. py:method:: on_kill(self) |
| |
| Cancel the submitted job run |
| |
| |
| |
| .. py:class:: EmrClusterLink |
| |
| Bases: :py:obj:`airflow.models.BaseOperatorLink` |
| |
| Operator link for EmrCreateJobFlowOperator. It allows users to access the EMR Cluster |
| |
| .. py:attribute:: name |
| :annotation: = EMR Cluster |
| |
| |
| |
| .. py:method:: get_link(self, operator, dttm = None, ti_key = None) |
| |
| Get link to EMR cluster. |
| |
| :param operator: operator |
| :param dttm: datetime |
| :return: url link |
| |
| |
| |
| .. py:class:: EmrCreateJobFlowOperator(*, aws_conn_id = 'aws_default', emr_conn_id = 'emr_default', job_flow_overrides = None, region_name = None, **kwargs) |
| |
| Bases: :py:obj:`airflow.models.BaseOperator` |
| |
| Creates an EMR JobFlow, reading the config from the EMR connection. |
| A dictionary of JobFlow overrides can be passed that override |
| the config from the connection. |
| |
| .. seealso:: |
| For more information on how to use this operator, take a look at the guide: |
| :ref:`howto/operator:EmrCreateJobFlowOperator` |
| |
| :param aws_conn_id: aws connection to uses |
| :param emr_conn_id: emr connection to use |
| :param job_flow_overrides: boto3 style arguments or reference to an arguments file |
| (must be '.json') to override emr_connection extra. (templated) |
| :param region_name: Region named passed to EmrHook |
| |
| .. py:attribute:: template_fields |
| :annotation: :Sequence[str] = ['job_flow_overrides'] |
| |
| |
| |
| .. py:attribute:: template_ext |
| :annotation: :Sequence[str] = ['.json'] |
| |
| |
| |
| .. py:attribute:: template_fields_renderers |
| |
| |
| |
| |
| .. py:attribute:: ui_color |
| :annotation: = #f9c915 |
| |
| |
| |
| .. py:attribute:: operator_extra_links |
| |
| |
| |
| |
| .. py:method:: execute(self, context) |
| |
| This is the main method to derive when creating an operator. |
| Context is the same dictionary used as when rendering jinja templates. |
| |
| Refer to get_template_context for more context. |
| |
| |
| |
| .. py:class:: EmrModifyClusterOperator(*, cluster_id, step_concurrency_level, aws_conn_id = 'aws_default', **kwargs) |
| |
| Bases: :py:obj:`airflow.models.BaseOperator` |
| |
| An operator that modifies an existing EMR cluster. |
| |
| .. seealso:: |
| For more information on how to use this operator, take a look at the guide: |
| :ref:`howto/operator:EmrModifyClusterOperator` |
| |
| :param cluster_id: cluster identifier |
| :param step_concurrency_level: Concurrency of the cluster |
| :param aws_conn_id: aws connection to uses |
| :param do_xcom_push: if True, cluster_id is pushed to XCom with key cluster_id. |
| |
| .. py:attribute:: template_fields |
| :annotation: :Sequence[str] = ['cluster_id', 'step_concurrency_level'] |
| |
| |
| |
| .. py:attribute:: template_ext |
| :annotation: :Sequence[str] = [] |
| |
| |
| |
| .. py:attribute:: ui_color |
| :annotation: = #f9c915 |
| |
| |
| |
| .. py:method:: execute(self, context) |
| |
| This is the main method to derive when creating an operator. |
| Context is the same dictionary used as when rendering jinja templates. |
| |
| Refer to get_template_context for more context. |
| |
| |
| |
| .. py:class:: EmrTerminateJobFlowOperator(*, job_flow_id, aws_conn_id = 'aws_default', **kwargs) |
| |
| Bases: :py:obj:`airflow.models.BaseOperator` |
| |
| Operator to terminate EMR JobFlows. |
| |
| .. seealso:: |
| For more information on how to use this operator, take a look at the guide: |
| :ref:`howto/operator:EmrTerminateJobFlowOperator` |
| |
| :param job_flow_id: id of the JobFlow to terminate. (templated) |
| :param aws_conn_id: aws connection to uses |
| |
| .. py:attribute:: template_fields |
| :annotation: :Sequence[str] = ['job_flow_id'] |
| |
| |
| |
| .. py:attribute:: template_ext |
| :annotation: :Sequence[str] = [] |
| |
| |
| |
| .. py:attribute:: ui_color |
| :annotation: = #f9c915 |
| |
| |
| |
| .. py:method:: execute(self, context) |
| |
| This is the main method to derive when creating an operator. |
| Context is the same dictionary used as when rendering jinja templates. |
| |
| Refer to get_template_context for more context. |
| |
| |
| |