blob: 74f734d055e8ba758e39f48c3da9c7f5d75cc154 [file] [log] [blame]
:py:mod:`airflow.providers.amazon.aws.operators.sagemaker`
==========================================================
.. py:module:: airflow.providers.amazon.aws.operators.sagemaker
Module Contents
---------------
Classes
~~~~~~~
.. autoapisummary::
airflow.providers.amazon.aws.operators.sagemaker.SageMakerBaseOperator
airflow.providers.amazon.aws.operators.sagemaker.SageMakerProcessingOperator
airflow.providers.amazon.aws.operators.sagemaker.SageMakerEndpointConfigOperator
airflow.providers.amazon.aws.operators.sagemaker.SageMakerEndpointOperator
airflow.providers.amazon.aws.operators.sagemaker.SageMakerTransformOperator
airflow.providers.amazon.aws.operators.sagemaker.SageMakerTuningOperator
airflow.providers.amazon.aws.operators.sagemaker.SageMakerModelOperator
airflow.providers.amazon.aws.operators.sagemaker.SageMakerTrainingOperator
airflow.providers.amazon.aws.operators.sagemaker.SageMakerDeleteModelOperator
Attributes
~~~~~~~~~~
.. autoapisummary::
airflow.providers.amazon.aws.operators.sagemaker.DEFAULT_CONN_ID
airflow.providers.amazon.aws.operators.sagemaker.CHECK_INTERVAL_SECOND
.. py:data:: DEFAULT_CONN_ID
:annotation: = aws_default
.. py:data:: CHECK_INTERVAL_SECOND
:annotation: = 30
.. py:class:: SageMakerBaseOperator(*, config, **kwargs)
Bases: :py:obj:`airflow.models.BaseOperator`
This is the base operator for all SageMaker operators.
:param config: The configuration necessary to start a training job (templated)
.. py:attribute:: template_fields
:annotation: :Sequence[str] = ['config']
.. py:attribute:: template_ext
:annotation: :Sequence[str] = []
.. py:attribute:: template_fields_renderers
.. py:attribute:: ui_color
:annotation: = #ededed
.. py:attribute:: integer_fields
:annotation: :List[List[Any]] = []
.. py:method:: parse_integer(self, config, field)
Recursive method for parsing string fields holding integer values to integers.
.. py:method:: parse_config_integers(self)
Parse the integer fields of training config to integers in case the config is rendered by Jinja and
all fields are str
.. py:method:: expand_role(self)
Placeholder for calling boto3's `expand_role`, which expands an IAM role name into an ARN.
.. py:method:: preprocess_config(self)
Process the config into a usable form.
.. py:method:: execute(self, context)
:abstractmethod:
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:method:: hook(self)
Return SageMakerHook
.. py:class:: SageMakerProcessingOperator(*, config, aws_conn_id = DEFAULT_CONN_ID, wait_for_completion = True, print_log = True, check_interval = CHECK_INTERVAL_SECOND, max_ingestion_time = None, action_if_job_exists = 'increment', **kwargs)
Bases: :py:obj:`SageMakerBaseOperator`
Use Amazon SageMaker Processing to analyze data and evaluate machine learning
models on Amazon SageMake. With Processing, you can use a simplified, managed
experience on SageMaker to run your data processing workloads, such as feature
engineering, data validation, model evaluation, and model interpretation.
.. seealso::
For more information on how to use this operator, take a look at the guide:
:ref:`howto/operator:SageMakerProcessingOperator`
:param config: The configuration necessary to start a processing job (templated).
For details of the configuration parameter see :py:meth:`SageMaker.Client.create_processing_job`
:param aws_conn_id: The AWS connection ID to use.
:param wait_for_completion: If wait is set to True, the time interval, in seconds,
that the operation waits to check the status of the processing job.
:param print_log: if the operator should print the cloudwatch log during processing
:param check_interval: if wait is set to be true, this is the time interval
in seconds which the operator will check the status of the processing job
:param max_ingestion_time: If wait is set to True, the operation fails if the processing job
doesn't finish within max_ingestion_time seconds. If you set this parameter to None,
the operation does not timeout.
:param action_if_job_exists: Behaviour if the job name already exists. Possible options are "increment"
(default) and "fail".
:return Dict: Returns The ARN of the processing job created in Amazon SageMaker.
.. py:method:: expand_role(self)
Placeholder for calling boto3's `expand_role`, which expands an IAM role name into an ARN.
.. py:method:: execute(self, context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: SageMakerEndpointConfigOperator(*, config, aws_conn_id = DEFAULT_CONN_ID, **kwargs)
Bases: :py:obj:`SageMakerBaseOperator`
Creates an endpoint configuration that Amazon SageMaker hosting
services uses to deploy models. In the configuration, you identify
one or more models, created using the CreateModel API, to deploy and
the resources that you want Amazon SageMaker to provision.
.. seealso::
For more information on how to use this operator, take a look at the guide:
:ref:`howto/operator:SageMakerEndpointConfigOperator`
:param config: The configuration necessary to create an endpoint config.
For details of the configuration parameter see :py:meth:`SageMaker.Client.create_endpoint_config`
:param aws_conn_id: The AWS connection ID to use.
:return Dict: Returns The ARN of the endpoint config created in Amazon SageMaker.
.. py:attribute:: integer_fields
:annotation: = [['ProductionVariants', 'InitialInstanceCount']]
.. py:method:: execute(self, context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: SageMakerEndpointOperator(*, config, aws_conn_id = DEFAULT_CONN_ID, wait_for_completion = True, check_interval = CHECK_INTERVAL_SECOND, max_ingestion_time = None, operation = 'create', **kwargs)
Bases: :py:obj:`SageMakerBaseOperator`
When you create a serverless endpoint, SageMaker provisions and manages
the compute resources for you. Then, you can make inference requests to
the endpoint and receive model predictions in response. SageMaker scales
the compute resources up and down as needed to handle your request traffic.
Requires an Endpoint Config.
.. seealso::
For more information on how to use this operator, take a look at the guide:
:ref:`howto/operator:SageMakerEndpointOperator`
:param config:
The configuration necessary to create an endpoint.
If you need to create a SageMaker endpoint based on an existed
SageMaker model and an existed SageMaker endpoint config::
config = endpoint_configuration;
If you need to create all of SageMaker model, SageMaker endpoint-config and SageMaker endpoint::
config = {
'Model': model_configuration,
'EndpointConfig': endpoint_config_configuration,
'Endpoint': endpoint_configuration
}
For details of the configuration parameter of model_configuration see
:py:meth:`SageMaker.Client.create_model`
For details of the configuration parameter of endpoint_config_configuration see
:py:meth:`SageMaker.Client.create_endpoint_config`
For details of the configuration parameter of endpoint_configuration see
:py:meth:`SageMaker.Client.create_endpoint`
:param wait_for_completion: Whether the operator should wait until the endpoint creation finishes.
:param check_interval: If wait is set to True, this is the time interval, in seconds, that this operation
waits before polling the status of the endpoint creation.
:param max_ingestion_time: If wait is set to True, this operation fails if the endpoint creation doesn't
finish within max_ingestion_time seconds. If you set this parameter to None it never times out.
:param operation: Whether to create an endpoint or update an endpoint. Must be either 'create or 'update'.
:param aws_conn_id: The AWS connection ID to use.
:return Dict: Returns The ARN of the endpoint created in Amazon SageMaker.
.. py:method:: create_integer_fields(self)
Set fields which should be casted to integers.
.. py:method:: expand_role(self)
Placeholder for calling boto3's `expand_role`, which expands an IAM role name into an ARN.
.. py:method:: execute(self, context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: SageMakerTransformOperator(*, config, aws_conn_id = DEFAULT_CONN_ID, wait_for_completion = True, check_interval = CHECK_INTERVAL_SECOND, max_ingestion_time = None, **kwargs)
Bases: :py:obj:`SageMakerBaseOperator`
Starts a transform job. A transform job uses a trained model to get inferences
on a dataset and saves these results to an Amazon S3 location that you specify.
.. seealso::
For more information on how to use this operator, take a look at the guide:
:ref:`howto/operator:SageMakerTransformOperator`
:param config: The configuration necessary to start a transform job (templated).
If you need to create a SageMaker transform job based on an existed SageMaker model::
config = transform_config
If you need to create both SageMaker model and SageMaker Transform job::
config = {
'Model': model_config,
'Transform': transform_config
}
For details of the configuration parameter of transform_config see
:py:meth:`SageMaker.Client.create_transform_job`
For details of the configuration parameter of model_config, See:
:py:meth:`SageMaker.Client.create_model`
:param aws_conn_id: The AWS connection ID to use.
:param wait_for_completion: Set to True to wait until the transform job finishes.
:param check_interval: If wait is set to True, the time interval, in seconds,
that this operation waits to check the status of the transform job.
:param max_ingestion_time: If wait is set to True, the operation fails
if the transform job doesn't finish within max_ingestion_time seconds. If you
set this parameter to None, the operation does not timeout.
:return Dict: Returns The ARN of the model created in Amazon SageMaker.
.. py:method:: create_integer_fields(self)
Set fields which should be casted to integers.
.. py:method:: expand_role(self)
Placeholder for calling boto3's `expand_role`, which expands an IAM role name into an ARN.
.. py:method:: execute(self, context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: SageMakerTuningOperator(*, config, aws_conn_id = DEFAULT_CONN_ID, wait_for_completion = True, check_interval = CHECK_INTERVAL_SECOND, max_ingestion_time = None, **kwargs)
Bases: :py:obj:`SageMakerBaseOperator`
Starts a hyperparameter tuning job. A hyperparameter tuning job finds the
best version of a model by running many training jobs on your dataset using
the algorithm you choose and values for hyperparameters within ranges that
you specify. It then chooses the hyperparameter values that result in a model
that performs the best, as measured by an objective metric that you choose.
.. seealso::
For more information on how to use this operator, take a look at the guide:
:ref:`howto/operator:SageMakerTuningOperator`
:param config: The configuration necessary to start a tuning job (templated).
For details of the configuration parameter see
:py:meth:`SageMaker.Client.create_hyper_parameter_tuning_job`
:param aws_conn_id: The AWS connection ID to use.
:param wait_for_completion: Set to True to wait until the tuning job finishes.
:param check_interval: If wait is set to True, the time interval, in seconds,
that this operation waits to check the status of the tuning job.
:param max_ingestion_time: If wait is set to True, the operation fails
if the tuning job doesn't finish within max_ingestion_time seconds. If you
set this parameter to None, the operation does not timeout.
:return Dict: Returns The ARN of the tuning job created in Amazon SageMaker.
.. py:attribute:: integer_fields
:annotation: = [['HyperParameterTuningJobConfig', 'ResourceLimits', 'MaxNumberOfTrainingJobs'],...
.. py:method:: expand_role(self)
Placeholder for calling boto3's `expand_role`, which expands an IAM role name into an ARN.
.. py:method:: execute(self, context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: SageMakerModelOperator(*, config, aws_conn_id = DEFAULT_CONN_ID, **kwargs)
Bases: :py:obj:`SageMakerBaseOperator`
Creates a model in Amazon SageMaker. In the request, you name the model and
describe a primary container. For the primary container, you specify the Docker
image that contains inference code, artifacts (from prior training), and a custom
environment map that the inference code uses when you deploy the model for predictions.
.. seealso::
For more information on how to use this operator, take a look at the guide:
:ref:`howto/operator:SageMakerModelOperator`
:param config: The configuration necessary to create a model.
For details of the configuration parameter see :py:meth:`SageMaker.Client.create_model`
:param aws_conn_id: The AWS connection ID to use.
:return Dict: Returns The ARN of the model created in Amazon SageMaker.
.. py:method:: expand_role(self)
Placeholder for calling boto3's `expand_role`, which expands an IAM role name into an ARN.
.. py:method:: execute(self, context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: SageMakerTrainingOperator(*, config, aws_conn_id = DEFAULT_CONN_ID, wait_for_completion = True, print_log = True, check_interval = CHECK_INTERVAL_SECOND, max_ingestion_time = None, check_if_job_exists = True, action_if_job_exists = 'increment', **kwargs)
Bases: :py:obj:`SageMakerBaseOperator`
Starts a model training job. After training completes, Amazon SageMaker saves
the resulting model artifacts to an Amazon S3 location that you specify.
.. seealso::
For more information on how to use this operator, take a look at the guide:
:ref:`howto/operator:SageMakerTrainingOperator`
:param config: The configuration necessary to start a training job (templated).
For details of the configuration parameter see :py:meth:`SageMaker.Client.create_training_job`
:param aws_conn_id: The AWS connection ID to use.
:param wait_for_completion: If wait is set to True, the time interval, in seconds,
that the operation waits to check the status of the training job.
:param print_log: if the operator should print the cloudwatch log during training
:param check_interval: if wait is set to be true, this is the time interval
in seconds which the operator will check the status of the training job
:param max_ingestion_time: If wait is set to True, the operation fails if the training job
doesn't finish within max_ingestion_time seconds. If you set this parameter to None,
the operation does not timeout.
:param check_if_job_exists: If set to true, then the operator will check whether a training job
already exists for the name in the config.
:param action_if_job_exists: Behaviour if the job name already exists. Possible options are "increment"
(default) and "fail".
This is only relevant if check_if
:return Dict: Returns The ARN of the training job created in Amazon SageMaker.
.. py:attribute:: integer_fields
:annotation: = [['ResourceConfig', 'InstanceCount'], ['ResourceConfig', 'VolumeSizeInGB'],...
.. py:method:: expand_role(self)
Placeholder for calling boto3's `expand_role`, which expands an IAM role name into an ARN.
.. py:method:: execute(self, context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.
.. py:class:: SageMakerDeleteModelOperator(*, config, aws_conn_id = DEFAULT_CONN_ID, **kwargs)
Bases: :py:obj:`SageMakerBaseOperator`
Deletes a SageMaker model.
.. seealso::
For more information on how to use this operator, take a look at the guide:
:ref:`howto/operator:SageMakerDeleteModelOperator`
:param config: The configuration necessary to delete the model.
For details of the configuration parameter see :py:meth:`SageMaker.Client.delete_model`
:param aws_conn_id: The AWS connection ID to use.
.. py:method:: execute(self, context)
This is the main method to derive when creating an operator.
Context is the same dictionary used as when rendering jinja templates.
Refer to get_template_context for more context.