docs-archive/apache-airflow-providers-google/8.0.0/_sources/_api/airflow/providers/google/cloud/hooks/vertex_ai/batch_prediction_job/index.rst.txt - airflow-site - Git at Google

 :py:mod:`airflow.providers.google.cloud.hooks.vertex_ai.batch_prediction_job`
 =============================================================================

 .. py:module:: airflow.providers.google.cloud.hooks.vertex_ai.batch_prediction_job

 .. autoapi-nested-parse::

    This module contains a Google Cloud Vertex AI hook.

    .. spelling::

        jsonl
        codepoints
        aiplatform
        gapic


 Module Contents
 ---------------

 Classes
 ~~~~~~~

 .. autoapisummary::

    airflow.providers.google.cloud.hooks.vertex_ai.batch_prediction_job.BatchPredictionJobHook


 .. py:class:: BatchPredictionJobHook(gcp_conn_id = 'google_cloud_default', delegate_to = None, impersonation_chain = None)

    Bases: :py:obj:`airflow.providers.google.common.hooks.base_google.GoogleBaseHook`

    Hook for Google Cloud Vertex AI Batch Prediction Job APIs.

    .. py:method:: get_job_service_client(self, region = None)

       Returns JobServiceClient.


    .. py:method:: wait_for_operation(self, operation, timeout = None)

       Waits for long-lasting operation to complete.


    .. py:method:: extract_batch_prediction_job_id(obj)
       :staticmethod:

       Returns unique id of the batch_prediction_job.


    .. py:method:: cancel_batch_prediction_job(self)

       Cancel BatchPredictionJob


    .. py:method:: create_batch_prediction_job(self, project_id, region, job_display_name, model_name, instances_format = 'jsonl', predictions_format = 'jsonl', gcs_source = None, bigquery_source = None, gcs_destination_prefix = None, bigquery_destination_prefix = None, model_parameters = None, machine_type = None, accelerator_type = None, accelerator_count = None, starting_replica_count = None, max_replica_count = None, generate_explanation = False, explanation_metadata = None, explanation_parameters = None, labels = None, encryption_spec_key_name = None, sync = True)

       Create a batch prediction job.

       :param project_id: Required. Project to run training in.
       :param region: Required. Location to run training in.
       :param job_display_name: Required. The user-defined name of the BatchPredictionJob. The name can be
           up to 128 characters long and can be consist of any UTF-8 characters.
       :param model_name: Required. A fully-qualified model resource name or model ID.
       :param instances_format: Required. The format in which instances are provided. Must be one of the
           formats listed in `Model.supported_input_storage_formats`. Default is "jsonl" when using
           `gcs_source`. If a `bigquery_source` is provided, this is overridden to "bigquery".
       :param predictions_format: Required. The format in which Vertex AI outputs the predictions, must be
           one of the formats specified in `Model.supported_output_storage_formats`. Default is "jsonl" when
           using `gcs_destination_prefix`. If a `bigquery_destination_prefix` is provided, this is
           overridden to "bigquery".
       :param gcs_source: Google Cloud Storage URI(-s) to your instances to run batch prediction on. They
           must match `instances_format`. May contain wildcards. For more information on wildcards, see
           https://cloud.google.com/storage/docs/gsutil/addlhelp/WildcardNames.
       :param bigquery_source: BigQuery URI to a table, up to 2000 characters long.
           For example: `bq://projectId.bqDatasetId.bqTableId`
       :param gcs_destination_prefix: The Google Cloud Storage location of the directory where the output is
           to be written to. In the given directory a new directory is created. Its name is
           ``prediction-<model-display-name>-<job-create-time>``, where timestamp is in
           YYYY-MM-DDThh:mm:ss.sssZ ISO-8601 format. Inside of it files ``predictions_0001.<extension>``,
           ``predictions_0002.<extension>``, ..., ``predictions_N.<extension>`` are created where
           ``<extension>`` depends on chosen ``predictions_format``, and N may equal 0001 and depends on the
           total number of successfully predicted instances. If the Model has both ``instance`` and
           ``prediction`` schemata defined then each such file contains predictions as per the
           ``predictions_format``. If prediction for any instance failed (partially or completely), then an
           additional ``errors_0001.<extension>``, ``errors_0002.<extension>``,..., ``errors_N.<extension>``
           files are created (N depends on total number of failed predictions). These files contain the
           failed instances, as per their schema, followed by an additional ``error`` field which as value
           has ```google.rpc.Status`` <Status>`__ containing only ``code`` and ``message`` fields.
       :param bigquery_destination_prefix: The BigQuery project location where the output is to be written
           to. In the given project a new dataset is created with name
           ``prediction_<model-display-name>_<job-create-time>`` where is made BigQuery-dataset-name
           compatible (for example, most special characters become underscores), and timestamp is in
           YYYY_MM_DDThh_mm_ss_sssZ "based on ISO-8601" format. In the dataset two tables will be created,
           ``predictions``, and ``errors``. If the Model has both ``instance`` and ``prediction`` schemata
           defined then the tables have columns as follows: The ``predictions`` table contains instances for
           which the prediction succeeded, it has columns as per a concatenation of the Model's instance and
           prediction schemata. The ``errors`` table contains rows for which the prediction has failed, it
           has instance columns, as per the instance schema, followed by a single "errors" column, which as
           values has ```google.rpc.Status`` <Status>`__ represented as a STRUCT, and containing only
           ``code`` and ``message``.
       :param model_parameters: The parameters that govern the predictions. The schema of the parameters may
           be specified via the Model's `parameters_schema_uri`.
       :param machine_type: The type of machine for running batch prediction on dedicated resources. Not
           specifying machine type will result in batch prediction job being run with automatic resources.
       :param accelerator_type: The type of accelerator(s) that may be attached to the machine as per
           `accelerator_count`. Only used if `machine_type` is set.
       :param accelerator_count: The number of accelerators to attach to the `machine_type`. Only used if
           `machine_type` is set.
       :param starting_replica_count: The number of machine replicas used at the start of the batch
           operation. If not set, Vertex AI decides starting number, not greater than `max_replica_count`.
           Only used if `machine_type` is set.
       :param max_replica_count: The maximum number of machine replicas the batch operation may be scaled
           to. Only used if `machine_type` is set. Default is 10.
       :param generate_explanation: Optional. Generate explanation along with the batch prediction results.
           This will cause the batch prediction output to include explanations based on the
           `prediction_format`:
           - `bigquery`: output includes a column named `explanation`. The value is a struct that conforms
           to the [aiplatform.gapic.Explanation] object.
           - `jsonl`: The JSON objects on each line include an additional entry keyed `explanation`. The
           value of the entry is a JSON object that conforms to the [aiplatform.gapic.Explanation] object.
           - `csv`: Generating explanations for CSV format is not supported.
       :param explanation_metadata: Optional. Explanation metadata configuration for this
           BatchPredictionJob. Can be specified only if `generate_explanation` is set to `True`.
           This value overrides the value of `Model.explanation_metadata`. All fields of
           `explanation_metadata` are optional in the request. If a field of the `explanation_metadata`
           object is not populated, the corresponding field of the `Model.explanation_metadata` object is
           inherited. For more details, see `Ref docs <http://tinyurl.com/1igh60kt>`
       :param explanation_parameters: Optional. Parameters to configure explaining for Model's predictions.
           Can be specified only if `generate_explanation` is set to `True`.
           This value overrides the value of `Model.explanation_parameters`. All fields of
           `explanation_parameters` are optional in the request. If a field of the `explanation_parameters`
           object is not populated, the corresponding field of the `Model.explanation_parameters` object is
           inherited. For more details, see `Ref docs <http://tinyurl.com/1an4zake>`
       :param labels: Optional. The labels with user-defined metadata to organize your BatchPredictionJobs.
           Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain
           lowercase letters, numeric characters, underscores and dashes. International characters are
           allowed. See https://goo.gl/xmQnxf for more information and examples of labels.
       :param encryption_spec_key_name: Optional. The Cloud KMS resource identifier of the customer managed
           encryption key used to protect the job. Has the form:
           ``projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key``. The key needs to be
           in the same region as where the compute resource is created.
           If this is set, then all resources created by the BatchPredictionJob will be encrypted with the
           provided encryption key.
           Overrides encryption_spec_key_name set in aiplatform.init.
       :param sync: Whether to execute this method synchronously. If False, this method will be executed in
           concurrent Future and any downstream object will be immediately returned and synced when the
           Future has completed.


    .. py:method:: delete_batch_prediction_job(self, project_id, region, batch_prediction_job, retry = DEFAULT, timeout = None, metadata = ())

       Deletes a BatchPredictionJob. Can only be called on jobs that already finished.

       :param project_id: Required. The ID of the Google Cloud project that the service belongs to.
       :param region: Required. The ID of the Google Cloud region that the service belongs to.
       :param batch_prediction_job: The name of the BatchPredictionJob resource to be deleted.
       :param retry: Designation of what errors, if any, should be retried.
       :param timeout: The timeout for this request.
       :param metadata: Strings which should be sent along with the request as metadata.


    .. py:method:: get_batch_prediction_job(self, project_id, region, batch_prediction_job, retry = DEFAULT, timeout = None, metadata = ())

       Gets a BatchPredictionJob

       :param project_id: Required. The ID of the Google Cloud project that the service belongs to.
       :param region: Required. The ID of the Google Cloud region that the service belongs to.
       :param batch_prediction_job: Required. The name of the BatchPredictionJob resource.
       :param retry: Designation of what errors, if any, should be retried.
       :param timeout: The timeout for this request.
       :param metadata: Strings which should be sent along with the request as metadata.


    .. py:method:: list_batch_prediction_jobs(self, project_id, region, filter = None, page_size = None, page_token = None, read_mask = None, retry = DEFAULT, timeout = None, metadata = ())

       Lists BatchPredictionJobs in a Location.

       :param project_id: Required. The ID of the Google Cloud project that the service belongs to.
       :param region: Required. The ID of the Google Cloud region that the service belongs to.
       :param filter: The standard list filter.
           Supported fields:
           -  ``display_name`` supports = and !=.
           -  ``state`` supports = and !=.
           -  ``model_display_name`` supports = and !=
           Some examples of using the filter are:
           -  ``state="JOB_STATE_SUCCEEDED" AND display_name="my_job"``
           -  ``state="JOB_STATE_RUNNING" OR display_name="my_job"``
           -  ``NOT display_name="my_job"``
           -  ``state="JOB_STATE_FAILED"``
       :param page_size: The standard list page size.
       :param page_token: The standard list page token.
       :param read_mask: Mask specifying which fields to read.
       :param retry: Designation of what errors, if any, should be retried.
       :param timeout: The timeout for this request.
       :param metadata: Strings which should be sent along with the request as metadata.
	:py:mod:`airflow.providers.google.cloud.hooks.vertex_ai.batch_prediction_job`
	=============================================================================

	.. py:module:: airflow.providers.google.cloud.hooks.vertex_ai.batch_prediction_job

	.. autoapi-nested-parse::

	This module contains a Google Cloud Vertex AI hook.

	.. spelling::

	jsonl
	codepoints
	aiplatform
	gapic



	Module Contents
	---------------

	Classes
	~~~~~~~

	.. autoapisummary::

	airflow.providers.google.cloud.hooks.vertex_ai.batch_prediction_job.BatchPredictionJobHook




	.. py:class:: BatchPredictionJobHook(gcp_conn_id = 'google_cloud_default', delegate_to = None, impersonation_chain = None)

	Bases: :py:obj:`airflow.providers.google.common.hooks.base_google.GoogleBaseHook`

	Hook for Google Cloud Vertex AI Batch Prediction Job APIs.

	.. py:method:: get_job_service_client(self, region = None)

	Returns JobServiceClient.


	.. py:method:: wait_for_operation(self, operation, timeout = None)

	Waits for long-lasting operation to complete.


	.. py:method:: extract_batch_prediction_job_id(obj)
	:staticmethod:

	Returns unique id of the batch_prediction_job.


	.. py:method:: cancel_batch_prediction_job(self)

	Cancel BatchPredictionJob


	.. py:method:: create_batch_prediction_job(self, project_id, region, job_display_name, model_name, instances_format = 'jsonl', predictions_format = 'jsonl', gcs_source = None, bigquery_source = None, gcs_destination_prefix = None, bigquery_destination_prefix = None, model_parameters = None, machine_type = None, accelerator_type = None, accelerator_count = None, starting_replica_count = None, max_replica_count = None, generate_explanation = False, explanation_metadata = None, explanation_parameters = None, labels = None, encryption_spec_key_name = None, sync = True)

	Create a batch prediction job.

	:param project_id: Required. Project to run training in.
	:param region: Required. Location to run training in.
	:param job_display_name: Required. The user-defined name of the BatchPredictionJob. The name can be
	up to 128 characters long and can be consist of any UTF-8 characters.
	:param model_name: Required. A fully-qualified model resource name or model ID.
	:param instances_format: Required. The format in which instances are provided. Must be one of the
	formats listed in `Model.supported_input_storage_formats`. Default is "jsonl" when using
	`gcs_source`. If a `bigquery_source` is provided, this is overridden to "bigquery".
	:param predictions_format: Required. The format in which Vertex AI outputs the predictions, must be
	one of the formats specified in `Model.supported_output_storage_formats`. Default is "jsonl" when
	using `gcs_destination_prefix`. If a `bigquery_destination_prefix` is provided, this is
	overridden to "bigquery".
	:param gcs_source: Google Cloud Storage URI(-s) to your instances to run batch prediction on. They
	must match `instances_format`. May contain wildcards. For more information on wildcards, see
	https://cloud.google.com/storage/docs/gsutil/addlhelp/WildcardNames.
	:param bigquery_source: BigQuery URI to a table, up to 2000 characters long.
	For example: `bq://projectId.bqDatasetId.bqTableId`
	:param gcs_destination_prefix: The Google Cloud Storage location of the directory where the output is
	to be written to. In the given directory a new directory is created. Its name is
	``prediction-<model-display-name>-<job-create-time>``, where timestamp is in
	YYYY-MM-DDThh:mm:ss.sssZ ISO-8601 format. Inside of it files ``predictions_0001.<extension>``,
	``predictions_0002.<extension>``, ..., ``predictions_N.<extension>`` are created where
	``<extension>`` depends on chosen ``predictions_format``, and N may equal 0001 and depends on the
	total number of successfully predicted instances. If the Model has both ``instance`` and
	``prediction`` schemata defined then each such file contains predictions as per the
	``predictions_format``. If prediction for any instance failed (partially or completely), then an
	additional ``errors_0001.<extension>``, ``errors_0002.<extension>``,..., ``errors_N.<extension>``
	files are created (N depends on total number of failed predictions). These files contain the
	failed instances, as per their schema, followed by an additional ``error`` field which as value
	has ```google.rpc.Status`` <Status>`__ containing only ``code`` and ``message`` fields.
	:param bigquery_destination_prefix: The BigQuery project location where the output is to be written
	to. In the given project a new dataset is created with name
	``prediction_<model-display-name>_<job-create-time>`` where is made BigQuery-dataset-name
	compatible (for example, most special characters become underscores), and timestamp is in
	YYYY_MM_DDThh_mm_ss_sssZ "based on ISO-8601" format. In the dataset two tables will be created,
	``predictions``, and ``errors``. If the Model has both ``instance`` and ``prediction`` schemata
	defined then the tables have columns as follows: The ``predictions`` table contains instances for
	which the prediction succeeded, it has columns as per a concatenation of the Model's instance and
	prediction schemata. The ``errors`` table contains rows for which the prediction has failed, it
	has instance columns, as per the instance schema, followed by a single "errors" column, which as
	values has ```google.rpc.Status`` <Status>`__ represented as a STRUCT, and containing only
	``code`` and ``message``.
	:param model_parameters: The parameters that govern the predictions. The schema of the parameters may
	be specified via the Model's `parameters_schema_uri`.
	:param machine_type: The type of machine for running batch prediction on dedicated resources. Not
	specifying machine type will result in batch prediction job being run with automatic resources.
	:param accelerator_type: The type of accelerator(s) that may be attached to the machine as per
	`accelerator_count`. Only used if `machine_type` is set.
	:param accelerator_count: The number of accelerators to attach to the `machine_type`. Only used if
	`machine_type` is set.
	:param starting_replica_count: The number of machine replicas used at the start of the batch
	operation. If not set, Vertex AI decides starting number, not greater than `max_replica_count`.
	Only used if `machine_type` is set.
	:param max_replica_count: The maximum number of machine replicas the batch operation may be scaled
	to. Only used if `machine_type` is set. Default is 10.
	:param generate_explanation: Optional. Generate explanation along with the batch prediction results.
	This will cause the batch prediction output to include explanations based on the
	`prediction_format`:
	- `bigquery`: output includes a column named `explanation`. The value is a struct that conforms
	to the [aiplatform.gapic.Explanation] object.
	- `jsonl`: The JSON objects on each line include an additional entry keyed `explanation`. The
	value of the entry is a JSON object that conforms to the [aiplatform.gapic.Explanation] object.
	- `csv`: Generating explanations for CSV format is not supported.
	:param explanation_metadata: Optional. Explanation metadata configuration for this
	BatchPredictionJob. Can be specified only if `generate_explanation` is set to `True`.
	This value overrides the value of `Model.explanation_metadata`. All fields of
	`explanation_metadata` are optional in the request. If a field of the `explanation_metadata`
	object is not populated, the corresponding field of the `Model.explanation_metadata` object is
	inherited. For more details, see `Ref docs <http://tinyurl.com/1igh60kt>`
	:param explanation_parameters: Optional. Parameters to configure explaining for Model's predictions.
	Can be specified only if `generate_explanation` is set to `True`.
	This value overrides the value of `Model.explanation_parameters`. All fields of
	`explanation_parameters` are optional in the request. If a field of the `explanation_parameters`
	object is not populated, the corresponding field of the `Model.explanation_parameters` object is
	inherited. For more details, see `Ref docs <http://tinyurl.com/1an4zake>`
	:param labels: Optional. The labels with user-defined metadata to organize your BatchPredictionJobs.
	Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain
	lowercase letters, numeric characters, underscores and dashes. International characters are
	allowed. See https://goo.gl/xmQnxf for more information and examples of labels.
	:param encryption_spec_key_name: Optional. The Cloud KMS resource identifier of the customer managed
	encryption key used to protect the job. Has the form:
	``projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key``. The key needs to be
	in the same region as where the compute resource is created.
	If this is set, then all resources created by the BatchPredictionJob will be encrypted with the
	provided encryption key.
	Overrides encryption_spec_key_name set in aiplatform.init.
	:param sync: Whether to execute this method synchronously. If False, this method will be executed in
	concurrent Future and any downstream object will be immediately returned and synced when the
	Future has completed.


	.. py:method:: delete_batch_prediction_job(self, project_id, region, batch_prediction_job, retry = DEFAULT, timeout = None, metadata = ())

	Deletes a BatchPredictionJob. Can only be called on jobs that already finished.

	:param project_id: Required. The ID of the Google Cloud project that the service belongs to.
	:param region: Required. The ID of the Google Cloud region that the service belongs to.
	:param batch_prediction_job: The name of the BatchPredictionJob resource to be deleted.
	:param retry: Designation of what errors, if any, should be retried.
	:param timeout: The timeout for this request.
	:param metadata: Strings which should be sent along with the request as metadata.


	.. py:method:: get_batch_prediction_job(self, project_id, region, batch_prediction_job, retry = DEFAULT, timeout = None, metadata = ())

	Gets a BatchPredictionJob

	:param project_id: Required. The ID of the Google Cloud project that the service belongs to.
	:param region: Required. The ID of the Google Cloud region that the service belongs to.
	:param batch_prediction_job: Required. The name of the BatchPredictionJob resource.
	:param retry: Designation of what errors, if any, should be retried.
	:param timeout: The timeout for this request.
	:param metadata: Strings which should be sent along with the request as metadata.


	.. py:method:: list_batch_prediction_jobs(self, project_id, region, filter = None, page_size = None, page_token = None, read_mask = None, retry = DEFAULT, timeout = None, metadata = ())

	Lists BatchPredictionJobs in a Location.

	:param project_id: Required. The ID of the Google Cloud project that the service belongs to.
	:param region: Required. The ID of the Google Cloud region that the service belongs to.
	:param filter: The standard list filter.
	Supported fields:
	- ``display_name`` supports = and !=.
	- ``state`` supports = and !=.
	- ``model_display_name`` supports = and !=
	Some examples of using the filter are:
	- ``state="JOB_STATE_SUCCEEDED" AND display_name="my_job"``
	- ``state="JOB_STATE_RUNNING" OR display_name="my_job"``
	- ``NOT display_name="my_job"``
	- ``state="JOB_STATE_FAILED"``
	:param page_size: The standard list page size.
	:param page_token: The standard list page token.
	:param read_mask: Mask specifying which fields to read.
	:param retry: Designation of what errors, if any, should be retried.
	:param timeout: The timeout for this request.
	:param metadata: Strings which should be sent along with the request as metadata.