| :py:mod:`airflow.providers.google.cloud.hooks.vertex_ai.batch_prediction_job` |
| ============================================================================= |
| |
| .. py:module:: airflow.providers.google.cloud.hooks.vertex_ai.batch_prediction_job |
| |
| .. autoapi-nested-parse:: |
| |
| This module contains a Google Cloud Vertex AI hook. |
| |
| .. spelling:: |
| |
| jsonl |
| codepoints |
| aiplatform |
| gapic |
| |
| |
| |
| Module Contents |
| --------------- |
| |
| Classes |
| ~~~~~~~ |
| |
| .. autoapisummary:: |
| |
| airflow.providers.google.cloud.hooks.vertex_ai.batch_prediction_job.BatchPredictionJobHook |
| |
| |
| |
| |
| .. py:class:: BatchPredictionJobHook(gcp_conn_id = 'google_cloud_default', delegate_to = None, impersonation_chain = None) |
| |
| Bases: :py:obj:`airflow.providers.google.common.hooks.base_google.GoogleBaseHook` |
| |
| Hook for Google Cloud Vertex AI Batch Prediction Job APIs. |
| |
| .. py:method:: get_job_service_client(self, region = None) |
| |
| Returns JobServiceClient. |
| |
| |
| .. py:method:: wait_for_operation(self, operation, timeout = None) |
| |
| Waits for long-lasting operation to complete. |
| |
| |
| .. py:method:: extract_batch_prediction_job_id(obj) |
| :staticmethod: |
| |
| Returns unique id of the batch_prediction_job. |
| |
| |
| .. py:method:: cancel_batch_prediction_job(self) |
| |
| Cancel BatchPredictionJob |
| |
| |
| .. py:method:: create_batch_prediction_job(self, project_id, region, job_display_name, model_name, instances_format = 'jsonl', predictions_format = 'jsonl', gcs_source = None, bigquery_source = None, gcs_destination_prefix = None, bigquery_destination_prefix = None, model_parameters = None, machine_type = None, accelerator_type = None, accelerator_count = None, starting_replica_count = None, max_replica_count = None, generate_explanation = False, explanation_metadata = None, explanation_parameters = None, labels = None, encryption_spec_key_name = None, sync = True) |
| |
| Create a batch prediction job. |
| |
| :param project_id: Required. Project to run training in. |
| :param region: Required. Location to run training in. |
| :param job_display_name: Required. The user-defined name of the BatchPredictionJob. The name can be |
| up to 128 characters long and can be consist of any UTF-8 characters. |
| :param model_name: Required. A fully-qualified model resource name or model ID. |
| :param instances_format: Required. The format in which instances are provided. Must be one of the |
| formats listed in `Model.supported_input_storage_formats`. Default is "jsonl" when using |
| `gcs_source`. If a `bigquery_source` is provided, this is overridden to "bigquery". |
| :param predictions_format: Required. The format in which Vertex AI outputs the predictions, must be |
| one of the formats specified in `Model.supported_output_storage_formats`. Default is "jsonl" when |
| using `gcs_destination_prefix`. If a `bigquery_destination_prefix` is provided, this is |
| overridden to "bigquery". |
| :param gcs_source: Google Cloud Storage URI(-s) to your instances to run batch prediction on. They |
| must match `instances_format`. May contain wildcards. For more information on wildcards, see |
| https://cloud.google.com/storage/docs/gsutil/addlhelp/WildcardNames. |
| :param bigquery_source: BigQuery URI to a table, up to 2000 characters long. |
| For example: `bq://projectId.bqDatasetId.bqTableId` |
| :param gcs_destination_prefix: The Google Cloud Storage location of the directory where the output is |
| to be written to. In the given directory a new directory is created. Its name is |
| ``prediction-<model-display-name>-<job-create-time>``, where timestamp is in |
| YYYY-MM-DDThh:mm:ss.sssZ ISO-8601 format. Inside of it files ``predictions_0001.<extension>``, |
| ``predictions_0002.<extension>``, ..., ``predictions_N.<extension>`` are created where |
| ``<extension>`` depends on chosen ``predictions_format``, and N may equal 0001 and depends on the |
| total number of successfully predicted instances. If the Model has both ``instance`` and |
| ``prediction`` schemata defined then each such file contains predictions as per the |
| ``predictions_format``. If prediction for any instance failed (partially or completely), then an |
| additional ``errors_0001.<extension>``, ``errors_0002.<extension>``,..., ``errors_N.<extension>`` |
| files are created (N depends on total number of failed predictions). These files contain the |
| failed instances, as per their schema, followed by an additional ``error`` field which as value |
| has ```google.rpc.Status`` <Status>`__ containing only ``code`` and ``message`` fields. |
| :param bigquery_destination_prefix: The BigQuery project location where the output is to be written |
| to. In the given project a new dataset is created with name |
| ``prediction_<model-display-name>_<job-create-time>`` where is made BigQuery-dataset-name |
| compatible (for example, most special characters become underscores), and timestamp is in |
| YYYY_MM_DDThh_mm_ss_sssZ "based on ISO-8601" format. In the dataset two tables will be created, |
| ``predictions``, and ``errors``. If the Model has both ``instance`` and ``prediction`` schemata |
| defined then the tables have columns as follows: The ``predictions`` table contains instances for |
| which the prediction succeeded, it has columns as per a concatenation of the Model's instance and |
| prediction schemata. The ``errors`` table contains rows for which the prediction has failed, it |
| has instance columns, as per the instance schema, followed by a single "errors" column, which as |
| values has ```google.rpc.Status`` <Status>`__ represented as a STRUCT, and containing only |
| ``code`` and ``message``. |
| :param model_parameters: The parameters that govern the predictions. The schema of the parameters may |
| be specified via the Model's `parameters_schema_uri`. |
| :param machine_type: The type of machine for running batch prediction on dedicated resources. Not |
| specifying machine type will result in batch prediction job being run with automatic resources. |
| :param accelerator_type: The type of accelerator(s) that may be attached to the machine as per |
| `accelerator_count`. Only used if `machine_type` is set. |
| :param accelerator_count: The number of accelerators to attach to the `machine_type`. Only used if |
| `machine_type` is set. |
| :param starting_replica_count: The number of machine replicas used at the start of the batch |
| operation. If not set, Vertex AI decides starting number, not greater than `max_replica_count`. |
| Only used if `machine_type` is set. |
| :param max_replica_count: The maximum number of machine replicas the batch operation may be scaled |
| to. Only used if `machine_type` is set. Default is 10. |
| :param generate_explanation: Optional. Generate explanation along with the batch prediction results. |
| This will cause the batch prediction output to include explanations based on the |
| `prediction_format`: |
| - `bigquery`: output includes a column named `explanation`. The value is a struct that conforms |
| to the [aiplatform.gapic.Explanation] object. |
| - `jsonl`: The JSON objects on each line include an additional entry keyed `explanation`. The |
| value of the entry is a JSON object that conforms to the [aiplatform.gapic.Explanation] object. |
| - `csv`: Generating explanations for CSV format is not supported. |
| :param explanation_metadata: Optional. Explanation metadata configuration for this |
| BatchPredictionJob. Can be specified only if `generate_explanation` is set to `True`. |
| This value overrides the value of `Model.explanation_metadata`. All fields of |
| `explanation_metadata` are optional in the request. If a field of the `explanation_metadata` |
| object is not populated, the corresponding field of the `Model.explanation_metadata` object is |
| inherited. For more details, see `Ref docs <http://tinyurl.com/1igh60kt>` |
| :param explanation_parameters: Optional. Parameters to configure explaining for Model's predictions. |
| Can be specified only if `generate_explanation` is set to `True`. |
| This value overrides the value of `Model.explanation_parameters`. All fields of |
| `explanation_parameters` are optional in the request. If a field of the `explanation_parameters` |
| object is not populated, the corresponding field of the `Model.explanation_parameters` object is |
| inherited. For more details, see `Ref docs <http://tinyurl.com/1an4zake>` |
| :param labels: Optional. The labels with user-defined metadata to organize your BatchPredictionJobs. |
| Label keys and values can be no longer than 64 characters (Unicode codepoints), can only contain |
| lowercase letters, numeric characters, underscores and dashes. International characters are |
| allowed. See https://goo.gl/xmQnxf for more information and examples of labels. |
| :param encryption_spec_key_name: Optional. The Cloud KMS resource identifier of the customer managed |
| encryption key used to protect the job. Has the form: |
| ``projects/my-project/locations/my-region/keyRings/my-kr/cryptoKeys/my-key``. The key needs to be |
| in the same region as where the compute resource is created. |
| If this is set, then all resources created by the BatchPredictionJob will be encrypted with the |
| provided encryption key. |
| Overrides encryption_spec_key_name set in aiplatform.init. |
| :param sync: Whether to execute this method synchronously. If False, this method will be executed in |
| concurrent Future and any downstream object will be immediately returned and synced when the |
| Future has completed. |
| |
| |
| .. py:method:: delete_batch_prediction_job(self, project_id, region, batch_prediction_job, retry = DEFAULT, timeout = None, metadata = ()) |
| |
| Deletes a BatchPredictionJob. Can only be called on jobs that already finished. |
| |
| :param project_id: Required. The ID of the Google Cloud project that the service belongs to. |
| :param region: Required. The ID of the Google Cloud region that the service belongs to. |
| :param batch_prediction_job: The name of the BatchPredictionJob resource to be deleted. |
| :param retry: Designation of what errors, if any, should be retried. |
| :param timeout: The timeout for this request. |
| :param metadata: Strings which should be sent along with the request as metadata. |
| |
| |
| .. py:method:: get_batch_prediction_job(self, project_id, region, batch_prediction_job, retry = DEFAULT, timeout = None, metadata = ()) |
| |
| Gets a BatchPredictionJob |
| |
| :param project_id: Required. The ID of the Google Cloud project that the service belongs to. |
| :param region: Required. The ID of the Google Cloud region that the service belongs to. |
| :param batch_prediction_job: Required. The name of the BatchPredictionJob resource. |
| :param retry: Designation of what errors, if any, should be retried. |
| :param timeout: The timeout for this request. |
| :param metadata: Strings which should be sent along with the request as metadata. |
| |
| |
| .. py:method:: list_batch_prediction_jobs(self, project_id, region, filter = None, page_size = None, page_token = None, read_mask = None, retry = DEFAULT, timeout = None, metadata = ()) |
| |
| Lists BatchPredictionJobs in a Location. |
| |
| :param project_id: Required. The ID of the Google Cloud project that the service belongs to. |
| :param region: Required. The ID of the Google Cloud region that the service belongs to. |
| :param filter: The standard list filter. |
| Supported fields: |
| - ``display_name`` supports = and !=. |
| - ``state`` supports = and !=. |
| - ``model_display_name`` supports = and != |
| Some examples of using the filter are: |
| - ``state="JOB_STATE_SUCCEEDED" AND display_name="my_job"`` |
| - ``state="JOB_STATE_RUNNING" OR display_name="my_job"`` |
| - ``NOT display_name="my_job"`` |
| - ``state="JOB_STATE_FAILED"`` |
| :param page_size: The standard list page size. |
| :param page_token: The standard list page token. |
| :param read_mask: Mask specifying which fields to read. |
| :param retry: Designation of what errors, if any, should be retried. |
| :param timeout: The timeout for this request. |
| :param metadata: Strings which should be sent along with the request as metadata. |
| |
| |
| |