docs-archive/apache-airflow-providers-google/8.4.0/_sources/_api/airflow/providers/google/cloud/hooks/dataflow/index.rst.txt - airflow-site - Git at Google

 :py:mod:`airflow.providers.google.cloud.hooks.dataflow`
 =======================================================

 .. py:module:: airflow.providers.google.cloud.hooks.dataflow

 .. autoapi-nested-parse::

    This module contains a Google Dataflow Hook.


 Module Contents
 ---------------

 Classes
 ~~~~~~~

 .. autoapisummary::

    airflow.providers.google.cloud.hooks.dataflow.DataflowJobStatus
    airflow.providers.google.cloud.hooks.dataflow.DataflowJobType
    airflow.providers.google.cloud.hooks.dataflow.DataflowHook


 Functions
 ~~~~~~~~~

 .. autoapisummary::

    airflow.providers.google.cloud.hooks.dataflow.process_line_and_extract_dataflow_job_id_callback


 Attributes
 ~~~~~~~~~~

 .. autoapisummary::

    airflow.providers.google.cloud.hooks.dataflow.DEFAULT_DATAFLOW_LOCATION
    airflow.providers.google.cloud.hooks.dataflow.JOB_ID_PATTERN
    airflow.providers.google.cloud.hooks.dataflow.T


 .. py:data:: DEFAULT_DATAFLOW_LOCATION
    :annotation: = us-central1


 .. py:data:: JOB_ID_PATTERN


 .. py:data:: T


 .. py:function:: process_line_and_extract_dataflow_job_id_callback(on_new_job_id_callback)

    Returns callback which triggers function passed as `on_new_job_id_callback` when Dataflow job_id is found.
    To be used for `process_line_callback` in
    :py:class:`~airflow.providers.apache.beam.hooks.beam.BeamCommandRunner`

    :param on_new_job_id_callback: Callback called when the job ID is known


 .. py:class:: DataflowJobStatus

    Helper class with Dataflow job statuses.
    Reference: https://cloud.google.com/dataflow/docs/reference/rest/v1b3/projects.jobs#Job.JobState

    .. py:attribute:: JOB_STATE_DONE
       :annotation: = JOB_STATE_DONE


    .. py:attribute:: JOB_STATE_UNKNOWN
       :annotation: = JOB_STATE_UNKNOWN


    .. py:attribute:: JOB_STATE_STOPPED
       :annotation: = JOB_STATE_STOPPED


    .. py:attribute:: JOB_STATE_RUNNING
       :annotation: = JOB_STATE_RUNNING


    .. py:attribute:: JOB_STATE_FAILED
       :annotation: = JOB_STATE_FAILED


    .. py:attribute:: JOB_STATE_CANCELLED
       :annotation: = JOB_STATE_CANCELLED


    .. py:attribute:: JOB_STATE_UPDATED
       :annotation: = JOB_STATE_UPDATED


    .. py:attribute:: JOB_STATE_DRAINING
       :annotation: = JOB_STATE_DRAINING


    .. py:attribute:: JOB_STATE_DRAINED
       :annotation: = JOB_STATE_DRAINED


    .. py:attribute:: JOB_STATE_PENDING
       :annotation: = JOB_STATE_PENDING


    .. py:attribute:: JOB_STATE_CANCELLING
       :annotation: = JOB_STATE_CANCELLING


    .. py:attribute:: JOB_STATE_QUEUED
       :annotation: = JOB_STATE_QUEUED


    .. py:attribute:: FAILED_END_STATES


    .. py:attribute:: SUCCEEDED_END_STATES


    .. py:attribute:: TERMINAL_STATES


    .. py:attribute:: AWAITING_STATES


 .. py:class:: DataflowJobType

    Helper class with Dataflow job types.

    .. py:attribute:: JOB_TYPE_UNKNOWN
       :annotation: = JOB_TYPE_UNKNOWN


    .. py:attribute:: JOB_TYPE_BATCH
       :annotation: = JOB_TYPE_BATCH


    .. py:attribute:: JOB_TYPE_STREAMING
       :annotation: = JOB_TYPE_STREAMING


 .. py:class:: DataflowHook(gcp_conn_id = 'google_cloud_default', delegate_to = None, poll_sleep = 10, impersonation_chain = None, drain_pipeline = False, cancel_timeout = 5 * 60, wait_until_finished = None)

    Bases: :py:obj:`airflow.providers.google.common.hooks.base_google.GoogleBaseHook`

    Hook for Google Dataflow.

    All the methods in the hook where project_id is used must be called with
    keyword arguments rather than positional.

    .. py:method:: get_conn()

       Returns a Google Cloud Dataflow service object.


    .. py:method:: start_java_dataflow(job_name, variables, jar, project_id, job_class = None, append_job_name = True, multiple_jobs = False, on_new_job_id_callback = None, location = DEFAULT_DATAFLOW_LOCATION)

       Starts Dataflow java job.

       :param job_name: The name of the job.
       :param variables: Variables passed to the job.
       :param project_id: Optional, the Google Cloud project ID in which to start a job.
           If set to None or missing, the default project_id from the Google Cloud connection is used.
       :param jar: Name of the jar for the job
       :param job_class: Name of the java class for the job.
       :param append_job_name: True if unique suffix has to be appended to job name.
       :param multiple_jobs: True if to check for multiple job in dataflow
       :param on_new_job_id_callback: Callback called when the job ID is known.
       :param location: Job location.


    .. py:method:: start_template_dataflow(job_name, variables, parameters, dataflow_template, project_id, append_job_name = True, on_new_job_id_callback = None, on_new_job_callback = None, location = DEFAULT_DATAFLOW_LOCATION, environment = None)

       Starts Dataflow template job.

       :param job_name: The name of the job.
       :param variables: Map of job runtime environment options.
           It will update environment argument if passed.

           .. seealso::
               For more information on possible configurations, look at the API documentation
               `https://cloud.google.com/dataflow/pipelines/specifying-exec-params
               <https://cloud.google.com/dataflow/docs/reference/rest/v1b3/RuntimeEnvironment>`__

       :param parameters: Parameters for the template
       :param dataflow_template: GCS path to the template.
       :param project_id: Optional, the Google Cloud project ID in which to start a job.
           If set to None or missing, the default project_id from the Google Cloud connection is used.
       :param append_job_name: True if unique suffix has to be appended to job name.
       :param on_new_job_id_callback: (Deprecated) Callback called when the Job is known.
       :param on_new_job_callback: Callback called when the Job is known.
       :param location: Job location.

           .. seealso::
               For more information on possible configurations, look at the API documentation
               `https://cloud.google.com/dataflow/pipelines/specifying-exec-params
               <https://cloud.google.com/dataflow/docs/reference/rest/v1b3/RuntimeEnvironment>`__


    .. py:method:: start_flex_template(body, location, project_id, on_new_job_id_callback = None, on_new_job_callback = None)

       Starts flex templates with the Dataflow  pipeline.

       :param body: The request body. See:
           https://cloud.google.com/dataflow/docs/reference/rest/v1b3/projects.locations.flexTemplates/launch#request-body
       :param location: The location of the Dataflow job (for example europe-west1)
       :param project_id: The ID of the GCP project that owns the job.
           If set to ``None`` or missing, the default project_id from the GCP connection is used.
       :param on_new_job_id_callback: (Deprecated) A callback that is called when a Job ID is detected.
       :param on_new_job_callback: A callback that is called when a Job is detected.
       :return: the Job


    .. py:method:: start_python_dataflow(job_name, variables, dataflow, py_options, project_id, py_interpreter = 'python3', py_requirements = None, py_system_site_packages = False, append_job_name = True, on_new_job_id_callback = None, location = DEFAULT_DATAFLOW_LOCATION)

       Starts Dataflow job.

       :param job_name: The name of the job.
       :param variables: Variables passed to the job.
       :param dataflow: Name of the Dataflow process.
       :param py_options: Additional options.
       :param project_id: The ID of the GCP project that owns the job.
           If set to ``None`` or missing, the default project_id from the GCP connection is used.
       :param py_interpreter: Python version of the beam pipeline.
           If None, this defaults to the python3.
           To track python versions supported by beam and related
           issues check: https://issues.apache.org/jira/browse/BEAM-1251
       :param py_requirements: Additional python package(s) to install.
           If a value is passed to this parameter, a new virtual environment has been created with
           additional packages installed.

           You could also install the apache-beam package if it is not installed on your system or you want
           to use a different version.
       :param py_system_site_packages: Whether to include system_site_packages in your virtualenv.
           See virtualenv documentation for more information.

           This option is only relevant if the ``py_requirements`` parameter is not None.
       :param append_job_name: True if unique suffix has to be appended to job name.
       :param project_id: Optional, the Google Cloud project ID in which to start a job.
           If set to None or missing, the default project_id from the Google Cloud connection is used.
       :param on_new_job_id_callback: Callback called when the job ID is known.
       :param location: Job location.


    .. py:method:: build_dataflow_job_name(job_name, append_job_name = True)
       :staticmethod:

       Builds Dataflow job name.


    .. py:method:: is_job_dataflow_running(name, project_id, location = DEFAULT_DATAFLOW_LOCATION, variables = None)

       Helper method to check if jos is still running in dataflow

       :param name: The name of the job.
       :param project_id: Optional, the Google Cloud project ID in which to start a job.
           If set to None or missing, the default project_id from the Google Cloud connection is used.
       :param location: Job location.
       :return: True if job is running.
       :rtype: bool


    .. py:method:: cancel_job(project_id, job_name = None, job_id = None, location = DEFAULT_DATAFLOW_LOCATION)

       Cancels the job with the specified name prefix or Job ID.

       Parameter ``name`` and ``job_id`` are mutually exclusive.

       :param job_name: Name prefix specifying which jobs are to be canceled.
       :param job_id: Job ID specifying which jobs are to be canceled.
       :param location: Job location.
       :param project_id: Optional, the Google Cloud project ID in which to start a job.
           If set to None or missing, the default project_id from the Google Cloud connection is used.


    .. py:method:: start_sql_job(job_name, query, options, project_id, location = DEFAULT_DATAFLOW_LOCATION, on_new_job_id_callback = None, on_new_job_callback = None)

       Starts Dataflow SQL query.

       :param job_name: The unique name to assign to the Cloud Dataflow job.
       :param query: The SQL query to execute.
       :param options: Job parameters to be executed.
           For more information, look at:
           `https://cloud.google.com/sdk/gcloud/reference/beta/dataflow/sql/query
           <gcloud beta dataflow sql query>`__
           command reference
       :param location: The location of the Dataflow job (for example europe-west1)
       :param project_id: The ID of the GCP project that owns the job.
           If set to ``None`` or missing, the default project_id from the GCP connection is used.
       :param on_new_job_id_callback: (Deprecated) Callback called when the job ID is known.
       :param on_new_job_callback: Callback called when the job is known.
       :return: the new job object


    .. py:method:: get_job(job_id, project_id, location = DEFAULT_DATAFLOW_LOCATION)

       Gets the job with the specified Job ID.

       :param job_id: Job ID to get.
       :param project_id: Optional, the Google Cloud project ID in which to start a job.
           If set to None or missing, the default project_id from the Google Cloud connection is used.
       :param location: The location of the Dataflow job (for example europe-west1). See:
           https://cloud.google.com/dataflow/docs/concepts/regional-endpoints
       :return: the Job
       :rtype: dict


    .. py:method:: fetch_job_metrics_by_id(job_id, project_id, location = DEFAULT_DATAFLOW_LOCATION)

       Gets the job metrics with the specified Job ID.

       :param job_id: Job ID to get.
       :param project_id: Optional, the Google Cloud project ID in which to start a job.
           If set to None or missing, the default project_id from the Google Cloud connection is used.
       :param location: The location of the Dataflow job (for example europe-west1). See:
           https://cloud.google.com/dataflow/docs/concepts/regional-endpoints
       :return: the JobMetrics. See:
           https://cloud.google.com/dataflow/docs/reference/rest/v1b3/JobMetrics
       :rtype: dict


    .. py:method:: fetch_job_messages_by_id(job_id, project_id, location = DEFAULT_DATAFLOW_LOCATION)

       Gets the job messages with the specified Job ID.

       :param job_id: Job ID to get.
       :param project_id: Optional, the Google Cloud project ID in which to start a job.
           If set to None or missing, the default project_id from the Google Cloud connection is used.
       :param location: Job location.
       :return: the list of JobMessages. See:
           https://cloud.google.com/dataflow/docs/reference/rest/v1b3/ListJobMessagesResponse#JobMessage
       :rtype: List[dict]


    .. py:method:: fetch_job_autoscaling_events_by_id(job_id, project_id, location = DEFAULT_DATAFLOW_LOCATION)

       Gets the job autoscaling events with the specified Job ID.

       :param job_id: Job ID to get.
       :param project_id: Optional, the Google Cloud project ID in which to start a job.
           If set to None or missing, the default project_id from the Google Cloud connection is used.
       :param location: Job location.
       :return: the list of AutoscalingEvents. See:
           https://cloud.google.com/dataflow/docs/reference/rest/v1b3/ListJobMessagesResponse#autoscalingevent
       :rtype: List[dict]


    .. py:method:: wait_for_done(job_name, location, project_id, job_id = None, multiple_jobs = False)

       Wait for Dataflow job.

       :param job_name: The 'jobName' to use when executing the DataFlow job
           (templated). This ends up being set in the pipeline options, so any entry
           with key ``'jobName'`` in ``options`` will be overwritten.
       :param location: location the job is running
       :param project_id: Optional, the Google Cloud project ID in which to start a job.
           If set to None or missing, the default project_id from the Google Cloud connection is used.
       :param job_id: a Dataflow job ID
       :param multiple_jobs: If pipeline creates multiple jobs then monitor all jobs
	:py:mod:`airflow.providers.google.cloud.hooks.dataflow`
	=======================================================

	.. py:module:: airflow.providers.google.cloud.hooks.dataflow

	.. autoapi-nested-parse::

	This module contains a Google Dataflow Hook.



	Module Contents
	---------------

	Classes
	~~~~~~~

	.. autoapisummary::

	airflow.providers.google.cloud.hooks.dataflow.DataflowJobStatus
	airflow.providers.google.cloud.hooks.dataflow.DataflowJobType
	airflow.providers.google.cloud.hooks.dataflow.DataflowHook



	Functions
	~~~~~~~~~

	.. autoapisummary::

	airflow.providers.google.cloud.hooks.dataflow.process_line_and_extract_dataflow_job_id_callback



	Attributes
	~~~~~~~~~~

	.. autoapisummary::

	airflow.providers.google.cloud.hooks.dataflow.DEFAULT_DATAFLOW_LOCATION
	airflow.providers.google.cloud.hooks.dataflow.JOB_ID_PATTERN
	airflow.providers.google.cloud.hooks.dataflow.T


	.. py:data:: DEFAULT_DATAFLOW_LOCATION
	:annotation: = us-central1



	.. py:data:: JOB_ID_PATTERN




	.. py:data:: T




	.. py:function:: process_line_and_extract_dataflow_job_id_callback(on_new_job_id_callback)

	Returns callback which triggers function passed as `on_new_job_id_callback` when Dataflow job_id is found.
	To be used for `process_line_callback` in
	:py:class:`~airflow.providers.apache.beam.hooks.beam.BeamCommandRunner`

	:param on_new_job_id_callback: Callback called when the job ID is known


	.. py:class:: DataflowJobStatus

	Helper class with Dataflow job statuses.
	Reference: https://cloud.google.com/dataflow/docs/reference/rest/v1b3/projects.jobs#Job.JobState

	.. py:attribute:: JOB_STATE_DONE
	:annotation: = JOB_STATE_DONE



	.. py:attribute:: JOB_STATE_UNKNOWN
	:annotation: = JOB_STATE_UNKNOWN



	.. py:attribute:: JOB_STATE_STOPPED
	:annotation: = JOB_STATE_STOPPED



	.. py:attribute:: JOB_STATE_RUNNING
	:annotation: = JOB_STATE_RUNNING



	.. py:attribute:: JOB_STATE_FAILED
	:annotation: = JOB_STATE_FAILED



	.. py:attribute:: JOB_STATE_CANCELLED
	:annotation: = JOB_STATE_CANCELLED



	.. py:attribute:: JOB_STATE_UPDATED
	:annotation: = JOB_STATE_UPDATED



	.. py:attribute:: JOB_STATE_DRAINING
	:annotation: = JOB_STATE_DRAINING



	.. py:attribute:: JOB_STATE_DRAINED
	:annotation: = JOB_STATE_DRAINED



	.. py:attribute:: JOB_STATE_PENDING
	:annotation: = JOB_STATE_PENDING



	.. py:attribute:: JOB_STATE_CANCELLING
	:annotation: = JOB_STATE_CANCELLING



	.. py:attribute:: JOB_STATE_QUEUED
	:annotation: = JOB_STATE_QUEUED



	.. py:attribute:: FAILED_END_STATES




	.. py:attribute:: SUCCEEDED_END_STATES




	.. py:attribute:: TERMINAL_STATES




	.. py:attribute:: AWAITING_STATES





	.. py:class:: DataflowJobType

	Helper class with Dataflow job types.

	.. py:attribute:: JOB_TYPE_UNKNOWN
	:annotation: = JOB_TYPE_UNKNOWN



	.. py:attribute:: JOB_TYPE_BATCH
	:annotation: = JOB_TYPE_BATCH



	.. py:attribute:: JOB_TYPE_STREAMING
	:annotation: = JOB_TYPE_STREAMING




	.. py:class:: DataflowHook(gcp_conn_id = 'google_cloud_default', delegate_to = None, poll_sleep = 10, impersonation_chain = None, drain_pipeline = False, cancel_timeout = 5 * 60, wait_until_finished = None)

	Bases: :py:obj:`airflow.providers.google.common.hooks.base_google.GoogleBaseHook`

	Hook for Google Dataflow.

	All the methods in the hook where project_id is used must be called with
	keyword arguments rather than positional.

	.. py:method:: get_conn()

	Returns a Google Cloud Dataflow service object.


	.. py:method:: start_java_dataflow(job_name, variables, jar, project_id, job_class = None, append_job_name = True, multiple_jobs = False, on_new_job_id_callback = None, location = DEFAULT_DATAFLOW_LOCATION)

	Starts Dataflow java job.

	:param job_name: The name of the job.
	:param variables: Variables passed to the job.
	:param project_id: Optional, the Google Cloud project ID in which to start a job.
	If set to None or missing, the default project_id from the Google Cloud connection is used.
	:param jar: Name of the jar for the job
	:param job_class: Name of the java class for the job.
	:param append_job_name: True if unique suffix has to be appended to job name.
	:param multiple_jobs: True if to check for multiple job in dataflow
	:param on_new_job_id_callback: Callback called when the job ID is known.
	:param location: Job location.


	.. py:method:: start_template_dataflow(job_name, variables, parameters, dataflow_template, project_id, append_job_name = True, on_new_job_id_callback = None, on_new_job_callback = None, location = DEFAULT_DATAFLOW_LOCATION, environment = None)

	Starts Dataflow template job.

	:param job_name: The name of the job.
	:param variables: Map of job runtime environment options.
	It will update environment argument if passed.

	.. seealso::
	For more information on possible configurations, look at the API documentation
	`https://cloud.google.com/dataflow/pipelines/specifying-exec-params
	<https://cloud.google.com/dataflow/docs/reference/rest/v1b3/RuntimeEnvironment>`__

	:param parameters: Parameters for the template
	:param dataflow_template: GCS path to the template.
	:param project_id: Optional, the Google Cloud project ID in which to start a job.
	If set to None or missing, the default project_id from the Google Cloud connection is used.
	:param append_job_name: True if unique suffix has to be appended to job name.
	:param on_new_job_id_callback: (Deprecated) Callback called when the Job is known.
	:param on_new_job_callback: Callback called when the Job is known.
	:param location: Job location.

	.. seealso::
	For more information on possible configurations, look at the API documentation
	`https://cloud.google.com/dataflow/pipelines/specifying-exec-params
	<https://cloud.google.com/dataflow/docs/reference/rest/v1b3/RuntimeEnvironment>`__



	.. py:method:: start_flex_template(body, location, project_id, on_new_job_id_callback = None, on_new_job_callback = None)

	Starts flex templates with the Dataflow pipeline.

	:param body: The request body. See:
	https://cloud.google.com/dataflow/docs/reference/rest/v1b3/projects.locations.flexTemplates/launch#request-body
	:param location: The location of the Dataflow job (for example europe-west1)
	:param project_id: The ID of the GCP project that owns the job.
	If set to ``None`` or missing, the default project_id from the GCP connection is used.
	:param on_new_job_id_callback: (Deprecated) A callback that is called when a Job ID is detected.
	:param on_new_job_callback: A callback that is called when a Job is detected.
	:return: the Job


	.. py:method:: start_python_dataflow(job_name, variables, dataflow, py_options, project_id, py_interpreter = 'python3', py_requirements = None, py_system_site_packages = False, append_job_name = True, on_new_job_id_callback = None, location = DEFAULT_DATAFLOW_LOCATION)

	Starts Dataflow job.

	:param job_name: The name of the job.
	:param variables: Variables passed to the job.
	:param dataflow: Name of the Dataflow process.
	:param py_options: Additional options.
	:param project_id: The ID of the GCP project that owns the job.
	If set to ``None`` or missing, the default project_id from the GCP connection is used.
	:param py_interpreter: Python version of the beam pipeline.
	If None, this defaults to the python3.
	To track python versions supported by beam and related
	issues check: https://issues.apache.org/jira/browse/BEAM-1251
	:param py_requirements: Additional python package(s) to install.
	If a value is passed to this parameter, a new virtual environment has been created with
	additional packages installed.

	You could also install the apache-beam package if it is not installed on your system or you want
	to use a different version.
	:param py_system_site_packages: Whether to include system_site_packages in your virtualenv.
	See virtualenv documentation for more information.

	This option is only relevant if the ``py_requirements`` parameter is not None.
	:param append_job_name: True if unique suffix has to be appended to job name.
	:param project_id: Optional, the Google Cloud project ID in which to start a job.
	If set to None or missing, the default project_id from the Google Cloud connection is used.
	:param on_new_job_id_callback: Callback called when the job ID is known.
	:param location: Job location.


	.. py:method:: build_dataflow_job_name(job_name, append_job_name = True)
	:staticmethod:

	Builds Dataflow job name.


	.. py:method:: is_job_dataflow_running(name, project_id, location = DEFAULT_DATAFLOW_LOCATION, variables = None)

	Helper method to check if jos is still running in dataflow

	:param name: The name of the job.
	:param project_id: Optional, the Google Cloud project ID in which to start a job.
	If set to None or missing, the default project_id from the Google Cloud connection is used.
	:param location: Job location.
	:return: True if job is running.
	:rtype: bool


	.. py:method:: cancel_job(project_id, job_name = None, job_id = None, location = DEFAULT_DATAFLOW_LOCATION)

	Cancels the job with the specified name prefix or Job ID.

	Parameter ``name`` and ``job_id`` are mutually exclusive.

	:param job_name: Name prefix specifying which jobs are to be canceled.
	:param job_id: Job ID specifying which jobs are to be canceled.
	:param location: Job location.
	:param project_id: Optional, the Google Cloud project ID in which to start a job.
	If set to None or missing, the default project_id from the Google Cloud connection is used.


	.. py:method:: start_sql_job(job_name, query, options, project_id, location = DEFAULT_DATAFLOW_LOCATION, on_new_job_id_callback = None, on_new_job_callback = None)

	Starts Dataflow SQL query.

	:param job_name: The unique name to assign to the Cloud Dataflow job.
	:param query: The SQL query to execute.
	:param options: Job parameters to be executed.
	For more information, look at:
	`https://cloud.google.com/sdk/gcloud/reference/beta/dataflow/sql/query
	<gcloud beta dataflow sql query>`__
	command reference
	:param location: The location of the Dataflow job (for example europe-west1)
	:param project_id: The ID of the GCP project that owns the job.
	If set to ``None`` or missing, the default project_id from the GCP connection is used.
	:param on_new_job_id_callback: (Deprecated) Callback called when the job ID is known.
	:param on_new_job_callback: Callback called when the job is known.
	:return: the new job object


	.. py:method:: get_job(job_id, project_id, location = DEFAULT_DATAFLOW_LOCATION)

	Gets the job with the specified Job ID.

	:param job_id: Job ID to get.
	:param project_id: Optional, the Google Cloud project ID in which to start a job.
	If set to None or missing, the default project_id from the Google Cloud connection is used.
	:param location: The location of the Dataflow job (for example europe-west1). See:
	https://cloud.google.com/dataflow/docs/concepts/regional-endpoints
	:return: the Job
	:rtype: dict


	.. py:method:: fetch_job_metrics_by_id(job_id, project_id, location = DEFAULT_DATAFLOW_LOCATION)

	Gets the job metrics with the specified Job ID.

	:param job_id: Job ID to get.
	:param project_id: Optional, the Google Cloud project ID in which to start a job.
	If set to None or missing, the default project_id from the Google Cloud connection is used.
	:param location: The location of the Dataflow job (for example europe-west1). See:
	https://cloud.google.com/dataflow/docs/concepts/regional-endpoints
	:return: the JobMetrics. See:
	https://cloud.google.com/dataflow/docs/reference/rest/v1b3/JobMetrics
	:rtype: dict


	.. py:method:: fetch_job_messages_by_id(job_id, project_id, location = DEFAULT_DATAFLOW_LOCATION)

	Gets the job messages with the specified Job ID.

	:param job_id: Job ID to get.
	:param project_id: Optional, the Google Cloud project ID in which to start a job.
	If set to None or missing, the default project_id from the Google Cloud connection is used.
	:param location: Job location.
	:return: the list of JobMessages. See:
	https://cloud.google.com/dataflow/docs/reference/rest/v1b3/ListJobMessagesResponse#JobMessage
	:rtype: List[dict]


	.. py:method:: fetch_job_autoscaling_events_by_id(job_id, project_id, location = DEFAULT_DATAFLOW_LOCATION)

	Gets the job autoscaling events with the specified Job ID.

	:param job_id: Job ID to get.
	:param project_id: Optional, the Google Cloud project ID in which to start a job.
	If set to None or missing, the default project_id from the Google Cloud connection is used.
	:param location: Job location.
	:return: the list of AutoscalingEvents. See:
	https://cloud.google.com/dataflow/docs/reference/rest/v1b3/ListJobMessagesResponse#autoscalingevent
	:rtype: List[dict]


	.. py:method:: wait_for_done(job_name, location, project_id, job_id = None, multiple_jobs = False)

	Wait for Dataflow job.

	:param job_name: The 'jobName' to use when executing the DataFlow job
	(templated). This ends up being set in the pipeline options, so any entry
	with key ``'jobName'`` in ``options`` will be overwritten.
	:param location: location the job is running
	:param project_id: Optional, the Google Cloud project ID in which to start a job.
	If set to None or missing, the default project_id from the Google Cloud connection is used.
	:param job_id: a Dataflow job ID
	:param multiple_jobs: If pipeline creates multiple jobs then monitor all jobs