docs/apache-airflow/administration-and-deployment/production-deployment.rst - airflow - Git at Google

  .. Licensed to the Apache Software Foundation (ASF) under one
     or more contributor license agreements.  See the NOTICE file
     distributed with this work for additional information
     regarding copyright ownership.  The ASF licenses this file
     to you under the Apache License, Version 2.0 (the
     "License"); you may not use this file except in compliance
     with the License.  You may obtain a copy of the License at

  ..   http://www.apache.org/licenses/LICENSE-2.0

  .. Unless required by applicable law or agreed to in writing,
     software distributed under the License is distributed on an
     "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     KIND, either express or implied.  See the License for the
     specific language governing permissions and limitations
     under the License.

 Production Deployment
 ^^^^^^^^^^^^^^^^^^^^^

 It is time to deploy your DAG in production. To do this, first, you need to make sure that the Airflow
 is itself production-ready. Let's see what precautions you need to take.

 Database backend
 ================

 Airflow comes with an ``SQLite`` backend by default. This allows the user to run Airflow without any external
 database. However, such a setup is meant to be used for testing purposes only; running the default setup
 in production can lead to data loss in multiple scenarios. If you want to run production-grade Airflow,
 make sure you :doc:`configure the backend <../howto/set-up-database>` to be an external database
 such as PostgreSQL or MySQL.

 You can change the backend using the following config

 .. code-block:: ini

     [database]
     sql_alchemy_conn = my_conn_string

 Once you have changed the backend, airflow needs to create all the tables required for operation.
 Create an empty DB and give Airflow's user permission to ``CREATE/ALTER`` it.
 Once that is done, you can run -

 .. code-block:: bash

     airflow db migrate

 ``migrate`` keeps track of migrations already applied, so it's safe to run as often as you need.

 .. note::

     Prior to Airflow version 2.7.0, ``airflow db upgrade`` was used to apply migrations,
     however, it has been deprecated in favor of ``airflow db migrate``.


 Multi-Node Cluster
 ==================

 Airflow uses :class:`~airflow.executors.sequential_executor.SequentialExecutor` by default. However, by its
 nature, the user is limited to executing at most one task at a time. ``Sequential Executor`` also pauses
 the scheduler when it runs a task, hence it is not recommended in a production setup. You should use the
 :class:`~airflow.executors.local_executor.LocalExecutor` for a single machine.
 For a multi-node setup, you should use the :doc:`Kubernetes executor <apache-airflow-providers-cncf-kubernetes:kubernetes_executor>` or
 the :doc:`Celery executor <apache-airflow-providers-celery:celery_executor>`.


 Once you have configured the executor, it is necessary to make sure that every node in the cluster contains
 the same configuration and DAGs. Airflow sends simple instructions such as "execute task X of DAG Y", but
 does not send any DAG files or configuration. You can use a simple cronjob or any other mechanism to sync
 DAGs and configs across your nodes, e.g., checkout DAGs from git repo every 5 minutes on all nodes.


 Logging
 =======

 If you are using disposable nodes in your cluster, configure the log storage to be a distributed file system
 (DFS) such as ``S3`` and ``GCS``, or external services such as Stackdriver Logging, Elasticsearch or
 Amazon CloudWatch. This way, the logs are available even after the node goes down or gets replaced.
 See :doc:`logging-monitoring/logging-tasks` for configurations.

 .. note::

     The logs only appear in your DFS after the task has finished. You can view the logs while the task is
     running in UI itself.

 Configuration
 =============

 Airflow comes bundled with a default ``airflow.cfg`` configuration file.
 You should use environment variables for configurations that change across deployments
 e.g. metadata DB, password, etc. You can accomplish this using the format :envvar:`AIRFLOW__{SECTION}__{KEY}`

 .. code-block:: bash

  AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=my_conn_id
  AIRFLOW__WEBSERVER__BASE_URL=http://host:port

 Some configurations such as the Airflow Backend connection URI can be derived from bash commands as well:

 .. code-block:: ini

  sql_alchemy_conn_cmd = bash_command_to_run


 Scheduler Uptime
 ================

 Airflow users occasionally report instances of the scheduler hanging without a trace, for example in these issues:

 * `Scheduler gets stuck without a trace <https://github.com/apache/airflow/issues/7935>`_
 * `Scheduler stopping frequently <https://github.com/apache/airflow/issues/13243>`_

 To mitigate these issues, make sure you have a :doc:`health check <logging-monitoring/check-health>` set up that will detect when your scheduler has not heartbeat in a while.

 .. _docker_image:

 Production Container Images
 ===========================

 We provide :doc:`a Docker Image (OCI) for Apache Airflow <docker-stack:index>` for use in a containerized environment. Consider using it to guarantee that software will always run the same no matter where it's deployed.

 Helm Chart for Kubernetes
 =========================

 `Helm <https://helm.sh/>`__ provides a simple mechanism to deploy software to a Kubernetes cluster. We maintain
 :doc:`an official Helm chart <helm-chart:index>` for Airflow that helps you define, install, and upgrade deployment. The Helm Chart uses :doc:`our official Docker image and Dockerfile <docker-stack:index>` that is also maintained and released by the community.


 Live-upgrading Airflow
 ======================

 Airflow is by-design a distributed system and while the
 :ref:`basic Airflow deployment <overview-basic-airflow-architecture>` requires usually a complete Airflow
 restart to upgrade, it is possible to upgrade Airflow without any downtime when you run Airflow in a
 :ref:`distributed deployment <overview-basic-airflow-architecture>`.

 Such a live upgrade is possible when there are no changes in Airflow metadata database schema,
 so you should aim to do it when you upgrade Airflow patch-level (bugfix) versions of the same minor
 Airflow version or when upgrading between adjacent minor versions (feature) of Airflow after reviewing the
 :doc:`release notes <../release_notes>` and :doc:`../migrations-ref` and making sure there are no changes
 in the database schema between them.

 In some cases when database migration is not significant, such live migration could also potentially be
 possible with upgrading Airflow database first and between MINOR versions, however, this is not recommended
 and you should only do it on your own risk, carefully reviewing the modifications to be applied to the
 database schema and assessing the risk of such upgrade - it requires deep knowledge of Airflow
 database :doc:`../database-erd-ref` and reviewing the :doc:`../migrations-ref`. You should always thoroughly
 test such upgrade in a staging environment first. Usually cost connected with such live upgrade preparation
 will be higher than the cost of a short downtime of Airflow, so we strongly discourage such live upgrades.

 Make sure to test such live upgrade procedure in a staging environment before you do it in production,
 to avoid any surprises and side-effects.

 When it comes to live-upgrading the ``Webserver``, ``Triggerer`` components, if you run them in separate
 environments and have more than one instances for each of them, you can rolling-restart them one by one,
 without any downtime. This should usually be done as the first step in your upgrade procedure.

 When you are running a deployment with separate ``DAG processor``, in a
 :ref:`Separate DAG processing deployment <overview-separate-dag-processing-airflow-architecture>`
 the ``DAG processor`` is not horizontally scaled - even if you have more of them there is usually one
 ``DAG processor`` running at a time per specific folder, so you can just stop it and start the new one -
 but since the ``DAG processor`` is not a critical component, it's ok for it to experience a short downtime.

 When it comes to upgrading the schedulers and workers, you can use the live upgrade capabilities
 of the executor you use:

 * For the :doc:`Local executor <../core-concepts/executor/local>` your tasks are running as subprocesses of
   scheduler and you cannot upgrade the Scheduler without killing the tasks run by it. You can either
   pause all your DAGs and wait for the running tasks to complete or just stop the scheduler and kill all
   the tasks it runs - then you will need to clear and restart those tasks manually after the upgrade
   is completed (or rely on ``retry`` being set for stopped tasks).

 * For the :doc:`Celery executor <apache-airflow-providers-celery:celery_executor>`, you have to first put your workers in
   offline mode (usually by setting a single ``TERM`` signal to the workers), wait until the workers
   finish all the running tasks, and then upgrade the code (for example by replacing the image the workers run
   in and restart the workers). You can monitor your workers via ``flower`` monitoring tool and see the number
   of running tasks going down to zero. Once the workers are upgraded, they will be automatically put in online
   mode and start picking up new tasks. You can then upgrade the ``Scheduler`` in a rolling restart mode.

 * For the :doc:`Kubernetes executor <apache-airflow-providers-cncf-kubernetes:kubernetes_executor>`, you can upgrade the scheduler
   triggerer, webserver in a rolling restart mode, and generally you should not worry about the workers, as they
   are managed by the Kubernetes cluster and will be automatically adopted by ``Schedulers`` when they are
   upgraded and restarted.

 * For the :doc:``CeleryKubernetesExecutor <apache-airflow-providers-celery:celery_kubernetes_executor>``, you follow the
   same procedure as for the ``CeleryExecutor`` - you put the workers in offline mode, wait for the running
   tasks to complete, upgrade the workers, and then upgrade the scheduler, triggerer and webserver in a
   rolling restart mode - which should also adopt tasks run via the ``KubernetesExecutor`` part of the
   executor.

 Most of the rolling-restart upgrade scenarios are implemented in the :doc:`helm-chart:index`, so you can
 use it to upgrade your Airflow deployment without any downtime - especially in case you do patch-level
 upgrades of Airflow.

 .. _production-deployment:kerberos:

 Kerberos-authenticated workers
 ==============================

 Apache Airflow has a built-in mechanism for authenticating the operation with a KDC (Key Distribution Center).
 Airflow has a separate command ``airflow kerberos`` that acts as token refresher. It uses the pre-configured
 Kerberos Keytab to authenticate in the KDC to obtain a valid token, and then refreshing valid token
 at regular intervals within the current token expiry window.

 Each request for refresh uses a configured principal, and only keytab valid for the principal specified
 is capable of retrieving the authentication token.

 The best practice to implement proper security mechanism in this case is to make sure that worker
 workloads have no access to the Keytab but only have access to the periodically refreshed, temporary
 authentication tokens. This can be achieved in Docker environment by running the ``airflow kerberos``
 command and the worker command in separate containers - where only the ``airflow kerberos`` token has
 access to the Keytab file (preferably configured as secret resource). Those two containers should share
 a volume where the temporary token should be written by the ``airflow kerberos`` and read by the workers.

 In the Kubernetes environment, this can be realized by the concept of sidecar, where both Kerberos
 token refresher and worker are part of the same Pod. Only the Kerberos sidecar has access to
 Keytab secret and both containers in the same Pod share the volume, where temporary token is written by
 the sidecar container and read by the worker container.

 This concept is implemented in :doc:`the Helm Chart for Apache Airflow <helm-chart:index>`.


 .. spelling:word-list::

    pypirc
    dockerignore


 Secured Server and Service Access on Google Cloud
 =================================================

 This section describes techniques and solutions for securely accessing servers and services when your Airflow
 environment is deployed on Google Cloud, or you connect to Google services, or you are connecting
 to the Google API.

 IAM and Service Accounts
 ------------------------

 You should not rely on internal network segmentation or firewalling as our primary security mechanisms.
 To protect your organization's data, every request you make should contain sender identity. In the case of
 Google Cloud, the identity is provided by
 `the IAM and Service account <https://cloud.google.com/iam/docs/service-accounts>`__. Each Compute Engine
 instance has an associated service account identity. It provides cryptographic credentials that your workload
 can use to prove its identity when making calls to Google APIs or third-party services. Each instance has
 access only to short-lived credentials. If you use Google-managed service account keys, then the private
 key is always held in escrow and is never directly accessible.

 If you are using Kubernetes Engine, you can use
 `Workload Identity <https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity>`__ to assign
 an identity to individual pods.

 For more information about service accounts in the Airflow, see :ref:`howto/connection:gcp`

 Impersonate Service Accounts
 ----------------------------

 If you need access to other service accounts, you can
 :ref:`impersonate other service accounts <howto/connection:gcp:impersonation>` to exchange the token with
 the default identity to another service account. Thus, the account keys are still managed by Google
 and cannot be read by your workload.

 It is not recommended to generate service account keys and store them in the metadata database or the
 secrets backend. Even with the use of the backend secret, the service account key is available for
 your workload.

 Access to Compute Engine Instance
 ---------------------------------

 If you want to establish an SSH connection to the Compute Engine instance, you must have the network address
 of this instance and credentials to access it. To simplify this task, you can use
 :class:`~airflow.providers.google.cloud.hooks.compute.ComputeEngineHook`
 instead of :class:`~airflow.providers.ssh.hooks.ssh.SSHHook`

 The :class:`~airflow.providers.google.cloud.hooks.compute.ComputeEngineHook` support authorization with
 Google OS Login service. It is an extremely robust way to manage Linux access properly as it stores
 short-lived ssh keys in the metadata service, offers PAM modules for access and sudo privilege checking
 and offers the ``nsswitch`` user lookup into the metadata service as well.

 It also solves the discovery problem that arises as your infrastructure grows. You can use the
 instance name instead of the network address.

 Access to Amazon Web Service
 ----------------------------

 Thanks to the
 `Web Identity Federation <https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_oidc.html>`__,
 you can exchange the Google Cloud Platform identity to the Amazon Web Service identity,
 which effectively means access to Amazon Web Service platform.
 For more information, see: :ref:`howto/connection:aws:gcp-federation`

 .. spelling:word-list::

     nsswitch
     cryptographic
     firewalling
     ComputeEngineHook
	.. Licensed to the Apache Software Foundation (ASF) under one
	or more contributor license agreements. See the NOTICE file
	distributed with this work for additional information
	regarding copyright ownership. The ASF licenses this file
	to you under the Apache License, Version 2.0 (the
	"License"); you may not use this file except in compliance
	with the License. You may obtain a copy of the License at

	.. http://www.apache.org/licenses/LICENSE-2.0

	.. Unless required by applicable law or agreed to in writing,
	software distributed under the License is distributed on an
	"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	KIND, either express or implied. See the License for the
	specific language governing permissions and limitations
	under the License.

	Production Deployment
	^^^^^^^^^^^^^^^^^^^^^

	It is time to deploy your DAG in production. To do this, first, you need to make sure that the Airflow
	is itself production-ready. Let's see what precautions you need to take.

	Database backend
	================

	Airflow comes with an ``SQLite`` backend by default. This allows the user to run Airflow without any external
	database. However, such a setup is meant to be used for testing purposes only; running the default setup
	in production can lead to data loss in multiple scenarios. If you want to run production-grade Airflow,
	make sure you :doc:`configure the backend <../howto/set-up-database>` to be an external database
	such as PostgreSQL or MySQL.

	You can change the backend using the following config

	.. code-block:: ini

	[database]
	sql_alchemy_conn = my_conn_string

	Once you have changed the backend, airflow needs to create all the tables required for operation.
	Create an empty DB and give Airflow's user permission to ``CREATE/ALTER`` it.
	Once that is done, you can run -

	.. code-block:: bash

	airflow db migrate

	``migrate`` keeps track of migrations already applied, so it's safe to run as often as you need.

	.. note::

	Prior to Airflow version 2.7.0, ``airflow db upgrade`` was used to apply migrations,
	however, it has been deprecated in favor of ``airflow db migrate``.


	Multi-Node Cluster
	==================

	Airflow uses :class:`~airflow.executors.sequential_executor.SequentialExecutor` by default. However, by its
	nature, the user is limited to executing at most one task at a time. ``Sequential Executor`` also pauses
	the scheduler when it runs a task, hence it is not recommended in a production setup. You should use the
	:class:`~airflow.executors.local_executor.LocalExecutor` for a single machine.
	For a multi-node setup, you should use the :doc:`Kubernetes executor <apache-airflow-providers-cncf-kubernetes:kubernetes_executor>` or
	the :doc:`Celery executor <apache-airflow-providers-celery:celery_executor>`.


	Once you have configured the executor, it is necessary to make sure that every node in the cluster contains
	the same configuration and DAGs. Airflow sends simple instructions such as "execute task X of DAG Y", but
	does not send any DAG files or configuration. You can use a simple cronjob or any other mechanism to sync
	DAGs and configs across your nodes, e.g., checkout DAGs from git repo every 5 minutes on all nodes.


	Logging
	=======

	If you are using disposable nodes in your cluster, configure the log storage to be a distributed file system
	(DFS) such as ``S3`` and ``GCS``, or external services such as Stackdriver Logging, Elasticsearch or
	Amazon CloudWatch. This way, the logs are available even after the node goes down or gets replaced.
	See :doc:`logging-monitoring/logging-tasks` for configurations.

	.. note::

	The logs only appear in your DFS after the task has finished. You can view the logs while the task is
	running in UI itself.

	Configuration
	=============

	Airflow comes bundled with a default ``airflow.cfg`` configuration file.
	You should use environment variables for configurations that change across deployments
	e.g. metadata DB, password, etc. You can accomplish this using the format :envvar:`AIRFLOW__{SECTION}__{KEY}`

	.. code-block:: bash

	AIRFLOW__DATABASE__SQL_ALCHEMY_CONN=my_conn_id
	AIRFLOW__WEBSERVER__BASE_URL=http://host:port

	Some configurations such as the Airflow Backend connection URI can be derived from bash commands as well:

	.. code-block:: ini

	sql_alchemy_conn_cmd = bash_command_to_run


	Scheduler Uptime
	================

	Airflow users occasionally report instances of the scheduler hanging without a trace, for example in these issues:

	* `Scheduler gets stuck without a trace <https://github.com/apache/airflow/issues/7935>`_
	* `Scheduler stopping frequently <https://github.com/apache/airflow/issues/13243>`_

	To mitigate these issues, make sure you have a :doc:`health check <logging-monitoring/check-health>` set up that will detect when your scheduler has not heartbeat in a while.

	.. _docker_image:

	Production Container Images
	===========================

	We provide :doc:`a Docker Image (OCI) for Apache Airflow <docker-stack:index>` for use in a containerized environment. Consider using it to guarantee that software will always run the same no matter where it's deployed.

	Helm Chart for Kubernetes
	=========================

	`Helm <https://helm.sh/>`__ provides a simple mechanism to deploy software to a Kubernetes cluster. We maintain
	:doc:`an official Helm chart <helm-chart:index>` for Airflow that helps you define, install, and upgrade deployment. The Helm Chart uses :doc:`our official Docker image and Dockerfile <docker-stack:index>` that is also maintained and released by the community.


	Live-upgrading Airflow
	======================

	Airflow is by-design a distributed system and while the
	:ref:`basic Airflow deployment <overview-basic-airflow-architecture>` requires usually a complete Airflow
	restart to upgrade, it is possible to upgrade Airflow without any downtime when you run Airflow in a
	:ref:`distributed deployment <overview-basic-airflow-architecture>`.

	Such a live upgrade is possible when there are no changes in Airflow metadata database schema,
	so you should aim to do it when you upgrade Airflow patch-level (bugfix) versions of the same minor
	Airflow version or when upgrading between adjacent minor versions (feature) of Airflow after reviewing the
	:doc:`release notes <../release_notes>` and :doc:`../migrations-ref` and making sure there are no changes
	in the database schema between them.

	In some cases when database migration is not significant, such live migration could also potentially be
	possible with upgrading Airflow database first and between MINOR versions, however, this is not recommended
	and you should only do it on your own risk, carefully reviewing the modifications to be applied to the
	database schema and assessing the risk of such upgrade - it requires deep knowledge of Airflow
	database :doc:`../database-erd-ref` and reviewing the :doc:`../migrations-ref`. You should always thoroughly
	test such upgrade in a staging environment first. Usually cost connected with such live upgrade preparation
	will be higher than the cost of a short downtime of Airflow, so we strongly discourage such live upgrades.

	Make sure to test such live upgrade procedure in a staging environment before you do it in production,
	to avoid any surprises and side-effects.

	When it comes to live-upgrading the ``Webserver``, ``Triggerer`` components, if you run them in separate
	environments and have more than one instances for each of them, you can rolling-restart them one by one,
	without any downtime. This should usually be done as the first step in your upgrade procedure.

	When you are running a deployment with separate ``DAG processor``, in a
	:ref:`Separate DAG processing deployment <overview-separate-dag-processing-airflow-architecture>`
	the ``DAG processor`` is not horizontally scaled - even if you have more of them there is usually one
	``DAG processor`` running at a time per specific folder, so you can just stop it and start the new one -
	but since the ``DAG processor`` is not a critical component, it's ok for it to experience a short downtime.

	When it comes to upgrading the schedulers and workers, you can use the live upgrade capabilities
	of the executor you use:

	* For the :doc:`Local executor <../core-concepts/executor/local>` your tasks are running as subprocesses of
	scheduler and you cannot upgrade the Scheduler without killing the tasks run by it. You can either
	pause all your DAGs and wait for the running tasks to complete or just stop the scheduler and kill all
	the tasks it runs - then you will need to clear and restart those tasks manually after the upgrade
	is completed (or rely on ``retry`` being set for stopped tasks).

	* For the :doc:`Celery executor <apache-airflow-providers-celery:celery_executor>`, you have to first put your workers in
	offline mode (usually by setting a single ``TERM`` signal to the workers), wait until the workers
	finish all the running tasks, and then upgrade the code (for example by replacing the image the workers run
	in and restart the workers). You can monitor your workers via ``flower`` monitoring tool and see the number
	of running tasks going down to zero. Once the workers are upgraded, they will be automatically put in online
	mode and start picking up new tasks. You can then upgrade the ``Scheduler`` in a rolling restart mode.

	* For the :doc:`Kubernetes executor <apache-airflow-providers-cncf-kubernetes:kubernetes_executor>`, you can upgrade the scheduler
	triggerer, webserver in a rolling restart mode, and generally you should not worry about the workers, as they
	are managed by the Kubernetes cluster and will be automatically adopted by ``Schedulers`` when they are
	upgraded and restarted.

	* For the :doc:``CeleryKubernetesExecutor <apache-airflow-providers-celery:celery_kubernetes_executor>``, you follow the
	same procedure as for the ``CeleryExecutor`` - you put the workers in offline mode, wait for the running
	tasks to complete, upgrade the workers, and then upgrade the scheduler, triggerer and webserver in a
	rolling restart mode - which should also adopt tasks run via the ``KubernetesExecutor`` part of the
	executor.

	Most of the rolling-restart upgrade scenarios are implemented in the :doc:`helm-chart:index`, so you can
	use it to upgrade your Airflow deployment without any downtime - especially in case you do patch-level
	upgrades of Airflow.

	.. _production-deployment:kerberos:

	Kerberos-authenticated workers
	==============================

	Apache Airflow has a built-in mechanism for authenticating the operation with a KDC (Key Distribution Center).
	Airflow has a separate command ``airflow kerberos`` that acts as token refresher. It uses the pre-configured
	Kerberos Keytab to authenticate in the KDC to obtain a valid token, and then refreshing valid token
	at regular intervals within the current token expiry window.

	Each request for refresh uses a configured principal, and only keytab valid for the principal specified
	is capable of retrieving the authentication token.

	The best practice to implement proper security mechanism in this case is to make sure that worker
	workloads have no access to the Keytab but only have access to the periodically refreshed, temporary
	authentication tokens. This can be achieved in Docker environment by running the ``airflow kerberos``
	command and the worker command in separate containers - where only the ``airflow kerberos`` token has
	access to the Keytab file (preferably configured as secret resource). Those two containers should share
	a volume where the temporary token should be written by the ``airflow kerberos`` and read by the workers.

	In the Kubernetes environment, this can be realized by the concept of sidecar, where both Kerberos
	token refresher and worker are part of the same Pod. Only the Kerberos sidecar has access to
	Keytab secret and both containers in the same Pod share the volume, where temporary token is written by
	the sidecar container and read by the worker container.

	This concept is implemented in :doc:`the Helm Chart for Apache Airflow <helm-chart:index>`.


	.. spelling:word-list::

	pypirc
	dockerignore


	Secured Server and Service Access on Google Cloud
	=================================================

	This section describes techniques and solutions for securely accessing servers and services when your Airflow
	environment is deployed on Google Cloud, or you connect to Google services, or you are connecting
	to the Google API.

	IAM and Service Accounts
	------------------------

	You should not rely on internal network segmentation or firewalling as our primary security mechanisms.
	To protect your organization's data, every request you make should contain sender identity. In the case of
	Google Cloud, the identity is provided by
	`the IAM and Service account <https://cloud.google.com/iam/docs/service-accounts>`__. Each Compute Engine
	instance has an associated service account identity. It provides cryptographic credentials that your workload
	can use to prove its identity when making calls to Google APIs or third-party services. Each instance has
	access only to short-lived credentials. If you use Google-managed service account keys, then the private
	key is always held in escrow and is never directly accessible.

	If you are using Kubernetes Engine, you can use
	`Workload Identity <https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity>`__ to assign
	an identity to individual pods.

	For more information about service accounts in the Airflow, see :ref:`howto/connection:gcp`

	Impersonate Service Accounts
	----------------------------

	If you need access to other service accounts, you can
	:ref:`impersonate other service accounts <howto/connection:gcp:impersonation>` to exchange the token with
	the default identity to another service account. Thus, the account keys are still managed by Google
	and cannot be read by your workload.

	It is not recommended to generate service account keys and store them in the metadata database or the
	secrets backend. Even with the use of the backend secret, the service account key is available for
	your workload.

	Access to Compute Engine Instance
	---------------------------------

	If you want to establish an SSH connection to the Compute Engine instance, you must have the network address
	of this instance and credentials to access it. To simplify this task, you can use
	:class:`~airflow.providers.google.cloud.hooks.compute.ComputeEngineHook`
	instead of :class:`~airflow.providers.ssh.hooks.ssh.SSHHook`

	The :class:`~airflow.providers.google.cloud.hooks.compute.ComputeEngineHook` support authorization with
	Google OS Login service. It is an extremely robust way to manage Linux access properly as it stores
	short-lived ssh keys in the metadata service, offers PAM modules for access and sudo privilege checking
	and offers the ``nsswitch`` user lookup into the metadata service as well.

	It also solves the discovery problem that arises as your infrastructure grows. You can use the
	instance name instead of the network address.

	Access to Amazon Web Service
	----------------------------

	Thanks to the
	`Web Identity Federation <https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_providers_oidc.html>`__,
	you can exchange the Google Cloud Platform identity to the Amazon Web Service identity,
	which effectively means access to Amazon Web Service platform.
	For more information, see: :ref:`howto/connection:aws:gcp-federation`

	.. spelling:word-list::

	nsswitch
	cryptographic
	firewalling
	ComputeEngineHook