blob: 9bb1848d964d59690f891f5fb0f6c5382b8a82c4 [file] [log] [blame]
.. Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
.. http://www.apache.org/licenses/LICENSE-2.0
.. Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
.. _write-logs-azure:
Writing logs to Azure Blob Storage
----------------------------------
Airflow can be configured to read and write task logs in Azure Blob Storage. It uses an existing
Airflow connection to read or write logs. If you don't have a connection properly setup,
this process will fail.
Follow the steps below to enable Azure Blob Storage logging:
To enable this feature, ``airflow.cfg`` must be configured as in this
example:
.. code-block:: ini
[logging]
# Airflow can store logs remotely in AWS S3, Azure Blob Storage, Google Cloud Storage or Elastic Search.
# Users must supply an Airflow connection id that provides access to the storage
# location. If remote_logging is set to true, see UPDATING.md for additional
# configuration requirements.
remote_logging = True
remote_base_log_folder = wasb://path/to/logs
[azure_remote_logging]
remote_wasb_log_container = my-container
.. note::
If you are using environment variables, the equivalent configuration is:
.. code-block:: bash
export AIRFLOW__LOGGING__REMOTE_LOGGING=True
export AIRFLOW__LOGGING__REMOTE_LOG_CONN_ID=<your_wasb_connection_id>
export AIRFLOW__LOGGING__REMOTE_BASE_LOG_FOLDER=wasb://path/to/logs
export AIRFLOW__AZURE_REMOTE_LOGGING__REMOTE_WASB_LOG_CONTAINER=<your_container_name>
The ``remote_base_log_folder`` must be prefixed with ``wasb://`` for Airflow to use the correct log handler. An incorrect format can cause a misleading ``ResourceNotFoundError``, even if the container exists.
Setup Steps:
''''''''''''''
#. Install the provider package with ``pip install apache-airflow-providers-microsoft-azure``.
#. Ensure :ref:`connection <howto/connection:wasb>` is already setup with read and write access to Azure Blob Storage in the ``remote_wasb_log_container`` container and path ``remote_base_log_folder``. The connection should be configured with appropriate authentication credentials (such as account key, shared access key, or managed identity). For account key authentication, you can add ``account_key`` to the connection's extra fields as a JSON dictionary: ``{"account_key": "your_account_key"}``.
#. Setup the above configuration values. Please note that the container should already exist.
#. Restart the Airflow webserver and scheduler, and trigger (or wait for) a new task execution.
#. Verify that logs are showing up for newly executed tasks in the container at the specified base path you have defined.
#. Verify that the Azure Blob Storage viewer is working in the UI. Pull up a newly executed task, and verify that you see something like:
.. code-block:: none
*** Found remote logs:
*** * https://my-container.blob.core.windows.net/path/to/logs/dag_id=tutorial_dag/run_id=manual.../task_id=load/attempt=1.log
[2023-07-23, 03:52:47] {taskinstance.py:1144} INFO - Dependencies all met...
**Note** that the path to the remote log file is listed in the second line.