blob: 7fc9d1097cce492a919f06f26426c3af312f39f2 [file] [log] [blame]
# INSTALL / BUILD instructions for Apache Airflow
This is a generic installation method that requires a number of dependencies to be installed.
Depending on your system you might need different prerequisites, but the following
systems/prerequisites are known to work:
Linux (Debian Bullseye and Linux Mint Debbie):
sudo apt install build-essential python3-dev libsqlite3-dev openssl \
sqlite default-libmysqlclient-dev libmysqlclient-dev postgresql
On Ubuntu 20.04 you may get an error of mariadb_config not found
and mysql_config not found.
Install MariaDB development headers:
sudo apt-get install libmariadb-dev libmariadbclient-dev
MacOS (Mojave/Catalina):
brew install sqlite mysql postgresql
# [required] fetch the tarball and untar the source move into the directory that was untarred.
# [optional] run Apache RAT (release audit tool) to validate license headers
# RAT docs here: https://creadur.apache.org/rat/. Requires Java and Apache Rat
java -jar apache-rat.jar -E ./.rat-excludes -d .
# [optional] Airflow pulls in quite a lot of dependencies in order
# to connect to other services. You might want to test or run Airflow
# from a virtual env to make sure those dependencies are separated
# from your system wide versions
python3 -m venv PATH_TO_YOUR_VENV
source PATH_TO_YOUR_VENV/bin/activate
# [required] building and installing by pip (preferred)
pip install .
# or directly
python setup.py install
# You can also install recommended version of the dependencies by using
# constraint-python<PYTHON_MAJOR_MINOR_VERSION>.txt files as constraint file. This is needed in case
# you have problems with installing the current requirements from PyPI.
# There are different constraint files for different python versions. For example"
pip install . \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-3.7.txt"
By default `pip install` in Airflow 2.0 installs only the provider packages that are needed by the extras and
install them as packages from PyPI rather than from local sources:
pip install .[google,amazon] \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-3.7.txt"
You can upgrade just airflow, without paying attention to provider's dependencies by using 'constraints-no-providers'
constraint files. This allows you to keep installed provider packages.
pip install . --upgrade \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-no-providers-3.7.txt"
You can also install airflow in "editable mode" (with -e) flag and then provider packages are
available directly from the sources (and the provider packages installed from PyPI are UNINSTALLED in
order to avoid having providers in two places. And `provider.yaml` files are used to discover capabilities
of the providers which are part of the airflow source code.
You can read more about `provider.yaml` and community-managed providers in
https://airflow.apache.org/docs/apache-airflow-providers/index.html for developing custom providers
and in ``CONTRIBUTING.rst`` for developing community maintained providers.
This is useful if you want to develop providers:
pip install -e . \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-3.7.txt"
You can also skip installing provider packages from PyPI by setting INSTALL_PROVIDERS_FROM_SOURCE to "true".
In this case Airflow will be installed in non-editable mode with all providers installed from the sources.
Additionally `provider.yaml` files will also be copied to providers folders which will make the providers
discoverable by Airflow even if they are not installed from packages in this case.
INSTALL_PROVIDERS_FROM_SOURCES="true" pip install . \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-3.7.txt"
Airflow can be installed with extras to install some additional features (for example 'async' or 'doc' or
to install automatically providers and all dependencies needed by that provider:
pip install .[async,google,amazon] \
--constraint "https://raw.githubusercontent.com/apache/airflow/constraints-main/constraints-3.7.txt"
The list of available extras:
# START EXTRAS HERE
aiobotocore, airbyte, alibaba, all, all_dbs, amazon, apache.atlas, apache.beam, apache.cassandra,
apache.drill, apache.druid, apache.flink, apache.hdfs, apache.hive, apache.impala, apache.kafka,
apache.kylin, apache.livy, apache.pig, apache.pinot, apache.spark, apache.sqoop, apache.webhdfs,
arangodb, asana, async, atlas, atlassian.jira, aws, azure, cassandra, celery, cgroups, cloudant,
cncf.kubernetes, common.sql, crypto, dask, databricks, datadog, dbt.cloud, deprecated_api, devel,
devel_all, devel_ci, devel_hadoop, dingding, discord, doc, doc_gen, docker, druid, elasticsearch,
exasol, facebook, ftp, gcp, gcp_api, github, github_enterprise, google, google_auth, grpc,
hashicorp, hdfs, hive, http, imap, influxdb, jdbc, jenkins, kerberos, kubernetes, ldap, leveldb,
microsoft.azure, microsoft.mssql, microsoft.psrp, microsoft.winrm, mongo, mssql, mysql, neo4j, odbc,
openfaas, openlineage, opsgenie, oracle, otel, pagerduty, pandas, papermill, password, pinot,
plexus, postgres, presto, qds, qubole, rabbitmq, redis, s3, salesforce, samba, segment, sendgrid,
sentry, sftp, singularity, slack, smtp, snowflake, spark, sqlite, ssh, statsd, tableau, tabular,
telegram, trino, vertica, virtualenv, webhdfs, winrm, zendesk
# END EXTRAS HERE
# For installing Airflow in development environments - see CONTRIBUTING.rst
# COMPILING FRONT-END ASSETS (in case you see "Please make sure to build the frontend in static/ directory and then restart the server")
# Optional : Installing yarn - https://classic.yarnpkg.com/en/docs/install
python setup.py compile_assets