| --- |
| title: Docker Builds |
| hide_title: true |
| sidebar_position: 6 |
| version: 1 |
| --- |
| |
| # Docker builds, images and tags |
| |
| The Apache Superset community extensively uses Docker for development, release, |
| and productionizing Superset. This page details our Docker builds and tag naming |
| schemes to help users navigate our offerings. |
| |
| Images are built and pushed to the [Superset Docker Hub repository]( |
| https://hub.docker.com/r/apache/superset) using GitHub Actions. |
| Different sets of images are built and/or published at different times: |
| |
| - **Published releases** (`release`): published using |
| tags like `3.0.0` and the `latest` tag. |
| - **Pull request iterations** (`pull_request`): for each pull request, while |
| we actively build the docker to validate the build, we do |
| not publish those images for security reasons, we simply `docker build --load` |
| - **Merges to the main branch** (`push`): resulting in new SHAs, with tags |
| prefixed with `master` for the latest `master` version. |
| |
| ## Build presets |
| |
| We have a set of build "presets" that each represent a combination of |
| parameters for the build, mostly pointing to either different target layer |
| for the build, and/or base image. |
| |
| Here are the build presets that are exposed through the `supersetbot docker` utility: |
| |
| - `lean`: The default Docker image, including both frontend and backend. Tags |
| without a build_preset are lean builds (ie: `latest`, `4.0.0`, `3.0.0`, ...). `lean` |
| builds do not contain database |
| drivers, meaning you need to install your own. That applies to analytics databases **AND |
| the metadata database**. You'll likely want to layer either `mysqlclient` or `psycopg2-binary` |
| depending on the metadata database you choose for your installation, plus the required |
| drivers to connect to your analytics database(s). |
| - `dev`: For development, with a headless browser, dev-related utilities and root access. This |
| includes some commonly used database drivers like `mysqlclient`, `psycopg2-binary` and |
| some other used for development/CI |
| - `py311`, e.g., Py311: Similar to lean but with a different Python version (in this example, 3.11). |
| - `ci`: For certain CI workloads. |
| - `websocket`: For Superset clusters supporting advanced features. |
| - `dockerize`: Used by Helm. |
| |
| ## Key tags examples |
| |
| - `latest`: The latest official release build |
| - `latest-dev`: the `-dev` image of the latest official release build, with a |
| headless browser and root access. |
| - `master`: The latest build from the `master` branch, implicitly the lean build |
| preset |
| - `master-dev`: Similar to `master` but includes a headless browser and root access. |
| - `pr-5252`: The latest commit in PR 5252. |
| - `30948dc401b40982cb7c0dbf6ebbe443b2748c1b-dev`: A build for |
| this specific SHA, which could be from a `master` merge, or release. |
| - `websocket-latest`: The WebSocket image for use in a Superset cluster. |
| |
| |
| |
| For insights or modifications to the build matrix and tagging conventions, |
| check the [supersetbot docker](https://github.com/apache-superset/supersetbot) |
| subcommand and the [docker.yml](https://github.com/apache/superset/blob/master/.github/workflows/docker.yml) |
| GitHub action. |
| |
| ## Key ARGs in Dockerfile |
| - `BUILD_TRANSLATIONS`: whether to build the translations into the image. For the |
| frontend build this tells webpack to strip out all locales other than `en` from |
| the `moment-timezone` library. For the backendthis skips compiling the |
| `*.po` translation files |
| - `DEV_MODE`: whether to skip the frontend build, this is used by our `docker-compose` dev setup |
| where we mount the local volume and build using `webpack` in `--watch` mode, meaning as you |
| alter the code in the local file system, webpack, from within a docker image used for this |
| purpose, will constantly rebuild the frontend as you go. This ARG enables the initial |
| `docker-compose` build to take much less time and resources |
| - `INCLUDE_CHROMIUM`: whether to include chromium in the backend build so that it can be |
| used as a headless browser for workloads related to "Alerts & Reports" and thumbnail generation |
| - `INCLUDE_FIREFOX`: same as above, but for firefox |
| - `PY_VER`: specifying the base image for the python backend, we don't recommend altering |
| this setting if you're not working on forwards or backwards compatibility |
| |
| ## Caching |
| |
| To accelerate builds, we follow Docker best practices and use `apache/superset-cache`. |
| |
| ## About database drivers |
| |
| Our docker images come with little to zero database driver support since |
| each environment requires different drivers, and maintaining a build with |
| wide database support would be both challenging (dozens of databases, |
| python drivers, and os dependencies) and inefficient (longer |
| build times, larger images, lower layer cache hit rate, ...). |
| |
| For production use cases, we recommend that you derive our `lean` image(s) and |
| add database support for the database you need. |
| |
| ## On supporting different platforms (namely arm64 AND amd64) |
| |
| Currently all automated builds are multi-platform, supporting both `linux/arm64` |
| and `linux/amd64`. This enables higher level constructs like `helm` and |
| `docker compose` to point to these images and effectively be multi-platform |
| as well. |
| |
| Pull requests and master builds |
| are one-image-per-platform so that they can be parallelized and the |
| build matrix for those is more sparse as we don't need to build every |
| build preset on every platform, and generally can be more selective here. |
| For those builds, we suffix tags with `-arm` where it applies. |
| |
| ### Working with Apple silicon |
| |
| Apple's current generation of computers uses ARM-based CPUs, and Docker |
| running on MACs seem to require `linux/arm64/v8` (at least one user's M2 was |
| configured in that way). Setting the environment |
| variable `DOCKER_DEFAULT_PLATFORM` to `linux/amd64` seems to function in |
| term of leveraging, and building upon the Superset builds provided here. |
| |
| ```bash |
| export DOCKER_DEFAULT_PLATFORM=linux/amd64 |
| ``` |
| |
| Presumably, `linux/arm64/v8` would be more optimized for this generation |
| of chips, but less compatible across the ARM ecosystem. |