hadoop-hdds/docs/content/beyond/Containers.md

title: “Ozone Containers” summary: Ozone uses containers extensively for testing. This page documents the usage and best practices of Ozone. weight: 2

Docker heavily is used at the ozone development with three principal use-cases:

dev:
- We use docker to start local pseudo-clusters (docker provides unified environment, but no image creation is required)
test:
- We create docker images from the dev branches to test ozone in kubernetes and other container orchestrator system
- We provide apache/ozone images for each release to make it easier for evaluation of Ozone. These images are not created for production usage.

production:
- We have documentation on how you can create your own docker image for your production cluster.

Let's check out each of the use-cases in more detail:

Development

Ozone artifact contains example docker-compose directories to make it easier to start Ozone cluster in your local machine.

From distribution:

cd compose/ozone
docker-compose up -d

After a local build:

cd  hadoop-ozone/dist/target/ozone-*/compose
docker-compose up -d

These environments are very important tools to start different type of Ozone clusters at any time.

To be sure that the compose files are up-to-date, we also provide acceptance test suites which start the cluster and check the basic behaviour.

The acceptance tests are part of the distribution, and you can find the test definitions in smoketest directory.

You can start the tests from any compose directory:

For example:

cd compose/ozone
./test.sh

Implementation details

compose tests are based on the apache/hadoop-runner docker image. The image itself does not contain any Ozone jar file or binary just the helper scripts to start ozone.

hadoop-runner provdes a fixed environment to run Ozone everywhere, but the ozone distribution itself is mounted from the including directory:

(Example docker-compose fragment)

 scm:
      image: apache/hadoop-runner:jdk11
      volumes:
         - ../..:/opt/hadoop
      ports:
         - 9876:9876

The containers are configured based on environment variables, but because the same environment variables should be set for each containers we maintain the list of the environment variables in a separated file:

 scm:
      image: apache/hadoop-runner:jdk11
      #...
      env_file:
          - ./docker-config

The docker-config file contains the list of the required environment variables:

OZONE-SITE.XML_ozone.om.address=om
OZONE-SITE.XML_ozone.om.http-address=om:9874
OZONE-SITE.XML_ozone.scm.names=scm
OZONE-SITE.XML_ozone.enabled=True
#...

As you can see we use naming convention. Based on the name of the environment variable, the appropriate hadoop config XML (ozone-site.xml in our case) will be generated by a script which is included in the hadoop-runner base image.

The entrypoint of the hadoop-runner image contains a helper shell script which triggers this transformation and can do additional actions (eg. initialize scm/om storage, download required keytabs, etc.) based on environment variables.

Test/Staging

The docker-compose based approach is recommended only for local test, not for multi node cluster. To use containers on a multi-node cluster we need a Container Orchestrator like Kubernetes.

Kubernetes example files are included in the kubernetes folder.

Please note: all the provided images are based the hadoop-runner image which contains all the required tool for testing in staging environments. For production we recommend to create your own, hardened image with your own base image.

Test the release

The release can be tested with deploying any of the example clusters:

cd kubernetes/examples/ozone
kubectl apply -f

Plese note that in this case the latest released container will be downloaded from the dockerhub.

Test the development build

To test a development build you can create your own image and upload it to your own docker registry:

mvn clean install -f pom.ozone.xml -DskipTests -Pdocker-build,docker-push -Ddocker.image=myregistry:9000/name/ozone

The configured image will be used in all the generated kubernetes resources files (image: keys are adjusted during the build)

cd kubernetes/examples/ozone
kubectl apply -f

Production

You can use the source of our development images as an example:

[Base image] (https://github.com/apache/hadoop/blob/docker-hadoop-runner-jdk11/Dockerfile)
[Docker image] (https://github.com/apache/hadoop/blob/trunk/hadoop-ozone/dist/src/main/docker/Dockerfile)

Most of the elements are optional and just helper function but to use the provided example kubernetes resources you may need the scripts from here

The two python scripts convert environment variables to real hadoop XML config files
The start.sh executes the python scripts (and other initialization) based on environment variables.

Containers

Ozone related container images and source locations: