blob: 4daba9c0dbedbc72937c91390481ceb39c9d9148 [file] [log] [blame]
////
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
////
:description: Apache Hop provides a Docker image for long (Hop Server) and short-lived (hop-run) containers. An additional image is available for Hop Web. Both images are available on Docker Hub.
[[DockerContainer-DockerContainer]]
= Docker container
== Introduction
This is the documentation of the official Apache Hop docker container published on:
https://hub.docker.com/r/apache/hop
It's a **Hop Docker image** supporting both **short-lived** and **long-lived** setups.
A short-lived setup executes a pipeline or workflow and stops right after.
A long-lived setup starts a Hop server and waits for work.
== Operating system
The docker container runs a minimal Linux system called https://hub.docker.com/_/alpine[Alpine].
OpenJDK version 8 is then used to execute Apache Hop.
The Linux user used to execute in the container is `hop` and the group is `hop` as well.
== Container Folder Structure
|===
|Folder | Description
|```/opt/hop```
| The installation location of the hop client package.
|```/files```
| This volume has read-write permissions for Linux user `hop`.
You can use it for example to mount a folder that contains the **hop and project config** as well as the **workflows and pipelines**.
|```/home/hop```
| The initial working directory location of the docker container.
|===
== Environment Variables
You can provide values for the following environment variables:
|===
|Environment Variable|Default |Description
|```HOP_LOG_LEVEL```
|`Basic`
| The log level.
Use one of: `None`, `Error`, `Minimal`, `Basic`, `Detailed`, `Debug` or `Rowlevel`.
|```HOP_LOG_PATH```
|`/opt/hop/hop.err.log`
| The file path to the Hop log file.
|```HOP_PROJECT_NAME```
|
| Name of the Hop project to create in the container.
You also need to specify the ```HOP_PROJECT_FOLDER``` variable.
If you do not set this variable, no project or environment will be created.
|```HOP_PROJECT_FOLDER```
`HOP_PROJECT_DIRECTORY` (Deprecated)
|
| Path to the home of the Hop project.
|```HOP_PROJECT_CONFIG_FILE_NAME```
|`project-config.json`
| Name of the project config file.
|```HOP_ENVIRONMENT_NAME```
|
| The name of the Hop environment to create in the container.
If you do not set this variable, no environment will be created.
|```HOP_ENVIRONMENT_CONFIG_FILE_NAME_PATHS```
|
| This is a comma separated list of paths to environment config files (including filename and file extension).
|```HOP_OPTIONS```
|`-XX:+AggressiveHeap`
| Any JRE options you want to set.
The `-XX:+AggressiveHeap` option tells the container to use all memory assigned to it.
|```HOP_SHARED_JDBC_FOLDER```
|
| Path to a folder where your JDBC drivers are located.
|```HOP_CUSTOM_ENTRYPOINT_EXTENSION_SHELL_FILE_PATH```
|
| Optional path to a custom entrypoint extension script file.
You can use this for example to fetch Hop project files from S3 or gitlab.
|```HOP_SYSTEM_PROPERTIES```
|
| The system properties that should be set during execution.
You can specify the properties as a comma separated list, e.g. `PROP1=xxx,PROP2=yyy`
|===
Here are the variables for **short-lived** containers, to execute a workflow or pipeline:
|===
|Environment Variable | Required | Description
|```HOP_FILE_PATH```
| Yes
| The path to the Hop workflow or pipeline to execute.
If you configured a project you can use a relative path to the project home folder.
|```HOP_RUN_CONFIG```
| Yes
| The name of the Hop run configuration to use to execute your pipeline or workflow.
|```HOP_RUN_PARAMETERS```
| No
| Optionally, you can specify the parameters that should be passed to the pipeline or workflow you are executing.
You can specify them as a comma separated list, e.g. ```PARAM_1=aaa,PARAM_2=bbb```.
|===
Below are the variables you can use for a **long-lived** container, running Hop Server:
|===
|Environment Variable |Default value| Description
|```HOP_SERVER_HOSTNAME```
| `0.0.0.0` (listen to anything)
| The IP address the server will listen to.
|```HOP_SERVER_PORT```
| `8080`
| The port the server will listen to.
|```HOP_SERVER_USER```
|`cluster`
| The username to log into the Hop server.
|```HOP_SERVER_PASS```
| `cluster`
|The password to log into the Hop server
|```HOP_SERVER_METADATA_FOLDER```
|(optional)
| You can point to a folder containing metadata JSON files which are then available to the server.
|```HOP_SERVER_MAX_LOG_LINES```
|`0` (keep all logging in memory)
|The maximum number of log lines kept in memory by the server.
|```HOP_SERVER_MAX_LOG_TIMEOUT```
|`0` (never clean up log lines)
|The time (in minutes) it takes for a log line to be cleaned up in memory.
|```HOP_SERVER_MAX_OBJECT_TIMEOUT```
|`1440` (a day)
|The time (in minutes) it takes for a pipeline or workflow execution to be removed from the server status.
|```HOP_SERVER_KEYSTORE```
| (optional)
|The path to the Java keystore file you want to use to run the Hop server with SSL enabled to support https.
|```HOP_SERVER_KEYSTORE_PASSWORD```
|(optional)
|The password of the Java keystore file you want to use to run the Hop server with SSL enabled to support https
|```HOP_SERVER_KEY_PASSWORD```
|(optional)
|The password of the key if you want to use to run the Hop server with SSL enabled.
If both passwords are the same you can omit setting this variable.
|===
== Updating the Hop docker container image
Make sure to get the latest updates for the Hop image by pulling them:
[source,bash]
----
docker pull apache/hop:<tag>
----
If you do not specify a value for `:<tag>` the value `latest` will be taken.
Latest will contain the last officially released version of Apache Hop.
You can also specify `Development` as a tag.
That image will contain the last built Development snapshot of Apache Hop. rxq7777
== How to run the Container
The most common use case will be that you run a **short-lived container** to just complete one Hop workflow or pipeline.
Below is an example of the commands to run a **workflow**:
[source,bash]
----
docker run -it --rm \
--env HOP_LOG_LEVEL=Basic \
--env HOP_FILE_PATH='${PROJECT_HOME}/pipelines-and-workflows/main.hwf' \
--env HOP_PROJECT_FOLDER=/files/project \
--env HOP_PROJECT_NAME=project-a \
--env HOP_ENVIRONMENT_NAME=project-a-test \
--env HOP_ENVIRONMENT_CONFIG_FILE_NAME_PATHS=/files/config/project-a-test.json \
--env HOP_RUN_CONFIG=classic \
--env HOP_RUN_PARAMETERS=PARAM_LOG_MESSAGE=Hello,PARAM_WAIT_FOR_X_MINUTES=1 \
-v /path/to/local/dir:/files \
--name my-simple-hop-container \
apache/hop:<tag>
----
If you need a **long-lived container**, this option is also available.
Run this command to start a Hop Server in a docker container:
[source,bash]
----
docker pull docker.io/apache/hop:<tag>
docker run -it --rm \
--env HOP_SERVER_USER=admin \
--env HOP_SERVER_PASS=admin \
--env HOP_SERVER_PORT=8181 \
--env HOP_SERVER_HOSTNAME=localhost \
-p 8181:8181 \
--name my-hop-server-container \
apache/hop:<tag>
----
Hop Server is designed to receive all variables and metadata from executing clients.
This means it needs little to no configuration to run.
You can then access the hop-server UI from your host at `http://localhost:8181`
== Custom Entrypoint Extension Shell Script
To make the Hop Docker image even more flexible, we added a ```HOP_CUSTOM_ENTRYPOINT_EXTENSION_SHELL_FILE_PATH``` variable that accepts a path to a custom shell script (that you provide).This shell script will run when you start the container before your Hop project is registered with the container's Hop config and before your Hop workflow or pipeline gets kicked off.
This feature might come in handy when you want to run some custom logic upfront, e.g. source Hop project files from S3 or clone them from GitHub.
The custom shell file can be provided in several ways (this is not a full list):
- via the mount point (```/files```)
- You create your own Dockerfile, define this image as the base and then use the ```COPY``` instruction to copy your custom shell file in your Docker image.
For the last scenario mentioned, it could be something like this:
We create a simple **bash script** called ```clone-git-repo.sh``` in a sub-folder called ```resources```:
[source,shell]
----
#!/bin/bash
cd /home/hop
git clone ${GIT_REPO_URI}
chown -R hop:hop /home/hop/${GIT_REPO_NAME}
----
We also make it parameter-driven, so it any other team can use it.We create our custom Dockerfile like so:
[source,dockerfile]
----
FROM apache/hop:1.1.0-SNAPSHOT
ENV GIT_REPO_URI=https://...
# example value: https://github.com/diethardsteiner/apache-hop-minimal-project.git
ENV GIT_REPO_NAME=repo-name
# example value: apache-hop-minimal-project
USER root
RUN apk update \
&& apk add --no-cache git
# copy custom entrypoint extension shell script
COPY --chown=hop:hop ./resources/clone-git-repo.sh /home/hop/clone-git-repo.sh
USER hop
----
Note that apart from defining the new environment variables (that go in line with the parameters we defined in the ```clone-git-repo.sh``` earlier on ), we also ```COPY``` the ```clone-git-repo.sh``` file to user hop's home folder.
Next let's build a small script which builds our custom image and then tests it by spinning up a container and running a workflow:
[source,shell]
----
#!/bin/zsh
DOCKER_IMG_CHECK=$(docker images | grep ds/custom-hop)
if [ ! -z "${DOCKER_IMG_CHECK}" ]; then
echo "removing existing ds/custom-hop image"
docker rmi ds/custom-hop:latest
fi
docker build . -f custom.Dockerfile -t ds/custom-hop:latest
echo " ==== TESTING ====="
HOP_DOCKER_IMAGE=ds/custom-hop:latest
PROJECT_DEPLOYMENT_DIR=/home/hop/apache-hop-minimal-project
docker run -it --rm \
--env HOP_LOG_LEVEL=Basic \
--env HOP_FILE_PATH='${PROJECT_HOME}/main.hwf' \
--env HOP_PROJECT_FOLDER=${PROJECT_DEPLOYMENT_DIR} \
--env HOP_PROJECT_NAME=apache-hop-minimum-project \
--env HOP_ENVIRONMENT_NAME=dev \
--env HOP_ENVIRONMENT_CONFIG_FILE_NAME_PATHS=${PROJECT_DEPLOYMENT_DIR}/dev-config.json \
--env HOP_RUN_CONFIG=local \
--env HOP_CUSTOM_ENTRYPOINT_EXTENSION_SHELL_FILE_PATH=/home/hop/clone-git-repo.sh \
--env GIT_REPO_URI=https://github.com/diethardsteiner/apache-hop-minimal-project.git \
--env GIT_REPO_NAME=apache-hop-minimal-project \
--name my-simple-hop-container \
${HOP_DOCKER_IMAGE}
----