The integration tests are built and run as part of Druid's Maven script. Maven itself is used by hand, and as part of the GHA build proces. Running integration tests in maven is a multi-part process.
distribution
.pom.xml
file after the distribution
project which builds the Druid tarball.Travis orchestrates the above process to run the ITs in parallel. When you run tests locally, you do the above steps one by one. You can, of course, reuse the same disribution for multiple image builds, and the same image for multiple test runs.
Use the following command to run the ITs, assuming DRUID_DEV
points to your Druid development directory:
cd $DRUID_DEV mvn clean package -P dist,test-image,skip-static-checks \ -Dmaven.javadoc.skip=true -DskipUTs=true
The various pieces are:
clean
: Remove any existing artifacts, and any existing Docker image.install
: Build the Druid code and write it to the local Maven repo.-P dist
: Create the Druid distribution tarball by pulling jars from the local Maven repo.-P test-image
: Build the Docker images by grabbing the Druid tarball and pulling additional dependencies into the local repo, then stage them for Docker.Once you've done the above once, you can do just the specific part you want to repeat during development. See below for details.
See quickstart for how to run the two steps separately.
Each pass through Maven runs a single test category. Running a test category has three parts, spelled out in Maven:
Again, see quickstart for how to run the three steps separately, and how to run the tests in an IDE.
To do the task via Maven:
cd $DRUID_DEV mvn verify -P docker-tests,skip-static-checks,IT-<category> \ -Dmaven.javadoc.skip=true -DskipUTs=true
The various pieces are:
verify
: Run the steps up to the one that checks the output of the ITs. Because of the extra cluster step in an IT, the build does not fail if an IT failse. Instead, it continues on to clean up the cluster, and only after that does it check test sucess in the verify
step.<category
: The name of the test category as listed in tests.The revised integration tests use the [Maven failsafe plugin] (https://maven.apache.org/surefire/maven-failsafe-plugin/) which shares code with the Maven Surefire plugin used to run unit tests. Failsafe handles the special steps unique to integration tests.
Since we use JUnit categories, we must use the surefire-junit47
provider. Omitting the provider seems to want to use the TestNG provider. Using just the surefire-junit4
provider cause Surefire to ignore categories.
One shared item is the skipTests
flag. Via a bit of Maven config creativity we define extra properties to control the two kinds of tests:
-DskipUTs=true
to skip Surefire (unit) tests.-P docker-tests
to enable Failsafe (integration) tests.-DskipTests=true
to skip all tests.The key modules in the above flow include:
distribution
: Builds the Druid tarball which the ITs exercise.The IT process resides in the integration-tests-ex
folder and consists of three Maven modules:
druid-it-tools
: Testing extensions added to the Docker image.druid-it-image
: Builds the Druid test Docker image.druid-integration-test-cases
: The code for all the ITs, along with the supporting framework.The annoying “druid” prefix occurs to make it easier to separate Apache Druid modules when users extend Druid with extra user-specific modules.
It turns out that, for obscure reasons, we must use a “flat” module structure under the root Druid pom.xml
even though the modules themselves are in folders within integration-tests-ex
. That is, we cannot create a integration-tests-ex
Maven module to hold the IT modules. The reason has to do with the fact that Maven has no good way to obtain the directory that contains the root pom.xml
file. Yet, we need this directory to find configuration files for some of the static checking tools. Though there is a directory plugin that looks like it should work, we then run into a another issue: if we invoke a goal directly from the mvn
command line: Maven happily ignores the validate
and initialize
phases where we'd set the directory path. By using a two-level module structure, we can punt and just always assume that ${project.parent.basedir}
points to the root Druid directory. More than you wanted to know, but now you know why there is no integration-tests-ex
module as there should be.
As a result, you may have to hunt in your IDE to find the non-project files in the integration-tests-ex
directory. Look in the root druid
project, in the integration-tests-ex
folder.
Because of this limitation, all the test code is in one Maven module. When we tried to create a separate module per category, we ended up with a great deal of redundancy since we could not have common parent module. Putting all the code in one module means we can only run one test category per Maven run, which is actually fine because that's how Travis runs tests anyway.
org.apache.druid.testsEx
PackageThe org.apache.druid.testsEx
is temporary: it holds code from the integration-tests
org.apache.druid.testing
package adapted to work in the revised environment. Some classes have the same name in both places. The goal is to merge the testsEx
package back into testing
at some future point when the tests are all upgraded.
The revised ITs themselves are also in this package. Over time, as we replace the old ITs, the test can migrate to the original package names.
Integration test artifacts are built only if you specifically request them using a profile.
-P test-image
builds the test Docker image.-P docker-tests
enables the integration tests.-P IT-<category>
selects the category to run.The profiles allow you to build the test image once during debugging, and reuse it across multiple test runs. (See Debugging.)
The Docker image inclues three third-party dependencies not included in the Druid build:
We use dependency rules in the test-image/pom.xml
file to cause Maven to download these dependencies into the Maven cache, then we use the maven-dependency-plugin
to copy those dependencies into a Docker directory, and we use Docker to copy the files into the image. This approach avoids the need to pull the dependency from a remote repository into the image directly, and thus both speeds up the build, and is kinder to the upstream repositories.
If you add additional dependencies, please follow the above process. See the pom.xml
files for examples.
The build environment users environment variables to pass information to Maven. Maven communicates with Docker and Docker Compose via environment variables set in the exec-maven-plugin
of various pom.xml
files. The environment variables then flow into either the Docker build script (Dockerfile
) or the various Docker Compose scripts (docker-compose.yaml
). It can be tedious to follow this flow. A quick outline:
-d<var>=<value
syntax.exec-maven-plugin
, sets environment variables typically from Maven's own variables for things like versions.exec-maven-plugin
invokes a script to do the needes shell fiddling. The environment variables are visible to the script and implicitly passed to whatever the script calls.Dockerfile
typically passes the build arguments to the image as environment variables of the same name.If you find you need a new parameter in either the Docker build or the Docker Compose configuration:
exec-maven-plugin
sections.The easiest way to test is to insert (or enable, or view) the environment in the image:
env
The output will typically go into the Docker output or the Docker Compose logs.
Since Druid's pom.xml
file is quite large, Maven can be a bit slow when all you want to do is to build the Docker image. To speed things up a bit, you can build just the docker image. See the Quickstart for how to run tests this way. Using this trick, creating an image, or launching a cluster, is quite fast. See the Docker section for details.