The integration tests are built and run as part of Druid's Maven script. Maven itself is used by hand, and as part of the GHA build proces. Running integration tests in maven is a multi-part process.
distribution.pom.xml file after the distribution project which builds the Druid tarball.Travis orchestrates the above process to run the ITs in parallel. When you run tests locally, you do the above steps one by one. You can, of course, reuse the same disribution for multiple image builds, and the same image for multiple test runs.
Use the following command to run the ITs, assuming DRUID_DEV points to your Druid development directory:
cd $DRUID_DEV mvn clean package -P dist,test-image,skip-static-checks \ -Dmaven.javadoc.skip=true -DskipUTs=true
The various pieces are:
clean: Remove any existing artifacts, and any existing Docker image.install: Build the Druid code and write it to the local Maven repo.-P dist: Create the Druid distribution tarball by pulling jars from the local Maven repo.-P test-image: Build the Docker images by grabbing the Druid tarball and pulling additional dependencies into the local repo, then stage them for Docker.Once you've done the above once, you can do just the specific part you want to repeat during development. See below for details.
See quickstart for how to run the two steps separately.
Each pass through Maven runs a single test category. Running a test category has three parts, spelled out in Maven:
Again, see quickstart for how to run the three steps separately, and how to run the tests in an IDE.
To do the task via Maven:
cd $DRUID_DEV mvn verify -P docker-tests,skip-static-checks,IT-<category> \ -Dmaven.javadoc.skip=true -DskipUTs=true
The various pieces are:
verify: Run the steps up to the one that checks the output of the ITs. Because of the extra cluster step in an IT, the build does not fail if an IT failse. Instead, it continues on to clean up the cluster, and only after that does it check test sucess in the verify step.<category: The name of the test category as listed in tests.The revised integration tests use the [Maven failsafe plugin] (https://maven.apache.org/surefire/maven-failsafe-plugin/) which shares code with the Maven Surefire plugin used to run unit tests. Failsafe handles the special steps unique to integration tests.
Since we use JUnit categories, we must use the surefire-junit47 provider. Omitting the provider seems to want to use the TestNG provider. Using just the surefire-junit4 provider cause Surefire to ignore categories.
One shared item is the skipTests flag. Via a bit of Maven config creativity we define extra properties to control the two kinds of tests:
-DskipUTs=true to skip Surefire (unit) tests.-P docker-tests to enable Failsafe (integration) tests.-DskipTests=true to skip all tests.The key modules in the above flow include:
distribution: Builds the Druid tarball which the ITs exercise.The IT process resides in the integration-tests-ex folder and consists of three Maven modules:
druid-it-tools: Testing extensions added to the Docker image.druid-it-image: Builds the Druid test Docker image.druid-integration-test-cases: The code for all the ITs, along with the supporting framework.The annoying “druid” prefix occurs to make it easier to separate Apache Druid modules when users extend Druid with extra user-specific modules.
It turns out that, for obscure reasons, we must use a “flat” module structure under the root Druid pom.xml even though the modules themselves are in folders within integration-tests-ex. That is, we cannot create a integration-tests-ex Maven module to hold the IT modules. The reason has to do with the fact that Maven has no good way to obtain the directory that contains the root pom.xml file. Yet, we need this directory to find configuration files for some of the static checking tools. Though there is a directory plugin that looks like it should work, we then run into a another issue: if we invoke a goal directly from the mvn command line: Maven happily ignores the validate and initialize phases where we'd set the directory path. By using a two-level module structure, we can punt and just always assume that ${project.parent.basedir} points to the root Druid directory. More than you wanted to know, but now you know why there is no integration-tests-ex module as there should be.
As a result, you may have to hunt in your IDE to find the non-project files in the integration-tests-ex directory. Look in the root druid project, in the integration-tests-ex folder.
Because of this limitation, all the test code is in one Maven module. When we tried to create a separate module per category, we ended up with a great deal of redundancy since we could not have common parent module. Putting all the code in one module means we can only run one test category per Maven run, which is actually fine because that's how Travis runs tests anyway.
org.apache.druid.testsEx PackageThe org.apache.druid.testsEx is temporary: it holds code from the integration-tests org.apache.druid.testing package adapted to work in the revised environment. Some classes have the same name in both places. The goal is to merge the testsEx package back into testing at some future point when the tests are all upgraded.
The revised ITs themselves are also in this package. Over time, as we replace the old ITs, the test can migrate to the original package names.
Integration test artifacts are built only if you specifically request them using a profile.
-P test-image builds the test Docker image.-P docker-tests enables the integration tests.-P IT-<category> selects the category to run.The profiles allow you to build the test image once during debugging, and reuse it across multiple test runs. (See Debugging.)
The Docker image inclues three third-party dependencies not included in the Druid build:
We use dependency rules in the test-image/pom.xml file to cause Maven to download these dependencies into the Maven cache, then we use the maven-dependency-plugin to copy those dependencies into a Docker directory, and we use Docker to copy the files into the image. This approach avoids the need to pull the dependency from a remote repository into the image directly, and thus both speeds up the build, and is kinder to the upstream repositories.
If you add additional dependencies, please follow the above process. See the pom.xml files for examples.
The build environment users environment variables to pass information to Maven. Maven communicates with Docker and Docker Compose via environment variables set in the exec-maven-plugin of various pom.xml files. The environment variables then flow into either the Docker build script (Dockerfile) or the various Docker Compose scripts (docker-compose.yaml). It can be tedious to follow this flow. A quick outline:
-d<var>=<value syntax.exec-maven-plugin, sets environment variables typically from Maven's own variables for things like versions.exec-maven-plugin invokes a script to do the needes shell fiddling. The environment variables are visible to the script and implicitly passed to whatever the script calls.Dockerfile typically passes the build arguments to the image as environment variables of the same name.If you find you need a new parameter in either the Docker build or the Docker Compose configuration:
exec-maven-plugin sections.The easiest way to test is to insert (or enable, or view) the environment in the image:
env
The output will typically go into the Docker output or the Docker Compose logs.
Since Druid's pom.xml file is quite large, Maven can be a bit slow when all you want to do is to build the Docker image. To speed things up a bit, you can build just the docker image. See the Quickstart for how to run tests this way. Using this trick, creating an image, or launching a cluster, is quite fast. See the Docker section for details.