integration-tests-ex/docs/debugging.md - druid - Git at Google

 <!--
   ~ Licensed to the Apache Software Foundation (ASF) under one
   ~ or more contributor license agreements.  See the NOTICE file
   ~ distributed with this work for additional information
   ~ regarding copyright ownership.  The ASF licenses this file
   ~ to you under the Apache License, Version 2.0 (the
   ~ "License"); you may not use this file except in compliance
   ~ with the License.  You may obtain a copy of the License at
   ~
   ~   http://www.apache.org/licenses/LICENSE-2.0
   ~
   ~ Unless required by applicable law or agreed to in writing,
   ~ software distributed under the License is distributed on an
   ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
   ~ KIND, either express or implied.  See the License for the
   ~ specific language governing permissions and limitations
   ~ under the License.
   -->

 # Debugging the Druid Image and Integration Tests

 The integration setup has as a primary goal the ability to quickly debug
 the Druid image and any individual tests. A first step is to move the
 image build into a separate project. A second step is to ensure each
 test can run in JUnit in an IDE against a cluster you start by hand.

 This section discusses how to use the various debugging features.

 See:

 * [Docker Configuration](docker.md) for information on debugging
   docker builds.

 ## General Debug Process

 Ease of debugging is a key goal of the revised structure.

 * Rebuild the Docker image only when the Druid code changes.
   Do a normal distribution build, then build a new image.
 * Reuse the same image over and over if you only change tests
   (such as when adding a new test.)
 * Reuse the same `shared` directory when the test does not
   make permanent changes.
 * Change Druid configuration by changing the Docker compose
   files, no need to rebuild the image.
 * Work primarily in the IDE when debugging tests.
 * To add more logging, change the `log4j2.xml` file in the shared
   directory to increase the logging level.
 * Remote debug Druid services if needed.

 ## Exploring the Test Cluster

 When run in Docker Compose, the endpoints known to Druid nodes differ from
 those needed by a client sitting outside the cluster. We could provide an
 explicit mapping. Better is to use the
 [Router](https://druid.apache.org/docs/latest/design/router.html#router-as-management-proxy)
 to proxy requests. Fortunately, the Druid Console already does this.

 ## Docker Build Output

 Modern Docker seems to hide the output of commands, which is a hassle to debug
 a build. Oddly, the details appear for a failed build, but not for success.
 Use the followig to see at least some output:

 ```bash
 export DOCKER_BUILDKIT=0
 ```

 Once the base container is built, you can run it, log in and poke around. First
 identify the name. See the last line of the container build:

 ```text
 Successfully tagged org.apache.druid/test:<version>
 ```

 Or ask Docker:

 ```bash
 docker images
 ```

 ## Debug the Docker Image

 You can log into the Docker image and poke around to see what's what:


 ```bash
 docker run --rm -it --entrypoint bash org.apache.druid/test:<version>
 ```

 Quite a few environment variables are provided by Docker and the setup scripts
 to see them, within the container, use:

 ```bash
 env
 ```

 ## Debug an Integration Test

 To debug an integration test, you need a Docker image with the latest Druid.
 To get that, you need a full Druid build. So, we break the debugging process
 down into steps that depend on the state of your code. Assume `DRUID_DEV`
 points to your Druid development area.

 ### On Each Druid Build

 If you need to rebuild Druid (because you fixed something), do:

 * Do a distribution build of Druid:
 * Build the test image.

 See [quickstart](quickstart.md) for the commands.

 ### Start the Test Cluster

 * Pick a test "group" to use.
 * Start a test cluster configured for this test.
 * Run a test from the command line:

 Again, see [quickstart](quickstart.md) for the commands.

 ### Debug the Test

 To run from your IDE, find the test to run and run it as a JUnit test (with the
 cluster up.)

 Depending on the test, you may be able to run the test over and over against the
 same cluster. (In fact, you should try to design your tests so that this is true:
 clean up after each run.)

 The tests are just plain old JUnit tests that happen to reach out to the
 test cluster and/or Docker to do their work. You can set breakpoints and debug
 in the usual way.

 Each test will first verify that the cluster is fully up before it starts, so
 you can launch your debug session immediately after starting the cluster: the tests
 will wait as needed.

 ### Stop the Test Cluster

 When done, stop the cluster: [quickstart](quickstart.md) again for details.

 ## Typical Issues

 For the most part, you can stop and restart the Druid services as often
 as you like and Druid will just work. There are a few combinations that
 can lead to trouble, however.

 * Services won't start: When doing a new build, stop the existing cluster
   before doing the build. The build removes and rebuilds the shared
   directory: services can't survive that.
 * Metastore failure: The metastore container will recreate the DB on
   each restart. This will fail if your shared directory already contains
   a DB. Do a `rm -r target/<category>/db` before restarting the DB container.
 * Coordinator fails with DB errors. The Coordinator will create the Druid
   tables when it starts. This means the DB has to be created. If the DB
   is removed after the Coordinator starts (to fix the above issue, say)
   then you have to restart the Coordinator so it can create the needed
   tables.
	<!--
	~ Licensed to the Apache Software Foundation (ASF) under one
	~ or more contributor license agreements. See the NOTICE file
	~ distributed with this work for additional information
	~ regarding copyright ownership. The ASF licenses this file
	~ to you under the Apache License, Version 2.0 (the
	~ "License"); you may not use this file except in compliance
	~ with the License. You may obtain a copy of the License at
	~
	~ http://www.apache.org/licenses/LICENSE-2.0
	~
	~ Unless required by applicable law or agreed to in writing,
	~ software distributed under the License is distributed on an
	~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	~ KIND, either express or implied. See the License for the
	~ specific language governing permissions and limitations
	~ under the License.
	-->

	# Debugging the Druid Image and Integration Tests

	The integration setup has as a primary goal the ability to quickly debug
	the Druid image and any individual tests. A first step is to move the
	image build into a separate project. A second step is to ensure each
	test can run in JUnit in an IDE against a cluster you start by hand.

	This section discusses how to use the various debugging features.

	See:

	* [Docker Configuration](docker.md) for information on debugging
	docker builds.

	## General Debug Process

	Ease of debugging is a key goal of the revised structure.

	* Rebuild the Docker image only when the Druid code changes.
	Do a normal distribution build, then build a new image.
	* Reuse the same image over and over if you only change tests
	(such as when adding a new test.)
	* Reuse the same `shared` directory when the test does not
	make permanent changes.
	* Change Druid configuration by changing the Docker compose
	files, no need to rebuild the image.
	* Work primarily in the IDE when debugging tests.
	* To add more logging, change the `log4j2.xml` file in the shared
	directory to increase the logging level.
	* Remote debug Druid services if needed.

	## Exploring the Test Cluster

	When run in Docker Compose, the endpoints known to Druid nodes differ from
	those needed by a client sitting outside the cluster. We could provide an
	explicit mapping. Better is to use the
	[Router](https://druid.apache.org/docs/latest/design/router.html#router-as-management-proxy)
	to proxy requests. Fortunately, the Druid Console already does this.

	## Docker Build Output

	Modern Docker seems to hide the output of commands, which is a hassle to debug
	a build. Oddly, the details appear for a failed build, but not for success.
	Use the followig to see at least some output:

	```bash
	export DOCKER_BUILDKIT=0
	```

	Once the base container is built, you can run it, log in and poke around. First
	identify the name. See the last line of the container build:

	```text
	Successfully tagged org.apache.druid/test:<version>
	```

	Or ask Docker:

	```bash
	docker images
	```

	## Debug the Docker Image

	You can log into the Docker image and poke around to see what's what:


	```bash
	docker run --rm -it --entrypoint bash org.apache.druid/test:<version>
	```

	Quite a few environment variables are provided by Docker and the setup scripts
	to see them, within the container, use:

	```bash
	env
	```

	## Debug an Integration Test

	To debug an integration test, you need a Docker image with the latest Druid.
	To get that, you need a full Druid build. So, we break the debugging process
	down into steps that depend on the state of your code. Assume `DRUID_DEV`
	points to your Druid development area.

	### On Each Druid Build

	If you need to rebuild Druid (because you fixed something), do:

	* Do a distribution build of Druid:
	* Build the test image.

	See [quickstart](quickstart.md) for the commands.

	### Start the Test Cluster

	* Pick a test "group" to use.
	* Start a test cluster configured for this test.
	* Run a test from the command line:

	Again, see [quickstart](quickstart.md) for the commands.

	### Debug the Test

	To run from your IDE, find the test to run and run it as a JUnit test (with the
	cluster up.)

	Depending on the test, you may be able to run the test over and over against the
	same cluster. (In fact, you should try to design your tests so that this is true:
	clean up after each run.)

	The tests are just plain old JUnit tests that happen to reach out to the
	test cluster and/or Docker to do their work. You can set breakpoints and debug
	in the usual way.

	Each test will first verify that the cluster is fully up before it starts, so
	you can launch your debug session immediately after starting the cluster: the tests
	will wait as needed.

	### Stop the Test Cluster

	When done, stop the cluster: [quickstart](quickstart.md) again for details.

	## Typical Issues

	For the most part, you can stop and restart the Druid services as often
	as you like and Druid will just work. There are a few combinations that
	can lead to trouble, however.

	* Services won't start: When doing a new build, stop the existing cluster
	before doing the build. The build removes and rebuilds the shared
	directory: services can't survive that.
	* Metastore failure: The metastore container will recreate the DB on
	each restart. This will fail if your shared directory already contains
	a DB. Do a `rm -r target/<category>/db` before restarting the DB container.
	* Coordinator fails with DB errors. The Coordinator will create the Druid
	tables when it starts. This means the DB has to be created. If the DB
	is removed after the Coordinator starts (to fix the above issue, say)
	then you have to restart the Coordinator so it can create the needed
	tables.