| automatic single-machine configuration as described in the <a href="/docs/latest/operations/single-server">quickstart</a>.</p><h2 class="anchor anchorWithStickyNavbar_LWe7" id="install-docker">Install Docker<a href="#install-docker" class="hash-link" aria-label="Direct link to Install Docker" title="Direct link to Install Docker"></a></h2><p>This tutorial requires <a href="https://docs.docker.com/install/" target="_blank" rel="noopener noreferrer">Docker</a> to be installed on the tutorial machine.</p><p>Once the Docker install is complete, please proceed to the next steps in the tutorial.</p><h2 class="anchor anchorWithStickyNavbar_LWe7" id="build-the-hadoop-docker-image">Build the Hadoop docker image<a href="#build-the-hadoop-docker-image" class="hash-link" aria-label="Direct link to Build the Hadoop docker image" title="Direct link to Build the Hadoop docker image"></a></h2><p>For this tutorial, we've provided a Dockerfile for a Hadoop 2.8.5 cluster, which we'll use to run the batch indexing task.</p><p>This Dockerfile and related files are located at <code>quickstart/tutorial/hadoop/docker</code>.</p><p>From the apache-druid-27.0.0 package root, run the following commands to build a Docker image named "druid-hadoop-demo" with version tag "2.8.5":</p><div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token builtin class-name" style="color:rgb(255, 203, 107)">cd</span><span class="token plain"> quickstart/tutorial/hadoop/docker</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token function" style="color:rgb(130, 170, 255)">docker</span><span class="token plain"> build -t druid-hadoop-demo:2.8.5 </span><span class="token builtin class-name" style="color:rgb(255, 203, 107)">.</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><p>This will start building the Hadoop image. Once the image build is done, you should see the message <code>Successfully tagged druid-hadoop-demo:2.8.5</code> printed to the console.</p><h2 class="anchor anchorWithStickyNavbar_LWe7" id="setup-the-hadoop-docker-cluster">Setup the Hadoop docker cluster<a href="#setup-the-hadoop-docker-cluster" class="hash-link" aria-label="Direct link to Setup the Hadoop docker cluster" title="Direct link to Setup the Hadoop docker cluster"></a></h2><h3 class="anchor anchorWithStickyNavbar_LWe7" id="create-temporary-shared-directory">Create temporary shared directory<a href="#create-temporary-shared-directory" class="hash-link" aria-label="Direct link to Create temporary shared directory" title="Direct link to Create temporary shared directory"></a></h3><p>We'll need a shared folder between the host and the Hadoop container for transferring some files.</p><p>Let's create some folders under <code>/tmp</code>, we will use these later when starting the Hadoop container:</p><div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token function" style="color:rgb(130, 170, 255)">mkdir</span><span class="token plain"> -p /tmp/shared</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token function" style="color:rgb(130, 170, 255)">mkdir</span><span class="token plain"> -p /tmp/shared/hadoop_xml</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><h3 class="anchor anchorWithStickyNavbar_LWe7" id="configure-etchosts">Configure /etc/hosts<a href="#configure-etchosts" class="hash-link" aria-label="Direct link to Configure /etc/hosts" title="Direct link to Configure /etc/hosts"></a></h3><p>On the host machine, add the following entry to <code>/etc/hosts</code>:</p><div class="codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">127.0.0.1 druid-hadoop-demo</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><h3 class="anchor anchorWithStickyNavbar_LWe7" id="start-the-hadoop-container">Start the Hadoop container<a href="#start-the-hadoop-container" class="hash-link" aria-label="Direct link to Start the Hadoop container" title="Direct link to Start the Hadoop container"></a></h3><p>Once the <code>/tmp/shared</code> folder has been created and the <code>etc/hosts</code> entry has been added, run the following command to start the Hadoop container.</p><div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token function" style="color:rgb(130, 170, 255)">docker</span><span class="token plain"> run -it -h druid-hadoop-demo --name druid-hadoop-demo -p </span><span class="token number" style="color:rgb(247, 140, 108)">2049</span><span class="token plain">:2049 -p </span><span class="token number" style="color:rgb(247, 140, 108)">2122</span><span class="token plain">:2122 -p </span><span class="token number" style="color:rgb(247, 140, 108)">8020</span><span class="token plain">:8020 -p </span><span class="token number" style="color:rgb(247, 140, 108)">8021</span><span class="token plain">:8021 -p </span><span class="token number" style="color:rgb(247, 140, 108)">8030</span><span class="token plain">:8030 -p </span><span class="token number" style="color:rgb(247, 140, 108)">8031</span><span class="token plain">:8031 -p </span><span class="token number" style="color:rgb(247, 140, 108)">8032</span><span class="token plain">:8032 -p </span><span class="token number" style="color:rgb(247, 140, 108)">8033</span><span class="token plain">:8033 -p </span><span class="token number" style="color:rgb(247, 140, 108)">8040</span><span class="token plain">:8040 -p </span><span class="token number" style="color:rgb(247, 140, 108)">8042</span><span class="token plain">:8042 -p </span><span class="token number" style="color:rgb(247, 140, 108)">8088</span><span class="token plain">:8088 -p </span><span class="token number" style="color:rgb(247, 140, 108)">8443</span><span class="token plain">:8443 -p </span><span class="token number" style="color:rgb(247, 140, 108)">9000</span><span class="token plain">:9000 -p </span><span class="token number" style="color:rgb(247, 140, 108)">10020</span><span class="token plain">:10020 -p </span><span class="token number" style="color:rgb(247, 140, 108)">19888</span><span class="token plain">:19888 -p </span><span class="token number" style="color:rgb(247, 140, 108)">34455</span><span class="token plain">:34455 -p </span><span class="token number" style="color:rgb(247, 140, 108)">49707</span><span class="token plain">:49707 -p </span><span class="token number" style="color:rgb(247, 140, 108)">50010</span><span class="token plain">:50010 -p </span><span class="token number" style="color:rgb(247, 140, 108)">50020</span><span class="token plain">:50020 -p </span><span class="token number" style="color:rgb(247, 140, 108)">50030</span><span class="token plain">:50030 -p </span><span class="token number" style="color:rgb(247, 140, 108)">50060</span><span class="token plain">:50060 -p </span><span class="token number" style="color:rgb(247, 140, 108)">50070</span><span class="token plain">:50070 -p </span><span class="token number" style="color:rgb(247, 140, 108)">50075</span><span class="token plain">:50075 -p </span><span class="token number" style="color:rgb(247, 140, 108)">50090</span><span class="token plain">:50090 -p </span><span class="token number" style="color:rgb(247, 140, 108)">51111</span><span class="token plain">:51111 -v /tmp/shared:/shared druid-hadoop-demo:2.8.5 /etc/bootstrap.sh -bash</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><p>Once the container is started, your terminal will attach to a bash shell running inside the container:</p><div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">Starting sshd: </span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token plain"> OK </span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token number" style="color:rgb(247, 140, 108)">18</span><span class="token plain">/07/26 </span><span class="token number" style="color:rgb(247, 140, 108)">17</span><span class="token plain">:27:15 WARN util.NativeCodeLoader: Unable to load native-hadoop library </span><span class="token keyword" style="font-style:italic">for</span><span class="token plain"> your platform</span><span class="token punctuation" style="color:rgb(199, 146, 234)">..</span><span class="token plain">. using builtin-java classes where applicable</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">Starting namenodes on </span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token plain">druid-hadoop-demo</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">druid-hadoop-demo: starting namenode, logging to /usr/local/hadoop/logs/hadoop-root-namenode-druid-hadoop-demo.out</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">localhost: starting datanode, logging to /usr/local/hadoop/logs/hadoop-root-datanode-druid-hadoop-demo.out</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">Starting secondary namenodes </span><span class="token punctuation" style="color:rgb(199, 146, 234)">[</span><span class="token number" style="color:rgb(247, 140, 108)">0.0</span><span class="token plain">.0.0</span><span class="token punctuation" style="color:rgb(199, 146, 234)">]</span><span class="token plain"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token number" style="color:rgb(247, 140, 108)">0.0</span><span class="token plain">.0.0: starting secondarynamenode, logging to /usr/local/hadoop/logs/hadoop-root-secondarynamenode-druid-hadoop-demo.out</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token number" style="color:rgb(247, 140, 108)">18</span><span class="token plain">/07/26 </span><span class="token number" style="color:rgb(247, 140, 108)">17</span><span class="token plain">:27:31 WARN util.NativeCodeLoader: Unable to load native-hadoop library </span><span class="token keyword" style="font-style:italic">for</span><span class="token plain"> your platform</span><span class="token punctuation" style="color:rgb(199, 146, 234)">..</span><span class="token plain">. using builtin-java classes where applicable</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">starting </span><span class="token function" style="color:rgb(130, 170, 255)">yarn</span><span class="token plain"> daemons</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">starting resourcemanager, logging to /usr/local/hadoop/logs/yarn--resourcemanager-druid-hadoop-demo.out</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">localhost: starting nodemanager, logging to /usr/local/hadoop/logs/yarn-root-nodemanager-druid-hadoop-demo.out</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">starting historyserver, logging to /usr/local/hadoop/logs/mapred--historyserver-druid-hadoop-demo.out</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">bash-4.1</span><span class="token comment" style="color:rgb(105, 112, 152);font-style:italic">#</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><p>The <code>Unable to load native-hadoop library for your platform... using builtin-java classes where applicable</code> warning messages can be safely ignored.</p><h4 class="anchor anchorWithStickyNavbar_LWe7" id="accessing-the-hadoop-container-shell">Accessing the Hadoop container shell<a href="#accessing-the-hadoop-container-shell" class="hash-link" aria-label="Direct link to Accessing the Hadoop container shell" title="Direct link to Accessing the Hadoop container shell"></a></h4><p>To open another shell to the Hadoop container, run the following command:</p><div class="codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">docker exec -it druid-hadoop-demo bash</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><h3 class="anchor anchorWithStickyNavbar_LWe7" id="copy-input-data-to-the-hadoop-container">Copy input data to the Hadoop container<a href="#copy-input-data-to-the-hadoop-container" class="hash-link" aria-label="Direct link to Copy input data to the Hadoop container" title="Direct link to Copy input data to the Hadoop container"></a></h3><p>From the apache-druid-27.0.0 package root on the host, copy the <code>quickstart/tutorial/wikiticker-2015-09-12-sampled.json.gz</code> sample data to the shared folder:</p><div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token function" style="color:rgb(130, 170, 255)">cp</span><span class="token plain"> quickstart/tutorial/wikiticker-2015-09-12-sampled.json.gz /tmp/shared/wikiticker-2015-09-12-sampled.json.gz</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><h3 class="anchor anchorWithStickyNavbar_LWe7" id="setup-hdfs-directories">Setup HDFS directories<a href="#setup-hdfs-directories" class="hash-link" aria-label="Direct link to Setup HDFS directories" title="Direct link to Setup HDFS directories"></a></h3><p>In the Hadoop container's shell, run the following commands to setup the HDFS directories needed by this tutorial and copy the input data to HDFS.</p><div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token builtin class-name" style="color:rgb(255, 203, 107)">cd</span><span class="token plain"> /usr/local/hadoop/bin</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">./hdfs dfs -mkdir /druid</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">./hdfs dfs -mkdir /druid/segments</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">./hdfs dfs -mkdir /quickstart</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">./hdfs dfs -chmod </span><span class="token number" style="color:rgb(247, 140, 108)">777</span><span class="token plain"> /druid</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">./hdfs dfs -chmod </span><span class="token number" style="color:rgb(247, 140, 108)">777</span><span class="token plain"> /druid/segments</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">./hdfs dfs -chmod </span><span class="token number" style="color:rgb(247, 140, 108)">777</span><span class="token plain"> /quickstart</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">./hdfs dfs -chmod -R </span><span class="token number" style="color:rgb(247, 140, 108)">777</span><span class="token plain"> /tmp</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">./hdfs dfs -chmod -R </span><span class="token number" style="color:rgb(247, 140, 108)">777</span><span class="token plain"> /user</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">./hdfs dfs -put /shared/wikiticker-2015-09-12-sampled.json.gz /quickstart/wikiticker-2015-09-12-sampled.json.gz</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><p>If you encounter namenode errors when running this command, the Hadoop container may not be finished initializing. When this occurs, wait a couple of minutes and retry the commands.</p><h2 class="anchor anchorWithStickyNavbar_LWe7" id="configure-druid-to-use-hadoop">Configure Druid to use Hadoop<a href="#configure-druid-to-use-hadoop" class="hash-link" aria-label="Direct link to Configure Druid to use Hadoop" title="Direct link to Configure Druid to use Hadoop"></a></h2><p>Some additional steps are needed to configure the Druid cluster for Hadoop batch indexing.</p><h3 class="anchor anchorWithStickyNavbar_LWe7" id="copy-hadoop-configuration-to-druid-classpath">Copy Hadoop configuration to Druid classpath<a href="#copy-hadoop-configuration-to-druid-classpath" class="hash-link" aria-label="Direct link to Copy Hadoop configuration to Druid classpath" title="Direct link to Copy Hadoop configuration to Druid classpath"></a></h3><p>From the Hadoop container's shell, run the following command to copy the Hadoop .xml configuration files to the shared folder:</p><div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token function" style="color:rgb(130, 170, 255)">cp</span><span class="token plain"> /usr/local/hadoop/etc/hadoop/*.xml /shared/hadoop_xml</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><p>From the host machine, run the following, where {PATH_TO_DRUID} is replaced by the path to the Druid package.</p><div class="language-bash codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-bash codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token function" style="color:rgb(130, 170, 255)">mkdir</span><span class="token plain"> -p </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token plain">PATH_TO_DRUID</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token plain">/conf/druid/single-server/micro-quickstart/_common/hadoop-xml</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"></span><span class="token function" style="color:rgb(130, 170, 255)">cp</span><span class="token plain"> /tmp/shared/hadoop_xml/*.xml </span><span class="token punctuation" style="color:rgb(199, 146, 234)">{</span><span class="token plain">PATH_TO_DRUID</span><span class="token punctuation" style="color:rgb(199, 146, 234)">}</span><span class="token plain">/conf/druid/single-server/micro-quickstart/_common/hadoop-xml/</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><h3 class="anchor anchorWithStickyNavbar_LWe7" id="update-druid-segment-and-log-storage">Update Druid segment and log storage<a href="#update-druid-segment-and-log-storage" class="hash-link" aria-label="Direct link to Update Druid segment and log storage" title="Direct link to Update Druid segment and log storage"></a></h3><p>In your favorite text editor, open <code>conf/druid/auto/_common/common.runtime.properties</code>, and make the following edits:</p><h4 class="anchor anchorWithStickyNavbar_LWe7" id="disable-local-deep-storage-and-enable-hdfs-deep-storage">Disable local deep storage and enable HDFS deep storage<a href="#disable-local-deep-storage-and-enable-hdfs-deep-storage" class="hash-link" aria-label="Direct link to Disable local deep storage and enable HDFS deep storage" title="Direct link to Disable local deep storage and enable HDFS deep storage"></a></h4><div class="codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">#</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"># Deep storage</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">#</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"># For local disk (only viable in a cluster if this is a network mount):</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">#druid.storage.type=local</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">#druid.storage.storageDirectory=var/druid/segments</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"># For HDFS:</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">druid.storage.type=hdfs</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">druid.storage.storageDirectory=/druid/segments</span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><h4 class="anchor anchorWithStickyNavbar_LWe7" id="disable-local-log-storage-and-enable-hdfs-log-storage">Disable local log storage and enable HDFS log storage<a href="#disable-local-log-storage-and-enable-hdfs-log-storage" class="hash-link" aria-label="Direct link to Disable local log storage and enable HDFS log storage" title="Direct link to Disable local log storage and enable HDFS log storage"></a></h4><div class="codeBlockContainer_Ckt0 theme-code-block" style="--prism-color:#bfc7d5;--prism-background-color:#292d3e"><div class="codeBlockContent_biex"><pre tabindex="0" class="prism-code language-text codeBlock_bY9V thin-scrollbar"><code class="codeBlockLines_e6Vv"><span class="token-line" style="color:#bfc7d5"><span class="token plain">#</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"># Indexing service logs</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">#</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"># For local disk (only viable in a cluster if this is a network mount):</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">#druid.indexer.logs.type=file</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">#druid.indexer.logs.directory=var/druid/indexing-logs</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain"># For HDFS:</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">druid.indexer.logs.type=hdfs</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain">druid.indexer.logs.directory=/druid/indexing-logs</span><br></span><span class="token-line" style="color:#bfc7d5"><span class="token plain" style="display:inline-block"></span><br></span></code></pre><div class="buttonGroup__atx"><button type="button" aria-label="Copy code to clipboard" title="Copy" class="clean-btn"><span class="copyButtonIcons_eSgA" aria-hidden="true"><svg viewBox="0 0 24 24" class="copyButtonIcon_y97N"><path fill="currentColor" d="M19,21H8V7H19M19,5H8A2,2 0 0,0 6,7V21A2,2 0 0,0 8,23H19A2,2 0 0,0 21,21V7A2,2 0 0,0 19,5M16,1H4A2,2 0 0,0 2,3V17H4V3H16V1Z"></path></svg><svg viewBox="0 0 24 24" class="copyButtonSuccessIcon_LjdS"><path fill="currentColor" d="M21,7L9,19L3.5,13.5L4.91,12.09L9,16.17L19.59,5.59L21,7Z"></path></svg></span></button></div></div></div><h3 class="anchor anchorWithStickyNavbar_LWe7" id="restart-druid-cluster">Restart Druid cluster<a href="#restart-druid-cluster" class="hash-link" aria-label="Direct link to Restart Druid cluster" title="Direct link to Restart Druid cluster"></a></h3><p>Once the Hadoop .xml files have been copied to the Druid cluster and the segment/log storage configuration has been updated to use HDFS, the Druid cluster needs to be restarted for the new configurations to take effect.</p><p>If the cluster is still running, CTRL-C to terminate the <code>bin/start-druid</code> script, and re-run it to bring the Druid services back up.</p><h2 class="anchor anchorWithStickyNavbar_LWe7" id="load-batch-data">Load batch data<a href="#load-batch-data" class="hash-link" aria-label="Direct link to Load batch data" title="Direct link to Load batch data"></a></h2><p>We've included a sample of Wikipedia edits from September 12, 2015 to get you started.</p><p>To load this data into Druid, you can submit an <em>ingestion task</em> pointing to the file. We've included |