| --- | 
 | id: tutorial-kerberos-hadoop | 
 | title: "Configuring Apache Druid to use Kerberized Apache Hadoop as deep storage" | 
 | sidebar_label: "Kerberized HDFS deep storage" | 
 | --- | 
 |  | 
 | <!-- | 
 |   ~ Licensed to the Apache Software Foundation (ASF) under one | 
 |   ~ or more contributor license agreements.  See the NOTICE file | 
 |   ~ distributed with this work for additional information | 
 |   ~ regarding copyright ownership.  The ASF licenses this file | 
 |   ~ to you under the Apache License, Version 2.0 (the | 
 |   ~ "License"); you may not use this file except in compliance | 
 |   ~ with the License.  You may obtain a copy of the License at | 
 |   ~ | 
 |   ~   http://www.apache.org/licenses/LICENSE-2.0 | 
 |   ~ | 
 |   ~ Unless required by applicable law or agreed to in writing, | 
 |   ~ software distributed under the License is distributed on an | 
 |   ~ "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY | 
 |   ~ KIND, either express or implied.  See the License for the | 
 |   ~ specific language governing permissions and limitations | 
 |   ~ under the License. | 
 |   --> | 
 |  | 
 |  | 
 | ## Hadoop Setup | 
 |  | 
 | Following are the configurations files required to be copied over to Druid conf folders: | 
 |  | 
 | 1. For HDFS as a deep storage, hdfs-site.xml, core-site.xml | 
 | 2. For ingestion, mapred-site.xml, yarn-site.xml | 
 |  | 
 | ### HDFS Folders and permissions | 
 |  | 
 | 1. Choose any folder name for the druid deep storage, for example 'druid' | 
 | 2. Create the folder in hdfs under the required parent folder. For example, | 
 | `hdfs dfs -mkdir /druid` | 
 | OR | 
 | `hdfs dfs -mkdir /apps/druid` | 
 |  | 
 | 3. Give druid processes appropriate permissions for the druid processes to access this folder. This would ensure that druid is able to create necessary folders like data and indexing_log in HDFS. | 
 | For example, if druid processes run as user 'root', then | 
 |  | 
 |     `hdfs dfs -chown root:root /apps/druid` | 
 |  | 
 |     OR | 
 |  | 
 |     `hdfs dfs -chmod 777 /apps/druid` | 
 |  | 
 | Druid creates necessary sub-folders to store data and index under this newly created folder. | 
 |  | 
 | ## Druid Setup | 
 |  | 
 | Edit common.runtime.properties at conf/druid/_common/common.runtime.properties to include the HDFS properties. Folders used for the location are same as the ones used for example above. | 
 |  | 
 | ### common.runtime.properties | 
 |  | 
 | ```properties | 
 | # Deep storage | 
 | # | 
 | # For HDFS: | 
 | druid.storage.type=hdfs | 
 | druid.storage.storageDirectory=/druid/segments | 
 | # OR | 
 | # druid.storage.storageDirectory=/apps/druid/segments | 
 |  | 
 | # | 
 | # Indexing service logs | 
 | # | 
 |  | 
 | # For HDFS: | 
 | druid.indexer.logs.type=hdfs | 
 | druid.indexer.logs.directory=/druid/indexing-logs | 
 | # OR | 
 | # druid.storage.storageDirectory=/apps/druid/indexing-logs | 
 | ``` | 
 |  | 
 | Note: Comment out Local storage and S3 Storage parameters in the file | 
 |  | 
 | Also include hdfs-storage core extension to `conf/druid/_common/common.runtime.properties` | 
 |  | 
 | ```properties | 
 | # | 
 | # Extensions | 
 | # | 
 |  | 
 | druid.extensions.directory=dist/druid/extensions | 
 | druid.extensions.hadoopDependenciesDir=dist/druid/hadoop-dependencies | 
 | druid.extensions.loadList=["mysql-metadata-storage", "druid-hdfs-storage", "druid-kerberos"] | 
 | ``` | 
 |  | 
 | ### Hadoop Jars | 
 |  | 
 | Ensure that Druid has necessary jars to support the Hadoop version. | 
 |  | 
 | Find the hadoop version using command, `hadoop version` | 
 |  | 
 | In case there is other software used with hadoop, like `WanDisco`, ensure that | 
 | 1. the necessary libraries are available | 
 | 2. add the requisite extensions to `druid.extensions.loadlist` in `conf/druid/_common/common.runtime.properties` | 
 |  | 
 | ### Kerberos setup | 
 |  | 
 | Create a headless keytab which would have access to the druid data and index. | 
 |  | 
 | Edit conf/druid/_common/common.runtime.properties and add the following properties: | 
 |  | 
 | ```properties | 
 | druid.hadoop.security.kerberos.principal | 
 | druid.hadoop.security.kerberos.keytab | 
 | ``` | 
 |  | 
 | For example | 
 |  | 
 | ```properties | 
 | druid.hadoop.security.kerberos.principal=hdfs-test@EXAMPLE.IO | 
 | druid.hadoop.security.kerberos.keytab=/etc/security/keytabs/hdfs.headless.keytab | 
 | ``` | 
 |  | 
 | ### Restart Druid Services | 
 |  | 
 | With the above changes, restart Druid. This would ensure that Druid works with Kerberized Hadoop |