Following are the configurations files required to be copied over to Druid conf folders:
Choose any folder name for the druid deep storage, for example ‘druid’
Create the folder in hdfs under the required parent folder. For example, hdfs dfs -mkdir /druid
OR hdfs dfs -mkdir /apps/druid
Give druid processes appropriate permissions for the druid processes to access this folder. This would ensure that druid is able to create necessary folders like data and indexing_log in HDFS. For example, if druid processes run as user ‘root’, then
hdfs dfs -chown root:root /apps/druid
OR
hdfs dfs -chmod 777 /apps/druid
Druid creates necessary sub-folders to store data and index under this newly created folder.
Edit common.runtime.properties at conf/druid/_common/common.runtime.properties to include the HDFS properties. Folders used for the location are same as the ones used for example above.
# Deep storage # # For HDFS: druid.storage.type=hdfs druid.storage.storageDirectory=/druid/segments # OR # druid.storage.storageDirectory=/apps/druid/segments # # Indexing service logs # # For HDFS: druid.indexer.logs.type=hdfs druid.indexer.logs.directory=/druid/indexing-logs # OR # druid.storage.storageDirectory=/apps/druid/indexing-logs
Note: Comment out Local storage and S3 Storage parameters in the file
Also include hdfs-storage core extension to conf/druid/_common/common.runtime.properties
# # Extensions # druid.extensions.directory=dist/druid/extensions druid.extensions.hadoopDependenciesDir=dist/druid/hadoop-dependencies druid.extensions.loadList=["mysql-metadata-storage", "druid-hdfs-storage", "druid-kerberos"]
Ensure that Druid has necessary jars to support the Hadoop version.
Find the hadoop version using command, hadoop version
In case there is other software used with hadoop, like WanDisco
, ensure that
druid.extensions.loadlist
in conf/druid/_common/common.runtime.properties
Create a headless keytab which would have access to the druid data and index.
Edit conf/druid/_common/common.runtime.properties and add the following properties:
druid.hadoop.security.kerberos.principal druid.hadoop.security.kerberos.keytab
For example
druid.hadoop.security.kerberos.principal=hdfs-test@EXAMPLE.IO druid.hadoop.security.kerberos.keytab=/etc/security/keytabs/hdfs.headless.keytab
With the above changes, restart Druid. This would ensure that Druid works with Kerberized Hadoop