This document discusses the design, implementation and use of Slider to deploy secure applications on a secure Hadoop cluster.
This document does not cover Kerberos, how to secure a Hadoop cluster, Kerberos command line tools or how Hadoop uses delegation tokens to delegate permissions round a cluster. These are assumed, though some links to useful pages are listed at the bottom.
Slider runs in secure clusters, but with restrictions
~/.slider/clusters/$name/data
to be writable by HBasekinit
or equivalent to authenticate with Kerberos and gain a (time-bounded) TGTThe Slider Client will talk to HDFS and YARN authenticating itself with the TGT, talking to the YARN and HDFS principals which it has been configured to expect.
This can be done as described in [Client Configuration] (client-configuration.html) on the command line as
-D yarn.resourcemanager.principal=yarn/master@LOCAL -D dfs.namenode.kerberos.principal=hdfs/master@LOCAL
The Slider Client will create the cluster data directory in HDFS with rwx
permissions for
user r-x
for the group and ---
for others. (these can be configurable as part of the cluster options),
It will then deploy the AM, which will (somehow? for how long?) retain the access rights of the user that created the cluster.
The Application Master will read in the JSON cluster specification file, and instantiate the relevant number of componentss.
When the AM is deployed in a secure cluster, it automatically uses Kerberos-authorized RPC channels. The client must acquire a token to talk the AM.
This is provided by the YARN Resource Manager when the client application wishes to talk with the SliderAM -a token which is only provided after the caller authenticates itself as the user that has access rights to the cluster
To allow the client to freeze a Slider application instance while they are unable to acquire a token to authenticate with the AM, use the --force
option.
Slider can be placed into secure mode by setting the Hadoop security options:
This can be done in slider-client.xml
:
<property> <name>hadoop.security.authorization</name> <value>true</value> </property> <property> <name>hadoop.security.authentication</name> <value>kerberos</value> </property>
Or it can be done on the command line
-D hadoop.security.authorization=true -D hadoop.security.authentication=kerberos
The Java Kerberos library needs to know the Kerberos controller and realm to use. This should happen automatically if this is set up as the default Kerberos binding (on a Unix system this is done in /etc/krb5.conf
.
If is not set up, a stack trace with kerberos classes at the top and the message java.lang.IllegalArgumentException: Can't get Kerberos realm
will be printed -and the client will then fail.
The realm and controller can be defined in the Java system properties java.security.krb5.realm
and java.security.krb5.kdc
. These can be fixed in the JVM options, as described in the [Client Configuration] (client-configuration.html) documentation.
They can also be set on the Slider command line itself, using the -S
parameter.
-S java.security.krb5.realm=MINICLUSTER -S java.security.krb5.kdc=hadoop-kdc
When trying to talk to a secure, cluster you may see the message:
No valid credentials provided (Mechanism level: Illegal key size)]
This means that the JRE does not have the extended cryptography package needed to work with the keys that Kerberos needs. This must be downloaded from Oracle (or other supplier of the JVM) and installed according to its accompanying instructions.
/System/Library/CoreServices/Ticket\ Viewer.app