This document covers how the client application is itself configured.
The client application can be configured
slider-client.xml
in the configuration directoryconf/
dirSLIDER_CONF_DIR
is set, in the file $SLIDER_CONF_DIR/slider-client.xml
log4j.properties
file in the same configuration directory.SLIDER_JVM_OPTS
The options defined in a Slider cluster configuration are only used by the client when creating a cluster -not for the actual client itself.
The Slider client needs to be configured to talk to a Hadoop filesystem and a YARN resource manager (“the RM”). In a secure cluster it needs to be told the Kerberos identity, the principal of both the HDFS namenode and the YARN RM -and it may also need some JVM options set in order for Java's Kerberos module to correctly identify itself to these services.
It cannot rely on local HADOOP_PREFIX/conf/hadoop-site.xml
and $YARN_PREFIX/conf/yarn-site.xml
files -because it is designed to work on client machines that may not have Hadoop and YARN installed.
Instead all client-side (non-JVM) options can be predefined in the configuration file slider-client.xml
.
Core JVM options can be set in the environment variable SLIDER_JVM_OPTS
; if unset the bin/slider
script will use the default values that were current when that version of Slider was released. These values may change across versions, and may in fact be.
At the time of writing, the default values were:
"-Djava.net.preferIPv4Stack=true -Djava.awt.headless=true -Xmx256m -Dslider.confdir=${confdir}"
To allow some Java system properties to be set without editing this environment variable, such system properties may be set on the Slider command line through the -S
parameter. For example, the following two operations are equivalent in terms of setting the system property java.security.krb5.realm
to the value LOCAL
.
export SLIDER_JVM_OPTS="-Djava.security.krb5.realm=LOCAL"
and
slider -S java.security.krb5.realm=LOCAL
Note that the first declaration invalidates all default JVM options; if any of those were desired, they should be included in the new definition.
Multiple system property declarations are allowed on the command line -including duplicate declarations. In such a case the order of assignment is undefined.
For any system property that the user expects to have to issue on every command -including any kerberos-related properties, adding them to the JVM options environment variable guarantees that they are always set.
-D
parameterThe slider client is configured via Hadoop-style configuration options. To be precise, all standard Hadoop-common, hadoop-hdfs client and hadoop-yar client-side options control how Slider communicates with the Hadoop YARN cluster.
There are extra options specific to Slider itself, options which are again set as Hadoop configuration parameters.
All Hadoop and Slider options can be set on the command line using the -D
parameter followed by the appropriate key=value
argument
For example, here is a definition of the default Hadoop filesystem:
-D fs.defaultFS=hdfs://namenode:9000
Multiple definitions are of course allowed on the command line
-D fs.defaultFS=hdfs://namenode:9000 -D dfs.namenode.kerberos.principal=hdfs/namenode@LOCAL
Slider-specific options can be made the same way
-D slider.kerberos.principal=
If duplicate declarations are made the order of assignment is undefined.
Some Hadoop and Slider options are so common that they have specific shortcut commands to aid their use
-m
, --manager
: sets the YARN resource manager. Equivalent to setting the yarn.resourcemanager.address
option
--fs
, --filesystem
: defines the filesystem. Equivalent to setting the fs.defaultFS
option
If these shortcuts are used and the options are also defined via -D
declarations, the order of assignment is undefined.
slider-client.xml
file.In the Slider installation, alongside the bin/slider
script is a configuration directory conf
. This contains the files:
log4j.properties
slider-client.xml
The log4j.properties
file is not covered here -it is a standard Log4J file. At the time of writing, this log configuration file is used on both the client and the server.
The slider-client.xml
file is a hadoop-formatted XML options file, which is read by the Slider client -but not by they Slider Application Master.
Here is an example file:
<property> <name>yarn.resourcemanager.address</name> <value>namenode:8033</value> </property> <property> <name>fs.defaultFS</name> <value>hdfs://namenode:9000</value> </property> <property> <name>ipc.client.fallback-to-simple-auth-allowed</name> <value>false</value> </property>
This defines both the filesystem and the YARN RM, and so obviates the need to declare either on the command line.
If an option is defined in the slider-client.xml
file and on the command line -be it by a -D key=value
declaration or a --manager
or --filesystem
definition. (this holds even if the value is declared with <final>true</final>
).
The environment variable SLIDER_CONF_DIR
can be used to declare an alternate configuration directory. If set, the directory it identifies will be used as the source of the log4j.properties
and slider-client.xml
files.
As well as standard YARN and Hadoop configuration options, Slider supports a limited number of slider-specific configuration parameters.
<property> <name>slider.zookeeper.quorum</name> <value>localhost:2181,zookeeper2:4545</value> </property> <property> <name>slider.yarn.queue</name> <value>default</value> </property> <property> <name>slider.security.enabled</name> <value>false</value> </property> <property> <name>slider.yarn.queue</name> <value>default</value> </property> <property> <name>slider.yarn.queue.priority</name> <value>1</value> </property> <property> <name>slider.yarn.restart.limit</name> <value>5</value> <description>How many times to start/restart the Slider AM</description> </property> <property> <name>slider.cluster.directory.permissions</name> <value>750</value> </property> <property> <name>slider.data.directory.permissions</name> <value>750</value> </property>
slider.zookeeper.quorum
- the zookeeper quorum.This defines the zookeeper quorum for this YARN cluster.
It is used to locate the service registry, enable running instances to publish information about their application, and for clients to query this.
It is also used as the default zookeeper binding for any application that uses zookeeper in its configuration -the value set when the application is defined will be copied into the instance definition file.
"slider.registry.path"
- the zookeeper path for the service registryThis declares the the zookeeper path for the service registry.
slider.security.enabled
- enable security.This turns security on; consult Security for more information.
slider.yarn.restart.limit
- set limit on Application Master RestartsThis limits how many times YARN should start a failed application master.
A short restart limit is useful when initially creating a cluster, as it ensures that YARN does not repeatedly try to restart a failing application.
In production, however, a large number prevents YARN from halting a Slider application merely because failures in the underlying YARN cluster have triggered restarts.
Important: The cluster-wide limit of yarn.resourcemanager.am.max-attempts
places an upper limit on the number of retries that any application can request. If the application fails after less restarts than requested, check this cluster setting.
slider.yarn.queue
- the name of the YARN queue for the cluster.This identifies the queue submit the application creation request to, which can define the priority, resource limits and other values of an application. All containers created in the Slider cluster will share this same queue.
Default value: default
.
slider.yarn.queue.priority
- the name of the YARN queue for the cluster.This identifies the priority within the queue. The lower the value, the higher the priority
Default value: 1
.
bin/slider thaw cl1 -D slider.yarn.queue.priority=5
slider.cluster.directory.permissions
An octal-format (chmod
-style) permissions mask for the directory that contains the cluster specification ${user.home}/.slider/clusters/${clustername}
<property> <name>slider.cluster.directory.permissions</name> <value>750</value> </property>
slider.data.directory.permissions
An octal-format (chmod
-style) permissions mask for the directory that contains the application data ${user.home}/.slider/clusters/${clustername}/database
<property> <name>slider.data.directory.permissions</name> <value>750</value> </property>
If the slider packages are set to log at debug level in the log4j configuration file, details on properties will be part of the copious output.
Apart from the filesystem bindings, Client configuration options are not passed down to the XML site specification of the created cluster.
The sole options passed down are the HDFS bindings: fs.defaultFS
, which is passed down both as that property and as fs.default.name
, and, in a secure cluster, the security flag (slider.security.enabled
) and the HDFS Kerberos principal.