In this document, a full path to a value is represented as a path options/zookeeper.port
; an assigment as options/zookeeper.port=2181
.
A wildcard indicates all entries matching a path: options/zookeeper.*
or /roles/*/yarn.memory
The specificaton of an application instance is defined in an application instance directory, ${user.home}/.slidera/clusters/${clustername}/cluster.json
)
The cluster desciption is hierarchal, with standardized sections.
Different sections have one of three roles.
Storage and specification of internal properties used to define a cluster -properties that should not be modified by users -doing so is likely to render the cluster undeployable.
Storage and specification of the components deployed by Slider. These sections define options for the deployed application, the size of the deployed application, attributes of the deployed roles, and customizable aspects of the Slider application master.
This information defines the desired state of a cluster.
Users may edit these sections, either via the CLI, or by directly editing the cluster.json
file of a frozen cluster.
This information describes the actual state of a cluster.
Using a common format for both the specification and description of a cluster may be confusing, but it is designed to unify the logic needed to parse and process cluster descriptions. There is only one JSON file to parse -merely different sections of relevance at different times.
A slider-deployed application consists of the single Slider application master, and one or more roles -specific components in the actual application.
The /roles
section contains a listing for each role, declaring the number of instances of each role desired, possibly along with some details defining the actual execution of the application.
The /statistics/roles/
section returns statistics on each role, while /instances
has a per-role entry listing the YARN containers hosting instances.
The AM/application provider may generate information for use by client applications.
There are three ways to provide this
The root contains a limited number of key-value pairs,
version
: string; required. The version of the JSON file, as an x.y.z
version string.
name
: string; required. Cluster name;
type
: string; required. Reference to the provider type -this triggers a Hadoop configuration property lookup to find the implementation classes.
valid
: boolean; required. Flag to indicate whether or not a specification is considered valid. If false, the rest of the document is in an unknown state.
/slider-internal
: internal confiugrationStores internal configuration options. These parameters are not defined in this document.
/diagnostics
: diagnostics sectionsPersisted list of information about Slider.
Static information about the file history
"diagnostics" : { "create.hadoop.deployed.info" : "(detached from release-2.3.0) @dfe46336fbc6a044bc124392ec06b85", "create.application.build.info" : "Slider Core-0.13.0-SNAPSHOT Built against commit# 1a94ee4aa1 on Java 1.7.0_45 by stevel", "create.hadoop.build.info" : "2.3.0", "create.time.millis" : "1393512091276", },
This information is not intended to provide anything other than diagnostics to an application; the values and their meaning are not defined. All applications MUST be able to process an empty or absent /diagnostics
section.
A persisted list of options used by Slider and its providers to build up the AM and the configurations of the deployed service components
"options": { "slider.am.monitoring.enabled": "false", "slider.cluster.application.image.path": "hdfs://sandbox:8020/hbase.tar.gz", "slider.container.failure.threshold": "5", "slider.container.failure.shortlife": "60", "zookeeper.port": "2181", "zookeeper.path": "/yarnapps_slider_stevel_test_cluster_lifecycle", "zookeeper.hosts": "sandbox", "site.hbase.master.startup.retainassign": "true", "site.fs.defaultFS": "hdfs://sandbox:8020", "site.fs.default.name": "hdfs://sandbox:8020", "env.MALLOC_ARENA_MAX": "4", "site.hbase.master.info.port": "0", "site.hbase.regionserver.info.port": "0" },
Many of the properties are automatically set by Slider when a cluster is constructed. They may be edited afterwards.
All option values MUST be strings.
slider.
All options that begin with slider.
are intended for use by slider and providers to configure the Slider application master itself, and the application. For example, slider.container.failure.threshold
defines the number of times a container must fail before the role (and hence the cluster) is considered to have failed. As another example, the zookeeper bindings such as zookeeper.hosts
are read by the HBase and Ambari providers, and used to modify the applications' site configurations with application-specific properties.
site.
These are properties that are expected to be propagated to an application's site
configuration -if such a configuration is created. For HBase, the site file is hbase-site.xml
; for Accumulo it is accumulo-site.xml
site.
, and setting the shortened key with the defined value.env.
These are options to configure environment variables in the roles. When a container is started, all env.
options have the prefix removed, and are then set as environment variables in the target context.
The /roles/$ROLENAME/
clauses each provide options for a specific role.
This includes
role.instances
: defines the number of instances of a role to create
env.
environment variables for launching the container
yarn.
properties to configure YARN requests.
jvm.heapsize
: an option supported by some providers to fix the heap size of a component.
“worker”: { “yarn.memory”: “768”, “env.MALLOC_ARENA_MAX”: “4”, “role.instances”: “0”, “role.name”: “worker”, “role.failed.starting.instances”: “0”, “jvm.heapsize”: “512M”, “yarn.vcores”: “1”, },
The role slider
represents the Slider Application Master itself.
"slider": { "yarn.memory": "256", "env.MALLOC_ARENA_MAX": "4", "role.instances": "1", "role.name": "slider", "jvm.heapsize": "256M", "yarn.vcores": "1", },
Providers may support a fixed number of roles -or they may support a dynamic number of roles defined at run-time, potentially from other data sources.
/options
and role options are merged.The options declared for a specific role are merged with the cluster-wide options to define the final options for a role. This is implemented in a simple override model: role-specific options can override any site-wide options.
/options
are used to create the initial option map for each role./options
section.slider
role is used in the CLI to define the attributes of the AM.Options set on a role do not affect any site-wide options: they are specific to the invidual role being created.
As such, overwriting a site.
option may have no effect -or it it may change the value of a site configuration document in that specific role instance.
role.instances
: number; required. The number of instances of that role desired in the application.
yarn.vcores
: number. The number of YARN “virtual cores” to request for each role instance. The larger the number, the more CPU allocation -and potentially the longer time to satisfy the request and so instantiate the node. If the value '“-1”is used -for any role but
slider`-the maximum value available to the application is requested.
yarn.memory
: number. The number in Megabytes of RAM to request for each role instance. The larger the number, the more memory allocation -and potentially the longer time to satisfy the request and so instantiate the node. If the value '“-1”is used -for any role but
slider`-the maximum value available to the application is requested.
env.
environment variables. String environment variables to use when setting up the container
jvm.heapsize
-the amount of memory for a provider to allocate for a processes JVM. Example “512M”. This option MAY be implemented by a provider.These are the parts of the document that provide dynamic run-time information about an application. They are provided by the Slider Application Master when a request for the cluster status is issued.
/info
Dynamic set of string key-value pairs containing information about the running application -as provided by th
The values in this section are not normatively defined.
Here are some standard values
slider.am.restart.supported"
whether the AM supports service restart without killing all the containers hosting the role instances:
"slider.am.restart.supported" : "false",
timestamps of the cluster going live, and when the status query was made
"live.time" : "27 Feb 2014 14:41:56 GMT", "live.time.millis" : "1393512116881", "status.time" : "27 Feb 2014 14:42:08 GMT", "status.time.millis" : "1393512128726",
yarn data provided to the AM
"yarn.vcores" : "32", "yarn.memory" : "2048",
information about the application and hadoop versions in use. Here the application was built using Hadoop 2.3.0, but is running against the version of Hadoop built for HDP-2.
"status.application.build.info" : "Slider Core-0.13.0-SNAPSHOT Built against commit# 1a94ee4aa1 on Java 1.7.0_45 by stevel", "status.hadoop.build.info" : "2.3.0", "status.hadoop.deployed.info" : "bigwheel-m16-2.2.0 @704f1e463ebc4fb89353011407e965"
As with the /diagnostics
section, this area is primarily intended for debugging.
/instances
: instance listInformation about the live containers in a cluster
"instances": { "slider": [ "container_1393511571284_0002_01_000001" ], "master": [ "container_1393511571284_0002_01_000003" ], "worker": [ "container_1393511571284_0002_01_000002", "container_1393511571284_0002_01_000004" ] },
/status
: detailed dynamic stateThis provides more detail on the application including live and failed instances
/status/live
: live role instances by container"cluster": { "live": { "worker": { "container_1394032374441_0001_01_000003": { "name": "container_1394032374441_0001_01_000003", "role": "worker", "roleId": 1, "createTime": 1394032384451, "startTime": 1394032384503, "released": false, "host": "192.168.1.88", "state": 3, "exitCode": 0, "command": "hbase-0.98.0/bin/hbase --config $PROPAGATED_CONFDIR regionserver start 1><LOG_DIR>/region-server.txt 2>&1 ; ", "diagnostics": "", "environment": [ "HADOOP_USER_NAME=\"slider\"", "HBASE_LOG_DIR=\"/tmp/slider-slider\"", "HBASE_HEAPSIZE=\"256\"", "MALLOC_ARENA_MAX=\"4\"", "PROPAGATED_CONFDIR=\"$PWD/propagatedconf\"" ] } } failed : {} }
All live instances MUST be described in /status/live
Failed clusters MAY be listed in the /status/failed
section, specifically, a limited set of recently failed clusters SHOULD be provided.
Future versions of this document may introduce more sections under /status
.
/status/rolestatus
: role status informationThis lists the current status of the roles: How many are running vs requested, how many are being released.
"rolestatus": { "worker": { "role.instances": "2", "role.requested.instances": "0", "role.failed.starting.instances": "0", "role.actual.instances": "2", "role.releasing.instances": "0", "role.failed.instances": "1" }, "slider": { "role.instances": "1", "role.requested.instances": "0", "role.name": "slider", "role.actual.instances": "1", "role.releasing.instances": "0", "role.failed.instances": "0" }, "master": { "role.instances": "1", "role.requested.instances": "1", "role.name": "master", "role.failed.starting.instances": "0", "role.actual.instances": "0", "role.releasing.instances": "0", "role.failed.instances": "0" } }
/status/provider
: provider-specific informationProviders MAY publish information to the /status/provider
section.
/statistics
: aggregate statisticsStatistics on the cluster and each role in the cluster
Better to have a specific /statistics/cluster
element, and to move the roles' statistics under /statistics/roles
:
"statistics": { "cluster": { "containers.unknown.completed": 0, "containers.start.completed": 3, "containers.live": 1, "containers.start.failed": 0, "containers.failed": 0, "containers.completed": 0, "containers.surplus": 0 }, "roles": { "worker": { "containers.start.completed": 0, "containers.live": 2, "containers.start.failed": 0, "containers.active.requests": 0, "containers.failed": 0, "containers.completed": 0, "containers.desired": 2, "containers.requested": 0 }, "master": { "containers.start.completed": 0, "containers.live": 1, "containers.start.failed": 0, "containers.active.requests": 0, "containers.failed": 0, "containers.completed": 0, "containers.desired": 1, "containers.requested": 0 } } },
/statistics/cluster
provides aggregate statistics for the entire cluster.
Under /statistics/roles
MUST come an entry for each role in the cluster.
All simple values in statistics section are integers.
/clientProperties
The /clientProperties
section contains key-val pairs of type string, the expectation being this is where providers can insert specific single attributes for client applications.
These values can be converted to application-specific files on the client, in code -as done today in the Slider CLI-, or via template expansion (beyond the scope of this document.
/clientfiles
This section list all files that an application instance MAY generate for clients, along with with a description.
"/clientfiles/hbase-site.xml": "site information for HBase" "/clientfiles/log4.properties": "log4.property file"
Client configuration file retrieval is by other means; this status operation merely lists files that are available;