branch-2.0.4-alpha/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-site/src/site/apt/CapacityScheduler.apt.vm - hadoop - Git at Google

 ~~ Licensed under the Apache License, Version 2.0 (the "License");
 ~~ you may not use this file except in compliance with the License.
 ~~ You may obtain a copy of the License at
 ~~
 ~~   http://www.apache.org/licenses/LICENSE-2.0
 ~~
 ~~ Unless required by applicable law or agreed to in writing, software
 ~~ distributed under the License is distributed on an "AS IS" BASIS,
 ~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 ~~ See the License for the specific language governing permissions and
 ~~ limitations under the License. See accompanying LICENSE file.

   ---
   Hadoop Map Reduce Next Generation-${project.version} - Capacity Scheduler
   ---
   ---
   ${maven.build.timestamp}

 Hadoop MapReduce Next Generation - Capacity Scheduler

   \[ {{{./index.html}Go Back}} \]

 %{toc|section=1|fromDepth=0}

 * {Purpose}

   This document describes the <<<CapacityScheduler>>>, a pluggable scheduler
   for Hadoop which allows for multiple-tenants to securely share a large cluster
   such that their applications are allocated resources in a timely manner under
   constraints of allocated capacities.

 * {Overview}

   The <<<CapacityScheduler>>> is designed to run Hadoop applications as a
   shared, multi-tenant cluster in an operator-friendly manner while maximizing
   the throughput and the utilization of the cluster.

   Traditionally each organization has it own private set of compute resources
   that have sufficient capacity to meet the organization's SLA under peak or
   near peak conditions. This generally leads to poor average utilization and
   overhead of managing multiple independent clusters, one per each organization.
   Sharing clusters between organizations is a cost-effective manner of running
   large Hadoop installations since this allows them to reap benefits of
   economies of scale without creating private clusters. However, organizations
   are concerned about sharing a cluster because they are worried about others
   using the resources that are critical for their SLAs.

   The <<<CapacityScheduler>>> is designed to allow sharing a large cluster while
   giving each organization capacity guarantees. The central idea is
   that the available resources in the Hadoop cluster are shared among multiple
   organizations who collectively fund the cluster based on their computing
   needs. There is an added benefit that an organization can access
   any excess capacity not being used by others. This provides elasticity for
   the organizations in a cost-effective manner.

   Sharing clusters across organizations necessitates strong support for
   multi-tenancy since each organization must be guaranteed capacity and
   safe-guards to ensure the shared cluster is impervious to single rouge
   application or user or sets thereof. The <<<CapacityScheduler>>> provides a
   stringent set of limits to ensure that a single application or user or queue
   cannot consume disproportionate amount of resources in the cluster. Also, the
   <<<CapacityScheduler>>> provides limits on initialized/pending applications
   from a single user and queue to ensure fairness and stability of the cluster.

   The primary abstraction provided by the <<<CapacityScheduler>>> is the concept
   of <queues>. These queues are typically setup by administrators to reflect the
   economics of the shared cluster.

   To provide further control and predictability on sharing of resources, the
   <<<CapacityScheduler>>> supports <hierarchical queues> to ensure
   resources are shared among the sub-queues of an organization before other
   queues are allowed to use free resources, there-by providing <affinity>
   for sharing free resources among applications of a given organization.

 * {Features}

   The <<<CapacityScheduler>>> supports the following features:

   * Hierarchical Queues - Hierarchy of queues is supported to ensure resources
     are shared among the sub-queues of an organization before other
     queues are allowed to use free resources, there-by providing more control
     and predictability.

   * Capacity Guarantees - Queues are allocated a fraction of the capacity of the
     grid in the sense that a certain capacity of resources will be at their
     disposal. All applications submitted to a queue will have access to the
     capacity allocated to the queue. Adminstrators can configure soft limits and
     optional hard limits on the capacity allocated to each queue.

   * Security - Each queue has strict ACLs which controls which users can submit
     applications to individual queues. Also, there are safe-guards to ensure
     that users cannot view and/or modify applications from other users.
     Also, per-queue and system administrator roles are supported.

   * Elasticity - Free resources can be allocated to any queue beyond it's
     capacity. When there is demand for these resources from queues running below
     capacity at a future point in time, as tasks scheduled on these resources
     complete, they will be assigned to applications on queues running below the
     capacity (pre-emption is not supported). This ensures that resources are available
     in a predictable and elastic manner to queues, thus preventing artifical silos
     of resources in the cluster which helps utilization.

   * Multi-tenancy - Comprehensive set of limits are provided to prevent a
     single application, user and queue from monopolizing resources of the queue
     or the cluster as a whole to ensure that the cluster isn't overwhelmed.

   * Operability

     * Runtime Configuration - The queue definitions and properties such as
       capacity, ACLs can be changed, at runtime, by administrators in a secure
       manner to minimize disruption to users. Also, a console is provided for
       users and administrators to view current allocation of resources to
       various queues in the system. Administrators can <add additional queues>
       at runtime, but queues cannot be <deleted> at runtime.

     * Drain applications - Administrators can <stop> queues
       at runtime to ensure that while existing applications run to completion,
       no new applications can be submitted. If a queue is in <<<STOPPED>>>
       state, new applications cannot be submitted to <itself> or
       <any of its child queueus>. Existing applications continue to completion,
       thus the queue can be <drained> gracefully.  Administrators can also
       <start> the stopped queues.

   * Resource-based Scheduling - Support for resource-intensive applications,
     where-in a application can optionally specify higher resource-requirements
     than the default, there-by accomodating applications with differing resource
     requirements. Currently, <memory> is the the resource requirement supported.

   []

 * {Configuration}

   * Setting up <<<ResourceManager>>> to use <<<CapacityScheduler>>>

     To configure the <<<ResourceManager>>> to use the <<<CapacityScheduler>>>, set
     the following property in the <<conf/yarn-site.xml>>:

 *--------------------------------------+--------------------------------------+
 || Property                            || Value                                |
 *--------------------------------------+--------------------------------------+
 | <<<yarn.resourcemanager.scheduler.class>>> | |
 | | <<<org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler>>> |
 *--------------------------------------+--------------------------------------+

   * Setting up <queues>

     <<conf/capacity-scheduler.xml>> is the configuration file for the
     <<<CapacityScheduler>>>.

     The <<<CapacityScheduler>>> has a pre-defined queue called <root>. All
     queueus in the system are children of the root queue.

     Further queues can be setup by configuring
     <<<yarn.scheduler.capacity.root.queues>>> with a list of comma-separated
     child queues.

     The configuration for <<<CapacityScheduler>>> uses a concept called
     <queue path> to configure the hierarchy of queues. The <queue path> is the
     full path of the queue's hierarchy, starting at <root>, with . (dot) as the
     delimiter.

     A given queue's children can be defined with the configuration knob:
     <<<yarn.scheduler.capacity.<queue-path>.queues>>>. Children do not
     inherit properties directly from the parent unless otherwise noted.

     Here is an example with three top-level child-queues <<<a>>>, <<<b>>> and
     <<<c>>> and some sub-queues for <<<a>>> and <<<b>>>:

 ----
 <property>
   <name>yarn.scheduler.capacity.root.queues</name>
   <value>a,b,c</value>
   <description>The queues at the this level (root is the root queue).
   </description>
 </property>

 <property>
   <name>yarn.scheduler.capacity.root.a.queues</name>
   <value>a1,a2</value>
   <description>The queues at the this level (root is the root queue).
   </description>
 </property>

 <property>
   <name>yarn.scheduler.capacity.root.b.queues</name>
   <value>b1,b2,b3</value>
   <description>The queues at the this level (root is the root queue).
   </description>
 </property>
 ----

   * Queue Properties

     * Resource Allocation

 *--------------------------------------+--------------------------------------+
 || Property                            || Description                         |
 *--------------------------------------+--------------------------------------+
 | <<<yarn.scheduler.capacity.<queue-path>.capacity>>> | |
 | | Queue <capacity> in percentage (%) as a float (e.g. 12.5).|
 | | The sum of capacities for all queues, at each level, must be equal |
 | | to 100. |
 | | Applications in the queue may consume more resources than the queue's |
 | | capacity if there are free resources, providing elasticity. |
 *--------------------------------------+--------------------------------------+
 | <<<yarn.scheduler.capacity.<queue-path>.maximum-capacity>>> |   |
 | | Maximum queue capacity in percentage (%) as a float. |
 | | This limits the <elasticity> for applications in the queue. |
 | | Defaults to -1 which disables it. |
 *--------------------------------------+--------------------------------------+
 | <<<yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent>>> |   |
 | | Each queue enforces a limit on the percentage of resources allocated to a |
 | | user at any given time, if there is demand for resources. The user limit |
 | | can vary between a minimum and maximum value. The the former |
 | | (the minimum value) is set to this property value and the latter |
 | | (the maximum value) depends on the number of users who have submitted |
 | | applications. For e.g., suppose the value of this property is 25. |
 | | If two users have submitted applications to a queue, no single user can |
 | | use more than 50% of the queue resources. If a third user submits an |
 | | application, no single user can use more than 33% of the queue resources. |
 | | With 4 or more users, no user can use more than 25% of the queues |
 | | resources. A value of 100 implies no user limits are imposed. The default |
 | | is 100.|
 *--------------------------------------+--------------------------------------+
 | <<<yarn.scheduler.capacity.<queue-path>.user-limit-factor>>> |   |
 | | The multiple of the queue capacity which can be configured to allow a |
 | | single user to acquire more resources. By default this is set to 1 which |
 | | ensures that a single user can never take more than the queue's configured |
 | | capacity irrespective of how idle th cluster is. Value is specified as |
 | | a float.|
 *--------------------------------------+--------------------------------------+

     * Running and Pending Application Limits


     The <<<CapacityScheduler>>> supports the following parameters to control
     the running and pending applications:


 *--------------------------------------+--------------------------------------+
 || Property                            || Description                         |
 *--------------------------------------+--------------------------------------+
 | <<<yarn.scheduler.capacity.maximum-applications>>> /  |
 | <<<yarn.scheduler.capacity.<queue-path>.maximum-applications>>>  | |
 | | Maximum number of applications in the system which can be concurrently |
 | | active both running and pending. Limits on each queue are directly |
 | | proportional to their queue capacities and user limits. This is a
 | | hard limit and any applications submitted when this limit is reached will |
 | | be rejected. Default is 10000. This can be set for all queues with |
 | | <<<yarn.scheduler.capacity.maximum-applications>>> and can also be overridden on a  |
 | | per queue basis by setting <<<yarn.scheduler.capacity.<queue-path>.maximum-applications>>>. |
 *--------------------------------------+--------------------------------------+
 | <<<yarn.scheduler.capacity.maximum-am-resource-percent>>> / |
 | <<<yarn.scheduler.capacity.<queue-path>.maximum-am-resource-percent>>> | |
 | | Maximum percent of resources in the cluster which can be used to run |
 | | application masters - controls number of concurrent active applications. Limits on each |
 | | queue are directly proportional to their queue capacities and user limits. |
 | | Specified as a float - ie 0.5 = 50%. Default is 10%. This can be set for all queues with |
 | | <<<yarn.scheduler.capacity.maximum-am-resource-percent>>> and can also be overridden on a  |
 | | per queue basis by setting <<<yarn.scheduler.capacity.<queue-path>.maximum-am-resource-percent>>> |
 *--------------------------------------+--------------------------------------+

     * Queue Administration & Permissions

     The <<<CapacityScheduler>>> supports the following parameters to
     the administer the queues:


 *--------------------------------------+--------------------------------------+
 || Property                            || Description                         |
 *--------------------------------------+--------------------------------------+
 | <<<yarn.scheduler.capacity.<queue-path>.state>>> | |
 | | The <state> of the queue. Can be one of <<<RUNNING>>> or <<<STOPPED>>>. |
 | | If a queue is in <<<STOPPED>>> state, new applications cannot be |
 | | submitted to <itself> or <any of its child queues>. |
 | | Thus, if the <root> queue is <<<STOPPED>>> no applications can be |
 | | submitted to the entire cluster. |
 | | Existing applications continue to completion, thus the queue can be
 | | <drained> gracefully. |
 *--------------------------------------+--------------------------------------+
 | <<<yarn.scheduler.capacity.root.<queue-path>.acl_submit_applications>>> | |
 | | The <ACL> which controls who can <submit> applications to the given queue. |
 | | If the given user/group has necessary ACLs on the given queue or |
 | | <one of the parent queues in the hierarchy> they can submit applications. |
 | | <ACLs> for this property <are> inherited from the parent queue |
 | | if not specified. |
 *--------------------------------------+--------------------------------------+
 | <<<yarn.scheduler.capacity.root.<queue-path>.acl_administer_queue>>> | |
 | | The <ACL> which controls who can <administer> applications on the given queue. |
 | | If the given user/group has necessary ACLs on the given queue or |
 | | <one of the parent queues in the hierarchy> they can administer applications. |
 | | <ACLs> for this property <are> inherited from the parent queue |
 | | if not specified. |
 *--------------------------------------+--------------------------------------+

     <Note:> An <ACL> is of the form <user1>, <user2><space><group1>, <group2>.
     The special value of <<*>> implies <anyone>. The special value of <space>
     implies <no one>. The default is <<*>> for the root queue if not specified.

     * Reviewing the configuration of the CapacityScheduler

       Once the installation and configuration is completed, you can review it
       after starting the YARN cluster from the web-ui.

       * Start the YARN cluster in the normal manner.

       * Open the <<<ResourceManager>>> web UI.

       * The </scheduler> web-page should show the resource usages of individual
         queues.

       []

 * {Changing Queue Configuration}

   Changing queue properties and adding new queues is very simple. You need to
   edit <<conf/capacity-scheduler.xml>> and run <yarn rmadmin -refreshQueues>.

 ----
 $ vi $HADOOP_CONF_DIR/capacity-scheduler.xml
 $ $HADOOP_YARN_HOME/bin/yarn rmadmin -refreshQueues
 ----

   <Note:> Queues cannot be <deleted>, only addition of new queues is supported -
   the updated queue configuration should be a valid one i.e. queue-capacity at
   each <level> should be equal to 100%.
	~~ Licensed under the Apache License, Version 2.0 (the "License");
	~~ you may not use this file except in compliance with the License.
	~~ You may obtain a copy of the License at
	~~
	~~ http://www.apache.org/licenses/LICENSE-2.0
	~~
	~~ Unless required by applicable law or agreed to in writing, software
	~~ distributed under the License is distributed on an "AS IS" BASIS,
	~~ WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
	~~ See the License for the specific language governing permissions and
	~~ limitations under the License. See accompanying LICENSE file.

	---
	Hadoop Map Reduce Next Generation-${project.version} - Capacity Scheduler
	---
	---
	${maven.build.timestamp}

	Hadoop MapReduce Next Generation - Capacity Scheduler

	\[ {{{./index.html}Go Back}} \]

	%{toc\|section=1\|fromDepth=0}

	* {Purpose}

	This document describes the <<<CapacityScheduler>>>, a pluggable scheduler
	for Hadoop which allows for multiple-tenants to securely share a large cluster
	such that their applications are allocated resources in a timely manner under
	constraints of allocated capacities.

	* {Overview}

	The <<<CapacityScheduler>>> is designed to run Hadoop applications as a
	shared, multi-tenant cluster in an operator-friendly manner while maximizing
	the throughput and the utilization of the cluster.

	Traditionally each organization has it own private set of compute resources
	that have sufficient capacity to meet the organization's SLA under peak or
	near peak conditions. This generally leads to poor average utilization and
	overhead of managing multiple independent clusters, one per each organization.
	Sharing clusters between organizations is a cost-effective manner of running
	large Hadoop installations since this allows them to reap benefits of
	economies of scale without creating private clusters. However, organizations
	are concerned about sharing a cluster because they are worried about others
	using the resources that are critical for their SLAs.

	The <<<CapacityScheduler>>> is designed to allow sharing a large cluster while
	giving each organization capacity guarantees. The central idea is
	that the available resources in the Hadoop cluster are shared among multiple
	organizations who collectively fund the cluster based on their computing
	needs. There is an added benefit that an organization can access
	any excess capacity not being used by others. This provides elasticity for
	the organizations in a cost-effective manner.

	Sharing clusters across organizations necessitates strong support for
	multi-tenancy since each organization must be guaranteed capacity and
	safe-guards to ensure the shared cluster is impervious to single rouge
	application or user or sets thereof. The <<<CapacityScheduler>>> provides a
	stringent set of limits to ensure that a single application or user or queue
	cannot consume disproportionate amount of resources in the cluster. Also, the
	<<<CapacityScheduler>>> provides limits on initialized/pending applications
	from a single user and queue to ensure fairness and stability of the cluster.

	The primary abstraction provided by the <<<CapacityScheduler>>> is the concept
	of <queues>. These queues are typically setup by administrators to reflect the
	economics of the shared cluster.

	To provide further control and predictability on sharing of resources, the
	<<<CapacityScheduler>>> supports <hierarchical queues> to ensure
	resources are shared among the sub-queues of an organization before other
	queues are allowed to use free resources, there-by providing <affinity>
	for sharing free resources among applications of a given organization.

	* {Features}

	The <<<CapacityScheduler>>> supports the following features:

	* Hierarchical Queues - Hierarchy of queues is supported to ensure resources
	are shared among the sub-queues of an organization before other
	queues are allowed to use free resources, there-by providing more control
	and predictability.

	* Capacity Guarantees - Queues are allocated a fraction of the capacity of the
	grid in the sense that a certain capacity of resources will be at their
	disposal. All applications submitted to a queue will have access to the
	capacity allocated to the queue. Adminstrators can configure soft limits and
	optional hard limits on the capacity allocated to each queue.

	* Security - Each queue has strict ACLs which controls which users can submit
	applications to individual queues. Also, there are safe-guards to ensure
	that users cannot view and/or modify applications from other users.
	Also, per-queue and system administrator roles are supported.

	* Elasticity - Free resources can be allocated to any queue beyond it's
	capacity. When there is demand for these resources from queues running below
	capacity at a future point in time, as tasks scheduled on these resources
	complete, they will be assigned to applications on queues running below the
	capacity (pre-emption is not supported). This ensures that resources are available
	in a predictable and elastic manner to queues, thus preventing artifical silos
	of resources in the cluster which helps utilization.

	* Multi-tenancy - Comprehensive set of limits are provided to prevent a
	single application, user and queue from monopolizing resources of the queue
	or the cluster as a whole to ensure that the cluster isn't overwhelmed.

	* Operability

	* Runtime Configuration - The queue definitions and properties such as
	capacity, ACLs can be changed, at runtime, by administrators in a secure
	manner to minimize disruption to users. Also, a console is provided for
	users and administrators to view current allocation of resources to
	various queues in the system. Administrators can <add additional queues>
	at runtime, but queues cannot be <deleted> at runtime.

	* Drain applications - Administrators can <stop> queues
	at runtime to ensure that while existing applications run to completion,
	no new applications can be submitted. If a queue is in <<<STOPPED>>>
	state, new applications cannot be submitted to <itself> or
	<any of its child queueus>. Existing applications continue to completion,
	thus the queue can be <drained> gracefully. Administrators can also
	<start> the stopped queues.

	* Resource-based Scheduling - Support for resource-intensive applications,
	where-in a application can optionally specify higher resource-requirements
	than the default, there-by accomodating applications with differing resource
	requirements. Currently, <memory> is the the resource requirement supported.

	[]

	* {Configuration}

	* Setting up <<<ResourceManager>>> to use <<<CapacityScheduler>>>

	To configure the <<<ResourceManager>>> to use the <<<CapacityScheduler>>>, set
	the following property in the <<conf/yarn-site.xml>>:

	*--------------------------------------+--------------------------------------+
	\|\| Property \|\| Value \|
	*--------------------------------------+--------------------------------------+
	\| <<<yarn.resourcemanager.scheduler.class>>> \| \|
	\| \| <<<org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.CapacityScheduler>>> \|
	*--------------------------------------+--------------------------------------+

	* Setting up <queues>

	<<conf/capacity-scheduler.xml>> is the configuration file for the
	<<<CapacityScheduler>>>.

	The <<<CapacityScheduler>>> has a pre-defined queue called <root>. All
	queueus in the system are children of the root queue.

	Further queues can be setup by configuring
	<<<yarn.scheduler.capacity.root.queues>>> with a list of comma-separated
	child queues.

	The configuration for <<<CapacityScheduler>>> uses a concept called
	<queue path> to configure the hierarchy of queues. The <queue path> is the
	full path of the queue's hierarchy, starting at <root>, with . (dot) as the
	delimiter.

	A given queue's children can be defined with the configuration knob:
	<<<yarn.scheduler.capacity.<queue-path>.queues>>>. Children do not
	inherit properties directly from the parent unless otherwise noted.

	Here is an example with three top-level child-queues <<<a>>>, <<<b>>> and
	<<<c>>> and some sub-queues for <<<a>>> and <<<b>>>:

	----
	<property>
	<name>yarn.scheduler.capacity.root.queues</name>
	<value>a,b,c</value>
	<description>The queues at the this level (root is the root queue).
	</description>
	</property>

	<property>
	<name>yarn.scheduler.capacity.root.a.queues</name>
	<value>a1,a2</value>
	<description>The queues at the this level (root is the root queue).
	</description>
	</property>

	<property>
	<name>yarn.scheduler.capacity.root.b.queues</name>
	<value>b1,b2,b3</value>
	<description>The queues at the this level (root is the root queue).
	</description>
	</property>
	----

	* Queue Properties

	* Resource Allocation

	*--------------------------------------+--------------------------------------+
	\|\| Property \|\| Description \|
	*--------------------------------------+--------------------------------------+
	\| <<<yarn.scheduler.capacity.<queue-path>.capacity>>> \| \|
	\| \| Queue <capacity> in percentage (%) as a float (e.g. 12.5).\|
	\| \| The sum of capacities for all queues, at each level, must be equal \|
	\| \| to 100. \|
	\| \| Applications in the queue may consume more resources than the queue's \|
	\| \| capacity if there are free resources, providing elasticity. \|
	*--------------------------------------+--------------------------------------+
	\| <<<yarn.scheduler.capacity.<queue-path>.maximum-capacity>>> \| \|
	\| \| Maximum queue capacity in percentage (%) as a float. \|
	\| \| This limits the <elasticity> for applications in the queue. \|
	\| \| Defaults to -1 which disables it. \|
	*--------------------------------------+--------------------------------------+
	\| <<<yarn.scheduler.capacity.<queue-path>.minimum-user-limit-percent>>> \| \|
	\| \| Each queue enforces a limit on the percentage of resources allocated to a \|
	\| \| user at any given time, if there is demand for resources. The user limit \|
	\| \| can vary between a minimum and maximum value. The the former \|
	\| \| (the minimum value) is set to this property value and the latter \|
	\| \| (the maximum value) depends on the number of users who have submitted \|
	\| \| applications. For e.g., suppose the value of this property is 25. \|
	\| \| If two users have submitted applications to a queue, no single user can \|
	\| \| use more than 50% of the queue resources. If a third user submits an \|
	\| \| application, no single user can use more than 33% of the queue resources. \|
	\| \| With 4 or more users, no user can use more than 25% of the queues \|
	\| \| resources. A value of 100 implies no user limits are imposed. The default \|
	\| \| is 100.\|
	*--------------------------------------+--------------------------------------+
	\| <<<yarn.scheduler.capacity.<queue-path>.user-limit-factor>>> \| \|
	\| \| The multiple of the queue capacity which can be configured to allow a \|
	\| \| single user to acquire more resources. By default this is set to 1 which \|
	\| \| ensures that a single user can never take more than the queue's configured \|
	\| \| capacity irrespective of how idle th cluster is. Value is specified as \|
	\| \| a float.\|
	*--------------------------------------+--------------------------------------+

	* Running and Pending Application Limits


	The <<<CapacityScheduler>>> supports the following parameters to control
	the running and pending applications:


	*--------------------------------------+--------------------------------------+
	\|\| Property \|\| Description \|
	*--------------------------------------+--------------------------------------+
	\| <<<yarn.scheduler.capacity.maximum-applications>>> / \|
	\| <<<yarn.scheduler.capacity.<queue-path>.maximum-applications>>> \| \|
	\| \| Maximum number of applications in the system which can be concurrently \|
	\| \| active both running and pending. Limits on each queue are directly \|
	\| \| proportional to their queue capacities and user limits. This is a
	\| \| hard limit and any applications submitted when this limit is reached will \|
	\| \| be rejected. Default is 10000. This can be set for all queues with \|
	\| \| <<<yarn.scheduler.capacity.maximum-applications>>> and can also be overridden on a \|
	\| \| per queue basis by setting <<<yarn.scheduler.capacity.<queue-path>.maximum-applications>>>. \|
	*--------------------------------------+--------------------------------------+
	\| <<<yarn.scheduler.capacity.maximum-am-resource-percent>>> / \|
	\| <<<yarn.scheduler.capacity.<queue-path>.maximum-am-resource-percent>>> \| \|
	\| \| Maximum percent of resources in the cluster which can be used to run \|
	\| \| application masters - controls number of concurrent active applications. Limits on each \|
	\| \| queue are directly proportional to their queue capacities and user limits. \|
	\| \| Specified as a float - ie 0.5 = 50%. Default is 10%. This can be set for all queues with \|
	\| \| <<<yarn.scheduler.capacity.maximum-am-resource-percent>>> and can also be overridden on a \|
	\| \| per queue basis by setting <<<yarn.scheduler.capacity.<queue-path>.maximum-am-resource-percent>>> \|
	*--------------------------------------+--------------------------------------+

	* Queue Administration & Permissions

	The <<<CapacityScheduler>>> supports the following parameters to
	the administer the queues:


	*--------------------------------------+--------------------------------------+
	\|\| Property \|\| Description \|
	*--------------------------------------+--------------------------------------+
	\| <<<yarn.scheduler.capacity.<queue-path>.state>>> \| \|
	\| \| The <state> of the queue. Can be one of <<<RUNNING>>> or <<<STOPPED>>>. \|
	\| \| If a queue is in <<<STOPPED>>> state, new applications cannot be \|
	\| \| submitted to <itself> or <any of its child queues>. \|
	\| \| Thus, if the <root> queue is <<<STOPPED>>> no applications can be \|
	\| \| submitted to the entire cluster. \|
	\| \| Existing applications continue to completion, thus the queue can be
	\| \| <drained> gracefully. \|
	*--------------------------------------+--------------------------------------+
	\| <<<yarn.scheduler.capacity.root.<queue-path>.acl_submit_applications>>> \| \|
	\| \| The <ACL> which controls who can <submit> applications to the given queue. \|
	\| \| If the given user/group has necessary ACLs on the given queue or \|
	\| \| <one of the parent queues in the hierarchy> they can submit applications. \|
	\| \| <ACLs> for this property <are> inherited from the parent queue \|
	\| \| if not specified. \|
	*--------------------------------------+--------------------------------------+
	\| <<<yarn.scheduler.capacity.root.<queue-path>.acl_administer_queue>>> \| \|
	\| \| The <ACL> which controls who can <administer> applications on the given queue. \|
	\| \| If the given user/group has necessary ACLs on the given queue or \|
	\| \| <one of the parent queues in the hierarchy> they can administer applications. \|
	\| \| <ACLs> for this property <are> inherited from the parent queue \|
	\| \| if not specified. \|
	*--------------------------------------+--------------------------------------+

	<Note:> An <ACL> is of the form <user1>, <user2><space><group1>, <group2>.
	The special value of <<*>> implies <anyone>. The special value of <space>
	implies <no one>. The default is <<*>> for the root queue if not specified.

	* Reviewing the configuration of the CapacityScheduler

	Once the installation and configuration is completed, you can review it
	after starting the YARN cluster from the web-ui.

	* Start the YARN cluster in the normal manner.

	* Open the <<<ResourceManager>>> web UI.

	* The </scheduler> web-page should show the resource usages of individual
	queues.

	[]

	* {Changing Queue Configuration}

	Changing queue properties and adding new queues is very simple. You need to
	edit <<conf/capacity-scheduler.xml>> and run <yarn rmadmin -refreshQueues>.

	----
	$ vi $HADOOP_CONF_DIR/capacity-scheduler.xml
	$ $HADOOP_YARN_HOME/bin/yarn rmadmin -refreshQueues
	----

	<Note:> Queues cannot be <deleted>, only addition of new queues is supported -
	the updated queue configuration should be a valid one i.e. queue-capacity at
	each <level> should be equal to 100%.