| <?xml version="1.0" encoding="UTF-8"?> |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| <!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> |
| <concept rev="1.3.0" id="admission_control"> |
| |
| <title>Configuring Admission Control</title> |
| |
| <prolog> |
| <metadata> |
| <data name="Category" value="Impala"/> |
| <data name="Category" value="Querying"/> |
| <data name="Category" value="Admission Control"/> |
| <data name="Category" value="Resource Management"/> |
| </metadata> |
| </prolog> |
| |
| <conbody> |
| |
| <p> |
| Impala includes features that balance and maximize resources in your |
| <keyword keyref="hadoop_distro"/> cluster. This topic describes how you can improve |
| efficiency of your a <keyword keyref="hadoop_distro"/> cluster using those features. |
| </p> |
| |
| <p> |
| The configuration options for admission control range from the simple (a single resource |
| pool with a single set of options) to the complex (multiple resource pools with different |
| options, each pool handling queries for a different set of users and groups). |
| </p> |
| |
| </conbody> |
| |
| <concept id="concept_bz4_vxz_jgb"> |
| |
| <title>Configuring Admission Control in Command Line Interface</title> |
| |
| <conbody> |
| |
| <p> |
| To configure admission control, use a combination of startup options for the Impala |
| daemon and edit or create the configuration files |
| <filepath>fair-scheduler.xml</filepath> and <filepath>llama-site.xml</filepath>. |
| </p> |
| |
| <p> |
| For a straightforward configuration using a single resource pool named |
| <codeph>default</codeph>, you can specify configuration options on the command line and |
| skip the <filepath>fair-scheduler.xml</filepath> and <filepath>llama-site.xml</filepath> |
| configuration files. |
| </p> |
| |
| <p> |
| For an advanced configuration with multiple resource pools using different settings: |
| <ol> |
| <li> |
| Set up the <filepath>fair-scheduler.xml</filepath> and |
| <filepath>llama-site.xml</filepath> configuration files manually. |
| </li> |
| |
| <li> |
| Provide the paths to each one using the <cmdname>impalad</cmdname> command-line |
| options, <codeph>‑‑fair_scheduler_allocation_path</codeph> and |
| <codeph>‑‑llama_site_path</codeph> respectively. |
| </li> |
| </ol> |
| </p> |
| |
| <p> |
| The Impala admission control feature uses the Fair Scheduler configuration settings to |
| determine how to map users and groups to different resource pools. For example, you |
| might set up different resource pools with separate memory limits, and maximum number of |
| concurrent and queued queries, for different categories of users within your |
| organization. For details about all the Fair Scheduler configuration settings, see the |
| <xref |
| href="http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html#Configuration" |
| scope="external" format="html">Apache |
| wiki</xref>. |
| </p> |
| |
| <p> |
| The Impala admission control feature uses a small subset of possible settings from the |
| <filepath>llama-site.xml</filepath> configuration file: |
| </p> |
| |
| <codeblock>llama.am.throttling.maximum.placed.reservations.<varname>queue_name</varname> |
| llama.am.throttling.maximum.queued.reservations.<varname>queue_name</varname> |
| <ph rev="2.5.0 IMPALA-2538">impala.admission-control.pool-default-query-options.<varname>queue_name</varname> |
| impala.admission-control.pool-queue-timeout-ms.<varname>queue_name</varname></ph> |
| </codeblock> |
| |
| <p rev="2.5.0 IMPALA-2538"> |
| The <codeph>impala.admission-control.pool-queue-timeout-ms</codeph> setting specifies |
| the timeout value for this pool in milliseconds. |
| </p> |
| |
| <p rev="2.5.0 IMPALA-2538" |
| > |
| The<codeph>impala.admission-control.pool-default-query-options</codeph> settings |
| designates the default query options for all queries that run in this pool. Its argument |
| value is a comma-delimited string of 'key=value' pairs, <codeph>'key1=val1,key2=val2, |
| ...'</codeph>. For example, this is where you might set a default memory limit for all |
| queries in the pool, using an argument such as <codeph>MEM_LIMIT=5G</codeph>. |
| </p> |
| |
| <p rev="2.5.0 IMPALA-2538"> |
| The <codeph>impala.admission-control.*</codeph> configuration settings are available in |
| <keyword keyref="impala25_full"/> and higher. |
| </p> |
| |
| </conbody> |
| |
| <concept id="concept_cz4_vxz_jgb"> |
| |
| <title>Example of Admission Control Configuration</title> |
| |
| <conbody> |
| |
| <p> |
| Here are sample <filepath>fair-scheduler.xml</filepath> and |
| <filepath>llama-site.xml</filepath> files that define resource pools |
| <codeph>root.default</codeph>, <codeph>root.development</codeph>, and |
| <codeph>root.production</codeph>. These files define resource pools for Impala |
| admission control and are separate from the similar |
| <codeph>fair-scheduler.xml</codeph>that defines resource pools for YARN. |
| </p> |
| |
| <p> |
| <b>fair-scheduler.xml:</b> |
| </p> |
| |
| <p> |
| Although Impala does not use the <codeph>vcores</codeph> value, you must still specify |
| it to satisfy YARN requirements for the file contents. |
| </p> |
| |
| <p> |
| Each <codeph><aclSubmitApps></codeph> tag (other than the one for |
| <codeph>root</codeph>) contains a comma-separated list of users, then a space, then a |
| comma-separated list of groups; these are the users and groups allowed to submit |
| Impala statements to the corresponding resource pool. |
| </p> |
| |
| <p> |
| If you leave the <codeph><aclSubmitApps></codeph> element empty for a pool, |
| nobody can submit directly to that pool; child pools can specify their own |
| <codeph><aclSubmitApps></codeph> values to authorize users and groups to submit |
| to those pools. |
| </p> |
| |
| <codeblock><allocations> |
| |
| <queue name="root"> |
| <aclSubmitApps> </aclSubmitApps> |
| <queue name="default"> |
| <maxResources>50000 mb, 0 vcores</maxResources> |
| <aclSubmitApps>*</aclSubmitApps> |
| </queue> |
| <queue name="development"> |
| <maxResources>200000 mb, 0 vcores</maxResources> |
| <aclSubmitApps>user1,user2 dev,ops,admin</aclSubmitApps> |
| </queue> |
| <queue name="production"> |
| <maxResources>1000000 mb, 0 vcores</maxResources> |
| <aclSubmitApps> ops,admin</aclSubmitApps> |
| </queue> |
| </queue> |
| <queuePlacementPolicy> |
| <rule name="specified" create="false"/> |
| <rule name="default" /> |
| </queuePlacementPolicy> |
| </allocations> |
| |
| </codeblock> |
| |
| <p> |
| <b>llama-site.xml:</b> |
| </p> |
| |
| <codeblock rev="2.5.0 IMPALA-2538"> |
| <?xml version="1.0" encoding="UTF-8"?> |
| <configuration> |
| <property> |
| <name>llama.am.throttling.maximum.placed.reservations.root.default</name> |
| <value>10</value> |
| </property> |
| <property> |
| <name>llama.am.throttling.maximum.queued.reservations.root.default</name> |
| <value>50</value> |
| </property> |
| <property> |
| <name>impala.admission-control.pool-default-query-options.root.default</name> |
| <value>mem_limit=128m,query_timeout_s=20,max_io_buffers=10</value> |
| </property> |
| <property> |
| <name>impala.admission-control.<b>pool-queue-timeout-ms</b>.root.default</name> |
| <value>30000</value> |
| </property> |
| <property> |
| <name>impala.admission-control.<b>max-query-mem-limit</b>.root.default.regularPool</name> |
| <value>1610612736</value><!--1.5GB--> |
| </property> |
| <property> |
| <name>impala.admission-control.<b>min-query-mem-limit</b>.root.default.regularPool</name> |
| <value>52428800</value><!--50MB--> |
| </property> |
| <property> |
| <name>impala.admission-control.<b>clamp-mem-limit-query-option</b>.root.default.regularPool</name> |
| <value>true</value> |
| </property> |
| </codeblock> |
| |
| </conbody> |
| |
| </concept> |
| |
| </concept> |
| |
| <concept id="concept_zy4_vxz_jgb"> |
| |
| <title>Configuring Cluster-wide Admission Control</title> |
| |
| <prolog> |
| <metadata> |
| <data name="Category" value="Configuring"/> |
| </metadata> |
| </prolog> |
| |
| <conbody> |
| |
| <note type="important"> |
| These settings only apply if you enable admission control but leave dynamic resource |
| pools disabled. In <keyword |
| keyref="impala25_full"/> and higher, we recommend |
| that you set up dynamic resource pools and customize the settings for each pool as |
| described in <xref href="#concept_bz4_vxz_jgb" format="dita"/>. |
| </note> |
| |
| <p> |
| The following Impala configuration options let you adjust the settings of the admission |
| control feature. When supplying the options on the <cmdname>impalad</cmdname> command |
| line, prepend the option name with <codeph>--</codeph>. |
| </p> |
| |
| <dl> |
| <dlentry> |
| |
| <dt> |
| <codeph>queue_wait_timeout_ms</codeph> |
| </dt> |
| |
| <dd> |
| <b>Purpose:</b> Maximum amount of time (in milliseconds) that a request waits to be |
| admitted before timing out. |
| <p> |
| <b>Type:</b> <codeph>int64</codeph> |
| </p> |
| |
| <p> |
| <b>Default:</b> <codeph>60000</codeph> |
| </p> |
| </dd> |
| |
| </dlentry> |
| |
| <dlentry> |
| |
| <dt> |
| <codeph>default_pool_max_requests</codeph> |
| </dt> |
| |
| <dd> |
| <b>Purpose:</b> Maximum number of concurrent outstanding requests allowed to run |
| before incoming requests are queued. Because this limit applies cluster-wide, but |
| each Impala node makes independent decisions to run queries immediately or queue |
| them, it is a soft limit; the overall number of concurrent queries might be slightly |
| higher during times of heavy load. A negative value indicates no limit. Ignored if |
| <codeph>fair_scheduler_config_path</codeph> and <codeph>llama_site_path</codeph> are |
| set. |
| <p> |
| <b>Type:</b> <codeph>int64</codeph> |
| </p> |
| |
| <p> |
| <b>Default:</b> <ph rev="2.5.0">-1, meaning unlimited (prior to |
| <keyword |
| keyref="impala25_full"/> the default was 200)</ph> |
| </p> |
| </dd> |
| |
| </dlentry> |
| |
| <dlentry> |
| |
| <dt> |
| <codeph>default_pool_max_queued</codeph> |
| </dt> |
| |
| <dd> |
| <b>Purpose:</b> Maximum number of requests allowed to be queued before rejecting |
| requests. Because this limit applies cluster-wide, but each Impala node makes |
| independent decisions to run queries immediately or queue them, it is a soft limit; |
| the overall number of queued queries might be slightly higher during times of heavy |
| load. A negative value or 0 indicates requests are always rejected once the maximum |
| concurrent requests are executing. Ignored if |
| <codeph>fair_scheduler_config_path</codeph> and <codeph>llama_site_path</codeph> are |
| set. |
| <p> |
| <b>Type:</b> <codeph>int64</codeph> |
| </p> |
| |
| <p> |
| <b>Default:</b> <ph rev="2.5.0">unlimited</ph> |
| </p> |
| </dd> |
| |
| </dlentry> |
| |
| <dlentry> |
| |
| <dt> |
| <codeph>default_pool_mem_limit</codeph> |
| </dt> |
| |
| <dd> |
| <b>Purpose:</b> Maximum amount of memory (across the entire cluster) that all |
| outstanding requests in this pool can use before new requests to this pool are |
| queued. Specified in bytes, megabytes, or gigabytes by a number followed by the |
| suffix <codeph>b</codeph> (optional), <codeph>m</codeph>, or <codeph>g</codeph>, |
| either uppercase or lowercase. You can specify floating-point values for megabytes |
| and gigabytes, to represent fractional numbers such as <codeph>1.5</codeph>. You can |
| also specify it as a percentage of the physical memory by specifying the suffix |
| <codeph>%</codeph>. 0 or no setting indicates no limit. Defaults to bytes if no unit |
| is given. Because this limit applies cluster-wide, but each Impala node makes |
| independent decisions to run queries immediately or queue them, it is a soft limit; |
| the overall memory used by concurrent queries might be slightly higher during times |
| of heavy load. Ignored if <codeph>fair_scheduler_config_path</codeph> and |
| <codeph>llama_site_path</codeph> are set. |
| <note |
| conref="../shared/impala_common.xml#common/admission_compute_stats"/> |
| |
| <p conref="../shared/impala_common.xml#common/type_string"/> |
| |
| <p> |
| <b>Default:</b> <codeph>""</codeph> (empty string, meaning unlimited) |
| </p> |
| </dd> |
| |
| </dlentry> |
| |
| <dlentry> |
| |
| <dt> |
| <codeph>disable_pool_max_requests</codeph> |
| </dt> |
| |
| <dd> |
| <b>Purpose:</b> Disables all per-pool limits on the maximum number of running |
| requests. |
| <p> |
| <b>Type:</b> Boolean |
| </p> |
| |
| <p> |
| <b>Default:</b> <codeph>false</codeph> |
| </p> |
| </dd> |
| |
| </dlentry> |
| |
| <dlentry> |
| |
| <dt> |
| <codeph>disable_pool_mem_limits</codeph> |
| </dt> |
| |
| <dd> |
| <b>Purpose:</b> Disables all per-pool mem limits. |
| <p> |
| <b>Type:</b> Boolean |
| </p> |
| |
| <p> |
| <b>Default:</b> <codeph>false</codeph> |
| </p> |
| </dd> |
| |
| </dlentry> |
| |
| <dlentry> |
| |
| <dt> |
| <codeph>fair_scheduler_allocation_path</codeph> |
| </dt> |
| |
| <dd> |
| <b>Purpose:</b> Path to the fair scheduler allocation file |
| (<codeph>fair-scheduler.xml</codeph>). |
| <p |
| conref="../shared/impala_common.xml#common/type_string"/> |
| |
| <p> |
| <b>Default:</b> <codeph>""</codeph> (empty string) |
| </p> |
| |
| <p> |
| <b>Usage notes:</b> Admission control only uses a small subset of the settings |
| that can go in this file, as described below. For details about all the Fair |
| Scheduler configuration settings, see the |
| <xref |
| href="http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html#Configuration" |
| scope="external" format="html">Apache |
| wiki</xref>. |
| </p> |
| </dd> |
| |
| </dlentry> |
| |
| <dlentry> |
| |
| <dt> |
| <codeph>llama_site_path</codeph> |
| </dt> |
| |
| <dd> |
| <b>Purpose:</b> Path to the configuration file used by admission control |
| (<codeph>llama-site.xml</codeph>). If set, |
| <codeph>fair_scheduler_allocation_path</codeph> must also be set. |
| <p conref="../shared/impala_common.xml#common/type_string"/> |
| |
| <p> |
| <b>Default:</b> <codeph>""</codeph> (empty string) |
| </p> |
| |
| <p> |
| <b>Usage notes:</b> Admission control only uses a few of the settings that can go |
| in this file, as described below. |
| </p> |
| </dd> |
| |
| </dlentry> |
| </dl> |
| |
| </conbody> |
| |
| </concept> |
| |
| </concept> |