blob: bea3ede61b43d653d6edaade70196e813e98dad7 [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta name="copyright" content="(C) Copyright 2023" />
<meta name="DC.rights.owner" content="(C) Copyright 2023" />
<meta name="DC.Type" content="concept" />
<meta name="DC.Title" content="MEM_LIMIT Query Option" />
<meta name="DC.Relation" scheme="URI" content="../topics/impala_query_options.html" />
<meta name="prodname" content="Impala" />
<meta name="prodname" content="Impala" />
<meta name="version" content="Impala 3.4.x" />
<meta name="version" content="Impala 3.4.x" />
<meta name="DC.Format" content="XHTML" />
<meta name="DC.Identifier" content="mem_limit" />
<link rel="stylesheet" type="text/css" href="../commonltr.css" />
<title>MEM_LIMIT Query Option</title>
</head>
<body id="mem_limit">
<h1 class="title topictitle1" id="ariaid-title1">MEM_LIMIT Query Option</h1>
<div class="body conbody">
<p class="p">
The MEM_LIMIT query option defines the maximum amount of memory a query can allocate on
each node. The total memory that can be used by a query is the <code class="ph codeph">MEM_LIMIT</code>
times the number of nodes.
</p>
<p class="p">
There are two levels of memory limit for Impala. The
<code class="ph codeph">‑‑mem_limit</code> startup option sets an overall limit for the
<span class="keyword cmdname">impalad</span> process (which handles multiple queries concurrently). That
process memory limit can be expressed either as a percentage of RAM available to the
process such as <code class="ph codeph">‑‑mem_limit=70%</code> or as a fixed amount of
memory, such as <code class="ph codeph">100gb</code>. The memory available to the process is based on
the host's physical memory and, since Impala 3.2, memory limits from Linux Control Groups.
E.g. if an <span class="keyword cmdname">impalad</span> process is running in a Docker container on a host
with 100GB of memory, the memory available is 100GB or the Docker container's memory
limit, whichever is less.
</p>
<p class="p">
The <code class="ph codeph">MEM_LIMIT</code> query option, which you set through
<span class="keyword cmdname">impala-shell</span> or the <code class="ph codeph">SET</code> statement in a JDBC or ODBC
application, applies to each individual query. The <code class="ph codeph">MEM_LIMIT</code> query option
is usually expressed as a fixed size such as <code class="ph codeph">10gb</code>, and must always be
less than the <span class="keyword cmdname">impalad</span> memory limit.
</p>
<p class="p">
If query processing approaches the specified memory limit on any node, either the
per-query limit or the impalad limit, then the SQL operations will start to reduce
their memory consumption, for example by writing the temporary data to disk (known as spilling to disk).
The result is a query that completes successfully, rather than failing with an out-of-memory error.
The tradeoff is decreased performance due to the extra disk I/O to write the temporary data and
read it back in. The slowdown could potentially be significant. Thus, while this feature improves
reliability, you should optimize your queries, system parameters, and hardware configuration to
make this spilling a rare occurrence.
</p>
<p class="p">
<strong class="ph b">Type:</strong> numeric
</p>
<p class="p">
<strong class="ph b">Units:</strong> A numeric argument represents memory size in bytes; you can also use a
suffix of <code class="ph codeph">m</code> or <code class="ph codeph">mb</code> for megabytes, or more commonly
<code class="ph codeph">g</code> or <code class="ph codeph">gb</code> for gigabytes. If you specify a value with
unrecognized formats, subsequent queries fail with an error.
</p>
<p class="p">
<strong class="ph b">Default:</strong> 0 (unlimited)
</p>
<p class="p">
<strong class="ph b">Usage notes:</strong>
</p>
<p class="p">
The <code class="ph codeph">MEM_LIMIT</code> setting is primarily useful for production workloads.
Impala's Admission Controller can be configured to automatically assign memory limits to
queries and limit memory consumption of resource pools. See <a class="xref" href="impala_admission.html#admission_concurrency">Concurrent Queries and Admission Control</a>
and <a class="xref" href="impala_admission.html#admission_memory">Memory Limits and Admission Control</a> for more information on configuring
the resource usage through admission control.
</p>
<p class="p">
Use the output of the <code class="ph codeph">SUMMARY</code> command in <span class="keyword cmdname">impala-shell</span>
to get a report of memory used for each phase of your most heavyweight queries on each
node, and then set a <code class="ph codeph">MEM_LIMIT</code> somewhat higher than that. See
<a class="xref" href="impala_explain_plan.html#perf_summary">Using the SUMMARY Report for Performance Tuning</a> for usage information about the
<code class="ph codeph">SUMMARY</code> command.
</p>
<p class="p">
<strong class="ph b">Examples:</strong>
</p>
<p class="p">
The following examples show how to set the <code class="ph codeph">MEM_LIMIT</code> query option using a
fixed number of bytes, or suffixes representing gigabytes or megabytes.
</p>
<pre class="pre codeblock"><code>
[localhost:21000] &gt; set mem_limit=3000000000;
MEM_LIMIT set to 3000000000
[localhost:21000] &gt; select 5;
Query: select 5
+---+
| 5 |
+---+
| 5 |
+---+
[localhost:21000] &gt; set mem_limit=3g;
MEM_LIMIT set to 3g
[localhost:21000] &gt; select 5;
Query: select 5
+---+
| 5 |
+---+
| 5 |
+---+
[localhost:21000] &gt; set mem_limit=3gb;
MEM_LIMIT set to 3gb
[localhost:21000] &gt; select 5;
+---+
| 5 |
+---+
| 5 |
+---+
[localhost:21000] &gt; set mem_limit=3m;
MEM_LIMIT set to 3m
[localhost:21000] &gt; select 5;
+---+
| 5 |
+---+
| 5 |
+---+
[localhost:21000] &gt; set mem_limit=3mb;
MEM_LIMIT set to 3mb
[localhost:21000] &gt; select 5;
+---+
| 5 |
+---+
</code></pre>
<p class="p">
The following examples show how unrecognized <code class="ph codeph">MEM_LIMIT</code> values lead to
errors for subsequent queries.
</p>
<pre class="pre codeblock"><code>
[localhost:21000] &gt; set mem_limit=3pb;
MEM_LIMIT set to 3pb
[localhost:21000] &gt; select 5;
ERROR: Failed to parse query memory limit from '3pb'.
[localhost:21000] &gt; set mem_limit=xyz;
MEM_LIMIT set to xyz
[localhost:21000] &gt; select 5;
Query: select 5
ERROR: Failed to parse query memory limit from 'xyz'.
</code></pre>
<p class="p">
The following examples shows the automatic query cancellation when the
<code class="ph codeph">MEM_LIMIT</code> value is exceeded on any host involved in the Impala query.
First it runs a successful query and checks the largest amount of memory used on any node
for any stage of the query. Then it sets an artificially low <code class="ph codeph">MEM_LIMIT</code>
setting so that the same query cannot run.
</p>
<pre class="pre codeblock"><code>
[localhost:21000] &gt; select count(*) from customer;
Query: select count(*) from customer
+----------+
| count(*) |
+----------+
| 150000 |
+----------+
[localhost:21000] &gt; select count(distinct c_name) from customer;
Query: select count(distinct c_name) from customer
+------------------------+
| count(distinct c_name) |
+------------------------+
| 150000 |
+------------------------+
[localhost:21000] &gt; summary;
+--------------+--------+--------+----------+----------+---------+------------+----------+---------------+---------------+
| Operator | #Hosts | #Inst | Avg Time | Max Time | #Rows | Est. #Rows | Peak Mem | Est. Peak Mem | Detail |
+--------------+--------+--------+----------+----------+---------+------------+----------+---------------+---------------+
| 06:AGGREGATE | 1 | 1 | 230.00ms | 230.00ms | 1 | 1 | 16.00 KB | -1 B | FINALIZE |
| 05:EXCHANGE | 1 | 1 | 43.44us | 43.44us | 1 | 1 | 0 B | -1 B | UNPARTITIONED |
| 02:AGGREGATE | 1 | 1 | 227.14ms | 227.14ms | 1 | 1 | 12.00 KB | 10.00 MB | |
| 04:AGGREGATE | 1 | 1 | 126.27ms | 126.27ms | 150.00K | 150.00K | 15.17 MB | 10.00 MB | |
| 03:EXCHANGE | 1 | 1 | 44.07ms | 44.07ms | 150.00K | 150.00K | 0 B | 0 B | HASH(c_name) |
<strong class="ph b">| 01:AGGREGATE | 1 | 1 | 361.94ms | 361.94ms | 150.00K | 150.00K | 23.04 MB | 10.00 MB | |</strong>
| 00:SCAN HDFS | 1 | 1 | 43.64ms | 43.64ms | 150.00K | 150.00K | 24.19 MB | 64.00 MB | tpch.customer |
+--------------+--------+--------+----------+----------+---------+------------+----------+---------------+---------------+
[localhost:21000] &gt; set mem_limit=15mb;
MEM_LIMIT set to 15mb
[localhost:21000] &gt; select count(distinct c_name) from customer;
Query: select count(distinct c_name) from customer
ERROR:
Rejected query from pool default-pool: minimum memory reservation is greater than memory available to the query
for buffer reservations. Memory reservation needed given the current plan: 38.00 MB. Adjust either the mem_limit
or the pool config (max-query-mem-limit, min-query-mem-limit) for the query to allow the query memory limit to be
at least 70.00 MB. Note that changing the mem_limit may also change the plan. See the query profile for more
information about the per-node memory requirements.</code></pre>
</div>
<div class="related-links">
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_query_options.html">Query Options for the SET Statement</a></div>
</div>
</div><div class="topic concept nested1" aria-labelledby="ariaid-title2" id="mem_limit_executors">
<h2 class="title topictitle2" id="ariaid-title2">MEM_LIMIT_EXECUTORS Query Option</h2>
<div class="body conbody">
<div class="note note"><span class="notetitle">Note:</span> This is an advanced query option. Setting this query option is not recommended
unless specifically advised.</div>
<p class="p">The existing <code class="ph codeph">MEM_LIMIT</code> query option applies to all impala coordinators and
executors. This means that the same amount of memory gets reserved but coordinators
typically just do the job of coordinating the query and thus do not necessarily need all the
estimated memory. Blocking the estimated memory on coordinators blocks the memory to be used
for other queries.</p>
<p class="p">The new <code class="ph codeph">MEM_LIMIT_EXECUTORS</code> query option functions similarly to the
<code class="ph codeph">MEM_LIMIT</code> option but sets the query memory limit only on executors. This
new option addresses the issue related to <code class="ph codeph">MEM_LIMIT</code> and is recommended in
scenarios where the query needs much higher memory on executors compared with
coordinators.</p>
<p class="p">Note that the <code class="ph codeph">MEM_LIMIT_EXECUTORS</code> option does not work with
<code class="ph codeph">MEM_LIMIT</code>. If you set both, only <code class="ph codeph">MEM_LIMIT</code> applies.</p>
</div>
</div>
<div class="topic concept nested1" aria-labelledby="ariaid-title3" id="mem_limit_coordinators">
<h2 class="title topictitle2" id="ariaid-title3">MEM_LIMIT_COORDINATORS Query Option</h2>
<div class="body conbody">
<div class="note note"><span class="notetitle">Note:</span> This is an advanced query option. Setting this query option is not recommended
unless specifically advised.</div>
<p class="p">The existing <code class="ph codeph">MEM_LIMIT</code> query option applies to all impala coordinators and
executors. This means that the same amount of memory gets reserved but coordinators
typically just do the job of coordinating the query and thus do not necessarily need all the
estimated memory. Blocking the estimated memory on coordinators blocks the memory to be used
for other queries.</p>
<p class="p">The new <code class="ph codeph">MEM_LIMIT_COORDINATORS</code> query option functions similarly to the
<code class="ph codeph">MEM_LIMIT</code> option but sets the query memory limit only on coordinators. This
new option addresses the issue related to <code class="ph codeph">MEM_LIMIT</code> and is recommended in
scenarios where the query needs higher or lower memory on coordinators compared to the planner
estimates.</p>
<p class="p">Note that the <code class="ph codeph">MEM_LIMIT_COORDINATORS</code> option does not work with
<code class="ph codeph">MEM_LIMIT</code>. If you set both, only <code class="ph codeph">MEM_LIMIT</code> applies.</p>
</div>
</div>
</body>
</html>