| <?xml version="1.0" encoding="UTF-8"?> |
| <!DOCTYPE html |
| PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> |
| <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> |
| <head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> |
| |
| <meta name="copyright" content="(C) Copyright 2023" /> |
| <meta name="DC.rights.owner" content="(C) Copyright 2023" /> |
| <meta name="DC.Type" content="concept" /> |
| <meta name="DC.Title" content="EXPLAIN_LEVEL Query Option" /> |
| <meta name="DC.Relation" scheme="URI" content="../topics/impala_query_options.html" /> |
| <meta name="DC.Relation" scheme="URI" content="../topics/impala_max_num_runtime_filters.html" /> |
| <meta name="prodname" content="Impala" /> |
| <meta name="prodname" content="Impala" /> |
| <meta name="version" content="Impala 3.4.x" /> |
| <meta name="version" content="Impala 3.4.x" /> |
| <meta name="DC.Format" content="XHTML" /> |
| <meta name="DC.Identifier" content="explain_level" /> |
| <link rel="stylesheet" type="text/css" href="../commonltr.css" /> |
| <title>EXPLAIN_LEVEL Query Option</title> |
| </head> |
| <body id="explain_level"> |
| |
| |
| <h1 class="title topictitle1" id="ariaid-title1">EXPLAIN_LEVEL Query Option</h1> |
| |
| |
| |
| |
| <div class="body conbody"> |
| |
| <p class="p"> Controls the amount of detail provided in the output of the |
| <code class="ph codeph">EXPLAIN</code> statement. The basic output can help you |
| identify high-level performance issues such as scanning a higher volume of |
| data or more partitions than you expect. The higher levels of detail show |
| how intermediate results flow between nodes and how different SQL |
| operations such as <code class="ph codeph">ORDER BY</code>, <code class="ph codeph">GROUP BY</code>, |
| joins, and <code class="ph codeph">WHERE</code> clauses are implemented within a |
| distributed query. </p> |
| |
| |
| <p class="p"> |
| <strong class="ph b">Type:</strong> <code class="ph codeph">STRING</code> or <code class="ph codeph">INT</code> |
| </p> |
| |
| |
| <p class="p"> |
| <strong class="ph b">Default:</strong> <code class="ph codeph">1</code> |
| </p> |
| |
| |
| <p class="p"> |
| <strong class="ph b">Arguments:</strong> |
| </p> |
| |
| |
| <p class="p"> |
| The allowed range of numeric values for this option is 0 to 3: |
| </p> |
| |
| |
| <ul class="ul"> |
| <li class="li"> |
| <code class="ph codeph">0</code> or <code class="ph codeph">MINIMAL</code>: A barebones list, one line per operation. Primarily useful |
| for checking the join order in very long queries where the regular <code class="ph codeph">EXPLAIN</code> output is too |
| long to read easily. |
| </li> |
| |
| |
| <li class="li"> |
| <code class="ph codeph">1</code> or <code class="ph codeph">STANDARD</code>: The default level of detail, showing the logical way that |
| work is split up for the distributed query. |
| </li> |
| |
| |
| <li class="li"> |
| <code class="ph codeph">2</code> or <code class="ph codeph">EXTENDED</code>: Includes additional |
| detail about how the query planner uses statistics in its |
| decision-making process, to understand how a query could be tuned by |
| gathering statistics, using query hints, adding or removing predicates, |
| and so on. In <span class="keyword">Impala 3.2</span> and higher, the output |
| also includes the analyzed query with the cast information in the output |
| header, and the implicit cast info in the Predicate section.</li> |
| |
| |
| <li class="li"> |
| <code class="ph codeph">3</code> or <code class="ph codeph">VERBOSE</code>: The maximum level of detail, showing how work is split up |
| within each node into <span class="q">"query fragments"</span> that are connected in a pipeline. This extra detail is |
| primarily useful for low-level performance testing and tuning within Impala itself, rather than for |
| rewriting the SQL code at the user level. |
| </li> |
| |
| </ul> |
| |
| |
| <div class="note note"><span class="notetitle">Note:</span> |
| Prior to Impala 1.3, the allowed argument range for <code class="ph codeph">EXPLAIN_LEVEL</code> was 0 to 1: level 0 had |
| the mnemonic <code class="ph codeph">NORMAL</code>, and level 1 was <code class="ph codeph">VERBOSE</code>. In Impala 1.3 and higher, |
| <code class="ph codeph">NORMAL</code> is not a valid mnemonic value, and <code class="ph codeph">VERBOSE</code> still applies to the |
| highest level of detail but now corresponds to level 3. You might need to adjust the values if you have any |
| older <code class="ph codeph">impala-shell</code> script files that set the <code class="ph codeph">EXPLAIN_LEVEL</code> query option. |
| </div> |
| |
| |
| <p class="p"> |
| Changing the value of this option controls the amount of detail in the output of the <code class="ph codeph">EXPLAIN</code> |
| statement. The extended information from level 2 or 3 is especially useful during performance tuning, when |
| you need to confirm whether the work for the query is distributed the way you expect, particularly for the |
| most resource-intensive operations such as join queries against large tables, queries against tables with |
| large numbers of partitions, and insert operations for Parquet tables. The extended information also helps to |
| check estimated resource usage when you use the admission control or resource management features explained |
| in <a class="xref" href="impala_resource_management.html#resource_management">Resource Management</a>. See |
| <a class="xref" href="impala_explain.html#explain">EXPLAIN Statement</a> for the syntax of the <code class="ph codeph">EXPLAIN</code> statement, and |
| <a class="xref" href="impala_explain_plan.html#perf_explain">Using the EXPLAIN Plan for Performance Tuning</a> for details about how to use the extended information. |
| </p> |
| |
| |
| <p class="p"> |
| <strong class="ph b">Usage notes:</strong> |
| </p> |
| |
| |
| <p class="p"> |
| As always, read the <code class="ph codeph">EXPLAIN</code> output from bottom to top. The lowest lines represent the |
| initial work of the query (scanning data files), the lines in the middle represent calculations done on each |
| node and how intermediate results are transmitted from one node to another, and the topmost lines represent |
| the final results being sent back to the coordinator node. |
| </p> |
| |
| |
| <p class="p"> |
| The numbers in the left column are generated internally during the initial planning phase and do not |
| represent the actual order of operations, so it is not significant if they appear out of order in the |
| <code class="ph codeph">EXPLAIN</code> output. |
| </p> |
| |
| |
| <p class="p"> |
| At all <code class="ph codeph">EXPLAIN</code> levels, the plan contains a warning if any tables in the query are missing |
| statistics. Use the <code class="ph codeph">COMPUTE STATS</code> statement to gather statistics for each table and suppress |
| this warning. See <a class="xref" href="impala_perf_stats.html#perf_stats">Table and Column Statistics</a> for details about how the statistics help |
| query performance. |
| </p> |
| |
| |
| <p class="p"> |
| The <code class="ph codeph">PROFILE</code> command in <span class="keyword cmdname">impala-shell</span> always starts with an explain plan |
| showing full detail, the same as with <code class="ph codeph">EXPLAIN_LEVEL=3</code>. <span class="ph">After the explain |
| plan comes the executive summary, the same output as produced by the <code class="ph codeph">SUMMARY</code> command in |
| <span class="keyword cmdname">impala-shell</span>.</span> |
| </p> |
| |
| |
| <p class="p"> |
| <strong class="ph b">Examples:</strong> |
| </p> |
| |
| |
| <p class="p"> |
| These examples use a trivial, empty table to illustrate how the essential aspects of query planning are shown |
| in <code class="ph codeph">EXPLAIN</code> output: |
| </p> |
| |
| |
| <pre class="pre codeblock"><code>[localhost:21000] > create table t1 (x int, s string); |
| [localhost:21000] > set explain_level=1; |
| [localhost:21000] > explain select count(*) from t1; |
| +------------------------------------------------------------------------+ |
| | Explain String | |
| +------------------------------------------------------------------------+ |
| | Estimated Per-Host Requirements: Memory=10.00MB VCores=1 | |
| | WARNING: The following tables are missing relevant table and/or column | |
| | statistics. | |
| | explain_plan.t1 | |
| | | |
| | 03:AGGREGATE [MERGE FINALIZE] | |
| | | output: sum(count(*)) | |
| | | | |
| | 02:EXCHANGE [PARTITION=UNPARTITIONED] | |
| | | | |
| | 01:AGGREGATE | |
| | | output: count(*) | |
| | | | |
| | 00:SCAN HDFS [explain_plan.t1] | |
| | partitions=1/1 size=0B | |
| +------------------------------------------------------------------------+ |
| [localhost:21000] > explain select * from t1; |
| +------------------------------------------------------------------------+ |
| | Explain String | |
| +------------------------------------------------------------------------+ |
| | Estimated Per-Host Requirements: Memory=-9223372036854775808B VCores=0 | |
| | WARNING: The following tables are missing relevant table and/or column | |
| | statistics. | |
| | explain_plan.t1 | |
| | | |
| | 01:EXCHANGE [PARTITION=UNPARTITIONED] | |
| | | | |
| | 00:SCAN HDFS [explain_plan.t1] | |
| | partitions=1/1 size=0B | |
| +------------------------------------------------------------------------+ |
| [localhost:21000] > set explain_level=2; |
| [localhost:21000] > explain select * from t1; |
| +------------------------------------------------------------------------+ |
| | Explain String | |
| +------------------------------------------------------------------------+ |
| | Estimated Per-Host Requirements: Memory=-9223372036854775808B VCores=0 | |
| | WARNING: The following tables are missing relevant table and/or column | |
| | statistics. | |
| | explain_plan.t1 | |
| | | |
| | 01:EXCHANGE [PARTITION=UNPARTITIONED] | |
| | | hosts=0 per-host-mem=unavailable | |
| | | tuple-ids=0 row-size=19B cardinality=unavailable | |
| | | | |
| | 00:SCAN HDFS [explain_plan.t1, PARTITION=RANDOM] | |
| | partitions=1/1 size=0B | |
| | table stats: unavailable | |
| | column stats: unavailable | |
| | hosts=0 per-host-mem=0B | |
| | tuple-ids=0 row-size=19B cardinality=unavailable | |
| +------------------------------------------------------------------------+ |
| [localhost:21000] > set explain_level=3; |
| [localhost:21000] > explain select * from t1; |
| +------------------------------------------------------------------------+ |
| | Explain String | |
| +------------------------------------------------------------------------+ |
| | Estimated Per-Host Requirements: Memory=-9223372036854775808B VCores=0 | |
| <strong class="ph b">| WARNING: The following tables are missing relevant table and/or column |</strong> |
| <strong class="ph b">| statistics. |</strong> |
| <strong class="ph b">| explain_plan.t1 |</strong> |
| | | |
| | F01:PLAN FRAGMENT [PARTITION=UNPARTITIONED] | |
| | 01:EXCHANGE [PARTITION=UNPARTITIONED] | |
| | hosts=0 per-host-mem=unavailable | |
| | tuple-ids=0 row-size=19B cardinality=unavailable | |
| | | |
| | F00:PLAN FRAGMENT [PARTITION=RANDOM] | |
| | DATASTREAM SINK [FRAGMENT=F01, EXCHANGE=01, PARTITION=UNPARTITIONED] | |
| | 00:SCAN HDFS [explain_plan.t1, PARTITION=RANDOM] | |
| | partitions=1/1 size=0B | |
| <strong class="ph b">| table stats: unavailable |</strong> |
| <strong class="ph b">| column stats: unavailable |</strong> |
| | hosts=0 per-host-mem=0B | |
| | tuple-ids=0 row-size=19B cardinality=unavailable | |
| +------------------------------------------------------------------------+ |
| </code></pre> |
| |
| <p class="p"> |
| As the warning message demonstrates, most of the information needed for Impala to do efficient query |
| planning, and for you to understand the performance characteristics of the query, requires running the |
| <code class="ph codeph">COMPUTE STATS</code> statement for the table: |
| </p> |
| |
| |
| <pre class="pre codeblock"><code>[localhost:21000] > compute stats t1; |
| +-----------------------------------------+ |
| | summary | |
| +-----------------------------------------+ |
| | Updated 1 partition(s) and 2 column(s). | |
| +-----------------------------------------+ |
| [localhost:21000] > explain select * from t1; |
| +------------------------------------------------------------------------+ |
| | Explain String | |
| +------------------------------------------------------------------------+ |
| | Estimated Per-Host Requirements: Memory=-9223372036854775808B VCores=0 | |
| | | |
| | F01:PLAN FRAGMENT [PARTITION=UNPARTITIONED] | |
| | 01:EXCHANGE [PARTITION=UNPARTITIONED] | |
| | hosts=0 per-host-mem=unavailable | |
| | tuple-ids=0 row-size=20B cardinality=0 | |
| | | |
| | F00:PLAN FRAGMENT [PARTITION=RANDOM] | |
| | DATASTREAM SINK [FRAGMENT=F01, EXCHANGE=01, PARTITION=UNPARTITIONED] | |
| | 00:SCAN HDFS [explain_plan.t1, PARTITION=RANDOM] | |
| | partitions=1/1 size=0B | |
| <strong class="ph b">| table stats: 0 rows total |</strong> |
| <strong class="ph b">| column stats: all |</strong> |
| | hosts=0 per-host-mem=0B | |
| | tuple-ids=0 row-size=20B cardinality=0 | |
| +------------------------------------------------------------------------+ |
| </code></pre> |
| |
| <p class="p"> |
| Joins and other complicated, multi-part queries are the ones where you most commonly need to examine the |
| <code class="ph codeph">EXPLAIN</code> output and customize the amount of detail in the output. This example shows the |
| default <code class="ph codeph">EXPLAIN</code> output for a three-way join query, then the equivalent output with a |
| <code class="ph codeph">[SHUFFLE]</code> hint to change the join mechanism between the first two tables from a broadcast |
| join to a shuffle join. |
| </p> |
| |
| |
| <pre class="pre codeblock"><code>[localhost:21000] > set explain_level=1; |
| [localhost:21000] > explain select one.*, two.*, three.* from t1 one, t1 two, t1 three where one.x = two.x and two.x = three.x; |
| +---------------------------------------------------------+ |
| | Explain String | |
| +---------------------------------------------------------+ |
| | Estimated Per-Host Requirements: Memory=4.00GB VCores=3 | |
| | | |
| | 07:EXCHANGE [PARTITION=UNPARTITIONED] | |
| | | | |
| <strong class="ph b">| 04:HASH JOIN [INNER JOIN, BROADCAST] |</strong> |
| | | hash predicates: two.x = three.x | |
| | | | |
| <strong class="ph b">| |--06:EXCHANGE [BROADCAST] |</strong> |
| | | | | |
| | | 02:SCAN HDFS [explain_plan.t1 three] | |
| | | partitions=1/1 size=0B | |
| | | | |
| <strong class="ph b">| 03:HASH JOIN [INNER JOIN, BROADCAST] |</strong> |
| | | hash predicates: one.x = two.x | |
| | | | |
| <strong class="ph b">| |--05:EXCHANGE [BROADCAST] |</strong> |
| | | | | |
| | | 01:SCAN HDFS [explain_plan.t1 two] | |
| | | partitions=1/1 size=0B | |
| | | | |
| | 00:SCAN HDFS [explain_plan.t1 one] | |
| | partitions=1/1 size=0B | |
| +---------------------------------------------------------+ |
| [localhost:21000] > explain select one.*, two.*, three.* |
| > from t1 one join [shuffle] t1 two join t1 three |
| > where one.x = two.x and two.x = three.x; |
| +---------------------------------------------------------+ |
| | Explain String | |
| +---------------------------------------------------------+ |
| | Estimated Per-Host Requirements: Memory=4.00GB VCores=3 | |
| | | |
| | 08:EXCHANGE [PARTITION=UNPARTITIONED] | |
| | | | |
| <strong class="ph b">| 04:HASH JOIN [INNER JOIN, BROADCAST] |</strong> |
| | | hash predicates: two.x = three.x | |
| | | | |
| <strong class="ph b">| |--07:EXCHANGE [BROADCAST] |</strong> |
| | | | | |
| | | 02:SCAN HDFS [explain_plan.t1 three] | |
| | | partitions=1/1 size=0B | |
| | | | |
| <strong class="ph b">| 03:HASH JOIN [INNER JOIN, PARTITIONED] |</strong> |
| | | hash predicates: one.x = two.x | |
| | | | |
| <strong class="ph b">| |--06:EXCHANGE [PARTITION=HASH(two.x)] |</strong> |
| | | | | |
| | | 01:SCAN HDFS [explain_plan.t1 two] | |
| | | partitions=1/1 size=0B | |
| | | | |
| <strong class="ph b">| 05:EXCHANGE [PARTITION=HASH(one.x)] |</strong> |
| | | | |
| | 00:SCAN HDFS [explain_plan.t1 one] | |
| | partitions=1/1 size=0B | |
| +---------------------------------------------------------+ |
| </code></pre> |
| |
| <p class="p"> |
| For a join involving many different tables, the default <code class="ph codeph">EXPLAIN</code> output might stretch over |
| several pages, and the only details you care about might be the join order and the mechanism (broadcast or |
| shuffle) for joining each pair of tables. In that case, you might set <code class="ph codeph">EXPLAIN_LEVEL</code> to its |
| lowest value of 0, to focus on just the join order and join mechanism for each stage. The following example |
| shows how the rows from the first and second joined tables are hashed and divided among the nodes of the |
| cluster for further filtering; then the entire contents of the third table are broadcast to all nodes for the |
| final stage of join processing. |
| </p> |
| |
| |
| <pre class="pre codeblock"><code>[localhost:21000] > set explain_level=0; |
| [localhost:21000] > explain select one.*, two.*, three.* |
| > from t1 one join [shuffle] t1 two join t1 three |
| > where one.x = two.x and two.x = three.x; |
| +---------------------------------------------------------+ |
| | Explain String | |
| +---------------------------------------------------------+ |
| | Estimated Per-Host Requirements: Memory=4.00GB VCores=3 | |
| | | |
| | 08:EXCHANGE [PARTITION=UNPARTITIONED] | |
| <strong class="ph b">| 04:HASH JOIN [INNER JOIN, BROADCAST] |</strong> |
| <strong class="ph b">| |--07:EXCHANGE [BROADCAST] |</strong> |
| | | 02:SCAN HDFS [explain_plan.t1 three] | |
| <strong class="ph b">| 03:HASH JOIN [INNER JOIN, PARTITIONED] |</strong> |
| <strong class="ph b">| |--06:EXCHANGE [PARTITION=HASH(two.x)] |</strong> |
| | | 01:SCAN HDFS [explain_plan.t1 two] | |
| <strong class="ph b">| 05:EXCHANGE [PARTITION=HASH(one.x)] |</strong> |
| | 00:SCAN HDFS [explain_plan.t1 one] | |
| +---------------------------------------------------------+ |
| </code></pre> |
| |
| |
| |
| </div> |
| |
| <div class="related-links"> |
| <ul class="ullinks"> |
| <li class="link ulchildlink"><strong><a href="../topics/impala_max_num_runtime_filters.html">MAX_NUM_RUNTIME_FILTERS Query Option (Impala 2.5 or higher only)</a></strong><br /> |
| </li> |
| </ul> |
| |
| <div class="familylinks"> |
| <div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_query_options.html">Query Options for the SET Statement</a></div> |
| </div> |
| </div></body> |
| </html> |