blob: 1caadcc29141637e521334e6e0cc224adfdc2670 [file] [log] [blame] [view]
<!---
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
# EXPLAIN
The `EXPLAIN` command shows the logical and physical execution plan for the specified SQL statement.
## Syntax
<pre>
EXPLAIN [ANALYZE] [VERBOSE] [FORMAT format] statement
</pre>
## `EXPLAIN`
Shows the execution plan of a statement.
If you need more detailed output, use `EXPLAIN VERBOSE`.
Note that `EXPLAIN VERBOSE` only supports the `indent` format.
The optional `[FORMAT format]` clause controls how the plan is displayed as
explained below. If this clause is not specified, the plan is displayed using
the format from the [configuration value] `datafusion.explain.format`.
### `tree` format (default)
The `tree` format is modeled after [DuckDB plans] and is designed to be easier
to see the high level structure of the plan
[duckdb plans]: https://duckdb.org/docs/stable/guides/meta/explain.html
```sql
> EXPLAIN FORMAT TREE SELECT SUM(x) FROM t GROUP BY b;
+---------------+-------------------------------+
| plan_type | plan |
+---------------+-------------------------------+
| physical_plan | ┌───────────────────────────┐ |
| | │ ProjectionExec │ |
| | │ -------------------- │ |
| | │ sum(t.x): sum(t.x)@1 │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ AggregateExec │ |
| | │ -------------------- │ |
| | │ aggr: sum(t.x) │ |
| | │ group_by: b@0 as b │ |
| | │ │ |
| | │ mode: │ |
| | │ FinalPartitioned │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ CoalesceBatchesExec │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ RepartitionExec │ |
| | │ -------------------- │ |
| | │ input_partition_count: │ |
| | │ 16 │ |
| | │ │ |
| | │ partitioning_scheme: │ |
| | │ Hash([b@0], 16) │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ RepartitionExec │ |
| | │ -------------------- │ |
| | │ input_partition_count: │ |
| | │ 1 │ |
| | │ │ |
| | │ partitioning_scheme: │ |
| | │ RoundRobinBatch(16) │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ AggregateExec │ |
| | │ -------------------- │ |
| | │ aggr: sum(t.x) │ |
| | │ group_by: b@1 as b │ |
| | │ mode: Partial │ |
| | └─────────────┬─────────────┘ |
| | ┌─────────────┴─────────────┐ |
| | │ DataSourceExec │ |
| | │ -------------------- │ |
| | │ bytes: 224 │ |
| | │ format: memory │ |
| | │ rows: 1 │ |
| | └───────────────────────────┘ |
| | |
+---------------+-------------------------------+
1 row(s) fetched.
Elapsed 0.016 seconds.
```
### `indent` format
The `indent` format shows both the logical and physical plan, with one line for
each operator in the plan. Child plans are indented to show the hierarchy.
See [Reading Explain Plans](../explain-usage.md) for more information on how to interpret these plans.
```sql
> CREATE TABLE t(x int, b int) AS VALUES (1, 2), (2, 3);
0 row(s) fetched.
Elapsed 0.004 seconds.
> EXPLAIN FORMAT INDENT SELECT SUM(x) FROM t GROUP BY b;
+---------------+-------------------------------------------------------------------------------+
| plan_type | plan |
+---------------+-------------------------------------------------------------------------------+
| logical_plan | Projection: sum(t.x) |
| | Aggregate: groupBy=[[t.b]], aggr=[[sum(CAST(t.x AS Int64))]] |
| | TableScan: t projection=[x, b] |
| physical_plan | ProjectionExec: expr=[sum(t.x)@1 as sum(t.x)] |
| | AggregateExec: mode=FinalPartitioned, gby=[b@0 as b], aggr=[sum(t.x)] |
| | CoalesceBatchesExec: target_batch_size=8192 |
| | RepartitionExec: partitioning=Hash([b@0], 16), input_partitions=16 |
| | RepartitionExec: partitioning=RoundRobinBatch(16), input_partitions=1 |
| | AggregateExec: mode=Partial, gby=[b@1 as b], aggr=[sum(t.x)] |
| | DataSourceExec: partitions=1, partition_sizes=[1] |
| | |
+---------------+-------------------------------------------------------------------------------+
2 row(s) fetched.
Elapsed 0.004 seconds.
```
### `pgjson` format
The `pgjson` format is modeled after [Postgres JSON] format.
You can use this format to visualize the plan in existing plan visualization
tools, such as [dalibo]
[postgres json]: https://www.postgresql.org/docs/current/sql-explain.html
[dalibo]: https://explain.dalibo.com/
```sql
> EXPLAIN FORMAT PGJSON SELECT SUM(x) FROM t GROUP BY b;
+--------------+----------------------------------------------------+
| plan_type | plan |
+--------------+----------------------------------------------------+
| logical_plan | [ |
| | { |
| | "Plan": { |
| | "Expressions": [ |
| | "sum(t.x)" |
| | ], |
| | "Node Type": "Projection", |
| | "Output": [ |
| | "sum(t.x)" |
| | ], |
| | "Plans": [ |
| | { |
| | "Aggregates": "sum(CAST(t.x AS Int64))", |
| | "Group By": "t.b", |
| | "Node Type": "Aggregate", |
| | "Output": [ |
| | "b", |
| | "sum(t.x)" |
| | ], |
| | "Plans": [ |
| | { |
| | "Node Type": "TableScan", |
| | "Output": [ |
| | "x", |
| | "b" |
| | ], |
| | "Plans": [], |
| | "Relation Name": "t" |
| | } |
| | ] |
| | } |
| | ] |
| | } |
| | } |
| | ] |
+--------------+----------------------------------------------------+
1 row(s) fetched.
Elapsed 0.008 seconds.
```
### `graphviz` format
The `graphviz` format uses the [DOT language] that can be used with [Graphviz] to
generate a visual representation of the plan.
[dot language]: https://graphviz.org/doc/info/lang.html
[graphviz]: https://graphviz.org/
```sql
> EXPLAIN FORMAT GRAPHVIZ SELECT SUM(x) FROM t GROUP BY b;
+--------------+------------------------------------------------------------------------------------------------------------------------------+
| plan_type | plan |
+--------------+------------------------------------------------------------------------------------------------------------------------------+
| logical_plan | |
| | // Begin DataFusion GraphViz Plan, |
| | // display it online here: https://dreampuf.github.io/GraphvizOnline |
| | |
| | digraph { |
| | subgraph cluster_1 |
| | { |
| | graph[label="LogicalPlan"] |
| | 2[shape=box label="Projection: sum(t.x)"] |
| | 3[shape=box label="Aggregate: groupBy=[[t.b]], aggr=[[sum(CAST(t.x AS Int64))]]"] |
| | 2 -> 3 [arrowhead=none, arrowtail=normal, dir=back] |
| | 4[shape=box label="TableScan: t projection=[x, b]"] |
| | 3 -> 4 [arrowhead=none, arrowtail=normal, dir=back] |
| | } |
| | subgraph cluster_5 |
| | { |
| | graph[label="Detailed LogicalPlan"] |
| | 6[shape=box label="Projection: sum(t.x)\nSchema: [sum(t.x):Int64;N]"] |
| | 7[shape=box label="Aggregate: groupBy=[[t.b]], aggr=[[sum(CAST(t.x AS Int64))]]\nSchema: [b:Int32;N, sum(t.x):Int64;N]"] |
| | 6 -> 7 [arrowhead=none, arrowtail=normal, dir=back] |
| | 8[shape=box label="TableScan: t projection=[x, b]\nSchema: [x:Int32;N, b:Int32;N]"] |
| | 7 -> 8 [arrowhead=none, arrowtail=normal, dir=back] |
| | } |
| | } |
| | // End DataFusion GraphViz Plan |
| | |
+--------------+------------------------------------------------------------------------------------------------------------------------------+
1 row(s) fetched.
Elapsed 0.010 seconds.
```
## `EXPLAIN ANALYZE`
Shows the execution plan and metrics of a statement. Note that `EXPLAIN ANALYZE`
only supports the `indent` format.
```sql
EXPLAIN ANALYZE SELECT SUM(x) FROM table GROUP BY b;
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| plan_type | plan |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
| Plan with Metrics | CoalescePartitionsExec, metrics=[] |
| | ProjectionExec: expr=[SUM(table.x)@1 as SUM(x)], metrics=[] |
| | HashAggregateExec: mode=FinalPartitioned, gby=[b@0 as b], aggr=[SUM(x)], metrics=[outputRows=2] |
| | CoalesceBatchesExec: target_batch_size=4096, metrics=[] |
| | RepartitionExec: partitioning=Hash([Column { name: "b", index: 0 }], 16), metrics=[sendTime=839560, fetchTime=122528525, repartitionTime=5327877] |
| | HashAggregateExec: mode=Partial, gby=[b@1 as b], aggr=[SUM(x)], metrics=[outputRows=2] |
| | RepartitionExec: partitioning=RoundRobinBatch(16), metrics=[fetchTime=5660489, repartitionTime=0, sendTime=8012] |
| | DataSourceExec: file_groups={1 group: [[/tmp/table.csv]]}, has_header=false, metrics=[] |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------------+
```
By default `EXPLAIN ANALYZE` shows the aggregated metrics from all partitions for each operator. If you need to display per-partition metrics, use `EXPLAIN ANALYZE VERBOSE`.
You can also set `datafusion.explain.analyze_level` from the [configuration value] to control the detail level for the metrics displayed.
[configuration value]: ../configs.md