Query analysis helps users understand query execution mechanisms and identify performance bottlenecks, facilitating optimization and improving efficiency. This directly enhances user experience and resource utilization. IoTDB provides two query analysis statements: EXPLAIN and EXPLAIN ANALYZE.
EXPLAIN: Displays the query execution plan, detailing how IoTDB retrieves and processes data.EXPLAIN ANALYZE: Executes the query and provides detailed performance metrics, such as execution time and resource consumption. Unlike other diagnostic tools, it requires no deployment and focuses on single-query analysis for precise troubleshooting.Performance Analysis Methods Comparison
| Method | Installation Difficulty | Business Impact | Functional Scope |
|---|---|---|---|
| EXPLAIN ANALYZE | Low. No additional components required; built-in SQL statement in IoTDB. | Low. Impacts only the analyzed query, with no effect on other workloads. | Supports cluster systems. Enables tracing for a single SQL query. |
| Monitoring Dashboard | Medium. Requires installation of IoTDB monitoring dashboard tool (TimechoDB) and enabling monitoring services. | Medium. Metrics collection introduces additional overhead. | Supports cluster systems. Analyzes overall database query load and latency. |
| Arthas Sampling | Medium. Requires Java Arthas installation (may be restricted in internal networks; sometimes requires application restart). | High. May degrade response speed of online services due to CPU sampling. | Does n****ot supports cluster systems. Analyzes overall database query load and latency. |
The EXPLAIN command allows users to view the execution plan of an SQL query. It presents the plan as a series of operators, illustrating how IoTDB processes the query. The syntax is as follows, where <SELECT_STATEMENT> represents the target query:
EXPLAIN <SELECT_STATEMENT>
The result of EXPLAIN includes information such as data access strategies, whether filtering conditions are pushed down, and the distribution of the query plan across different nodes. This provides users with a means to visualize the internal execution logic of the query.
-- Create database CREATE DATABASE test; -- Create table USE test; CREATE TABLE t1 (device_id STRING ID, type STRING ATTRIBUTE, speed FLOAT); -- Insert data INSERT INTO t1(device_id, type, speed) VALUES('car_1', 'Model Y', 120.0); INSERT INTO t1(device_id, type, speed) VALUES('car_2', 'Model 3', 100.0); -- Execute EXPLAIN EXPLAIN SELECT * FROM t1;
The result shows that IoTDB retrieves data from different data partitions through two TableScan nodes and aggregates the data using a Collect operator before returning it:
+-----------------------------------------------------------------------------------------------+ | distribution plan| +-----------------------------------------------------------------------------------------------+ | ┌─────────────────────────────────────────────┐ | | │OutputNode-4 │ | | │OutputColumns-[time, device_id, type, speed] │ | | │OutputSymbols: [time, device_id, type, speed]│ | | └─────────────────────────────────────────────┘ | | │ | | │ | | ┌─────────────────────────────────────────────┐ | | │Collect-21 │ | | │OutputSymbols: [time, device_id, type, speed]│ | | └─────────────────────────────────────────────┘ | | ┌───────────────────────┴───────────────────────┐ | | │ │ | |┌─────────────────────────────────────────────┐ ┌───────────┐ | |│TableScan-19 │ │Exchange-28│ | |│QualifiedTableName: test.t1 │ └───────────┘ | |│OutputSymbols: [time, device_id, type, speed]│ │ | |│DeviceNumber: 1 │ │ | |│ScanOrder: ASC │ ┌─────────────────────────────────────────────┐| |│PushDownOffset: 0 │ │TableScan-20 │| |│PushDownLimit: 0 │ │QualifiedTableName: test.t1 │| |│PushDownLimitToEachDevice: false │ │OutputSymbols: [time, device_id, type, speed]│| |│RegionId: 2 │ │DeviceNumber: 1 │| |└─────────────────────────────────────────────┘ │ScanOrder: ASC │| | │PushDownOffset: 0 │| | │PushDownLimit: 0 │| | │PushDownLimitToEachDevice: false │| | │RegionId: 1 │| | └─────────────────────────────────────────────┘| +-----------------------------------------------------------------------------------------------+
The EXPLAIN ANALYZE statement provides detailed performance metrics by executing the query and analyzing its runtime behavior. The syntax is as follows:
EXPLAIN ANALYZE [VERBOSE] <SELECT_STATEMENT>
SELECT_STATEMENT corresponds to the query statement to be analyzed.VERBOSE (optional): Prints detailed analysis results. Without this option, some metrics are omitted.Explain Analyze is a performance analysis SQL built into the IoTDB query engine. Unlike Explain, it runs the query and collects execution metrics, enabling users to trace performance bottlenecks, observe resource usage, and conduct precise performance tuning.
The output includes detailed statistics such as query planning time, execution time, data partitioning, and resource consumption:
QueryStatistics contains high-level statistics about the query execution, including the time spent in each planning stage and the number of query shards.
FetchPartitionCost and FetchSchemaCost.A FragmentInstance is a wrapper for a query shard in IoTDB. Each query shard outputs its execution information in the result set, including FragmentStatistics and operator information. FragmentStatistics provides detailed metrics about the shard's execution, including:
Bloom filters help determine if a sequence exists in a TsFile. They are stored at the end of each TsFile.
TimeSeriesMetadata contains indexing information for sequences in a TsFile. Each TsFile has one metadata entry per sequence.
LIMIT are applied.Chunks are the fundamental units of data storage in TsFiles.
(loadChunkFromCacheCount + loadChunkFromDiskCount) equals tsfile number * subSensor number (including the time column) * avg chunk number in each TsFile.Query Timeout Scenario with EXPLAIN ANALYZE
Since EXPLAIN ANALYZE runs as a special query type, it cannot return results if execution times out. To aid troubleshooting, IoTDB automatically enables a timed logging mechanism that periodically records partial results to a dedicated log file (logs/log_explain_analyze.log). This mechanism requires no user configuration.
The following example demonstrates how to use EXPLAIN ANALYZE:
-- Create Database CREATE DATABASE test; -- Create Table USE test; CREATE TABLE t1 (device_id STRING ID, type STRING ATTRIBUTE, speed FLOAT); -- Insert Data INSERT INTO t1(device_id, type, speed) VALUES('car_1', 'Model Y', 120.0); INSERT INTO t1(device_id, type, speed) VALUES('car_2', 'Model 3', 100.0); -- Execute EXPLAIN ANALYZE EXPLAIN ANALYZE VERBOSE SELECT * FROM t1;
Output (Simplified):
+-----------------------------------------------------------------------------------------------+ | Explain Analyze| +-----------------------------------------------------------------------------------------------+ |Analyze Cost: 38.860 ms | |Fetch Partition Cost: 9.888 ms | |Fetch Schema Cost: 54.046 ms | |Logical Plan Cost: 10.102 ms | |Logical Optimization Cost: 17.396 ms | |Distribution Plan Cost: 2.508 ms | |Dispatch Cost: 22.126 ms | |Fragment Instances Count: 2 | | | |FRAGMENT-INSTANCE[Id: 20241127_090849_00009_1.2.0][IP: 0.0.0.0][DataRegion: 2][State: FINISHED]| | Total Wall Time: 18 ms | | Cost of initDataQuerySource: 6.153 ms | | Seq File(unclosed): 1, Seq File(closed): 0 | | UnSeq File(unclosed): 0, UnSeq File(closed): 0 | | ready queued time: 0.164 ms, blocked queued time: 0.342 ms | | Query Statistics: | | loadBloomFilterFromCacheCount: 0 | | loadBloomFilterFromDiskCount: 0 | | loadBloomFilterActualIOSize: 0 | | loadBloomFilterTime: 0.000 | | loadTimeSeriesMetadataAlignedMemSeqCount: 1 | | loadTimeSeriesMetadataAlignedMemSeqTime: 0.246 | | loadTimeSeriesMetadataFromCacheCount: 0 | | loadTimeSeriesMetadataFromDiskCount: 0 | | loadTimeSeriesMetadataActualIOSize: 0 | | constructAlignedChunkReadersMemCount: 1 | | constructAlignedChunkReadersMemTime: 0.294 | | loadChunkFromCacheCount: 0 | | loadChunkFromDiskCount: 0 | | loadChunkActualIOSize: 0 | | pageReadersDecodeAlignedMemCount: 1 | | pageReadersDecodeAlignedMemTime: 0.047 | | [PlanNodeId 43]: IdentitySinkNode(IdentitySinkOperator) | | CPU Time: 5.523 ms | | output: 2 rows | | HasNext() Called Count: 6 | | Next() Called Count: 5 | | Estimated Memory Size: : 327680 | | [PlanNodeId 31]: CollectNode(CollectOperator) | | CPU Time: 5.512 ms | | output: 2 rows | | HasNext() Called Count: 6 | | Next() Called Count: 5 | | Estimated Memory Size: : 327680 | | [PlanNodeId 29]: TableScanNode(TableScanOperator) | | CPU Time: 5.439 ms | | output: 1 rows | | HasNext() Called Count: 3 | Next() Called Count: 2 | | Estimated Memory Size: : 327680 | | DeviceNumber: 1 | | CurrentDeviceIndex: 0 | | [PlanNodeId 40]: ExchangeNode(ExchangeOperator) | | CPU Time: 0.053 ms | | output: 1 rows | | HasNext() Called Count: 2 | | Next() Called Count: 1 | | Estimated Memory Size: : 131072 | | | |FRAGMENT-INSTANCE[Id: 20241127_090849_00009_1.3.0][IP: 0.0.0.0][DataRegion: 1][State: FINISHED]| | Total Wall Time: 13 ms | | Cost of initDataQuerySource: 5.725 ms | | Seq File(unclosed): 1, Seq File(closed): 0 | | UnSeq File(unclosed): 0, UnSeq File(closed): 0 | | ready queued time: 0.118 ms, blocked queued time: 5.844 ms | | Query Statistics: | | loadBloomFilterFromCacheCount: 0 | | loadBloomFilterFromDiskCount: 0 | | loadBloomFilterActualIOSize: 0 | | loadBloomFilterTime: 0.000 | | loadTimeSeriesMetadataAlignedMemSeqCount: 1 | | loadTimeSeriesMetadataAlignedMemSeqTime: 0.004 | | loadTimeSeriesMetadataFromCacheCount: 0 | | loadTimeSeriesMetadataFromDiskCount: 0 | | loadTimeSeriesMetadataActualIOSize: 0 | | constructAlignedChunkReadersMemCount: 1 | | constructAlignedChunkReadersMemTime: 0.001 | | loadChunkFromCacheCount: 0 | | loadChunkFromDiskCount: 0 | | loadChunkActualIOSize: 0 | | pageReadersDecodeAlignedMemCount: 1 | | pageReadersDecodeAlignedMemTime: 0.007 | | [PlanNodeId 42]: IdentitySinkNode(IdentitySinkOperator) | | CPU Time: 0.270 ms | | output: 1 rows | | HasNext() Called Count: 3 | | Next() Called Count: 2 | | Estimated Memory Size: : 327680 | | [PlanNodeId 30]: TableScanNode(TableScanOperator) | | CPU Time: 0.250 ms | | output: 1 rows | | HasNext() Called Count: 3 | | Next() Called Count: 2 | | Estimated Memory Size: : 327680 | | DeviceNumber: 1 | | CurrentDeviceIndex: 0 | +-----------------------------------------------------------------------------------------------+
Key Scenarios:
EXPLAIN ANALYZE introduces minimal overhead, as it runs in a separate thread to collect query statistics. These metrics are generated by the system regardless of whether EXPLAIN ANALYZE is executed; the command simply retrieves them for user inspection.
Additionally, EXPLAIN ANALYZE iterates through the result set without producing output. Therefore, the reported execution time closely reflects the actual query execution time, with negligible deviation.
The following metrics are crucial for evaluating I/O performance during query execution:
loadBloomFilterActualIOSizeloadBloomFilterTimeloadTimeSeriesMetadataAlignedDisk[Seq/Unseq]TimeloadTimeSeriesMetadataActualIOSizealignedTimeSeriesMetadataModificationTimeconstructAlignedChunkReadersDiskTimeloadChunkActualIOSizeThese metrics were detailed in previous sections. While TimeSeriesMetadata loading is tracked separately for sequential and unsequential files, chunk reading is not currently differentiated. However, the proportion of sequential versus unsequential data can be inferred from TimeSeriesMetadata statistics.
Unsequential data can negatively affect query performance in the following ways:
Currently, there is no direct method to measure the performance impact of unsequential data. The only approach is to compare query performance before and after merging the unsequential data. However, even after merging, the query still incurs I/O, compression, and decoding overhead, meaning the execution time will not decrease significantly.
If EXPLAIN ANALYZE results are missing from the log_explain_analyze.log file after a query timeout, the issue may stem from an incomplete system upgrade. Specifically, if the lib package was updated without updating the conf/logback-datanode.xml file, the logging configuration may be outdated.
Solution:
conf/logback-datanode.xml file with the updated version.EXPLAIN ANALYZE statement with the VERBOSE option to confirm proper logging.