SQL Cache is the query result caching mechanism provided by Doris. It caches query results keyed by metadata such as the SQL text and data version, so that subsequent identical queries are returned directly from the cache. This significantly reduces the overhead of repeated computation.
Before reading this guide, confirm that you understand or have prepared the following:
explain plan (2.1.3+ recommended)now() and random()) affect the cache hit rateBefore deciding whether to enable SQL Cache, check item by item that your queries meet the following conditions:
random())SQL Cache is suitable for query scenarios where the data is updated infrequently. It avoids repeated computation by caching query results.
SQL Cache uniquely identifies a piece of cached data based on the combination of the following factors:
| Key factor | Description |
|---|---|
| SQL text | An exactly identical SQL string |
| View definition | The DDL definition of any views involved |
| Table and partition versions | Whether the data has been changed |
| User variables and their values | The current values of variables referenced in the SQL |
| Nondeterministic functions and their results | The computed results of functions such as now() and random() |
| Row policy definition | Row Policy configuration |
| Data masking definition | Data Masking configuration |
If any factor changes (such as SQL rewrites, different query fields or conditions, or version changes caused by data updates), the cache will not be hit. For multi-table Join queries, an update to any one of the tables changes the partition ID or version number, which makes the cache unable to hit.
Definition: A nondeterministic function is a function whose result has no fixed relationship with its input parameters.
| Function | Behavior | Can it use the cache |
|---|---|---|
now() | Returns the current second-level time, changes once per second | Reusable within the same second |
date(now()) | Converts second-level time to day-level granularity | Reusable within the same day (recommended) |
random() | Returns a different result on every call | Almost never hits the cache |
Optimization tip: Convert fine-grained time to coarse-grained time. For example, use select * from tbl where dt = date(now()) instead of select * from tbl where dt = now(), so that all queries within the same day can hit the cache. Avoid using strongly nondeterministic functions such as random() in queries.
SQL Cache caches results strictly by the queried fields. It does not support satisfying a “more metrics” query with a “fewer metrics” cache.
| Scenario | Behavior | Impact |
|---|---|---|
| The result of 2 metrics has been cached, and a new query requests 3 metrics | The cache cannot be reused, and the query must be re-executed | The hit rate drops, and the new result is written to a new cache entry |
Optimization tip: Keep the metric fields of business-side reports and query templates stable. If you need to add metrics, modify the query template instead of expanding it ad hoc, so that the cache is not made unusable.
The processing flow after FE receives a query request:
Special optimization: If the SQL optimization phase determines that the result contains only 0 or 1 row, FE saves the result directly in its own memory to accelerate subsequent identical queries.
Purpose: Enable SQL Cache at the session or global level (disabled by default).
-- Turn on SQL Cache in the current session; it is off by default set enable_sql_cache=true; -- Turn off SQL Cache in the current session set enable_sql_cache=false; -- Turn on SQL Cache globally; it is off by default set global enable_sql_cache=true; -- Turn off SQL Cache globally set global enable_sql_cache=false;
Notes: Session-level configuration only takes effect for the current session. Global configuration takes effect for all newly created sessions.
explain plan (applicable to Doris 2.1.3+)Purpose: Determine whether the cache is hit using the query plan. Command: Run explain plan <your SQL>. Notes: When a LogicalSqlCache or PhysicalSqlCache node appears in the query plan tree, the query has hit the SQL Cache.
> explain plan select * from t2; +------------------------------------------------------------------------------------------------------------+ | Explain String(Nereids Planner) | +------------------------------------------------------------------------------------------------------------+ | ========== PARSED PLAN (time: 28ms) ========== | | LogicalSqlCache[1] ( queryId=711dea740e4746e6-8bc11afe08f6542c ) | | +--PhysicalResultSink[39] ( outputExprs=[id#0, name#1] ) | | +--PhysicalOlapScan[t2]@0 ( stats=12 ) | | | | ========== ANALYZED PLAN ========== | | LogicalSqlCache[1] ( queryId=711dea740e4746e6-8bc11afe08f6542c ) | | +--PhysicalResultSink[39] ( outputExprs=[id#0, name#1] ) | | +--PhysicalOlapScan[t2]@0 ( stats=12 ) | | | | ========== REWRITTEN PLAN ========== | | LogicalSqlCache[1] ( queryId=711dea740e4746e6-8bc11afe08f6542c ) | | +--PhysicalResultSink[39] ( outputExprs=[id#0, name#1] ) | | +--PhysicalOlapScan[t2]@0 ( stats=12 ) | | | | ========== OPTIMIZED PLAN ========== | | PhysicalSqlCache[3] ( queryId=711dea740e4746e6-8bc11afe08f6542c, backend=192.168.126.3:9051, rowCount=12 ) | | +--PhysicalResultSink[39] ( outputExprs=[id#0, name#1] ) | | +--PhysicalOlapScan[t2]@0 ( stats=12 ) | +------------------------------------------------------------------------------------------------------------+
Purpose: Confirm cache hits in versions that do not support explain plan. Command: Enable Profile and view the Execution Summary. Notes: If the Is Cached: field shows Yes, the query has hit the SQL Cache.
Execution Summary: - Parse SQL Time: 18ms - Nereids Analysis Time: N/A - Nereids Rewrite Time: N/A - Nereids Optimize Time: N/A - Nereids Translate Time: N/A - Workload Group: normal - Analysis Time: N/A - Wait and Fetch Result Time: N/A - Fetch Result Time: 0ms - Write Result Time: 0ms - Doris Version: 915138e801 - Is Nereids: Yes - Is Cached: Yes - Total Instances Num: 0 - Instances Num Per BE: - Parallel Fragment Exec Instance Num: 1 - Trace ID: - Transaction Commit Time: N/A - Nereids Distribute Time: N/A
Endpoint: http://${FE_IP}:${FE_HTTP_PORT}/metrics Notes: Metric statistics are monotonically increasing. After an FE restart, the count starts again from 0.
# 1 SQL has been written to the cache doris_fe_cache_added{type="sql"} 1 # The SQL Cache has been hit 2 times doris_fe_cache_hit{type="sql"} 2
Endpoint: http://${BE_IP}:${BE_HTTP_PORT}/metrics Notes: Different caches may be stored on different BEs. You must collect metrics from all BEs to get the complete picture.
# There are currently 1205 caches in the memory of this BE doris_be_query_cache_sql_total_count 1205 # All caches currently take up about 44KB of BE memory doris_be_query_cache_memory_total_byte 44101
The cache metadata in FE uses weak references: when FE is short on memory, the least recently used metadata is automatically released. The following parameters are also supported for limiting memory usage:
| Parameter | Default value | Description |
|---|---|---|
sql_cache_manage_num | 100 | Upper bound on the number of metadata entries; the least recently used entry is automatically released when exceeded |
expire_sql_cache_in_fe_second | 300 | Metadata expiration time (seconds); automatically released if not accessed within this time |
cache_result_max_row_count | 3000 | Upper bound on the number of result rows; if exceeded, no SQL Cache is created |
cache_result_max_data_size | 31457280 (30MB) | Upper bound on the result size (in bytes); if exceeded, no SQL Cache is created |
Configuration commands (take effect in real time, must be configured on every FE; for persistence, write them into fe.conf):
-- Store at most 100 cache metadata entries ADMIN SET FRONTEND CONFIG ('sql_cache_manage_num'='100'); -- Automatically release a cache metadata entry that has not been accessed for 300 seconds ADMIN SET FRONTEND CONFIG ('expire_sql_cache_in_fe_second'='300'); -- Do not create SQL Cache when the result has more than 3000 rows by default ADMIN SET FRONTEND CONFIG ('cache_result_max_row_count'='3000'); -- Do not create SQL Cache when the result is larger than 30MB by default ADMIN SET FRONTEND CONFIG ('cache_result_max_data_size'='31457280');
| Parameter | Default value (example) | Description |
|---|---|---|
query_cache_max_size_mb | 256 | The stable upper limit of memory used by the cache |
query_cache_elasticity_size_mb | 128 | Elastic expansion space; eviction is triggered when usage exceeds max + elasticity, until usage drops below max |
Configuration file: be.conf (BE must be restarted after modification).
-- When the cache memory usage exceeds query_cache_max_size_mb + query_cache_elasticity_size_mb, -- release the least recently used cache entries until the memory usage drops below query_cache_max_size_mb. query_cache_max_size_mb = 256 query_cache_elasticity_size_mb = 128
The following table summarizes common causes of cache misses or invalidation, along with the corresponding investigation directions:
| No. | Cause of invalidation | Typical operation | Investigation tip |
|---|---|---|---|
| 1 | Table or view structure change | drop table, replace table, alter table, alter view | Check recent DDL history of tables and views |
| 2 | Table data change | insert, delete, update, truncate | Check ingestion and change logs to confirm whether the data version has changed |
| 3 | User permissions revoked | revoke | Check permission changes for the query account |
| 4 | Use of nondeterministic functions | select random(), select now(), etc. | Switch to coarse-grained functions or constant parameters |
| 5 | Variable value change | select * from tbl where dt = @dt_var | Check whether the session variable values are consistent |
| 6 | Row Policy / Data Masking change | Row policy or masking policy was adjusted | Check the recent change records of the policy |
| 7 | Result row count exceeds the limit | Exceeds cache_result_max_row_count (3000 rows by default) | Adjust the threshold or narrow the result set |
| 8 | Result size exceeds the limit | Exceeds cache_result_max_data_size (30MB by default) | Adjust the threshold or reduce the returned fields |
Q1: Is SQL Cache enabled by default? A: It is disabled by default. Enable it at the session level with set enable_sql_cache=true, or globally with set global enable_sql_cache=true.
Q2: Which table types does SQL Cache support? A: It currently supports OlapTable internal tables and Hive external tables.
Q3: Can SQL Cache be used for real-time updated data? A: Yes, but every data update changes the partition version, which invalidates the cache. The hit rate is low. SQL Cache is more suitable for T+1 offline analytics scenarios.
Q4: How is the cache invalidated for multi-table Join queries? A: As soon as any one of the tables in the Join has a data change, the partition ID or version number changes, and the entire query cache cannot be hit.
Q: Two metrics were cached previously. Can the cache be reused for a query that now requests three metrics? A: No. SQL Cache caches results strictly by the queried fields. The cache of fewer metrics cannot satisfy a query request for more metrics, and the query must be re-executed.
Q5: Does the now() function completely invalidate the cache? A: No, it does not completely invalidate the cache. now() returns a second-level time, and identical queries within the same second can reuse the cache. To extend the cache hit window, use date(now()) to convert it to day-level granularity.
Q6: Where is the cached data stored? A: The vast majority of results are stored in BE memory (in a HashMap structure). When the result contains only 0 or 1 row, FE stores it directly in its own memory.
Q7: Is the cache still available after an FE restart? A: The metadata in FE is lost, and the monitoring metrics are also reset to 0. Results in BE memory are also invalidated after a restart.
| Dimension | SQL Cache | Partition Cache (where applicable) |
|---|---|---|
| Cache granularity | The result set of an entire SQL | Caches intermediate results by partition |
| Applicable update pattern | T+1, low-frequency updates | Some partitions are updated frequently while others remain stable |
| Hit condition | The SQL text and all dependent metadata are unchanged | The versions of the involved partitions are unchanged |
| Invalidation granularity | A change in any dependency invalidates the entire cache | Only partitions that miss the cache need to be recomputed |
Note: The Partition Cache column in this table is for comparison reference only. Actual availability depends on the features of the current Doris version.