blob: 08c0cc602ea4274299de3d3a925d8bd3b1f5db87 [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
<meta name="copyright" content="(C) Copyright 2025" />
<meta name="DC.rights.owner" content="(C) Copyright 2025" />
<meta name="DC.Type" content="concept" />
<meta name="DC.Title" content="DROP STATS Statement" />
<meta name="DC.Relation" scheme="URI" content="../topics/impala_langref_sql.html" />
<meta name="prodname" content="Impala" />
<meta name="prodname" content="Impala" />
<meta name="version" content="Impala 3.4.x" />
<meta name="version" content="Impala 3.4.x" />
<meta name="DC.Format" content="XHTML" />
<meta name="DC.Identifier" content="drop_stats" />
<link rel="stylesheet" type="text/css" href="../commonltr.css" />
<title>DROP STATS Statement</title>
</head>
<body id="drop_stats">
<h1 class="title topictitle1" id="ariaid-title1">DROP STATS Statement</h1>
<div class="body conbody">
<p class="p">
Removes the specified statistics from a table or partition. The statistics were originally created by the
<code class="ph codeph">COMPUTE STATS</code> or <code class="ph codeph">COMPUTE INCREMENTAL STATS</code> statement.
</p>
<p class="p">
<strong class="ph b">Syntax:</strong>
</p>
<pre class="pre codeblock"><code>DROP STATS [<var class="keyword varname">database_name</var>.]<var class="keyword varname">table_name</var>
DROP INCREMENTAL STATS [<var class="keyword varname">database_name</var>.]<var class="keyword varname">table_name</var> PARTITION (<var class="keyword varname">partition_spec</var>)
<var class="keyword varname">partition_spec</var> ::= <var class="keyword varname">partition_col</var>=<var class="keyword varname">constant_value</var>
</code></pre>
<p class="p">
The <code class="ph codeph">PARTITION</code> clause is only allowed in combination with the
<code class="ph codeph">INCREMENTAL</code> clause. It is optional for <code class="ph codeph">COMPUTE INCREMENTAL
STATS</code>, and required for <code class="ph codeph">DROP INCREMENTAL STATS</code>. Whenever you
specify partitions through the <code class="ph codeph">PARTITION
(<var class="keyword varname">partition_spec</var>)</code> clause in a <code class="ph codeph">COMPUTE INCREMENTAL
STATS</code> or <code class="ph codeph">DROP INCREMENTAL STATS</code> statement, you must include
all the partitioning columns in the specification, and specify constant values for all
the partition key columns.
</p>
<p class="p">
<code class="ph codeph">DROP STATS</code> removes all statistics from the table, whether created by <code class="ph codeph">COMPUTE
STATS</code> or <code class="ph codeph">COMPUTE INCREMENTAL STATS</code>.
</p>
<p class="p">
<code class="ph codeph">DROP INCREMENTAL STATS</code> only affects incremental statistics for a single partition, specified
through the <code class="ph codeph">PARTITION</code> clause. The incremental stats are marked as outdated, so that they are
recomputed by the next <code class="ph codeph">COMPUTE INCREMENTAL STATS</code> statement.
</p>
<p class="p">
<strong class="ph b">Usage notes:</strong>
</p>
<p class="p">
You typically use this statement when the statistics for a table or a partition have become stale due to data
files being added to or removed from the associated HDFS data directories, whether by manual HDFS operations
or <code class="ph codeph">INSERT</code>, <code class="ph codeph">INSERT OVERWRITE</code>, or <code class="ph codeph">LOAD DATA</code> statements, or
adding or dropping partitions.
</p>
<p class="p">
When a table or partition has no associated statistics, Impala treats it as essentially zero-sized when
constructing the execution plan for a query. In particular, the statistics influence the order in which
tables are joined in a join query. To ensure proper query planning and good query performance and
scalability, make sure to run <code class="ph codeph">COMPUTE STATS</code> or <code class="ph codeph">COMPUTE INCREMENTAL STATS</code> on
the table or partition after removing any stale statistics.
</p>
<p class="p">
Dropping the statistics is not required for an unpartitioned table or a partitioned table covered by the
original type of statistics. A subsequent <code class="ph codeph">COMPUTE STATS</code> statement replaces any existing
statistics with new ones, for all partitions, regardless of whether the old ones were outdated. Therefore,
this statement was rarely used before the introduction of incremental statistics.
</p>
<p class="p">
Dropping the statistics is required for a partitioned table containing incremental statistics, to make a
subsequent <code class="ph codeph">COMPUTE INCREMENTAL STATS</code> statement rescan an existing partition. See
<a class="xref" href="impala_perf_stats.html#perf_stats">Table and Column Statistics</a> for information about incremental statistics, a new feature
available in Impala 2.1.0 and higher.
</p>
<p class="p">
<strong class="ph b">Statement type:</strong> DDL
</p>
<p class="p">
<strong class="ph b">Cancellation:</strong> Cannot be cancelled.
</p>
<p class="p">
<strong class="ph b">HDFS permissions:</strong>
</p>
<p class="p">
The user ID that the <span class="keyword cmdname">impalad</span> daemon runs under,
typically the <code class="ph codeph">impala</code> user, does not need any
particular HDFS permissions to perform this statement.
All read and write operations are on the metastore database,
not HDFS files and directories.
</p>
<p class="p">
<strong class="ph b">Examples:</strong>
</p>
<p class="p">
The following example shows a partitioned table that has associated statistics produced by the
<code class="ph codeph">COMPUTE INCREMENTAL STATS</code> statement, and how the situation evolves as statistics are dropped
from specific partitions, then the entire table.
</p>
<p class="p">
Initially, all table and column statistics are filled in.
</p>
<pre class="pre codeblock"><code>show table stats item_partitioned;
+-------------+-------+--------+----------+--------------+---------+-----------------
| i_category | #Rows | #Files | Size | Bytes Cached | Format | Incremental stats
+-------------+-------+--------+----------+--------------+---------+-----------------
| Books | 1733 | 1 | 223.74KB | NOT CACHED | PARQUET | true
| Children | 1786 | 1 | 230.05KB | NOT CACHED | PARQUET | true
| Electronics | 1812 | 1 | 232.67KB | NOT CACHED | PARQUET | true
| Home | 1807 | 1 | 232.56KB | NOT CACHED | PARQUET | true
| Jewelry | 1740 | 1 | 223.72KB | NOT CACHED | PARQUET | true
| Men | 1811 | 1 | 231.25KB | NOT CACHED | PARQUET | true
| Music | 1860 | 1 | 237.90KB | NOT CACHED | PARQUET | true
| Shoes | 1835 | 1 | 234.90KB | NOT CACHED | PARQUET | true
| Sports | 1783 | 1 | 227.97KB | NOT CACHED | PARQUET | true
| Women | 1790 | 1 | 226.27KB | NOT CACHED | PARQUET | true
| Total | 17957 | 10 | 2.25MB | 0B | |
+-------------+-------+--------+----------+--------------+---------+-----------------
show column stats item_partitioned;
+------------------+-----------+------------------+--------+----------+--------------
| Column | Type | #Distinct Values | #Nulls | Max Size | Avg Size
+------------------+-----------+------------------+--------+----------+--------------
| i_item_sk | INT | 19443 | -1 | 4 | 4
| i_item_id | STRING | 9025 | -1 | 16 | 16
| i_rec_start_date | TIMESTAMP | 4 | -1 | 16 | 16
| i_rec_end_date | TIMESTAMP | 3 | -1 | 16 | 16
| i_item_desc | STRING | 13330 | -1 | 200 | 100.302803039
| i_current_price | FLOAT | 2807 | -1 | 4 | 4
| i_wholesale_cost | FLOAT | 2105 | -1 | 4 | 4
| i_brand_id | INT | 965 | -1 | 4 | 4
| i_brand | STRING | 725 | -1 | 22 | 16.1776008605
| i_class_id | INT | 16 | -1 | 4 | 4
| i_class | STRING | 101 | -1 | 15 | 7.76749992370
| i_category_id | INT | 10 | -1 | 4 | 4
| i_manufact_id | INT | 1857 | -1 | 4 | 4
| i_manufact | STRING | 1028 | -1 | 15 | 11.3295001983
| i_size | STRING | 8 | -1 | 11 | 4.33459997177
| i_formulation | STRING | 12884 | -1 | 20 | 19.9799995422
| i_color | STRING | 92 | -1 | 10 | 5.38089990615
| i_units | STRING | 22 | -1 | 7 | 4.18690013885
| i_container | STRING | 2 | -1 | 7 | 6.99259996414
| i_manager_id | INT | 105 | -1 | 4 | 4
| i_product_name | STRING | 19094 | -1 | 25 | 18.0233001708
| i_category | STRING | 10 | 0 | -1 | -1
+------------------+-----------+------------------+--------+----------+--------------
</code></pre>
<p class="p">
To remove statistics for particular partitions, use the <code class="ph codeph">DROP INCREMENTAL STATS</code> statement.
After removing statistics for two partitions, the table-level statistics reflect that change in the
<code class="ph codeph">#Rows</code> and <code class="ph codeph">Incremental stats</code> fields. The counts, maximums, and averages of
the column-level statistics are unaffected.
</p>
<div class="note note"><span class="notetitle">Note:</span>
(It is possible that the row count might be preserved in future after a <code class="ph codeph">DROP INCREMENTAL
STATS</code> statement. Check the resolution of the issue
<a class="xref" href="https://issues.apache.org/jira/browse/IMPALA-1615" target="_blank">IMPALA-1615</a>.)
</div>
<pre class="pre codeblock"><code>drop incremental stats item_partitioned partition (i_category='Sports');
drop incremental stats item_partitioned partition (i_category='Electronics');
show table stats item_partitioned
+-------------+-------+--------+----------+--------------+---------+------------------
| i_category | #Rows | #Files | Size | Bytes Cached | Format | Incremental stats
+-------------+-------+--------+----------+--------------+---------+-----------------
| Books | 1733 | 1 | 223.74KB | NOT CACHED | PARQUET | true
| Children | 1786 | 1 | 230.05KB | NOT CACHED | PARQUET | true
| Electronics | -1 | 1 | 232.67KB | NOT CACHED | PARQUET | false
| Home | 1807 | 1 | 232.56KB | NOT CACHED | PARQUET | true
| Jewelry | 1740 | 1 | 223.72KB | NOT CACHED | PARQUET | true
| Men | 1811 | 1 | 231.25KB | NOT CACHED | PARQUET | true
| Music | 1860 | 1 | 237.90KB | NOT CACHED | PARQUET | true
| Shoes | 1835 | 1 | 234.90KB | NOT CACHED | PARQUET | true
| Sports | -1 | 1 | 227.97KB | NOT CACHED | PARQUET | false
| Women | 1790 | 1 | 226.27KB | NOT CACHED | PARQUET | true
| Total | 17957 | 10 | 2.25MB | 0B | |
+-------------+-------+--------+----------+--------------+---------+-----------------
show column stats item_partitioned
+------------------+-----------+------------------+--------+----------+--------------
| Column | Type | #Distinct Values | #Nulls | Max Size | Avg Size
+------------------+-----------+------------------+--------+----------+--------------
| i_item_sk | INT | 19443 | -1 | 4 | 4
| i_item_id | STRING | 9025 | -1 | 16 | 16
| i_rec_start_date | TIMESTAMP | 4 | -1 | 16 | 16
| i_rec_end_date | TIMESTAMP | 3 | -1 | 16 | 16
| i_item_desc | STRING | 13330 | -1 | 200 | 100.302803039
| i_current_price | FLOAT | 2807 | -1 | 4 | 4
| i_wholesale_cost | FLOAT | 2105 | -1 | 4 | 4
| i_brand_id | INT | 965 | -1 | 4 | 4
| i_brand | STRING | 725 | -1 | 22 | 16.1776008605
| i_class_id | INT | 16 | -1 | 4 | 4
| i_class | STRING | 101 | -1 | 15 | 7.76749992370
| i_category_id | INT | 10 | -1 | 4 | 4
| i_manufact_id | INT | 1857 | -1 | 4 | 4
| i_manufact | STRING | 1028 | -1 | 15 | 11.3295001983
| i_size | STRING | 8 | -1 | 11 | 4.33459997177
| i_formulation | STRING | 12884 | -1 | 20 | 19.9799995422
| i_color | STRING | 92 | -1 | 10 | 5.38089990615
| i_units | STRING | 22 | -1 | 7 | 4.18690013885
| i_container | STRING | 2 | -1 | 7 | 6.99259996414
| i_manager_id | INT | 105 | -1 | 4 | 4
| i_product_name | STRING | 19094 | -1 | 25 | 18.0233001708
| i_category | STRING | 10 | 0 | -1 | -1
+------------------+-----------+------------------+--------+----------+--------------
</code></pre>
<p class="p">
To remove all statistics from the table, whether produced by <code class="ph codeph">COMPUTE STATS</code> or
<code class="ph codeph">COMPUTE INCREMENTAL STATS</code>, use the <code class="ph codeph">DROP STATS</code> statement without the
<code class="ph codeph">INCREMENTAL</code> clause). Now, both table-level and column-level statistics are reset.
</p>
<pre class="pre codeblock"><code>drop stats item_partitioned;
show table stats item_partitioned
+-------------+-------+--------+----------+--------------+---------+------------------
| i_category | #Rows | #Files | Size | Bytes Cached | Format | Incremental stats
+-------------+-------+--------+----------+--------------+---------+------------------
| Books | -1 | 1 | 223.74KB | NOT CACHED | PARQUET | false
| Children | -1 | 1 | 230.05KB | NOT CACHED | PARQUET | false
| Electronics | -1 | 1 | 232.67KB | NOT CACHED | PARQUET | false
| Home | -1 | 1 | 232.56KB | NOT CACHED | PARQUET | false
| Jewelry | -1 | 1 | 223.72KB | NOT CACHED | PARQUET | false
| Men | -1 | 1 | 231.25KB | NOT CACHED | PARQUET | false
| Music | -1 | 1 | 237.90KB | NOT CACHED | PARQUET | false
| Shoes | -1 | 1 | 234.90KB | NOT CACHED | PARQUET | false
| Sports | -1 | 1 | 227.97KB | NOT CACHED | PARQUET | false
| Women | -1 | 1 | 226.27KB | NOT CACHED | PARQUET | false
| Total | -1 | 10 | 2.25MB | 0B | |
+-------------+-------+--------+----------+--------------+---------+------------------
show column stats item_partitioned
+------------------+-----------+------------------+--------+----------+----------+
| Column | Type | #Distinct Values | #Nulls | Max Size | Avg Size |
+------------------+-----------+------------------+--------+----------+----------+
| i_item_sk | INT | -1 | -1 | 4 | 4 |
| i_item_id | STRING | -1 | -1 | -1 | -1 |
| i_rec_start_date | TIMESTAMP | -1 | -1 | 16 | 16 |
| i_rec_end_date | TIMESTAMP | -1 | -1 | 16 | 16 |
| i_item_desc | STRING | -1 | -1 | -1 | -1 |
| i_current_price | FLOAT | -1 | -1 | 4 | 4 |
| i_wholesale_cost | FLOAT | -1 | -1 | 4 | 4 |
| i_brand_id | INT | -1 | -1 | 4 | 4 |
| i_brand | STRING | -1 | -1 | -1 | -1 |
| i_class_id | INT | -1 | -1 | 4 | 4 |
| i_class | STRING | -1 | -1 | -1 | -1 |
| i_category_id | INT | -1 | -1 | 4 | 4 |
| i_manufact_id | INT | -1 | -1 | 4 | 4 |
| i_manufact | STRING | -1 | -1 | -1 | -1 |
| i_size | STRING | -1 | -1 | -1 | -1 |
| i_formulation | STRING | -1 | -1 | -1 | -1 |
| i_color | STRING | -1 | -1 | -1 | -1 |
| i_units | STRING | -1 | -1 | -1 | -1 |
| i_container | STRING | -1 | -1 | -1 | -1 |
| i_manager_id | INT | -1 | -1 | 4 | 4 |
| i_product_name | STRING | -1 | -1 | -1 | -1 |
| i_category | STRING | 10 | 0 | -1 | -1 |
+------------------+-----------+------------------+--------+----------+----------+
</code></pre>
<p class="p">
<strong class="ph b">Related information:</strong>
</p>
<p class="p">
<a class="xref" href="impala_compute_stats.html#compute_stats">COMPUTE STATS Statement</a>, <a class="xref" href="impala_show.html#show_table_stats">SHOW TABLE STATS Statement</a>,
<a class="xref" href="impala_show.html#show_column_stats">SHOW COLUMN STATS Statement</a>, <a class="xref" href="impala_perf_stats.html#perf_stats">Table and Column Statistics</a>
</p>
</div>
<div class="related-links">
<div class="familylinks">
<div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_langref_sql.html">Impala SQL Statements</a></div>
</div>
</div></body>
</html>