| <?xml version="1.0" encoding="UTF-8"?> |
| <!DOCTYPE html |
| PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> |
| <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> |
| <head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> |
| |
| <meta name="copyright" content="(C) Copyright 2025" /> |
| <meta name="DC.rights.owner" content="(C) Copyright 2025" /> |
| <meta name="DC.Type" content="concept" /> |
| <meta name="DC.Title" content="REFRESH Statement" /> |
| <meta name="DC.Relation" scheme="URI" content="../topics/impala_langref_sql.html" /> |
| <meta name="prodname" content="Impala" /> |
| <meta name="prodname" content="Impala" /> |
| <meta name="version" content="Impala 3.4.x" /> |
| <meta name="version" content="Impala 3.4.x" /> |
| <meta name="DC.Format" content="XHTML" /> |
| <meta name="DC.Identifier" content="refresh" /> |
| <link rel="stylesheet" type="text/css" href="../commonltr.css" /> |
| <title>REFRESH Statement</title> |
| </head> |
| <body id="refresh"> |
| |
| |
| <h1 class="title topictitle1" id="ariaid-title1">REFRESH Statement</h1> |
| |
| |
| |
| |
| |
| |
| <div class="body conbody"> |
| |
| <p class="p"> |
| The <code class="ph codeph">REFRESH</code> statement reloads the metadata for the table from the |
| metastore database and does an incremental reload of the file and block metadata from the |
| HDFS NameNode. <code class="ph codeph">REFRESH</code> is used to avoid inconsistencies between Impala |
| and external metadata sources, namely Hive Metastore (HMS) and NameNodes. |
| </p> |
| |
| |
| <p class="p"> The <code class="ph codeph">REFRESH</code> statement is only required if you load data |
| from outside of Impala. Updated metadata, as a result of running |
| <code class="ph codeph">REFRESH</code>, is broadcast to all Impala coordinators. </p> |
| |
| |
| <p class="p"> |
| See <a class="xref" href="impala_hadoop.html#intro_metastore">Overview of Impala Metadata and the Metastore</a> for the information about the way |
| Impala uses metadata and how it shares the same metastore database as Hive. |
| </p> |
| |
| |
| <p class="p"> |
| Once issued, the <code class="ph codeph">REFRESH</code> statement cannot be cancelled. |
| </p> |
| |
| |
| <p class="p"> |
| <strong class="ph b">Syntax:</strong> |
| </p> |
| |
| |
| <pre class="pre codeblock"><code>REFRESH [<var class="keyword varname">db_name</var>.]<var class="keyword varname">table_name</var> [PARTITION (<var class="keyword varname">key_col1</var>=<var class="keyword varname">val1</var> [, <var class="keyword varname">key_col2</var>=<var class="keyword varname">val2</var>...])]</code></pre> |
| |
| <p class="p"> |
| <strong class="ph b">Usage notes:</strong> |
| </p> |
| |
| |
| <p class="p"> |
| The table name is a required parameter, and the table must already exist and be known to |
| Impala. |
| </p> |
| |
| |
| <p class="p"> |
| Only the metadata for the specified table is reloaded. |
| </p> |
| |
| |
| <p class="p"> |
| Use the <code class="ph codeph">REFRESH</code> statement to load the latest metastore metadata for a |
| particular table after one of the following scenarios happens outside of Impala: |
| </p> |
| |
| |
| <ul class="ul"> |
| <li class="li"> Deleting, adding, or modifying files. <p class="p"> For example, after loading |
| new data files into the HDFS data directory for the table, appending |
| to an existing HDFS file, inserting data from Hive via |
| <code class="ph codeph">INSERT</code> or <code class="ph codeph">LOAD DATA</code>. </p> |
| |
| </li> |
| |
| |
| <li class="li"> |
| Deleting, adding, or modifying partitions. |
| <p class="p"> |
| For example, after issuing <code class="ph codeph">ALTER TABLE</code> or other table-modifying SQL |
| statement in Hive |
| </p> |
| |
| </li> |
| |
| </ul> |
| |
| |
| <div class="note note"><span class="notetitle">Note:</span> |
| <p class="p"> |
| In <span class="keyword">Impala 2.3</span> and higher, the <code class="ph codeph">ALTER TABLE |
| <var class="keyword varname">table_name</var> RECOVER PARTITIONS</code> statement is a faster |
| alternative to <code class="ph codeph">REFRESH</code> when you are only adding new partition |
| directories through Hive or manual HDFS operations. See |
| <a class="xref" href="impala_alter_table.html#alter_table">ALTER TABLE Statement</a> for details. |
| </p> |
| |
| </div> |
| |
| |
| <div class="p"> |
| <code class="ph codeph">INVALIDATE METADATA</code> and <code class="ph codeph">REFRESH</code> are counterparts: |
| <ul class="ul"> |
| <li class="li"> |
| <code class="ph codeph">INVALIDATE METADATA</code> is an asynchronous operations that simply |
| discards the loaded metadata from the catalog and coordinator caches. After that |
| operation, the catalog and all the Impala coordinators only know about the existence |
| of databases and tables and nothing more. Metadata loading for tables is triggered |
| by any subsequent queries. |
| </li> |
| |
| |
| <li class="li"> |
| <code class="ph codeph">REFRESH</code> reloads the metadata synchronously. |
| <code class="ph codeph">REFRESH</code> is more lightweight than doing a full metadata load after a |
| table has been invalidated. <code class="ph codeph">REFRESH</code> cannot detect changes in block |
| locations triggered by operations like HDFS balancer, hence causing remote reads |
| during query execution with negative performance implications. |
| </li> |
| |
| </ul> |
| |
| </div> |
| |
| |
| <p class="p"> |
| <strong class="ph b">Refreshing a single partition:</strong> |
| </p> |
| |
| |
| <p class="p"> |
| In <span class="keyword">Impala 2.7</span> and higher, the <code class="ph codeph">REFRESH</code> statement |
| can apply to a single partition at a time, rather than the whole table. Include the |
| optional <code class="ph codeph">PARTITION (<var class="keyword varname">partition_spec</var>)</code> clause and specify |
| values for each of the partition key columns. |
| </p> |
| |
| |
| <div class="p"> |
| The following rules apply: |
| <ul class="ul"> |
| <li class="li"> |
| The <code class="ph codeph">PARTITION</code> clause of the <code class="ph codeph">REFRESH</code> statement must |
| include all the partition key columns. |
| </li> |
| |
| |
| <li class="li"> |
| The order of the partition key columns does not have to match the column order in the |
| table. |
| </li> |
| |
| |
| <li class="li"> |
| Specifying a nonexistent partition does not cause an error. |
| </li> |
| |
| |
| <li class="li"> |
| The partition can be one that Impala created and is already aware of, or a new |
| partition created through Hive. |
| </li> |
| |
| </ul> |
| |
| </div> |
| |
| |
| <p class="p"> |
| The following examples demonstrates the above rules. |
| </p> |
| |
| |
| <pre class="pre codeblock"><code> |
| -- Partition doesn't exist. |
| refresh p2 partition (y=0, z=3); |
| refresh p2 partition (y=0, z=-1) |
| |
| -- Key columns specified in a different order than the table definition. |
| refresh p2 partition (z=1, y=0) |
| |
| -- Incomplete partition spec causes an error. |
| refresh p2 partition (y=0) |
| ERROR: AnalysisException: Items in partition spec must exactly match the partition columns in the table definition: default.p2 (1 vs 2) |
| |
| </code></pre> |
| |
| <p class="p"> |
| For examples of using <code class="ph codeph">REFRESH</code> and <code class="ph codeph">INVALIDATE METADATA</code> |
| with a combination of Impala and Hive operations, see |
| <a class="xref" href="impala_tutorial.html#tutorial_impala_hive">Switching Back and Forth Between Impala and Hive</a>. |
| </p> |
| |
| |
| <p class="p"> |
| <strong class="ph b">Related impala-shell options:</strong> |
| </p> |
| |
| |
| <p class="p"> |
| Due to the expense of reloading the metadata for all tables, the |
| <span class="keyword cmdname">impala-shell</span> <code class="ph codeph">-r</code> option is not recommended. |
| </p> |
| |
| |
| <p class="p"> |
| <strong class="ph b">HDFS permissions:</strong> |
| </p> |
| |
| |
| <p class="p"> |
| All HDFS and Ranger permissions and privilege requirements are the same whether you |
| refresh the entire table or a single partition. |
| </p> |
| |
| |
| <p class="p"> |
| <strong class="ph b">HDFS considerations:</strong> |
| </p> |
| |
| |
| <p class="p"> |
| The <code class="ph codeph">REFRESH</code> statement checks HDFS permissions of the underlying data |
| files and directories, caching this information so that a statement can be cancelled |
| immediately if for example the <code class="ph codeph">impala</code> user does not have permission to |
| write to the data directory for the table. Impala reports any lack of write permissions as |
| an <code class="ph codeph">INFO</code> message in the log file. |
| </p> |
| |
| |
| <p class="p"> |
| If you change HDFS permissions to make data readable or writeable by the Impala user, |
| issue another <code class="ph codeph">REFRESH</code> to make Impala aware of the change. |
| </p> |
| |
| |
| <p class="p"> |
| <strong class="ph b">Kudu considerations:</strong> |
| </p> |
| |
| |
| <p class="p">By default, much of the metadata |
| for Kudu tables is handled by the underlying storage layer. Kudu tables |
| have less reliance on the Metastore database, and require less metadata |
| caching on the Impala side. For example, information about partitions in |
| Kudu tables is managed by Kudu, and Impala does not cache any block |
| locality metadata for Kudu tables. If the Kudu service is not integrated |
| with the Hive Metastore, Impala will manage Kudu table metadata in the |
| Hive Metastore.</p> |
| |
| |
| <p class="p"> |
| The <code class="ph codeph">REFRESH</code> and <code class="ph codeph">INVALIDATE METADATA</code> statements are |
| needed less frequently for Kudu tables than for HDFS-backed tables. Neither statement is |
| needed when data is added to, removed, or updated in a Kudu table, even if the changes |
| are made directly to Kudu through a client program using the Kudu API. Run |
| <code class="ph codeph">REFRESH <var class="keyword varname">table_name</var></code> or <code class="ph codeph">INVALIDATE METADATA |
| <var class="keyword varname">table_name</var></code> for a Kudu table only after making a change to |
| the Kudu table schema, such as adding or dropping a column. |
| </p> |
| |
| |
| <p class="p"> |
| <strong class="ph b">Related information:</strong> |
| </p> |
| |
| |
| <p class="p"> |
| <a class="xref" href="impala_hadoop.html#intro_metastore">Overview of Impala Metadata and the Metastore</a>, |
| <a class="xref" href="impala_invalidate_metadata.html#invalidate_metadata">INVALIDATE METADATA Statement</a> |
| </p> |
| |
| |
| </div> |
| |
| |
| <div class="related-links"> |
| <div class="familylinks"> |
| <div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_langref_sql.html">Impala SQL Statements</a></div> |
| </div> |
| </div></body> |
| </html> |