| <?xml version="1.0" encoding="UTF-8"?> |
| <!DOCTYPE html |
| PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> |
| <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"> |
| <head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> |
| |
| <meta name="copyright" content="(C) Copyright 2023" /> |
| <meta name="DC.rights.owner" content="(C) Copyright 2023" /> |
| <meta name="DC.Type" content="concept" /> |
| <meta name="DC.Title" content="COMPRESSION_CODEC Query Option (Impala 2.0 or higher only)" /> |
| <meta name="DC.Relation" scheme="URI" content="../topics/impala_set.html" /> |
| <meta name="prodname" content="Impala" /> |
| <meta name="prodname" content="Impala" /> |
| <meta name="version" content="Impala 3.4.x" /> |
| <meta name="version" content="Impala 3.4.x" /> |
| <meta name="DC.Format" content="XHTML" /> |
| <meta name="DC.Identifier" content="compression_codec" /> |
| <link rel="stylesheet" type="text/css" href="../commonltr.css" /> |
| <title>COMPRESSION_CODEC Query Option (Impala 2.0 or higher only)</title> |
| </head> |
| <body id="compression_codec"> |
| |
| |
| <h1 class="title topictitle1" id="ariaid-title1">COMPRESSION_CODEC Query Option (<span class="keyword">Impala 2.0</span> or higher only)</h1> |
| |
| |
| |
| |
| <div class="body conbody"> |
| |
| |
| |
| |
| |
| <p class="p"> |
| |
| When Impala writes Parquet data files using the <code class="ph codeph">INSERT</code> statement, the underlying compression |
| is controlled by the <code class="ph codeph">COMPRESSION_CODEC</code> query option. |
| </p> |
| |
| |
| <div class="note note"><span class="notetitle">Note:</span> |
| Prior to Impala 2.0, this option was named <code class="ph codeph">PARQUET_COMPRESSION_CODEC</code>. In Impala 2.0 and |
| later, the <code class="ph codeph">PARQUET_COMPRESSION_CODEC</code> name is not recognized. Use the more general name |
| <code class="ph codeph">COMPRESSION_CODEC</code> for new code. |
| </div> |
| |
| |
| <p class="p"> |
| <strong class="ph b">Syntax:</strong> |
| </p> |
| |
| |
| <pre class="pre codeblock"><code>SET COMPRESSION_CODEC=<var class="keyword varname">codec_name</var>; // Supported for all codecs. |
| SET COMPRESSION_CODEC=<var class="keyword varname">codec_name</var>:<var class="keyword varname">compression_level</var>; // Only supported for ZSTD. |
| </code></pre> |
| |
| <p class="p"> |
| The allowed values for this query option are <code class="ph codeph">SNAPPY</code> (the default), <code class="ph codeph">GZIP</code>, |
| <code class="ph codeph">ZSTD</code>, <code class="ph codeph">LZ4</code>, and <code class="ph codeph">NONE</code>. |
| </p> |
| |
| |
| <p class="p"> |
| <code class="ph codeph">ZSTD</code> also supports setting a compression level. The lower the level, the faster the speed at |
| the cost of compression ratio. Compression levels from 1 up to 22 are supported for <code class="ph codeph">ZSTD</code>. |
| The default compression level 3 is used, if one is not passed using the <code class="ph codeph">compression_codec</code> |
| query option. |
| </p> |
| |
| |
| <div class="note note"><span class="notetitle">Note:</span> |
| A Parquet file created with <code class="ph codeph">COMPRESSION_CODEC=NONE</code> is still typically smaller than the |
| original data, due to encoding schemes such as run-length encoding and dictionary encoding that are applied |
| separately from compression. |
| </div> |
| |
| |
| <p class="p"></p> |
| |
| |
| <p class="p"> |
| The option value is not case-sensitive. |
| </p> |
| |
| |
| <p class="p"> |
| If the option is set to an unrecognized value, all kinds of queries will fail due to the invalid option |
| setting, not just queries involving Parquet tables. (The value <code class="ph codeph">BZIP2</code> is also recognized, but |
| is not compatible with Parquet tables.) |
| </p> |
| |
| |
| <p class="p"> |
| <strong class="ph b">Type:</strong> <code class="ph codeph">STRING</code> |
| </p> |
| |
| |
| <p class="p"> |
| <strong class="ph b">Default:</strong> <code class="ph codeph">SNAPPY</code> |
| </p> |
| |
| |
| |
| <p class="p"> |
| <strong class="ph b">Examples:</strong> |
| </p> |
| |
| |
| <pre class="pre codeblock"><code> |
| set compression_codec=lz4; |
| insert into parquet_table_lz4_compressed select * from t1; |
| |
| set compression_codec=zstd; // Default compression level 3. |
| insert into parquet_table_zstd_default_compressed select * from t1; |
| |
| set compression_codec=zstd:12; // Compression level 12. |
| insert into parquet_table_zstd_highly_compressed select * from t1; |
| |
| set compression_codec=gzip; |
| insert into parquet_table_highly_compressed select * from t1; |
| |
| set compression_codec=snappy; |
| insert into parquet_table_compression_plus_fast_queries select * from t1; |
| |
| set compression_codec=none; |
| insert into parquet_table_no_compression select * from t1; |
| |
| set compression_codec=foo; |
| select * from t1 limit 5; |
| ERROR: Invalid compression codec: foo |
| </code></pre> |
| |
| <p class="p"> |
| <strong class="ph b">Related information:</strong> |
| </p> |
| |
| |
| <p class="p"> |
| For information about how compressing Parquet data files affects query performance, see |
| <a class="xref" href="impala_parquet.html#parquet_compression">Compressions for Parquet Data Files</a>. |
| </p> |
| |
| </div> |
| |
| <div class="related-links"> |
| <div class="familylinks"> |
| <div class="parentlink"><strong>Parent topic:</strong> <a class="link" href="../topics/impala_set.html">SET Statement</a></div> |
| </div> |
| </div></body> |
| </html> |