blob: 43af944563a3f4bda88a000e080874d97a5ead84 [file] [log] [blame]
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd">
<concept rev="ver" id="incompatible_changes">
<title><ph audience="standalone">Incompatible Changes and Limitations in
Apache Impala</ph><ph audience="integrated">Apache Impala Incompatible
Changes and Limitations</ph></title>
<prolog>
<metadata>
<data name="Category" value="Impala"/>
<data name="Category" value="Release Notes"/>
<data name="Category" value="Incompatible Changes"/>
<data name="Category" value="Limitations"/>
<data name="Category" value="Upgrading"/>
<data name="Category" value="Troubleshooting"/>
<data name="Category" value="Administrators"/>
<data name="Category" value="Developers"/>
<data name="Category" value="Data Analysts"/>
</metadata>
</prolog>
<conbody>
<p> The Impala version covered by this documentation library contains the
following incompatible changes. These are things such as file format
changes, removed features, or changes to implementation, default
configuration, dependencies, or prerequisites that could cause issues
during or after an Impala upgrade. </p>
<p> Even added SQL statements or clauses can produce incompatibilities, if
you have databases, tables, or columns whose names conflict with the new
keywords. <ph audience="PDF">See <xref
href="impala_reserved_words.xml#reserved_words"/> for the set of
reserved words for the current release, and the quoting techniques to
avoid name conflicts.</ph>
</p>
<p outputclass="toc inpage"/>
</conbody>
<concept id="incompatible_changes_330x">
<title>Incompatible Changes Introduced in Impala 3.3.x</title>
<conbody>
<p> For the full list of issues closed in this release, including any that
introduce behavior changes or incompatibilities, see the <xref
keyref="changelog_33">changelog for <keyword keyref="impala33"
/></xref>. <ul>
<li>Default file format changed to Parquet<p>When you create a table,
the default format for that table data is now Parquet.</p><p>For
backward compatibility, you can use the DEFAULT_FILE_FORMAT query
option to set the default file format to the previous default,
text, or other formats.</p></li>
</ul></p>
</conbody>
</concept>
<concept rev="3.1.0" id="incompatible_changes_320x">
<title>Incompatible Changes Introduced in Impala 3.2.x</title>
<conbody>
<p> For the full list of issues closed in this release, including any that
introduce behavior changes or incompatibilities, see the <xref
keyref="changelog_32">changelog for <keyword keyref="impala32"
/></xref>. </p>
<ul>
<li>The Port change for the <codeph>SHUTDOWN</codeph> command<p>The
<codeph>SHUTDOWN</codeph> command for shutting down a remote
server used the backend port in Impala 3.1. Starting in Impala 3.2,
the command uses the KRPC port, e.g.<codeph>
:shutdown('host100:27000')</codeph>.</p></li>
</ul>
</conbody>
</concept>
<concept rev="3.1.0" id="incompatible_changes_310x">
<title>Incompatible Changes Introduced in Impala 3.1.x</title>
<conbody>
<p> For the full list of issues closed in this release, including any that
introduce behavior changes or incompatibilities, see the <xref
keyref="changelog_31">changelog for <keyword keyref="impala31"
/></xref>. </p>
</conbody>
</concept>
<concept rev="3.0.0" id="incompatible_changes_300x">
<title>Incompatible Changes Introduced in Impala 3.0.x</title>
<conbody>
<p> For the full list of issues closed in this release, including any that
introduce behavior changes or incompatibilities, see the <xref
keyref="changelog_300">changelog for <keyword keyref="impala30"
/></xref>. </p>
</conbody>
</concept>
<concept rev="2.12.0" id="incompatible_changes_212x">
<title>Incompatible Changes Introduced in Impala 2.12.x</title>
<conbody>
<p> For the full list of issues closed in this release, including any that
introduce behavior changes or incompatibilities, see the <xref
keyref="changelog_212">changelog for <keyword keyref="impala212"
/></xref>. </p>
</conbody>
</concept>
<concept rev="2.11.0" id="incompatible_changes_211x">
<title>Incompatible Changes Introduced in Impala 2.11.x</title>
<conbody>
<p> For the full list of issues closed in this release, including any that
introduce behavior changes or incompatibilities, see the <xref
keyref="changelog_211">changelog for <keyword keyref="impala211"
/></xref>. </p>
</conbody>
</concept>
<concept rev="2.10.0" id="incompatible_changes_210x">
<title>Incompatible Changes Introduced in Impala 2.10.x</title>
<conbody>
<p> For the full list of issues closed in this release, including any that
introduce behavior changes or incompatibilities, see the <xref
keyref="changelog_210">changelog for <keyword keyref="impala210"
/></xref>. </p>
</conbody>
</concept>
<concept rev="2.9.0" id="incompatible_changes_29x">
<title>Incompatible Changes Introduced in Impala 2.9.x</title>
<conbody>
<p> For the full list of issues closed in this release, including any that
introduce behavior changes or incompatibilities, see the <xref
keyref="changelog_29">changelog for <keyword keyref="impala29"
/></xref>. </p>
<!-- Prose explanations of specific incompatible changes - TBD.
<ul>
<li>
</li>
</ul>
-->
</conbody>
</concept>
<concept rev="2.8.0" id="incompatible_changes_28x">
<title>Incompatible Changes Introduced in Impala 2.8.x</title>
<conbody>
<ul>
<li>
<p rev="IMPALA-4160"> Llama support is removed completely from Impala.
Related flags (<codeph>--enable_rm</codeph>) and query options (such
as <codeph>V_CPU_CORES</codeph>) remain but do not have any effect. </p>
<p rev="IMPALA-4160"> If <codeph>--enable_rm</codeph> is passed to
Impala, a warning is printed to the log on startup. </p>
</li>
<li>
<p rev="kudu"> The syntax related to Kudu tables includes a number of
new reserved words, such as <codeph>COMPRESSION</codeph>,
<codeph>DEFAULT</codeph>, and <codeph>ENCODING</codeph>, that
might conflict with names of existing tables, columns, or other
identifiers from older Impala versions. See <xref
href="impala_reserved_words.xml#reserved_words"/> for the full
list of reserved words. </p>
</li>
<li>
<p rev="kudu"> The DDL syntax for Kudu tables, particularly in the
<codeph>CREATE TABLE</codeph> statement, is different from the
special <codeph>impala_next</codeph> fork that was previously used
for accessing Kudu tables from Impala: </p>
<ul rev="kudu">
<li>
<p> The <codeph>DISTRIBUTE BY</codeph> clause is now
<codeph>PARTITIONED BY</codeph>. </p>
</li>
<li>
<p> The <codeph>INTO <varname>N</varname> BUCKETS</codeph> clause
is now <codeph>PARTITIONS <varname>N</varname></codeph>. </p>
</li>
<li>
<p> The <codeph>SPLIT ROWS</codeph> clause is replaced by
different syntax for specifying the ranges covered by each
partition. </p>
</li>
</ul>
</li>
<li>
<p> The <codeph>DESCRIBE</codeph> output for Kudu tables includes
several extra columns. </p>
</li>
<li>
<p rev="kudu IMPALA-4527"> Non-primary-key columns can contain
<codeph>NULL</codeph> values by default. The <codeph>SHOW CREATE
TABLE</codeph> output for these columns displays the
<codeph>NULL</codeph> attribute. There was a period during early
experimental versions of Impala + Kudu where non-primary-key columns
had the <codeph>NOT NULL</codeph> attribute by default. </p>
</li>
<li>
<p rev="kudu IMPALA-3710"> The <codeph>IGNORE</codeph> keyword that
was present in early experimental versions of Impala + Kudu is no
longer present. The behavior of the <codeph>IGNORE</codeph> keyword
is now the default: DML statements continue with warnings, instead
of failing with errors, if they encounter conditions such as
<q>primary key already exists</q> for an <codeph>INSERT</codeph>
statement or <q>primary key already deleted</q> for a
<codeph>DELETE</codeph> statement. </p>
</li>
<li>
<p rev="IMPALA-4589"> The replication factor for Kudu tables must be
an odd number. </p>
</li>
<li>
<p rev="IMPALA-4432"> A UDF compiled into an LLVM IR bitcode module
(<codeph>.bc</codeph>) might encounter a runtime error when native
code generation is turned off by setting the query option
<codeph>DISABLE_CODEGEN=1</codeph>. This issue also applies when
running a built-in or native UDF with more than 20 arguments. See
<xref keyref="IMPALA-4432">IMPALA-4432</xref> for details. As a
workaround, either turn native code generation back on with the
query option <codeph>DISABLE_CODEGEN=0</codeph>, or use the regular
UDF compilation path that does not produce an IR module. </p>
</li>
</ul>
</conbody>
</concept>
<concept rev="2.7.0" id="incompatible_changes_27x">
<title>Incompatible Changes Introduced in Impala 2.7.x</title>
<conbody>
<ul>
<li>
<p rev="IMPALA-1731 IMPALA-3868"> Bug fixes related to parsing of
floating-point values (IMPALA-1731 and IMPALA-3868) can change the
results of casting strings that represent invalid floating-point
values. For example, formerly a string value beginning or ending
with <codeph>inf</codeph>, such as <codeph>1.23inf</codeph> or
<codeph>infinite</codeph>, now are converted to
<codeph>NULL</codeph> when interpreted as a floating-point value.
Formerly, they were interpreted as the special <q>infinity</q> value
when converting from string to floating-point. Similarly, now only
the string <codeph>NaN</codeph> (case-sensitive) is interpreted as
the special <q>not a number</q> value. String values containing
multiple dots, such as <codeph>3..141</codeph> or
<codeph>3.1.4.1</codeph>, are now interpreted as
<codeph>NULL</codeph> rather than being converted to valid
floating-point values. </p>
</li>
</ul>
</conbody>
</concept>
<concept rev="2.6.0" id="incompatible_changes_26x">
<title>Incompatible Changes Introduced in Impala 2.6.x</title>
<conbody>
<ul>
<li>
<p rev=""> The default for the <codeph>RUNTIME_FILTER_MODE</codeph>
query option is changed to <codeph>GLOBAL</codeph> (the highest
setting). </p>
</li>
<li rev="IMPALA-3007">
<p> The <codeph>RUNTIME_BLOOM_FILTER_SIZE</codeph> setting is now only
used as a fallback if statistics are not available; otherwise,
Impala uses the statistics to estimate the appropriate size to use
for each filter. </p>
</li>
<li>
<p rev="IMPALA-3199"> Admission control and dynamic resource pools are
enabled by default. When upgrading from an earlier release, you must
turn on these settings yourself if they are not already enabled. See
<xref href="impala_admission.xml#admission_control"/> for details
about admission control. </p>
</li>
<li>
<p> Impala reserves some new keywords, in preparation for support for
Kudu syntax: <codeph>buckets</codeph>, <codeph>delete</codeph>,
<codeph>distribute</codeph>, <codeph>hash</codeph>,
<codeph>ignore</codeph>, <codeph>split</codeph>, and
<codeph>update</codeph>. </p>
</li>
<li>
<p rev="IMPALA-3554"> For Kerberized clusters, the Catalog service now
uses the Kerberos principal instead of the operating sytem user that
runs the <cmdname>catalogd</cmdname> daemon. This eliminates the
requirement to configure a
<codeph>hadoop.user.group.static.mapping.overrides</codeph>
setting to put the OS user into the Sentry administrative group, on
clusters where the principal and the OS user name for this user are
different. </p>
</li>
<li>
<p> The mechanism for interpreting <codeph>DECIMAL</codeph> literals
is improved, no longer going through an intermediate conversion step
to <codeph>DOUBLE</codeph>: </p>
<ul>
<li>
<p rev="IMPALA-3163"> Casting a <codeph>DECIMAL</codeph> value to
<codeph>TIMESTAMP</codeph>
<codeph>DOUBLE</codeph> produces a more precise value for the
<codeph>TIMESTAMP</codeph> than formerly. </p>
</li>
<li>
<p rev="IMPALA-3439"> Certain function calls involving
<codeph>DECIMAL</codeph> literals now succeed, when formerly
they failed due to lack of a function signature with a
<codeph>DOUBLE</codeph> argument. </p>
</li>
</ul>
</li>
<li>
<p rev="IMPALA-3155"> Improved type accuracy for <codeph>CASE</codeph>
return values. If all <codeph>WHEN</codeph> clauses of the
<codeph>CASE</codeph> expression are of <codeph>CHAR</codeph>
type, the final result is also <codeph>CHAR</codeph> instead of
being converted to <codeph>STRING</codeph>. </p>
</li>
<li>
<p conref="../shared/impala_common.xml#common/IMPALA-3662"/>
</li>
<li rev="IMPALA-3452">
<p> The <codeph>S3_SKIP_INSERT_STAGING</codeph> query option, which is
enabled by default, increases the speed of <codeph>INSERT</codeph>
operations for S3 tables. The speedup applies to regular
<codeph>INSERT</codeph>, but not <codeph>INSERT
OVERWRITE</codeph>. The tradeoff is the possibility of inconsistent
output files left behind if a node fails during
<codeph>INSERT</codeph> execution. See <xref
href="impala_s3_skip_insert_staging.xml#s3_skip_insert_staging"/>
for details. </p>
</li>
</ul>
<p> Certain features are turned off by default, to avoid regressions or
unexpected behavior following an upgrade. Consider turning on these
features after suitable testing: </p>
<ul>
<li>
<p rev="IMPALA-2660"> Impala now recognizes the
<codeph>auth_to_local</codeph> setting, specified through the HDFS
configuration setting
<codeph>hadoop.security.auth_to_local</codeph>. This feature is
disabled by default; to enable it, specify
<codeph>--load_auth_to_local_rules=true</codeph> in the
<cmdname>impalad</cmdname> configuration settings. </p>
</li>
<li>
<p rev="IMPALA-2069"> A new query option,
<codeph>PARQUET_ANNOTATE_STRINGS_UTF8</codeph>, makes Impala
include the <codeph>UTF-8</codeph> annotation metadata for
<codeph>STRING</codeph>, <codeph>CHAR</codeph>, and
<codeph>VARCHAR</codeph> columns in Parquet files created by
<codeph>INSERT</codeph> or <codeph>CREATE TABLE AS SELECT</codeph>
statements. </p>
</li>
<li>
<p rev="IMPALA-2835"> A new query option,
<codeph>PARQUET_FALLBACK_SCHEMA_RESOLUTION</codeph>, lets Impala
locate columns within Parquet files based on column name rather than
ordinal position. This enhancement improves interoperability with
applications that write Parquet files with a different order or
subset of columns than are used in the Impala table. </p>
</li>
</ul>
</conbody>
</concept>
<concept rev="2.5.x" id="incompatible_changes_25x">
<title>Incompatible Changes Introduced in Impala 2.5.x</title>
<conbody>
<ul>
<li rev="IMPALA-3044">
<p> The admission control default limit for concurrent queries (the
<uicontrol>max requests</uicontrol> setting) is now unlimited
instead of 200. </p>
</li>
<li>
<p rev="IMPALA-2749"> Multiplying a mixture of
<codeph>DECIMAL</codeph> and <codeph>FLOAT</codeph> or
<codeph>DOUBLE</codeph> values now returns <codeph>DOUBLE</codeph>
rather than <codeph>DECIMAL</codeph>. This change avoids some cases
where an intermediate value would underflow or overflow and become
<codeph>NULL</codeph> unexpectedly. The results of multiplying
<codeph>DECIMAL</codeph> and <codeph>FLOAT</codeph> or
<codeph>DOUBLE</codeph> might now be slightly less precise than
before. Previously, the intermediate types and thus the final result
depended on the exact order of the values of different types being
multiplied, which made the final result values difficult to reason
about. </p>
</li>
<li rev="IMPALA-2204">
<p> Previously, the <codeph>_</codeph> and <codeph>%</codeph> wildcard
characters for the <codeph>LIKE</codeph> operator would not match
characters on the second or subsequent lines of multi-line string
values. The fix for issue <xref keyref="IMPALA-2204"
>IMPALA-2204</xref> causes the wildcard matching to apply to the
entire string for values containing embedded <codeph>\n</codeph>
characters. This could cause different results than in previous
Impala releases for identical queries on identical data. </p>
</li>
<li rev="IMPALA-1748">
<p> Formerly, all Impala UDFs and UDAs required running the
<codeph>CREATE FUNCTION</codeph> statements to re-create them
after each <cmdname>catalogd</cmdname> restart. In <keyword
keyref="impala25_full"/> and higher, functions written in C++ are
persisted across restarts, and the requirement to re-create
functions only applies to functions written in Java. Adapt any
function-reloading logic that you have added to your Impala
environment. </p>
</li>
<li>
<p rev="IMPALA-1651">
<codeph>CREATE TABLE LIKE</codeph> no longer inherits HDFS caching
settings from the source table. </p>
</li>
<li>
<p rev="IMPALA-2070"> The <codeph>SHOW DATABASES</codeph> statement
now returns two columns rather than one. The second column includes
the associated comment string, if any, for each database. Adjust any
application code that examines the list of databases and assumes the
result set contains only a single column. </p>
</li>
<li>
<p> The output of the <codeph>SHOW FUNCTIONS</codeph> statement
includes two new columns, showing the kind of the function (for
example, <codeph>BUILTIN</codeph>) and whether or not the function
persists across catalog server restarts. For example, the
<codeph>SHOW FUNCTIONS</codeph> output for the
<codeph>_impala_builtins</codeph> database starts with: </p>
<codeblock>
+--------------+-------------------------------------------------+-------------+---------------+
| return type | signature | binary type | is persistent |
+--------------+-------------------------------------------------+-------------+---------------+
| BIGINT | abs(BIGINT) | BUILTIN | true |
| DECIMAL(*,*) | abs(DECIMAL(*,*)) | BUILTIN | true |
| DOUBLE | abs(DOUBLE) | BUILTIN | true |
...
</codeblock>
</li>
</ul>
</conbody>
</concept>
<concept rev="2.4.x" id="incompatible_changes_24x">
<title>Incompatible Changes Introduced in Impala 2.4.x</title>
<conbody>
<p> Other than support for DSSD storage, the Impala feature set for
<keyword keyref="impala24"/> is the same as for <keyword
keyref="impala23"/>. Therefore, there are no incompatible changes for
Impala introduced in <keyword keyref="impala24"/>. </p>
</conbody>
</concept>
<!-- All 2.3.x subsections go under here -->
<!-- Actually for 2.3 and higher, let's get away from doing a separate subhead for each maintenance release,
because in the normal course of events there will be nothing to add here until the next full release. If something new
needs to get noted, just add a new bullet with wording to indicate which x.y.z release it applies to. -->
<concept rev="2.3.x" id="incompatible_changes_23x">
<title>Incompatible Changes Introduced in Impala 2.3.x</title>
<conbody>
<note conref="../shared/impala_common.xml#common/impala_llama_obsolete"/>
<ul>
<li rev="IMPALA-2005" audience="hidden">
<p> If a <codeph>CREATE TABLE AS SELECT</codeph> operation fails while
data is being inserted, the table is automatically removed.
Previously, the table was left behind with no data. </p>
</li>
<li rev="IMPALA-2130">
<p> If Impala encounters a Parquet file that is invalid because of an
incorrect magic number, the query skips the file. This change is
caused by the fix for issue <xref keyref="IMPALA-2130"
>IMPALA-2130</xref>. Previously, Impala would attempt to read the
file despite the possibility that the file was corrupted. </p>
</li>
<li rev="IMPALA-2233">
<p> Previously, calls to overloaded built-in functions could treat
parameters as <codeph>DOUBLE</codeph> or <codeph>FLOAT</codeph> when
no overload had a signature that matched the exact argument types.
Now Impala prefers the function signature with
<codeph>DECIMAL</codeph> parameters in this case. This change
avoids a possible loss of precision in function calls such as
<codeph>greatest(0, 99999.8888)</codeph>; now both parameters are
treated as <codeph>DECIMAL</codeph> rather than
<codeph>DOUBLE</codeph>, avoiding any loss of precision in the
fractional value. This could cause slightly different results than
in previous Impala releases for certain function calls. </p>
</li>
<li rev="IMPALA-1675">
<p> Formerly, adding or subtracting a large interval value to a
<codeph>TIMESTAMP</codeph> could produce a nonsensical result. Now
when the result goes outside the range of <codeph>TIMESTAMP</codeph>
values, Impala returns <codeph>NULL</codeph>. </p>
</li>
<li rev="IMPALA-2251 IMPALA-2257">
<p> Formerly, it was possible to accidentally create a table with
identical row and column delimiters. This could happen
unintentionally, when specifying one of the delimiters and using the
default value for the other. Now an attempt to use identical
delimiters still succeeds, but displays a warning message. </p>
</li>
<li rev="">
<p> Formerly, Impala could include snippets of table data in log files
by default, for example when reporting conversion errors for data
values. Now any such log messages are only produced at higher
logging levels that you would enable only during debugging. </p>
</li>
<!-- placeholder -->
</ul>
</conbody>
</concept>
<!-- All 2.2.x subsections go under here -->
<concept rev="2.2.x" id="incompatible_changes_22x">
<title>Incompatible Changes Introduced in Impala 2.2.x</title>
<conbody>
<section id="files_220">
<title> Changes to File Handling </title>
<p conref="../shared/impala_common.xml#common/ignore_file_extensions"/>
<p> The log rotation feature in Impala 2.2.0 and higher means that older
log files are now removed by default. The default is to preserve the
latest 10 log files for each severity level, for each Impala-related
daemon. If you have set up your own log rotation processes that expect
older files to be present, either adjust your procedures or change the
Impala <codeph>-max_log_files</codeph> setting. <ph audience="PDF">See
<xref href="impala_logging.xml#logs_rotate"/> for details.</ph>
</p>
</section>
<section id="prereqs_210">
<title> Changes to Prerequisites </title>
<p conref="../shared/impala_common.xml#common/cpu_prereq"/>
</section>
</conbody>
</concept>
<!-- All 2.1.x subsections go under here -->
<concept rev="2.1.x" id="incompatible_changes_21x">
<title>Incompatible Changes Introduced in Impala 2.1.x</title>
<conbody>
<section id="prereqs_210">
<title> Changes to Prerequisites </title>
<p rev=""> Currently, Impala 2.1.x does not function on CPUs without the
SSE4.1 instruction set. This minimum CPU requirement is higher than in
previous versions, which relied on the older SSSE3 instruction set.
Check the CPU level of the hosts in your cluster before upgrading to
<keyword keyref="impala21_full"/>. </p>
</section>
<section id="output_format_210">
<title> Changes to Output Format </title>
<p> The <q>small query</q> optimization feature introduces some new
information in the <codeph>EXPLAIN</codeph> plan, which you might need
to account for if you parse the text of the plan output. </p>
</section>
<section id="reserved_words_210">
<title> New Reserved Words </title>
<p> New SQL syntax introduces additional reserved words:
<codeph>FOR</codeph>, <codeph>GRANT</codeph>,
<codeph>REVOKE</codeph>, <codeph>ROLE</codeph>,
<codeph>ROLES</codeph>, <codeph>INCREMENTAL</codeph>. <ph
audience="PDF">As always, see <xref
href="impala_reserved_words.xml#reserved_words"/> for the set of
reserved words for the current release, and the quoting techniques
to avoid name conflicts.</ph>
</p>
</section>
</conbody>
</concept>
<!-- All 2.0.x subsections go under here -->
<concept rev="2.0.5" id="incompatible_changes_205">
<title>Incompatible Changes Introduced in Impala 2.0.5</title>
<conbody>
<p> No incompatible changes. </p>
</conbody>
</concept>
<concept rev="2.0.4" id="incompatible_changes_204">
<title>Incompatible Changes Introduced in Impala 2.0.4</title>
<conbody>
<p> No incompatible changes. </p>
</conbody>
</concept>
<concept rev="2.0.3" id="incompatible_changes_203">
<title>Incompatible Changes Introduced in Impala 2.0.3</title>
<conbody> </conbody>
</concept>
<concept rev="2.0.2" id="incompatible_changes_202">
<title>Incompatible Changes Introduced in Impala 2.0.2</title>
<conbody>
<p> No incompatible changes. </p>
</conbody>
</concept>
<concept rev="2.0.1" id="incompatible_changes_201">
<title>Incompatible Changes Introduced in Impala 2.0.1</title>
<conbody>
<ul>
<li>
<p
conref="../shared/impala_common.xml#common/insert_hidden_work_directory"
/>
</li>
<li>
<p> The <codeph>abs()</codeph> function now takes a broader range of
numeric types as arguments, and the return type is the same as the
argument type. </p>
</li>
<li>
<p> Shorthand notation for character classes in regular expressions,
such as <codeph>\d</codeph> for digit, are now available again in
regular expression operators and functions such as
<codeph>regexp_extract()</codeph> and
<codeph>regexp_replace()</codeph>. Some other differences in
regular expression behavior remain between Impala 1.x and Impala 2.x
releases. See <xref
href="impala_incompatible_changes.xml#incompatible_changes_200"/>
for details. </p>
</li>
</ul>
</conbody>
</concept>
<concept rev="2.0.0" id="incompatible_changes_200">
<title>Incompatible Changes Introduced in Impala 2.0.0</title>
<conbody>
<section id="prereqs_200">
<title> Changes to Prerequisites </title>
<p rev=""> Currently, Impala 2.0.x does not function on CPUs without the
SSE4.1 instruction set. This minimum CPU requirement is higher than in
previous versions, which relied on the older SSSE3 instruction set.
Check the CPU level of the hosts in your cluster before upgrading to
<keyword keyref="impala20_full"/>. </p>
</section>
<section id="queries_200">
<title> Changes to Query Syntax </title>
<p> The new syntax where query hints are allowed in comments causes some
changes in the way comments are parsed in the
<cmdname>impala-shell</cmdname> interpreter. Previously, you could
end a <codeph>--</codeph> comment line with a semicolon and
<cmdname>impala-shell</cmdname> would treat that as a no-op
statement. Now, a comment line ending with a semicolon is passed as an
empty statement to the Impala daemon, where it is flagged as an error. </p>
<p> Impala 2.0 and later uses a different support library for regular
expression parsing than in earlier Impala versions. Now, Impala uses
the <xref href="https://code.google.com/p/re2/" scope="external"
format="html">Google RE2 library</xref> rather than Boost for
evaluating regular expressions. This implementation change causes some
differences in the allowed regular expression syntax, and in the way
certain regex operators are interpreted. The following are some of the
major differences (not necessarily a complete list): </p>
<ul>
<li>
<p>
<codeph>.*?</codeph> notation for non-greedy matches is now
supported, where it was not in earlier Impala releases. </p>
</li>
<li>
<p> By default, <codeph>^</codeph> and <codeph>$</codeph> now match
only begin/end of buffer, not begin/end of each line. This
behavior can be overridden in the regex itself using the
<codeph>m</codeph> flag. </p>
</li>
<li>
<p> By default, <codeph>.</codeph> does not match newline. This
behavior can be overridden in the regex itself using the
<codeph>s</codeph> flag. </p>
</li>
<li>
<p>
<codeph>\Z</codeph> is not supported. </p>
</li>
<li>
<p>
<codeph>&lt;</codeph> and <codeph>&gt;</codeph> for start of word
and end of word are not supported. </p>
</li>
<li>
<p> Lookahead and lookbehind are not supported. </p>
</li>
<li>
<p> Shorthand notation for character classes, such as
<codeph>\d</codeph> for digit, is not recognized. (This
restriction is lifted in Impala 2.0.1, which restores the
shorthand notation.) </p>
</li>
</ul>
</section>
<section id="output_format_210">
<title> Changes to Output Format </title>
<p conref="../shared/impala_common.xml#common/user_kerberized"/>
<p> The changed format for the user name in secure environments is also
reflected where the user name is displayed in the output of the
<codeph>PROFILE</codeph> command. </p>
<p> In the output from <codeph>SHOW FUNCTIONS</codeph>, <codeph>SHOW
AGGREGATE FUNCTIONS</codeph>, and <codeph>SHOW ANALYTIC
FUNCTIONS</codeph>, arguments and return types of arbitrary
<codeph>DECIMAL</codeph> scale and precision are represented as
<codeph>DECIMAL(*,*)</codeph>. Formerly, these items were displayed
as <codeph>DECIMAL(-1,-1)</codeph>. </p>
</section>
<section id="query_options_200">
<title> Changes to Query Options </title>
<p> The <codeph>PARQUET_COMPRESSION_CODEC</codeph> query option has been
replaced by the <codeph>COMPRESSION_CODEC</codeph> query option. <ph
audience="PDF">See <xref
href="impala_compression_codec.xml#compression_codec"/> for
details.</ph>
</p>
</section>
<section id="config_options_200">
<title> Changes to Configuration Options </title>
<p> The meaning of the <codeph>--idle_query_timeout</codeph>
configuration option is changed, to accommodate the new
<codeph>QUERY_TIMEOUT_S</codeph> query option. Rather than setting
an absolute timeout period that applies to all queries, it now sets a
maximum timeout period, which can be adjusted downward for individual
queries by specifying a value for the <codeph>QUERY_TIMEOUT_S</codeph>
query option. In sessions where no <codeph>QUERY_TIMEOUT_S</codeph>
query option is specified, the <codeph>--idle_query_timeout</codeph>
timeout period applies the same as in earlier versions. </p>
<p> The <codeph>--strict_unicode</codeph> option of
<cmdname>impala-shell</cmdname> was removed. To avoid problems with
Unicode values in <cmdname>impala-shell</cmdname>, define the
following locale setting before running
<cmdname>impala-shell</cmdname>: </p>
<codeblock>export LC_CTYPE=en_US.UTF-8
</codeblock>
</section>
<section id="reserved_words_210">
<title> New Reserved Words </title>
<p> Some new SQL syntax requires the addition of new reserved words:
<codeph>ANTI</codeph>, <codeph>ANALYTIC</codeph>,
<codeph>OVER</codeph>, <codeph>PRECEDING</codeph>,
<codeph>UNBOUNDED</codeph>, <codeph>FOLLOWING</codeph>,
<codeph>CURRENT</codeph>, <codeph>ROWS</codeph>,
<codeph>RANGE</codeph>, <codeph>CHAR</codeph>,
<codeph>VARCHAR</codeph>. <ph audience="PDF">As always, see <xref
href="impala_reserved_words.xml#reserved_words"/> for the set of
reserved words for the current release, and the quoting techniques
to avoid name conflicts.</ph>
</p>
</section>
<section id="output_files_200">
<title> Changes to Data Files </title>
<p id="parquet_block_size"> The default Parquet block size for Impala is
changed from 1 GB to 256 MB. This change could have implications for
the sizes of Parquet files produced by <codeph>INSERT</codeph> and
<codeph>CREATE TABLE AS SELECT</codeph> statements. </p>
<p> Although older Impala releases typically produced files that were
smaller than the old default size of 1 GB, now the file size matches
more closely whatever value is specified for the
<codeph>PARQUET_FILE_SIZE</codeph> query option. Thus, if you use a
non-default value for this setting, the output files could be larger
than before. They still might be somewhat smaller than the specified
value, because Impala makes conservative estimates about the space
needed to represent each column as it encodes the data. </p>
<p> When you do not specify an explicit value for the
<codeph>PARQUET_FILE_SIZE</codeph> query option, Impala tries to
keep the file size within the 256 MB default size, but Impala might
adjust the file size to be somewhat larger if needed to accommodate
the layout for <term>wide</term> tables, that is, tables with hundreds
or thousands of columns. </p>
<p> This change is unlikely to affect memory usage while writing Parquet
files, because Impala does not pre-allocate the memory needed to hold
the entire Parquet block. </p>
</section>
</conbody>
</concept>
<concept rev="1.4.4" id="incompatible_changes_144">
<title>Incompatible Changes Introduced in Impala 1.4.4</title>
<conbody>
<p> No incompatible changes. </p>
</conbody>
</concept>
<concept rev="1.4.3" id="incompatible_changes_143">
<title>Incompatible Changes Introduced in Impala 1.4.3</title>
<conbody>
<p> No incompatible changes. The TLS/SSL security fix does not require any
change in the way you interact with Impala. </p>
</conbody>
</concept>
<concept rev="1.4.2" id="incompatible_changes_142">
<title>Incompatible Changes Introduced in Impala 1.4.2</title>
<conbody>
<p> None. Impala 1.4.2 is purely a bug-fix release. It does not include
any incompatible changes. </p>
</conbody>
</concept>
<concept rev="1.4.1" id="incompatible_changes_141">
<title>Incompatible Changes Introduced in Impala 1.4.1</title>
<conbody>
<p> None. Impala 1.4.1 is purely a bug-fix release. It does not include
any incompatible changes. </p>
</conbody>
</concept>
<concept rev="1.4.0" id="incompatible_changes_140">
<title>Incompatible Changes Introduced in Impala 1.4.0</title>
<prolog>
<metadata>
<data name="Category" value="Deprecated Features"/>
</metadata>
</prolog>
<conbody>
<ul>
<li>
<p> There is a slight change to required security privileges in the
Sentry framework. To create a new object, now you need the
<codeph>ALL</codeph> privilege on the parent object. For example,
to create a new table, view, or function requires having the
<codeph>ALL</codeph> privilege on the database containing the new
object. See <xref href="impala_authorization.xml"/> for a full list
of operations and associated privileges. </p>
</li>
<li>
<p> With the ability of <codeph>ORDER BY</codeph> queries to process
unlimited amounts of data with no <codeph>LIMIT</codeph> clause, the
query options <codeph>DEFAULT_ORDER_BY_LIMIT</codeph> and
<codeph>ABORT_ON_DEFAULT_LIMIT_EXCEEDED</codeph> are now
deprecated and have no effect. <ph audience="PDF">See <xref
href="impala_order_by.xml#order_by"/> for details about
improvements to the <codeph>ORDER BY</codeph> clause.</ph>
</p>
</li>
<li>
<p> There are some changes to the list of reserved words. <ph
audience="PDF">See <xref
href="impala_reserved_words.xml#reserved_words"/> for the most
current list.</ph> The following keywords are new: </p>
<ul>
<li>
<codeph>API_VERSION</codeph>
</li>
<li>
<codeph>BINARY</codeph>
</li>
<li>
<codeph>CACHED</codeph>
</li>
<li>
<codeph>CLASS</codeph>
</li>
<li>
<codeph>PARTITIONS</codeph>
</li>
<li>
<codeph>PRODUCED</codeph>
</li>
<li>
<codeph>UNCACHED</codeph>
</li>
</ul>
<p> The following were formerly reserved keywords, but are no longer
reserved: </p>
<ul>
<li>
<codeph>COUNT</codeph>
</li>
<li>
<codeph>GROUP_CONCAT</codeph>
</li>
<li>
<codeph>NDV</codeph>
</li>
<li>
<codeph>SUM</codeph>
</li>
</ul>
</li>
<li>
<p> The fix for issue <xref keyref="IMPALA-973">IMPALA-973</xref>
changes the behavior of the <codeph>INVALIDATE METADATA</codeph>
statement regarding nonexistent tables. In Impala 1.4.0 and higher,
the statement returns an error if the specified table is not in the
metastore database at all. It completes successfully if the
specified table is in the metastore database but not yet recognized
by Impala, for example if the table was created through Hive.
Formerly, you could issue this statement for a completely
nonexistent table, with no error. </p>
</li>
</ul>
</conbody>
</concept>
<concept rev="1.3.3" id="incompatible_changes_133">
<title>Incompatible Changes Introduced in Impala 1.3.3</title>
<conbody>
<p> No incompatible changes. The TLS/SSL security fix does not require any
change in the way you interact with Impala. </p>
</conbody>
</concept>
<concept rev="1.3.2" id="incompatible_changes_132">
<title>Incompatible Changes Introduced in Impala 1.3.2</title>
<conbody>
<p> With the fix for IMPALA-1019, you can use HDFS caching for files that
are accessed by Impala. </p>
</conbody>
</concept>
<concept rev="1.3.1" id="incompatible_changes_131">
<title>Incompatible Changes Introduced in Impala 1.3.1</title>
<conbody>
<ul>
<li>
<p conref="../shared/impala_common.xml#common/regexp_matching"/>
</li>
<li>
<p> The result set for the <codeph>SHOW FUNCTIONS</codeph> statement
includes a new first column, with the data type of the return value.
<ph audience="PDF">See <xref href="impala_show.xml#show"/> for
examples.</ph>
</p>
</li>
</ul>
</conbody>
</concept>
<concept rev="1.3.0" id="incompatible_changes_130">
<title>Incompatible Changes Introduced in Impala 1.3.0</title>
<conbody>
<ul>
<li>
<p> The <codeph>EXPLAIN_LEVEL</codeph> query option now accepts
numeric options from 0 (most concise) to 3 (most verbose), rather
than only 0 or 1. If you formerly used <codeph>SET
EXPLAIN_LEVEL=1</codeph> to get detailed explain plans, switch to
<codeph>SET EXPLAIN_LEVEL=3</codeph>. If you used the mnemonic
keyword (<codeph>SET EXPLAIN_LEVEL=verbose</codeph>), you do not
need to change your code because now level 3 corresponds to
<codeph>verbose</codeph>. <ph audience="PDF">See <xref
href="impala_explain_level.xml#explain_level"/> for details
about the allowed explain levels, and <xref
href="impala_explain_plan.xml#explain_plan"/> for usage
information.</ph>
</p>
</li>
<li>
<p> The keyword <codeph>DECIMAL</codeph> is now a reserved word. If
you have any databases, tables, columns, or other objects already
named <codeph>DECIMAL</codeph>, quote any references to them using
backticks (<codeph>``</codeph>) to avoid name conflicts with the
keyword. <note> Although the <codeph>DECIMAL</codeph> keyword is a
reserved word, currently Impala does not support
<codeph>DECIMAL</codeph> as a data type for columns. </note>
</p>
</li>
<li>
<p> The query option formerly named <codeph>YARN_POOL</codeph> is now
named <codeph>REQUEST_POOL</codeph> to reflect its broader use with
the Impala admission control feature. <ph audience="PDF">See <xref
href="impala_request_pool.xml#request_pool"/> for information
about the option, and <xref
href="impala_admission.xml#admission_control"/> for details
about its use with the admission control feature.</ph>
</p>
</li>
<li>
<p> There are some changes to the list of reserved words. <ph
audience="PDF">See <xref
href="impala_reserved_words.xml#reserved_words"/> for the most
current list.</ph>
</p>
<ul>
<li>
<p> The names of aggregate functions are no longer reserved words,
so you can have databases, tables, columns, or other objects
named <codeph>AVG</codeph>, <codeph>MIN</codeph>, and so on
without any name conflicts. </p>
</li>
<li>
<p> The internal function names <codeph>DISTINCTPC</codeph> and
<codeph>DISTINCTPCSA</codeph> are no longer reserved words,
although <codeph>DISTINCT</codeph> is still a reserved word.
</p>
</li>
<li>
<p> The keywords <codeph>CLOSE_FN</codeph> and
<codeph>PREPARE_FN</codeph> are now reserved words. <ph
audience="PDF">See <xref
href="impala_create_function.xml#create_function"/> for
their role in the <codeph>CREATE FUNCTION</codeph> statement,
and <xref href="impala_udf.xml#udf_threads"/> for usage
information.</ph>
</p>
</li>
</ul>
</li>
<li>
<p> The HDFS property
<codeph>dfs.client.file-block-storage-locations.timeout</codeph>
was renamed to
<codeph>dfs.client.file-block-storage-locations.timeout.millis</codeph>,
to emphasize that the unit of measure is milliseconds, not seconds.
Impala requires a timeout of at least 10 seconds, making the minimum
value for this setting 10000. If you are not using cluster
management software, you might need to edit the
<filepath>hdfs-site.xml</filepath> file in the Impala
configuration directory for the new name and minimum value. </p>
</li>
</ul>
</conbody>
</concept>
<concept rev="1.2.4" id="incompatible_changes_124">
<title>Incompatible Changes Introduced in Impala 1.2.4</title>
<conbody>
<p> There are no incompatible changes introduced in Impala 1.2.4. </p>
<p> Previously, after creating a table in Hive, you had to issue the
<codeph>INVALIDATE METADATA</codeph> statement with no table name, a
potentially expensive operation on clusters with many databases, tables,
and partitions. Starting in Impala 1.2.4, you can issue the statement
<codeph>INVALIDATE METADATA <varname>table_name</varname></codeph> for
a table newly created through Hive. Loading the metadata for only this
one table is faster and involves less network overhead. Therefore, you
might revisit your setup DDL scripts to add the table name to
<codeph>INVALIDATE METADATA</codeph> statements, in cases where you
create and populate the tables through Hive before querying them through
Impala. </p>
</conbody>
</concept>
<concept rev="1.2.3" id="incompatible_changes_123">
<title>Incompatible Changes Introduced in Impala 1.2.3</title>
<conbody>
<p> Because the feature set of Impala 1.2.3 is identical to Impala 1.2.2,
there are no new incompatible changes. See <xref
href="impala_incompatible_changes.xml#incompatible_changes_122"/> if
you are upgrading from Impala 1.2.1 or 1.1.x. </p>
</conbody>
</concept>
<concept rev="1.2.2" id="incompatible_changes_122">
<title>Incompatible Changes Introduced in Impala 1.2.2</title>
<conbody>
<p> The following changes to SQL syntax and semantics in Impala 1.2.2
could require updates to your SQL code, or schema objects such as tables
or views: </p>
<ul>
<li>
<p> With the addition of the <codeph>CROSS JOIN</codeph> keyword, you
might need to rewrite any queries that refer to a table named
<codeph>CROSS</codeph> or use the name <codeph>CROSS</codeph> as a
table alias: </p>
<codeblock>-- Formerly, 'cross' in this query was an alias for t1
-- and it was a normal join query.
-- In 1.2.2 and higher, CROSS JOIN is a keyword, so 'cross'
-- is not interpreted as a table alias, and the query
-- uses the special CROSS JOIN processing rather than a
-- regular join.
select * from t1 cross join t2...
-- Now if CROSS is used in other context such as a table or column name,
-- use backticks to escape it.
create table `cross` (x int);
select * from `cross`;</codeblock>
</li>
<li>
<p> Formerly, a <codeph>DROP DATABASE</codeph> statement in Impala
would not remove the top-level HDFS directory for that database. The
<codeph>DROP DATABASE</codeph> has been enhanced to remove that
directory. (You still need to drop all the tables inside the
database first; this change only applies to the top-level directory
for the entire database.) </p>
</li>
<li> The keyword <codeph>PARQUET</codeph> is introduced as a synonym for
<codeph>PARQUETFILE</codeph> in the <codeph>CREATE TABLE</codeph>
and <codeph>ALTER TABLE</codeph> statements, because that is the
common name for the file format. (As opposed to SequenceFile and
RCFile where the <q>File</q> suffix is part of the name.)
Documentation examples have been changed to prefer the new shorter
keyword. The <codeph>PARQUETFILE</codeph> keyword is still available
for backward compatibility with older Impala versions. </li>
<li> New overloads are available for several operators and built-in
functions, allowing you to insert their result values into smaller
numeric columns such as <codeph>INT</codeph>,
<codeph>SMALLINT</codeph>, <codeph>TINYINT</codeph>, and
<codeph>FLOAT</codeph> without using a <codeph>CAST()</codeph> call.
If you remove the <codeph>CAST()</codeph> calls from
<codeph>INSERT</codeph> statements, those statements might not work
with earlier versions of Impala. </li>
</ul>
<p> Because many users are likely to upgrade straight from Impala 1.x to
Impala 1.2.2, also read <xref
href="impala_incompatible_changes.xml#incompatible_changes_121"/> for
things to note about upgrading to Impala 1.2.x in general. </p>
</conbody>
</concept>
<concept rev="1.2.1" id="incompatible_changes_121">
<title>Incompatible Changes Introduced in Impala 1.2.1</title>
<conbody>
<p> The following changes to SQL syntax and semantics in Impala 1.2.1
could require updates to your SQL code, or schema objects such as tables
or views: </p>
<ul>
<li>
<p conref="../shared/impala_common.xml#common/null_sorting_change"/>
<p audience="PDF"> See <xref href="impala_literals.xml#null"/> for
more information. </p>
</li>
</ul>
<p> The new <cmdname>catalogd</cmdname> service might require changes to
any user-written scripts that stop, start, or restart Impala services,
install or upgrade Impala packages, or issue <codeph>REFRESH</codeph> or
<codeph>INVALIDATE METADATA</codeph> statements: </p>
<ul conref="../shared/impala_common.xml#common/catalogd_xrefs">
<li/>
</ul>
</conbody>
</concept>
<concept rev="1.2" id="incompatible_changes_120">
<title>Incompatible Changes Introduced in Impala 1.2.0 (Beta)</title>
<conbody>
<p> There are no incompatible changes to SQL syntax in Impala 1.2.0
(beta). </p>
<p> The new <cmdname>catalogd</cmdname> service might require changes to
any user-written scripts that stop, start, or restart Impala services,
install or upgrade Impala packages, or issue <codeph>REFRESH</codeph> or
<codeph>INVALIDATE METADATA</codeph> statements: </p>
<ul conref="../shared/impala_common.xml#common/catalogd_xrefs">
<li/>
</ul>
<p> The new resource management feature interacts with both YARN and Llama
services. <ph audience="PDF">See <xref
href="impala_resource_management.xml#resource_management"/> for
usage information for Impala resource management.</ph>
</p>
</conbody>
</concept>
<concept id="incompatible_changes_111">
<title>Incompatible Changes Introduced in Impala 1.1.1</title>
<conbody>
<p> There are no incompatible changes in Impala 1.1.1. </p>
<!-- These couple of paragraphs were originally intended to be conref'ed from the Parquet section of Installing/Using. -->
<!-- But conbodydiv tag too restrictive, can't have just paragraphs and codeblocks inside. -->
<!-- So I will physically copy the info for the time being. -->
<!-- Also copying it under the Upgrading topic. -->
<!-- <conbodydiv conref="impala_parquet.xml#upgrade_parquet_metadata"/> -->
<p> Previously, it was not possible to create Parquet data through Impala
and reuse that table within Hive. Now that Parquet support is available
for Hive 10, reusing existing Impala Parquet data files in Hive requires
updating the table metadata. Use the following command if you are
already running Impala 1.1.1: </p>
<codeblock>ALTER TABLE <varname>table_name</varname> SET FILEFORMAT PARQUETFILE;
</codeblock>
<p> If you are running a level of Impala that is older than 1.1.1, do the
metadata update through Hive: </p>
<codeblock>ALTER TABLE <varname>table_name</varname> SET SERDE 'parquet.hive.serde.ParquetHiveSerDe';
ALTER TABLE <varname>table_name</varname> SET FILEFORMAT
INPUTFORMAT "parquet.hive.DeprecatedParquetInputFormat"
OUTPUTFORMAT "parquet.hive.DeprecatedParquetOutputFormat";
</codeblock>
<p> Impala 1.1.1 and higher can reuse Parquet data files created by Hive,
without any action required. </p>
<p> As usual, make sure to upgrade the Impala LZO package to the latest
level at the same time as you upgrade the Impala server. </p>
</conbody>
</concept>
<concept id="incompatible_changes_11">
<title>Incompatible Change Introduced in Impala 1.1</title>
<conbody>
<ul>
<li>
<p> The <codeph>REFRESH</codeph> statement now requires a table name;
in Impala 1.0, the table name was optional. This syntax change is
part of the internal rework to make <codeph>REFRESH</codeph> a true
Impala SQL statement so that it can be called through the JDBC and
ODBC APIs. <codeph>REFRESH</codeph> now reloads the metadata
immediately, rather than marking it for update the next time any
affected table is accessed. The previous behavior, where omitting
the table name caused a refresh of the entire Impala metadata
catalog, is available through the new <codeph>INVALIDATE
METADATA</codeph> statement. <codeph>INVALIDATE METADATA</codeph>
can be specified with a table name to affect a single table, or
without a table name to affect the entire metadata catalog; the
relevant metadata is reloaded the next time it is requested during
the processing for a SQL statement. See <xref
href="impala_refresh.xml#refresh"/> and <xref
href="impala_invalidate_metadata.xml#invalidate_metadata"/> for
the latest details about these statements. </p>
</li>
</ul>
</conbody>
</concept>
<concept id="incompatible_changes_10">
<title>Incompatible Changes Introduced in Impala 1.0</title>
<conbody>
<ul>
<li> If you use LZO-compressed text files, when you upgrade Impala to
version 1.0, also update the Impala LZO package to the latest level.
See <xref href="impala_txtfile.xml#lzo"/> for details. </li>
</ul>
</conbody>
</concept>
</concept>