| <?xml version="1.0" encoding="UTF-8"?> |
| <!-- |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| --> |
| <!DOCTYPE concept PUBLIC "-//OASIS//DTD DITA Concept//EN" "concept.dtd"> |
| <concept id="conversion_functions"> |
| |
| <title>Impala Type Conversion Functions</title> |
| |
| <titlealts audience="PDF"> |
| |
| <navtitle>Type Conversion Functions</navtitle> |
| |
| </titlealts> |
| |
| <prolog> |
| <metadata> |
| <data name="Category" value="Impala"/> |
| <data name="Category" value="Impala Functions"/> |
| <data name="Category" value="Impala Data Types"/> |
| <data name="Category" value="SQL"/> |
| <data name="Category" value="Data Analysts"/> |
| <data name="Category" value="Developers"/> |
| <data name="Category" value="Querying"/> |
| </metadata> |
| </prolog> |
| |
| <conbody> |
| |
| <p> |
| Conversion functions are typically used in combination with other functions to explicitly |
| pass the expected data types. Impala has strict rules regarding data types for function |
| parameters. For example, Impala does not automatically convert a <codeph>DOUBLE</codeph> |
| value to <codeph>FLOAT</codeph>, a <codeph>BIGINT</codeph> value to <codeph>INT</codeph>, |
| or other conversion where precision could be lost or overflow could occur. Also, for |
| reporting or dealing with loosely defined schemas in big data contexts, you might need to |
| convert values to or from the <codeph>STRING</codeph> type. |
| </p> |
| |
| <p> |
| <b>Function reference:</b> |
| </p> |
| |
| <p> |
| Impala supports the following type conversion functions: |
| </p> |
| |
| <ul> |
| <li> |
| <xref href="#conversion_functions/cast">CAST</xref> |
| </li> |
| |
| <li> |
| <xref href="#conversion_functions/typeof">TYPEOF</xref> |
| </li> |
| </ul> |
| |
| <dl> |
| <dlentry id="cast"> |
| |
| <dt> |
| CAST(expression AS type) |
| </dt> |
| |
| <dd> |
| <b>Purpose:</b> Returns <varname>expression</varname> converted to the |
| <varname>type</varname> data type. |
| <p> |
| If the <varname>expression</varname> value is of a type that cannot be converted to |
| the target <varname>type</varname>: |
| <ul> |
| <li> |
| Of <codeph>DECIMAL</codeph>, <codeph>DATE</codeph>, and |
| <codeph>BOOLEAN</codeph>, the function returns an error. |
| </li> |
| |
| <li> |
| Of all other types, the function returns <codeph>NULL</codeph>. |
| </li> |
| </ul> |
| </p> |
| |
| <p> |
| <b>Usage notes:</b> |
| </p> |
| |
| <p> |
| Use <codeph>CAST</codeph> when passing a column value or literal to a function that |
| expects a parameter with a different type. Frequently used in SQL operations such as |
| <codeph>CREATE TABLE AS SELECT</codeph> and <codeph>INSERT ... VALUES</codeph> to |
| ensure that values from various sources are of the appropriate type for the |
| destination columns. |
| </p> |
| |
| <p> |
| Where practical, do a one-time <codeph>CAST()</codeph> operation during the |
| ingestion process to make each column into the appropriate type, rather than using |
| many <codeph>CAST()</codeph> operations in each query; doing type conversions for |
| each row during each query can be expensive for tables with millions or billions of |
| rows. |
| </p> |
| |
| <p |
| conref="../shared/impala_common.xml#common/timezone_conversion_caveat"/> |
| |
| <p conref="../shared/impala_common.xml#common/related_info"/> |
| |
| <p> |
| For details of casts from each kind of data type, see the description of the |
| appropriate type: <xref |
| href="impala_tinyint.xml#tinyint"/>, |
| <xref |
| href="impala_smallint.xml#smallint"/>, |
| <xref |
| href="impala_int.xml#int"/>, |
| <xref href="impala_bigint.xml#bigint" |
| />, |
| <xref href="impala_float.xml#float"/>, |
| <xref |
| href="impala_double.xml#double"/>, |
| <xref |
| href="impala_decimal.xml#decimal">DECIMAL Data Type</xref>, |
| <xref |
| href="impala_string.xml#string"/>, |
| <xref |
| href="impala_char.xml#char">CHAR Data Type</xref>, |
| <xref |
| href="impala_varchar.xml#varchar">VARCHAR Data Type</xref>, |
| <xref |
| href="impala_timestamp.xml#timestamp"/>, |
| <xref |
| href="impala_boolean.xml#boolean"/> |
| </p> |
| </dd> |
| |
| </dlentry> |
| |
| <dlentry id="cast_format"> |
| |
| <dt> |
| CAST(expression AS type FORMAT pattern) |
| </dt> |
| |
| <dd> |
| <b>Purpose:</b> Returns <varname>expression</varname> converted to the |
| <varname>type</varname> data type based on the <varname>pattern</varname> format |
| string. This signature of <codeph>CAST()</codeph> with the <codeph>FORMAT</codeph> |
| clause is supported only for casts between <codeph>STRING</codeph> / |
| <codeph>CHAR</codeph> / <codeph>VARCHAR</codeph> types and <codeph>TIMESTAMP</codeph> |
| / <codeph>DATE</codeph> types. |
| <note> |
| The patterns allowed in the <codeph>FORMAT</codeph> clause support ISO SQL:2016 |
| standard patterns. Those patterns are not the same format patterns used with the |
| other Impala conversion functions, e.g. <codeph>TO_TIMESTAMP()</codeph> and |
| <codeph>FROM_TIMESTAMP()</codeph>. |
| </note> |
| |
| <p> |
| The following rules apply to <varname>pattern</varname>. Any exceptions to these |
| rules are noted in the Details column of the table below. |
| <ul> |
| <li> |
| <varname>pattern</varname> is a case-insensitive <codeph>STRING</codeph>. |
| </li> |
| |
| <li> |
| If <varname>pattern</varname> is <codeph>NULL</codeph>, an empty string, or a |
| number, an error returns. |
| </li> |
| |
| <li> |
| A fewer digits in <varname>expression</varname> than specified in the |
| <varname>pattern</varname> is accepted if a separator is correctly specified in |
| the <varname>pattern</varname>. For example, <codeph>CAST('5-01-2017' AS DATE |
| FORMAT 'MM-dd-yyyy')</codeph> returns <codeph>DATE'2017-05-01'</codeph>. |
| </li> |
| |
| <li> |
| If fewer number of digits are in <varname>expression</varname> than specified in |
| the <varname>pattern</varname>, the current date is used to complete the year |
| pattern. For example, <codeph>CAST('19/05' AS DATE FORMAT 'YYYY/MM')</codeph> |
| will return <codeph>DATE'2019-05-01'</codeph> when executed on August 8, 2019. |
| </li> |
| </ul> |
| </p> |
| |
| <p> |
| The following format patterns are supported in the <codeph>FORMAT</codeph> clause. |
| <table frame="all" rowsep="1" |
| colsep="1" id="table_gbt_5ym_r3b"> |
| <tgroup cols="3" align="left"> |
| <colspec colname="c1" colnum="1" colwidth="1*"/> |
| <colspec colname="c2" colnum="2" colwidth="1.75*"/> |
| <colspec colname="newCol3" colnum="3" colwidth="6.6*"/> |
| <thead> |
| <row> |
| <entry> |
| Pattern |
| </entry> |
| <entry> |
| Description |
| </entry> |
| <entry> |
| Details |
| </entry> |
| </row> |
| </thead> |
| <tbody> |
| <row> |
| <entry> |
| <codeph>YYYY</codeph> |
| </entry> |
| <entry> |
| 4-digit year. |
| </entry> |
| <entry/> |
| </row> |
| <row> |
| <entry> |
| <codeph>YYY</codeph> |
| </entry> |
| <entry> |
| Last 3 digits of a year. |
| </entry> |
| <entry/> |
| </row> |
| <row> |
| <entry> |
| <codeph>YY</codeph> |
| </entry> |
| <entry> |
| Last 2 digits of a year. |
| </entry> |
| <entry/> |
| </row> |
| <row> |
| <entry> |
| <codeph>Y</codeph> |
| </entry> |
| <entry> |
| Last digit of a year |
| </entry> |
| <entry/> |
| </row> |
| <row> |
| <entry> |
| <codeph>RRRR</codeph> |
| </entry> |
| <entry> |
| 4-digit round year |
| </entry> |
| <entry> |
| <p> |
| If 1,3 or 4-digit year values are provided in |
| <varname>expression</varname>, treated as <codeph>YYYY</codeph>. |
| </p> |
| |
| |
| |
| <p> |
| If 2-digit years are provided in <varname>expression</varname>, treated |
| as <codeph>RR</codeph>. |
| </p> |
| |
| |
| |
| <p> |
| For datetime to string conversions, treated as <codeph>YYYY</codeph>. |
| </p> |
| |
| |
| |
| <p> |
| If <codeph>YYYY</codeph>, <codeph>YYY</codeph>, <codeph>YY</codeph>, |
| <codeph>Y</codeph>, or <codeph>RR</codeph> is given in the same |
| <varname>pattern</varname> for a string to datetime conversion, an error |
| returns. |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>RR</codeph> |
| </entry> |
| <entry> |
| 2-digit round year. |
| </entry> |
| <entry> |
| <p> |
| <ul> |
| <li> |
| For datetime to string conversion, same as <codeph>YY</codeph>. |
| </li> |
| |
| <li> |
| For string to datetime conversions, the first 2 digits of the year |
| in the return value depends on the specified two-digit year and the |
| last two digits of the current year as follows: |
| <ul> |
| <li> |
| <p> |
| If the specified 2-digit year is 00 to 49: |
| </p> |
| <ul> |
| <li> |
| <p> |
| If the last 2 digits of the current year are 00 to 49, the |
| returned year has the same first 2 digits as the current |
| year. |
| </p> |
| </li> |
| |
| <li> |
| <p> |
| If the last 2 digits of the current year are 50 to 99, the |
| first 2 digits of the returned year are 1 greater than the |
| first 2 digits of the current year. |
| </p> |
| </li> |
| </ul> |
| </li> |
| |
| <li> |
| <p> |
| If the specified 2-digit year is 50 to 99: |
| </p> |
| <ul> |
| <li> |
| <p> |
| If the last 2 digits of the current year are 00 to 49, the |
| first 2 digits of the returned year are 1 less than the |
| first 2 digits of the current year. |
| </p> |
| </li> |
| |
| <li> |
| <p> |
| If the last 2 digits of the current year are 50 to 99, the |
| returned year has the same first 2 digits as the current |
| year. |
| </p> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| </li> |
| </ul> |
| </p> |
| |
| |
| |
| <p> |
| If <codeph>YYYY</codeph>, <codeph>YYY</codeph>, <codeph>YY</codeph>, |
| <codeph>Y</codeph>, or <codeph>RR</codeph> is given in the same |
| <varname>pattern</varname> for a string to datetime conversion, an error |
| returns. |
| </p> |
| |
| |
| |
| <p> |
| If 1-digit year values are provided in <varname>expression</varname>, it |
| is treated as <codeph>YYYY</codeph>. |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>MM</codeph> |
| </entry> |
| <entry> |
| Month |
| </entry> |
| <entry> |
| <p> |
| In datetime to string conversions, 1-digit month is prefixed with a |
| zero. |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>DD</codeph> |
| </entry> |
| <entry> |
| Day of month (1-31) |
| </entry> |
| <entry> |
| <p> |
| In datetime to string conversions, one digit day is prefixed with a |
| zero. |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>DDD</codeph> |
| </entry> |
| <entry> |
| Day of year (1-366) |
| </entry> |
| <entry> |
| <p> |
| In string to datetime conversions, providing <codeph>MM</codeph> and |
| <codeph>DD</codeph> along with <codeph>DDD</codeph> results an error, |
| e.g. <codeph>YYYY-MM-DDD</codeph>. |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>HH</codeph> |
| |
| <p> |
| <codeph>HH12</codeph> |
| </p> |
| </entry> |
| <entry> |
| Hour of day (1-12) |
| </entry> |
| <entry> |
| <p> |
| In datetime to string conversions, 1-digit hours are prefixed with a |
| zero. |
| </p> |
| |
| |
| |
| <p> |
| If provided hour in <varname>expression</varname> is not between 1 and |
| 12, returns an error. |
| </p> |
| |
| |
| |
| <p> |
| If no AM/PM is provided in <varname>expression</varname>, the default is |
| <codeph>AM</codeph>. |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>HH24</codeph> |
| </entry> |
| <entry> |
| Hour of day (0-23) |
| </entry> |
| <entry> |
| <p> |
| In string to datetime conversions, if <codeph>HH12</codeph>, |
| <codeph>AM</codeph>, <codeph>PM</codeph> are given in the same |
| <varname>pattern</varname>, an error returns. |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>MI</codeph> |
| </entry> |
| <entry> |
| Minute of hour (0-59) |
| </entry> |
| <entry> |
| <p> |
| In datetime to string conversions, 1-digit minutes are prefixed with a |
| zero. |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>SS</codeph> |
| </entry> |
| <entry> |
| Second of minute (0-59) |
| </entry> |
| <entry> |
| <p> |
| In datetime to string conversions, 1-digit seconds are prefixed with a |
| zero. |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>SSSSS</codeph> |
| </entry> |
| <entry> |
| Second of Day (0-86399) |
| </entry> |
| <entry> |
| In string to timestamp conversions, if <codeph>SS</codeph>, |
| <codeph>HH</codeph>, <codeph>HH12</codeph>, <codeph>HH24</codeph>, |
| <codeph>MI</codeph>, <codeph>AM</codeph>/<codeph>PM</codeph> are given in |
| the same <varname>pattern</varname>, an error returns. |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>FF</codeph> |
| |
| <p> |
| <codeph>FF1</codeph>, ..., <codeph>FF9</codeph> |
| </p> |
| </entry> |
| <entry> |
| Fractional second |
| </entry> |
| <entry> |
| <p> |
| A number, 1 to 9, can be used to indicate the number of digits. |
| </p> |
| |
| |
| |
| <p> |
| <codeph>FF</codeph> specifies a 9 digits fractional second. |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>AM</codeph> |
| |
| <p> |
| <codeph>PM</codeph> |
| </p> |
| |
| |
| |
| <p> |
| <codeph>A.M.</codeph> |
| </p> |
| |
| |
| |
| <p> |
| <codeph>P.M.</codeph> |
| </p> |
| </entry> |
| <entry> |
| Meridiem indicator |
| </entry> |
| <entry> |
| <p> |
| For datetime to string conversions, <codeph>AM</codeph> and |
| <codeph>PM</codeph> are treated as synonyms. For example, casting |
| <codeph>'2019-01-01 11:00 am'</codeph> to <codeph>TIMESTAMP</codeph> |
| with the <codeph>'YYYY-MM-DD HH12:MI PM'</codeph> pattern returns |
| <codeph>01-JAN-19 11.00.00.000000 AM</codeph>. |
| </p> |
| |
| |
| |
| <p> |
| For string to datetime conversion, <codeph>HH24</codeph> in the same |
| <varname>pattern</varname> returns an error. |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>TZH</codeph> |
| </entry> |
| <entry> |
| Timezone offset hour |
| </entry> |
| <entry> |
| <p> |
| An optional sign, + or -, and 2 digits for the value of signed numbers |
| are allowed for the source <varname>expression</varname>, e.g. |
| <codeph>“+10”</codeph>, <codeph>“-05”</codeph>, |
| <codeph>"04"</codeph>. |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>TZM</codeph> |
| </entry> |
| <entry> |
| Timezone offset minute |
| </entry> |
| <entry> |
| <p> |
| Unsigned numbers are allowed for the source |
| <varname>expression</varname>. |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>-</codeph> |
| |
| <p> |
| <codeph>.</codeph> |
| </p> |
| |
| |
| |
| <p> |
| <codeph>/</codeph> |
| </p> |
| |
| |
| |
| <p> |
| <codeph>,</codeph> |
| </p> |
| |
| |
| |
| <p> |
| <codeph>'</codeph> |
| </p> |
| |
| |
| |
| <p> |
| <codeph>;</codeph> |
| </p> |
| |
| |
| |
| <p> |
| <codeph>:</codeph> |
| </p> |
| |
| |
| |
| <p> |
| <space> |
| </p> |
| </entry> |
| <entry> |
| Separator |
| </entry> |
| <entry> |
| <p> |
| For string to datetime conversions, any separator character in the |
| <varname>pattern</varname> string would match any separator character in |
| the input <varname>expression</varname>. |
| </p> |
| |
| |
| |
| <p> |
| For example, <codeph>CAST(“20191010” AS DATE FORMAT |
| “YYYY-MM-DD”)</codeph> returns an error, but <codeph>CAST("2019-.;10 |
| 10" AS DATE FORMAT "YYYY-MM-DD")</codeph> succeeds. |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>T</codeph> |
| </entry> |
| <entry> |
| Separates the date from the time. |
| </entry> |
| <entry> |
| This pattern is used for accepting ISO 8601 datetime formats. |
| |
| <p> |
| Example: <codeph>YYYY-MM-DD<b>T</b>HH24:MI:SS.FF9Z</codeph> |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>Z</codeph> |
| </entry> |
| <entry> |
| Indicates the zero hour offset from UTC. |
| </entry> |
| <entry> |
| This pattern is used for accepting ISO 8601 datetime formats. |
| </entry> |
| </row> |
| </tbody> |
| </tgroup> |
| </table> |
| </p> |
| |
| <p> |
| <b>Examples:</b> |
| </p> |
| <table frame="all" rowsep="1" colsep="1" id="table_qqg_txn_r3b"> |
| <tgroup cols="2" align="left"> |
| <colspec colname="c1" colnum="1" colwidth="1*"/> |
| <colspec colname="c2" colnum="2" colwidth="1*"/> |
| <thead> |
| <row> |
| <entry> |
| Input |
| </entry> |
| <entry> |
| Output |
| </entry> |
| </row> |
| </thead> |
| <tbody> |
| <row> |
| <entry> |
| <codeph>CAST("02-11-2014" AS DATE FORMAT 'dd-mm-yyyy')</codeph> |
| </entry> |
| <entry> |
| 2014-11-02 |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>CAST("365 2014" AS DATE FORMAT 'DDD-YYYY')</codeph> |
| </entry> |
| <entry> |
| 2014-12-31 |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>CAST("5-01-26" AS DATE FORMAT 'YY-MM-DD')</codeph> |
| </entry> |
| <entry> |
| Executed at 2019-01-01 11:11:11: |
| |
| <p> |
| 2015-01-26 |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>CAST('2018-11-10T15:11:04Z' AS TIMESTAMP</codeph> |
| |
| <p> |
| <codeph>FORMAT 'YYYY-MM-DDTHH24:MI:SSZ')</codeph> |
| </p> |
| </entry> |
| <entry> |
| 2018-11-10 15:11:04 |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>CAST("95-01-28" AS DATE FORMAT 'YYY-MM-DD')</codeph> |
| </entry> |
| <entry> |
| Executed at 2019-01-01 11:11:11: |
| |
| <p> |
| 2095-01-28 |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>CAST("49-01-15" AS DATE FORMAT 'RR-MM-DD')</codeph> |
| </entry> |
| <entry> |
| Round year when last 2 digits of current year is greater than 49. |
| |
| <p> |
| Executed at 2099-01-01 11:11:11: |
| </p> |
| |
| |
| |
| <p> |
| 2149-01-15 |
| </p> |
| </entry> |
| </row> |
| <row> |
| <entry> |
| <codeph>CAST('2019.10.10 13:30:40.123456 +01:30' </codeph> |
| |
| <p> |
| <codeph>AS TIMESTAMP</codeph> |
| </p> |
| |
| |
| |
| <p> |
| <codeph>FORMAT 'YYYY-MM-DD HH24:MI:SS.FF9 TZH:TZM')</codeph> |
| </p> |
| </entry> |
| <entry> |
| 2019-10-10 13:30:40.123456000 |
| </entry> |
| </row> |
| </tbody> |
| </tgroup> |
| </table> |
| </dd> |
| |
| </dlentry> |
| |
| <dlentry rev="2.3.0" id="typeof"> |
| |
| <dt> |
| TYPEOF(expression) |
| </dt> |
| |
| <dd> |
| <b>Purpose:</b> Returns the name of the data type corresponding to |
| <varname>expression</varname>. For types with extra attributes, such as length for |
| <codeph>CHAR</codeph> and <codeph>VARCHAR</codeph>, or precision and scale for |
| <codeph>DECIMAL</codeph>, includes the full specification of the type. |
| <p> |
| <b>Return type:</b> <codeph>STRING</codeph> |
| </p> |
| |
| <p> |
| <b>Usage notes:</b> Typically used in interactive exploration of a schema, or in |
| application code that programmatically generates schema definitions such as |
| <codeph>CREATE TABLE</codeph> statements, for example, to get the type of an |
| expression such as <codeph>col1 / col2</codeph> or <codeph>CONCAT(col1, col2, |
| col3)</codeph>. This function is especially useful for arithmetic expressions |
| involving <codeph>DECIMAL</codeph> types because the precision and scale of the |
| result is can be different than that of the operands. |
| </p> |
| |
| <p |
| conref="../shared/impala_common.xml#common/example_blurb" |
| /> |
| |
| <p> |
| <codeph>TYPEOF(2)</codeph> returns <codeph>TINYINT</codeph>. |
| </p> |
| |
| <p> |
| <codeph>TYPEOF(NOW())</codeph> returns <codeph>TIMESTAMP</codeph>. |
| </p> |
| </dd> |
| |
| </dlentry> |
| </dl> |
| |
| </conbody> |
| |
| </concept> |