absReturns the absolute value of a number.
abs(numeric_expression)
acosReturns the arc cosine or inverse cosine of a number.
acos(numeric_expression)
acoshReturns the area hyperbolic cosine or inverse hyperbolic cosine of a number.
acosh(numeric_expression)
asinReturns the arc sine or inverse sine of a number.
asin(numeric_expression)
asinhReturns the area hyperbolic sine or inverse hyperbolic sine of a number.
asinh(numeric_expression)
atanReturns the arc tangent or inverse tangent of a number.
atan(numeric_expression)
atan2Returns the arc tangent or inverse tangent of expression_y / expression_x.
atan2(expression_y, expression_x)
atanhReturns the area hyperbolic tangent or inverse hyperbolic tangent of a number.
atanh(numeric_expression)
cbrtReturns the cube root of a number.
cbrt(numeric_expression)
ceilReturns the nearest integer greater than or equal to a number.
ceil(numeric_expression)
cosReturns the cosine of a number.
cos(numeric_expression)
coshReturns the hyperbolic cosine of a number.
cosh(numeric_expression)
cotReturns the cotangent of a number.
cot(numeric_expression)
degreesConverts radians to degrees.
degrees(numeric_expression)
expReturns the base-e exponential of a number.
exp(numeric_expression)
factorialFactorial. Returns 1 if value is less than 2.
factorial(numeric_expression)
floorReturns the nearest integer less than or equal to a number.
floor(numeric_expression)
gcdReturns the greatest common divisor of expression_x and expression_y. Returns 0 if both inputs are zero.
gcd(expression_x, expression_y)
isnanReturns true if a given number is +NaN or -NaN otherwise returns false.
isnan(numeric_expression)
iszeroReturns true if a given number is +0.0 or -0.0 otherwise returns false.
iszero(numeric_expression)
lcmReturns the least common multiple of expression_x and expression_y. Returns 0 if either input is zero.
lcm(expression_x, expression_y)
lnReturns the natural logarithm of a number.
ln(numeric_expression)
logReturns the base-x logarithm of a number. Can either provide a specified base, or if omitted then takes the base-10 of a number.
log(base, numeric_expression) log(numeric_expression)
log10Returns the base-10 logarithm of a number.
log10(numeric_expression)
log2Returns the base-2 logarithm of a number.
log2(numeric_expression)
nanvlReturns the first argument if it's not NaN. Returns the second argument otherwise.
nanvl(expression_x, expression_y)
piReturns an approximate value of π.
pi()
powAlias of power.
powerReturns a base expression raised to the power of an exponent.
power(base, exponent)
radiansConverts degrees to radians.
radians(numeric_expression)
randomReturns a random float value in the range [0, 1). The random seed is unique to each row.
random()
roundRounds a number to the nearest integer.
round(numeric_expression[, decimal_places])
signumReturns the sign of a number. Negative numbers return -1. Zero and positive numbers return 1.
signum(numeric_expression)
sinReturns the sine of a number.
sin(numeric_expression)
sinhReturns the hyperbolic sine of a number.
sinh(numeric_expression)
sqrtReturns the square root of a number.
sqrt(numeric_expression)
tanReturns the tangent of a number.
tan(numeric_expression)
tanhReturns the hyperbolic tangent of a number.
tanh(numeric_expression)
truncTruncates a number to a whole number or truncated to the specified decimal places.
trunc(numeric_expression[, decimal_places])
decimal_places is a positive integer, truncates digits to the right of the decimal point. If decimal_places is a negative integer, replaces digits to the left of the decimal point with 0.coalesceReturns the first of its arguments that is not null. Returns null if all arguments are null. This function is often used to substitute a default value for null values.
coalesce(expression1[, ..., expression_n])
> select coalesce(null, null, 'datafusion'); +----------------------------------------+ | coalesce(NULL,NULL,Utf8("datafusion")) | +----------------------------------------+ | datafusion | +----------------------------------------+
greatestReturns the greatest value in a list of expressions. Returns null if all expressions are null.
greatest(expression1[, ..., expression_n])
> select greatest(4, 7, 5); +---------------------------+ | greatest(4,7,5) | +---------------------------+ | 7 | +---------------------------+
ifnullAlias of nvl.
nullifReturns null if expression1 equals expression2; otherwise it returns expression1. This can be used to perform the inverse operation of coalesce.
nullif(expression1, expression2)
> select nullif('datafusion', 'data'); +-----------------------------------------+ | nullif(Utf8("datafusion"),Utf8("data")) | +-----------------------------------------+ | datafusion | +-----------------------------------------+ > select nullif('datafusion', 'datafusion'); +-----------------------------------------------+ | nullif(Utf8("datafusion"),Utf8("datafusion")) | +-----------------------------------------------+ | | +-----------------------------------------------+
nvlReturns expression2 if expression1 is NULL otherwise it returns expression1.
nvl(expression1, expression2)
> select nvl(null, 'a'); +---------------------+ | nvl(NULL,Utf8("a")) | +---------------------+ | a | +---------------------+\ > select nvl('b', 'a'); +--------------------------+ | nvl(Utf8("b"),Utf8("a")) | +--------------------------+ | b | +--------------------------+
nvl2Returns expression2 if expression1 is not NULL; otherwise it returns expression3.
nvl2(expression1, expression2, expression3)
> select nvl2(null, 'a', 'b'); +--------------------------------+ | nvl2(NULL,Utf8("a"),Utf8("b")) | +--------------------------------+ | b | +--------------------------------+ > select nvl2('data', 'a', 'b'); +----------------------------------------+ | nvl2(Utf8("data"),Utf8("a"),Utf8("b")) | +----------------------------------------+ | a | +----------------------------------------+
asciiReturns the Unicode character code of the first character in a string.
ascii(str)
> select ascii('abc'); +--------------------+ | ascii(Utf8("abc")) | +--------------------+ | 97 | +--------------------+ > select ascii('🚀'); +-------------------+ | ascii(Utf8("🚀")) | +-------------------+ | 128640 | +-------------------+
Related functions:
bit_lengthReturns the bit length of a string.
bit_length(str)
> select bit_length('datafusion'); +--------------------------------+ | bit_length(Utf8("datafusion")) | +--------------------------------+ | 80 | +--------------------------------+
Related functions:
btrimTrims the specified trim string from the start and end of a string. If no trim string is provided, all whitespace is removed from the start and end of the input string.
btrim(str[, trim_str])
> select btrim('__datafusion____', '_'); +-------------------------------------------+ | btrim(Utf8("__datafusion____"),Utf8("_")) | +-------------------------------------------+ | datafusion | +-------------------------------------------+
trim(BOTH trim_str FROM str)
trim(trim_str FROM str)
Related functions:
char_lengthAlias of character_length.
character_lengthReturns the number of characters in a string.
character_length(str)
> select character_length('Ångström'); +------------------------------------+ | character_length(Utf8("Ångström")) | +------------------------------------+ | 8 | +------------------------------------+
Related functions:
chrReturns the character with the specified ASCII or Unicode code value.
chr(expression)
> select chr(128640); +--------------------+ | chr(Int64(128640)) | +--------------------+ | 🚀 | +--------------------+
Related functions:
concatConcatenates multiple strings together.
concat(str[, ..., str_n])
> select concat('data', 'f', 'us', 'ion'); +-------------------------------------------------------+ | concat(Utf8("data"),Utf8("f"),Utf8("us"),Utf8("ion")) | +-------------------------------------------------------+ | datafusion | +-------------------------------------------------------+
Related functions:
concat_wsConcatenates multiple strings together with a specified separator.
concat_ws(separator, str[, ..., str_n])
> select concat_ws('_', 'data', 'fusion'); +--------------------------------------------------+ | concat_ws(Utf8("_"),Utf8("data"),Utf8("fusion")) | +--------------------------------------------------+ | data_fusion | +--------------------------------------------------+
Related functions:
containsReturn true if search_str is found within string (case-sensitive).
contains(str, search_str)
> select contains('the quick brown fox', 'row'); +---------------------------------------------------+ | contains(Utf8("the quick brown fox"),Utf8("row")) | +---------------------------------------------------+ | true | +---------------------------------------------------+
ends_withTests if a string ends with a substring.
ends_with(str, substr)
> select ends_with('datafusion', 'soin'); +--------------------------------------------+ | ends_with(Utf8("datafusion"),Utf8("soin")) | +--------------------------------------------+ | false | +--------------------------------------------+ > select ends_with('datafusion', 'sion'); +--------------------------------------------+ | ends_with(Utf8("datafusion"),Utf8("sion")) | +--------------------------------------------+ | true | +--------------------------------------------+
find_in_setReturns a value in the range of 1 to N if the string str is in the string list strlist consisting of N substrings.
find_in_set(str, strlist)
> select find_in_set('b', 'a,b,c,d'); +----------------------------------------+ | find_in_set(Utf8("b"),Utf8("a,b,c,d")) | +----------------------------------------+ | 2 | +----------------------------------------+
initcapCapitalizes the first character in each word in the input string. Words are delimited by non-alphanumeric characters.
initcap(str)
> select initcap('apache datafusion'); +------------------------------------+ | initcap(Utf8("apache datafusion")) | +------------------------------------+ | Apache Datafusion | +------------------------------------+
Related functions:
instrAlias of strpos.
leftReturns a specified number of characters from the left side of a string.
left(str, n)
> select left('datafusion', 4); +-----------------------------------+ | left(Utf8("datafusion"),Int64(4)) | +-----------------------------------+ | data | +-----------------------------------+
Related functions:
lengthAlias of character_length.
levenshteinReturns the Levenshtein distance between the two given strings.
levenshtein(str1, str2)
> select levenshtein('kitten', 'sitting'); +---------------------------------------------+ | levenshtein(Utf8("kitten"),Utf8("sitting")) | +---------------------------------------------+ | 3 | +---------------------------------------------+
lowerConverts a string to lower-case.
lower(str)
> select lower('Ångström'); +-------------------------+ | lower(Utf8("Ångström")) | +-------------------------+ | ångström | +-------------------------+
Related functions:
lpadPads the left side of a string with another string to a specified string length.
lpad(str, n[, padding_str])
> select lpad('Dolly', 10, 'hello'); +---------------------------------------------+ | lpad(Utf8("Dolly"),Int64(10),Utf8("hello")) | +---------------------------------------------+ | helloDolly | +---------------------------------------------+
Related functions:
ltrimTrims the specified trim string from the beginning of a string. If no trim string is provided, all whitespace is removed from the start of the input string.
ltrim(str[, trim_str])
> select ltrim(' datafusion '); +-------------------------------+ | ltrim(Utf8(" datafusion ")) | +-------------------------------+ | datafusion | +-------------------------------+ > select ltrim('___datafusion___', '_'); +-------------------------------------------+ | ltrim(Utf8("___datafusion___"),Utf8("_")) | +-------------------------------------------+ | datafusion___ | +-------------------------------------------+
trim(LEADING trim_str FROM str)
Related functions:
octet_lengthReturns the length of a string in bytes.
octet_length(str)
> select octet_length('Ångström'); +--------------------------------+ | octet_length(Utf8("Ångström")) | +--------------------------------+ | 10 | +--------------------------------+
Related functions:
positionAlias of strpos.
repeatReturns a string with an input string repeated a specified number.
repeat(str, n)
> select repeat('data', 3); +-------------------------------+ | repeat(Utf8("data"),Int64(3)) | +-------------------------------+ | datadatadata | +-------------------------------+
replaceReplaces all occurrences of a specified substring in a string with a new substring.
replace(str, substr, replacement)
> select replace('ABabbaBA', 'ab', 'cd'); +-------------------------------------------------+ | replace(Utf8("ABabbaBA"),Utf8("ab"),Utf8("cd")) | +-------------------------------------------------+ | ABcdbaBA | +-------------------------------------------------+
reverseReverses the character order of a string.
reverse(str)
> select reverse('datafusion'); +-----------------------------+ | reverse(Utf8("datafusion")) | +-----------------------------+ | noisufatad | +-----------------------------+
rightReturns a specified number of characters from the right side of a string.
right(str, n)
> select right('datafusion', 6); +------------------------------------+ | right(Utf8("datafusion"),Int64(6)) | +------------------------------------+ | fusion | +------------------------------------+
Related functions:
rpadPads the right side of a string with another string to a specified string length.
rpad(str, n[, padding_str])
> select rpad('datafusion', 20, '_-'); +-----------------------------------------------+ | rpad(Utf8("datafusion"),Int64(20),Utf8("_-")) | +-----------------------------------------------+ | datafusion_-_-_-_-_- | +-----------------------------------------------+
Related functions:
rtrimTrims the specified trim string from the end of a string. If no trim string is provided, all whitespace is removed from the end of the input string.
rtrim(str[, trim_str])
> select rtrim(' datafusion '); +-------------------------------+ | rtrim(Utf8(" datafusion ")) | +-------------------------------+ | datafusion | +-------------------------------+ > select rtrim('___datafusion___', '_'); +-------------------------------------------+ | rtrim(Utf8("___datafusion___"),Utf8("_")) | +-------------------------------------------+ | ___datafusion | +-------------------------------------------+
trim(TRAILING trim_str FROM str)
Related functions:
split_partSplits a string based on a specified delimiter and returns the substring in the specified position.
split_part(str, delimiter, pos)
> select split_part('1.2.3.4.5', '.', 3); +--------------------------------------------------+ | split_part(Utf8("1.2.3.4.5"),Utf8("."),Int64(3)) | +--------------------------------------------------+ | 3 | +--------------------------------------------------+
starts_withTests if a string starts with a substring.
starts_with(str, substr)
> select starts_with('datafusion','data'); +----------------------------------------------+ | starts_with(Utf8("datafusion"),Utf8("data")) | +----------------------------------------------+ | true | +----------------------------------------------+
strposReturns the starting position of a specified substring in a string. Positions begin at 1. If the substring does not exist in the string, the function returns 0.
strpos(str, substr)
> select strpos('datafusion', 'fus'); +----------------------------------------+ | strpos(Utf8("datafusion"),Utf8("fus")) | +----------------------------------------+ | 5 | +----------------------------------------+
position(substr in origstr)
substrExtracts a substring of a specified number of characters from a specific starting position in a string.
substr(str, start_pos[, length])
> select substr('datafusion', 5, 3); +----------------------------------------------+ | substr(Utf8("datafusion"),Int64(5),Int64(3)) | +----------------------------------------------+ | fus | +----------------------------------------------+
substring(str from start_pos for length)
substr_indexReturns the substring from str before count occurrences of the delimiter delim. If count is positive, everything to the left of the final delimiter (counting from the left) is returned. If count is negative, everything to the right of the final delimiter (counting from the right) is returned.
substr_index(str, delim, count)
> select substr_index('www.apache.org', '.', 1); +---------------------------------------------------------+ | substr_index(Utf8("www.apache.org"),Utf8("."),Int64(1)) | +---------------------------------------------------------+ | www | +---------------------------------------------------------+ > select substr_index('www.apache.org', '.', -1); +----------------------------------------------------------+ | substr_index(Utf8("www.apache.org"),Utf8("."),Int64(-1)) | +----------------------------------------------------------+ | org | +----------------------------------------------------------+
substringAlias of substr.
substring_indexAlias of substr_index.
to_hexConverts an integer to a hexadecimal string.
to_hex(int)
> select to_hex(12345689); +-------------------------+ | to_hex(Int64(12345689)) | +-------------------------+ | bc6159 | +-------------------------+
translateTranslates characters in a string to specified translation characters.
translate(str, chars, translation)
> select translate('twice', 'wic', 'her'); +--------------------------------------------------+ | translate(Utf8("twice"),Utf8("wic"),Utf8("her")) | +--------------------------------------------------+ | there | +--------------------------------------------------+
trimAlias of btrim.
upperConverts a string to upper-case.
upper(str)
> select upper('dataFusion'); +---------------------------+ | upper(Utf8("dataFusion")) | +---------------------------+ | DATAFUSION | +---------------------------+
Related functions:
uuidReturns UUID v4 string value which is unique per row.
uuid()
> select uuid(); +--------------------------------------+ | uuid() | +--------------------------------------+ | 6ec17ef8-1934-41cc-8d59-d0c8f9eea1f0 | +--------------------------------------+
decodeDecode binary data from textual representation in string.
decode(expression, format)
Related functions:
encodeEncode binary data into a textual representation.
encode(expression, format)
base64, hexRelated functions:
Apache DataFusion uses a PCRE-like regular expression syntax (minus support for several features including look-around and backreferences). The following regular expression functions are supported:
regexp_countReturns the number of matches that a regular expression has in a string.
regexp_count(str, regexp[, start, flags])
> select regexp_count('abcAbAbc', 'abc', 2, 'i'); +---------------------------------------------------------------+ | regexp_count(Utf8("abcAbAbc"),Utf8("abc"),Int64(2),Utf8("i")) | +---------------------------------------------------------------+ | 1 | +---------------------------------------------------------------+
regexp_likeReturns true if a regular expression has at least one match in a string, false otherwise.
regexp_like(str, regexp[, flags])
select regexp_like('Köln', '[a-zA-Z]ö[a-zA-Z]{2}'); +--------------------------------------------------------+ | regexp_like(Utf8("Köln"),Utf8("[a-zA-Z]ö[a-zA-Z]{2}")) | +--------------------------------------------------------+ | true | +--------------------------------------------------------+ SELECT regexp_like('aBc', '(b|d)', 'i'); +--------------------------------------------------+ | regexp_like(Utf8("aBc"),Utf8("(b|d)"),Utf8("i")) | +--------------------------------------------------+ | true | +--------------------------------------------------+
Additional examples can be found here
regexp_matchReturns the first regular expression matches in a string.
regexp_match(str, regexp[, flags])
> select regexp_match('Köln', '[a-zA-Z]ö[a-zA-Z]{2}'); +---------------------------------------------------------+ | regexp_match(Utf8("Köln"),Utf8("[a-zA-Z]ö[a-zA-Z]{2}")) | +---------------------------------------------------------+ | [Köln] | +---------------------------------------------------------+ SELECT regexp_match('aBc', '(b|d)', 'i'); +---------------------------------------------------+ | regexp_match(Utf8("aBc"),Utf8("(b|d)"),Utf8("i")) | +---------------------------------------------------+ | [B] | +---------------------------------------------------+
Additional examples can be found here
regexp_replaceReplaces substrings in a string that match a regular expression.
regexp_replace(str, regexp, replacement[, flags])
> select regexp_replace('foobarbaz', 'b(..)', 'X\\1Y', 'g'); +------------------------------------------------------------------------+ | regexp_replace(Utf8("foobarbaz"),Utf8("b(..)"),Utf8("X\1Y"),Utf8("g")) | +------------------------------------------------------------------------+ | fooXarYXazY | +------------------------------------------------------------------------+ SELECT regexp_replace('aBc', '(b|d)', 'Ab\\1a', 'i'); +-------------------------------------------------------------------+ | regexp_replace(Utf8("aBc"),Utf8("(b|d)"),Utf8("Ab\1a"),Utf8("i")) | +-------------------------------------------------------------------+ | aAbBac | +-------------------------------------------------------------------+
Additional examples can be found here
current_dateReturns the current UTC date.
The current_date() return value is determined at query time and will return the same date, no matter when in the query plan the function executes.
current_date()
current_timeReturns the current UTC time.
The current_time() return value is determined at query time and will return the same time, no matter when in the query plan the function executes.
current_time()
current_timestampAlias of now.
date_binCalculates time intervals and returns the start of the interval nearest to the specified timestamp. Use date_bin to downsample time series data by grouping rows into time-based “bins” or “windows” and applying an aggregate or selector function to each window.
For example, if you “bin” or “window” data into 15 minute intervals, an input timestamp of 2023-01-01T18:18:18Z will be updated to the start time of the 15 minute bin it is in: 2023-01-01T18:15:00Z.
date_bin(interval, expression, origin-timestamp)
The following intervals are supported:
-- Bin the timestamp into 1 day intervals > SELECT date_bin(interval '1 day', time) as bin FROM VALUES ('2023-01-01T18:18:18Z'), ('2023-01-03T19:00:03Z') t(time); +---------------------+ | bin | +---------------------+ | 2023-01-01T00:00:00 | | 2023-01-03T00:00:00 | +---------------------+ 2 row(s) fetched. -- Bin the timestamp into 1 day intervals starting at 3AM on 2023-01-01 > SELECT date_bin(interval '1 day', time, '2023-01-01T03:00:00') as bin FROM VALUES ('2023-01-01T18:18:18Z'), ('2023-01-03T19:00:03Z') t(time); +---------------------+ | bin | +---------------------+ | 2023-01-01T03:00:00 | | 2023-01-03T03:00:00 | +---------------------+ 2 row(s) fetched.
date_formatAlias of to_char.
date_partReturns the specified part of the date as an integer.
date_part(part, expression)
part: Part of the date to return. The following date parts are supported:
expression: Time expression to operate on. Can be a constant, column, or function.
extract(field FROM source)
date_truncTruncates a timestamp value to a specified precision.
date_trunc(precision, expression)
precision: Time precision to truncate to. The following precisions are supported:
expression: Time expression to operate on. Can be a constant, column, or function.
datepartAlias of date_part.
datetruncAlias of date_trunc.
from_unixtimeConverts an integer to RFC3339 timestamp format (YYYY-MM-DDT00:00:00.000000000Z). Integers and unsigned integers are interpreted as nanoseconds since the unix epoch (1970-01-01T00:00:00Z) return the corresponding timestamp.
from_unixtime(expression[, timezone])
> select from_unixtime(1599572549, 'America/New_York'); +-----------------------------------------------------------+ | from_unixtime(Int64(1599572549),Utf8("America/New_York")) | +-----------------------------------------------------------+ | 2020-09-08T09:42:29-04:00 | +-----------------------------------------------------------+
make_dateMake a date from year/month/day component parts.
make_date(year, month, day)
> select make_date(2023, 1, 31); +-------------------------------------------+ | make_date(Int64(2023),Int64(1),Int64(31)) | +-------------------------------------------+ | 2023-01-31 | +-------------------------------------------+ > select make_date('2023', '01', '31'); +-----------------------------------------------+ | make_date(Utf8("2023"),Utf8("01"),Utf8("31")) | +-----------------------------------------------+ | 2023-01-31 | +-----------------------------------------------+
Additional examples can be found here
nowReturns the current UTC timestamp.
The now() return value is determined at query time and will return the same timestamp, no matter when in the query plan the function executes.
now()
to_charReturns a string representation of a date, time, timestamp or duration based on a Chrono format. Unlike the PostgreSQL equivalent of this function numerical formatting is not supported.
to_char(expression, format)
> select to_char('2023-03-01'::date, '%d-%m-%Y'); +----------------------------------------------+ | to_char(Utf8("2023-03-01"),Utf8("%d-%m-%Y")) | +----------------------------------------------+ | 01-03-2023 | +----------------------------------------------+
Additional examples can be found here
to_dateConverts a value to a date (YYYY-MM-DD). Supports strings, integer and double types as input. Strings are parsed as YYYY-MM-DD (e.g. ‘2023-07-20’) if no Chrono formats are provided. Integers and doubles are interpreted as days since the unix epoch (1970-01-01T00:00:00Z). Returns the corresponding date.
Note: to_date returns Date32, which represents its values as the number of days since unix epoch(1970-01-01) stored as signed 32 bit value. The largest supported date value is 9999-12-31.
to_date('2017-05-31', '%Y-%m-%d')
> select to_date('2023-01-31'); +-----------------------------+ | to_date(Utf8("2023-01-31")) | +-----------------------------+ | 2023-01-31 | +-----------------------------+ > select to_date('2023/01/31', '%Y-%m-%d', '%Y/%m/%d'); +---------------------------------------------------------------+ | to_date(Utf8("2023/01/31"),Utf8("%Y-%m-%d"),Utf8("%Y/%m/%d")) | +---------------------------------------------------------------+ | 2023-01-31 | +---------------------------------------------------------------+
Additional examples can be found here
to_local_timeConverts a timestamp with a timezone to a timestamp without a timezone (with no offset or timezone information). This function handles daylight saving time changes.
to_local_time(expression)
> SELECT to_local_time('2024-04-01T00:00:20Z'::timestamp); +---------------------------------------------+ | to_local_time(Utf8("2024-04-01T00:00:20Z")) | +---------------------------------------------+ | 2024-04-01T00:00:20 | +---------------------------------------------+ > SELECT to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels'); +---------------------------------------------+ | to_local_time(Utf8("2024-04-01T00:00:20Z")) | +---------------------------------------------+ | 2024-04-01T00:00:20 | +---------------------------------------------+ > SELECT time, arrow_typeof(time) as type, to_local_time(time) as to_local_time, arrow_typeof(to_local_time(time)) as to_local_time_type FROM ( SELECT '2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels' AS time ); +---------------------------+------------------------------------------------+---------------------+-----------------------------+ | time | type | to_local_time | to_local_time_type | +---------------------------+------------------------------------------------+---------------------+-----------------------------+ | 2024-04-01T00:00:20+02:00 | Timestamp(Nanosecond, Some("Europe/Brussels")) | 2024-04-01T00:00:20 | Timestamp(Nanosecond, None) | +---------------------------+------------------------------------------------+---------------------+-----------------------------+ # combine `to_local_time()` with `date_bin()` to bin on boundaries in the timezone rather # than UTC boundaries > SELECT date_bin(interval '1 day', to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels')) AS date_bin; +---------------------+ | date_bin | +---------------------+ | 2024-04-01T00:00:00 | +---------------------+ > SELECT date_bin(interval '1 day', to_local_time('2024-04-01T00:00:20Z'::timestamp AT TIME ZONE 'Europe/Brussels')) AT TIME ZONE 'Europe/Brussels' AS date_bin_with_timezone; +---------------------------+ | date_bin_with_timezone | +---------------------------+ | 2024-04-01T00:00:00+02:00 | +---------------------------+
to_timestampConverts a value to a timestamp (YYYY-MM-DDT00:00:00Z). Supports strings, integer, unsigned integer, and double types as input. Strings are parsed as RFC3339 (e.g. ‘2023-07-20T05:44:00’) if no [Chrono formats] are provided. Integers, unsigned integers, and doubles are interpreted as seconds since the unix epoch (1970-01-01T00:00:00Z). Returns the corresponding timestamp.
Note: to_timestamp returns Timestamp(Nanosecond). The supported range for integer input is between -9223372037 and 9223372036. Supported range for string input is between 1677-09-21T00:12:44.0 and 2262-04-11T23:47:16.0. Please use to_timestamp_seconds for the input outside of supported bounds.
to_timestamp(expression[, ..., format_n])
> select to_timestamp('2023-01-31T09:26:56.123456789-05:00'); +-----------------------------------------------------------+ | to_timestamp(Utf8("2023-01-31T09:26:56.123456789-05:00")) | +-----------------------------------------------------------+ | 2023-01-31T14:26:56.123456789 | +-----------------------------------------------------------+ > select to_timestamp('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y'); +--------------------------------------------------------------------------------------------------------+ | to_timestamp(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) | +--------------------------------------------------------------------------------------------------------+ | 2023-05-17T03:59:00.123456789 | +--------------------------------------------------------------------------------------------------------+
Additional examples can be found here
to_timestamp_microsConverts a value to a timestamp (YYYY-MM-DDT00:00:00.000000Z). Supports strings, integer, and unsigned integer types as input. Strings are parsed as RFC3339 (e.g. ‘2023-07-20T05:44:00’) if no Chrono formats are provided. Integers and unsigned integers are interpreted as microseconds since the unix epoch (1970-01-01T00:00:00Z) Returns the corresponding timestamp.
to_timestamp_micros(expression[, ..., format_n])
> select to_timestamp_micros('2023-01-31T09:26:56.123456789-05:00'); +------------------------------------------------------------------+ | to_timestamp_micros(Utf8("2023-01-31T09:26:56.123456789-05:00")) | +------------------------------------------------------------------+ | 2023-01-31T14:26:56.123456 | +------------------------------------------------------------------+ > select to_timestamp_micros('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y'); +---------------------------------------------------------------------------------------------------------------+ | to_timestamp_micros(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) | +---------------------------------------------------------------------------------------------------------------+ | 2023-05-17T03:59:00.123456 | +---------------------------------------------------------------------------------------------------------------+
Additional examples can be found here
to_timestamp_millisConverts a value to a timestamp (YYYY-MM-DDT00:00:00.000Z). Supports strings, integer, and unsigned integer types as input. Strings are parsed as RFC3339 (e.g. ‘2023-07-20T05:44:00’) if no Chrono formats are provided. Integers and unsigned integers are interpreted as milliseconds since the unix epoch (1970-01-01T00:00:00Z). Returns the corresponding timestamp.
to_timestamp_millis(expression[, ..., format_n])
> select to_timestamp_millis('2023-01-31T09:26:56.123456789-05:00'); +------------------------------------------------------------------+ | to_timestamp_millis(Utf8("2023-01-31T09:26:56.123456789-05:00")) | +------------------------------------------------------------------+ | 2023-01-31T14:26:56.123 | +------------------------------------------------------------------+ > select to_timestamp_millis('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y'); +---------------------------------------------------------------------------------------------------------------+ | to_timestamp_millis(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) | +---------------------------------------------------------------------------------------------------------------+ | 2023-05-17T03:59:00.123 | +---------------------------------------------------------------------------------------------------------------+
Additional examples can be found here
to_timestamp_nanosConverts a value to a timestamp (YYYY-MM-DDT00:00:00.000000000Z). Supports strings, integer, and unsigned integer types as input. Strings are parsed as RFC3339 (e.g. ‘2023-07-20T05:44:00’) if no Chrono formats are provided. Integers and unsigned integers are interpreted as nanoseconds since the unix epoch (1970-01-01T00:00:00Z). Returns the corresponding timestamp.
to_timestamp_nanos(expression[, ..., format_n])
> select to_timestamp_nanos('2023-01-31T09:26:56.123456789-05:00'); +-----------------------------------------------------------------+ | to_timestamp_nanos(Utf8("2023-01-31T09:26:56.123456789-05:00")) | +-----------------------------------------------------------------+ | 2023-01-31T14:26:56.123456789 | +-----------------------------------------------------------------+ > select to_timestamp_nanos('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y'); +--------------------------------------------------------------------------------------------------------------+ | to_timestamp_nanos(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) | +--------------------------------------------------------------------------------------------------------------+ | 2023-05-17T03:59:00.123456789 | +---------------------------------------------------------------------------------------------------------------+
Additional examples can be found here
to_timestamp_secondsConverts a value to a timestamp (YYYY-MM-DDT00:00:00.000Z). Supports strings, integer, and unsigned integer types as input. Strings are parsed as RFC3339 (e.g. ‘2023-07-20T05:44:00’) if no Chrono formats are provided. Integers and unsigned integers are interpreted as seconds since the unix epoch (1970-01-01T00:00:00Z). Returns the corresponding timestamp.
to_timestamp_seconds(expression[, ..., format_n])
> select to_timestamp_seconds('2023-01-31T09:26:56.123456789-05:00'); +-------------------------------------------------------------------+ | to_timestamp_seconds(Utf8("2023-01-31T09:26:56.123456789-05:00")) | +-------------------------------------------------------------------+ | 2023-01-31T14:26:56 | +-------------------------------------------------------------------+ > select to_timestamp_seconds('03:59:00.123456789 05-17-2023', '%c', '%+', '%H:%M:%S%.f %m-%d-%Y'); +----------------------------------------------------------------------------------------------------------------+ | to_timestamp_seconds(Utf8("03:59:00.123456789 05-17-2023"),Utf8("%c"),Utf8("%+"),Utf8("%H:%M:%S%.f %m-%d-%Y")) | +----------------------------------------------------------------------------------------------------------------+ | 2023-05-17T03:59:00 | +----------------------------------------------------------------------------------------------------------------+
Additional examples can be found here
to_unixtimeConverts a value to seconds since the unix epoch (1970-01-01T00:00:00Z). Supports strings, dates, timestamps and double types as input. Strings are parsed as RFC3339 (e.g. ‘2023-07-20T05:44:00’) if no Chrono formats are provided.
to_unixtime(expression[, ..., format_n])
> select to_unixtime('2020-09-08T12:00:00+00:00'); +------------------------------------------------+ | to_unixtime(Utf8("2020-09-08T12:00:00+00:00")) | +------------------------------------------------+ | 1599566400 | +------------------------------------------------+ > select to_unixtime('01-14-2023 01:01:30+05:30', '%q', '%d-%m-%Y %H/%M/%S', '%+', '%m-%d-%Y %H:%M:%S%#z'); +-----------------------------------------------------------------------------------------------------------------------------+ | to_unixtime(Utf8("01-14-2023 01:01:30+05:30"),Utf8("%q"),Utf8("%d-%m-%Y %H/%M/%S"),Utf8("%+"),Utf8("%m-%d-%Y %H:%M:%S%#z")) | +-----------------------------------------------------------------------------------------------------------------------------+ | 1673638290 | +-----------------------------------------------------------------------------------------------------------------------------+
todayAlias of current_date.
array_any_valueExtracts the element with the index n from the array.
array_element(array, index)
> select array_element([1, 2, 3, 4], 3); +-----------------------------------------+ | array_element(List([1,2,3,4]),Int64(3)) | +-----------------------------------------+ | 3 | +-----------------------------------------+
array_appendAppends an element to the end of an array.
array_append(array, element)
> select array_append([1, 2, 3], 4); +--------------------------------------+ | array_append(List([1,2,3]),Int64(4)) | +--------------------------------------+ | [1, 2, 3, 4] | +--------------------------------------+
array_catAlias of array_concat.
array_concatAppends an element to the end of an array.
array_append(array, element)
> select array_append([1, 2, 3], 4); +--------------------------------------+ | array_append(List([1,2,3]),Int64(4)) | +--------------------------------------+ | [1, 2, 3, 4] | +--------------------------------------+
array_containsAlias of array_has.
array_dimsReturns an array of the array's dimensions.
array_dims(array)
> select array_dims([[1, 2, 3], [4, 5, 6]]); +---------------------------------+ | array_dims(List([1,2,3,4,5,6])) | +---------------------------------+ | [2, 3] | +---------------------------------+
array_distanceReturns the Euclidean distance between two input arrays of equal length.
array_distance(array1, array2)
> select array_distance([1, 2], [1, 4]); +------------------------------------+ | array_distance(List([1,2], [1,4])) | +------------------------------------+ | 2.0 | +------------------------------------+
array_distinctReturns distinct values from the array after removing duplicates.
array_distinct(array)
> select array_distinct([1, 3, 2, 3, 1, 2, 4]); +---------------------------------+ | array_distinct(List([1,2,3,4])) | +---------------------------------+ | [1, 2, 3, 4] | +---------------------------------+
array_elementExtracts the element with the index n from the array.
array_element(array, index)
> select array_element([1, 2, 3, 4], 3); +-----------------------------------------+ | array_element(List([1,2,3,4]),Int64(3)) | +-----------------------------------------+ | 3 | +-----------------------------------------+
array_emptyAlias of empty.
array_exceptReturns an array of the elements that appear in the first array but not in the second.
array_except(array1, array2)
> select array_except([1, 2, 3, 4], [5, 6, 3, 4]); +----------------------------------------------------+ | array_except([1, 2, 3, 4], [5, 6, 3, 4]); | +----------------------------------------------------+ | [1, 2] | +----------------------------------------------------+ > select array_except([1, 2, 3, 4], [3, 4, 5, 6]); +----------------------------------------------------+ | array_except([1, 2, 3, 4], [3, 4, 5, 6]); | +----------------------------------------------------+ | [1, 2] | +----------------------------------------------------+
array_extractAlias of array_element.
array_hasReturns true if the array contains the element.
array_has(array, element)
> select array_has([1, 2, 3], 2); +-----------------------------+ | array_has(List([1,2,3]), 2) | +-----------------------------+ | true | +-----------------------------+
array_has_allReturns true if the array contains the element.
array_has(array, element)
> select array_has([1, 2, 3], 2); +-----------------------------+ | array_has(List([1,2,3]), 2) | +-----------------------------+ | true | +-----------------------------+
array_has_anyReturns true if the array contains the element.
array_has(array, element)
> select array_has([1, 2, 3], 2); +-----------------------------+ | array_has(List([1,2,3]), 2) | +-----------------------------+ | true | +-----------------------------+
array_indexofAlias of array_position.
array_intersectReturns distinct values from the array after removing duplicates.
array_distinct(array)
> select array_distinct([1, 3, 2, 3, 1, 2, 4]); +---------------------------------+ | array_distinct(List([1,2,3,4])) | +---------------------------------+ | [1, 2, 3, 4] | +---------------------------------+
array_joinAlias of array_to_string.
array_lengthReturns the length of the array dimension.
array_length(array, dimension)
> select array_length([1, 2, 3, 4, 5], 1); +-------------------------------------------+ | array_length(List([1,2,3,4,5]), 1) | +-------------------------------------------+ | 5 | +-------------------------------------------+
array_ndimsReturns an array of the array's dimensions.
array_dims(array)
> select array_dims([[1, 2, 3], [4, 5, 6]]); +---------------------------------+ | array_dims(List([1,2,3,4,5,6])) | +---------------------------------+ | [2, 3] | +---------------------------------+
array_pop_backExtracts the element with the index n from the array.
array_element(array, index)
> select array_element([1, 2, 3, 4], 3); +-----------------------------------------+ | array_element(List([1,2,3,4]),Int64(3)) | +-----------------------------------------+ | 3 | +-----------------------------------------+
array_pop_frontExtracts the element with the index n from the array.
array_element(array, index)
> select array_element([1, 2, 3, 4], 3); +-----------------------------------------+ | array_element(List([1,2,3,4]),Int64(3)) | +-----------------------------------------+ | 3 | +-----------------------------------------+
array_positionReturns the position of the first occurrence of the specified element in the array.
array_position(array, element) array_position(array, element, index)
> select array_position([1, 2, 2, 3, 1, 4], 2); +----------------------------------------------+ | array_position(List([1,2,2,3,1,4]),Int64(2)) | +----------------------------------------------+ | 2 | +----------------------------------------------+ > select array_position([1, 2, 2, 3, 1, 4], 2, 3); +----------------------------------------------------+ | array_position(List([1,2,2,3,1,4]),Int64(2), Int64(3)) | +----------------------------------------------------+ | 3 | +----------------------------------------------------+
array_positionsReturns the position of the first occurrence of the specified element in the array.
array_position(array, element) array_position(array, element, index)
> select array_position([1, 2, 2, 3, 1, 4], 2); +----------------------------------------------+ | array_position(List([1,2,2,3,1,4]),Int64(2)) | +----------------------------------------------+ | 2 | +----------------------------------------------+ > select array_position([1, 2, 2, 3, 1, 4], 2, 3); +----------------------------------------------------+ | array_position(List([1,2,2,3,1,4]),Int64(2), Int64(3)) | +----------------------------------------------------+ | 3 | +----------------------------------------------------+
array_prependPrepends an element to the beginning of an array.
array_prepend(element, array)
> select array_prepend(1, [2, 3, 4]); +---------------------------------------+ | array_prepend(Int64(1),List([2,3,4])) | +---------------------------------------+ | [1, 2, 3, 4] | +---------------------------------------+
array_push_backAlias of array_append.
array_push_frontAlias of array_prepend.
array_removeRemoves the first element from the array equal to the given value.
array_remove(array, element)
> select array_remove([1, 2, 2, 3, 2, 1, 4], 2); +----------------------------------------------+ | array_remove(List([1,2,2,3,2,1,4]),Int64(2)) | +----------------------------------------------+ | [1, 2, 3, 2, 1, 4] | +----------------------------------------------+
array_remove_allRemoves the first element from the array equal to the given value.
array_remove(array, element)
> select array_remove([1, 2, 2, 3, 2, 1, 4], 2); +----------------------------------------------+ | array_remove(List([1,2,2,3,2,1,4]),Int64(2)) | +----------------------------------------------+ | [1, 2, 3, 2, 1, 4] | +----------------------------------------------+
array_remove_nRemoves the first element from the array equal to the given value.
array_remove(array, element)
> select array_remove([1, 2, 2, 3, 2, 1, 4], 2); +----------------------------------------------+ | array_remove(List([1,2,2,3,2,1,4]),Int64(2)) | +----------------------------------------------+ | [1, 2, 3, 2, 1, 4] | +----------------------------------------------+
array_repeatReturns an array containing element count times.
array_repeat(element, count)
> select array_repeat(1, 3); +---------------------------------+ | array_repeat(Int64(1),Int64(3)) | +---------------------------------+ | [1, 1, 1] | +---------------------------------+ > select array_repeat([1, 2], 2); +------------------------------------+ | array_repeat(List([1,2]),Int64(2)) | +------------------------------------+ | [[1, 2], [1, 2]] | +------------------------------------+
array_replaceReplaces the first max occurrences of the specified element with another specified element.
array_replace_n(array, from, to, max)
> select array_replace_n([1, 2, 2, 3, 2, 1, 4], 2, 5, 2); +-------------------------------------------------------------------+ | array_replace_n(List([1,2,2,3,2,1,4]),Int64(2),Int64(5),Int64(2)) | +-------------------------------------------------------------------+ | [1, 5, 5, 3, 2, 1, 4] | +-------------------------------------------------------------------+
array_replace_allReplaces the first max occurrences of the specified element with another specified element.
array_replace_n(array, from, to, max)
> select array_replace_n([1, 2, 2, 3, 2, 1, 4], 2, 5, 2); +-------------------------------------------------------------------+ | array_replace_n(List([1,2,2,3,2,1,4]),Int64(2),Int64(5),Int64(2)) | +-------------------------------------------------------------------+ | [1, 5, 5, 3, 2, 1, 4] | +-------------------------------------------------------------------+
array_replace_nReplaces the first max occurrences of the specified element with another specified element.
array_replace_n(array, from, to, max)
> select array_replace_n([1, 2, 2, 3, 2, 1, 4], 2, 5, 2); +-------------------------------------------------------------------+ | array_replace_n(List([1,2,2,3,2,1,4]),Int64(2),Int64(5),Int64(2)) | +-------------------------------------------------------------------+ | [1, 5, 5, 3, 2, 1, 4] | +-------------------------------------------------------------------+
array_resizeResizes the list to contain size elements. Initializes new elements with value or empty if value is not set.
array_resize(array, size, value)
> select array_resize([1, 2, 3], 5, 0); +-------------------------------------+ | array_resize(List([1,2,3],5,0)) | +-------------------------------------+ | [1, 2, 3, 0, 0] | +-------------------------------------+
array_reverseReturns the array with the order of the elements reversed.
array_reverse(array)
> select array_reverse([1, 2, 3, 4]); +------------------------------------------------------------+ | array_reverse(List([1, 2, 3, 4])) | +------------------------------------------------------------+ | [4, 3, 2, 1] | +------------------------------------------------------------+
array_sliceExtracts the element with the index n from the array.
array_element(array, index)
> select array_element([1, 2, 3, 4], 3); +-----------------------------------------+ | array_element(List([1,2,3,4]),Int64(3)) | +-----------------------------------------+ | 3 | +-----------------------------------------+
array_sortSort array.
array_sort(array, desc, nulls_first)
ASC or DESC).NULLS FIRST or NULLS LAST).> select array_sort([3, 1, 2]); +-----------------------------+ | array_sort(List([3,1,2])) | +-----------------------------+ | [1, 2, 3] | +-----------------------------+
array_to_stringConverts each element to its text representation.
array_to_string(array, delimiter[, null_string])
> select array_to_string([[1, 2, 3, 4], [5, 6, 7, 8]], ','); +----------------------------------------------------+ | array_to_string(List([1,2,3,4,5,6,7,8]),Utf8(",")) | +----------------------------------------------------+ | 1,2,3,4,5,6,7,8 | +----------------------------------------------------+
array_unionReturns distinct values from the array after removing duplicates.
array_distinct(array)
> select array_distinct([1, 3, 2, 3, 1, 2, 4]); +---------------------------------+ | array_distinct(List([1,2,3,4])) | +---------------------------------+ | [1, 2, 3, 4] | +---------------------------------+
cardinalityReturns the total number of elements in the array.
cardinality(array)
> select cardinality([[1, 2, 3, 4], [5, 6, 7, 8]]); +--------------------------------------+ | cardinality(List([1,2,3,4,5,6,7,8])) | +--------------------------------------+ | 8 | +--------------------------------------+
emptyReturns 1 for an empty array or 0 for a non-empty array.
empty(array)
> select empty([1]); +------------------+ | empty(List([1])) | +------------------+ | 0 | +------------------+
flattenConverts an array of arrays to a flat array.
The flattened array contains all the elements from all source arrays.
flatten(array)
> select flatten([[1, 2], [3, 4]]); +------------------------------+ | flatten(List([1,2], [3,4])) | +------------------------------+ | [1, 2, 3, 4] | +------------------------------+
generate_seriesSimilar to the range function, but it includes the upper bound.
generate_series(start, stop, step)
> select generate_series(1,3); +------------------------------------+ | generate_series(Int64(1),Int64(3)) | +------------------------------------+ | [1, 2, 3] | +------------------------------------+
list_any_valueAlias of array_any_value.
list_appendAlias of array_append.
list_catAlias of array_concat.
list_concatAlias of array_concat.
list_containsAlias of array_has.
list_dimsAlias of array_dims.
list_distanceAlias of array_distance.
list_distinctAlias of array_distinct.
list_elementAlias of array_element.
list_emptyAlias of empty.
list_exceptAlias of array_except.
list_extractAlias of array_element.
list_hasAlias of array_has.
list_has_allAlias of array_has_all.
list_has_anyAlias of array_has_any.
list_indexofAlias of array_position.
list_intersectAlias of array_intersect.
list_joinAlias of array_to_string.
list_lengthAlias of array_length.
list_ndimsAlias of array_ndims.
list_pop_backAlias of array_pop_back.
list_pop_frontAlias of array_pop_front.
list_positionAlias of array_position.
list_positionsAlias of array_positions.
list_prependAlias of array_prepend.
list_push_backAlias of array_append.
list_push_frontAlias of array_prepend.
list_removeAlias of array_remove.
list_remove_allAlias of array_remove_all.
list_remove_nAlias of array_remove_n.
list_repeatAlias of array_repeat.
list_replaceAlias of array_replace.
list_replace_allAlias of array_replace_all.
list_replace_nAlias of array_replace_n.
list_resizeAlias of array_resize.
list_reverseAlias of array_reverse.
list_sliceAlias of array_slice.
list_sortAlias of array_sort.
list_to_stringAlias of array_to_string.
list_unionAlias of array_union.
make_arrayReturns an array using the specified input expressions.
make_array(expression1[, ..., expression_n])
> select make_array(1, 2, 3, 4, 5); +----------------------------------------------------------+ | make_array(Int64(1),Int64(2),Int64(3),Int64(4),Int64(5)) | +----------------------------------------------------------+ | [1, 2, 3, 4, 5] | +----------------------------------------------------------+
make_listAlias of make_array.
rangeReturns an Arrow array between start and stop with step. The range start..end contains all values with start <= x < end. It is empty if start >= end. Step cannot be 0.
range(start, stop, step)
> select range(2, 10, 3); +-----------------------------------+ | range(Int64(2),Int64(10),Int64(3))| +-----------------------------------+ | [2, 5, 8] | +-----------------------------------+ > select range(DATE '1992-09-01', DATE '1993-03-01', INTERVAL '1' MONTH); +--------------------------------------------------------------+ | range(DATE '1992-09-01', DATE '1993-03-01', INTERVAL '1' MONTH) | +--------------------------------------------------------------+ | [1992-09-01, 1992-10-01, 1992-11-01, 1992-12-01, 1993-01-01, 1993-02-01] | +--------------------------------------------------------------+
string_to_arraySplits a string into an array of substrings based on a delimiter. Any substrings matching the optional null_str argument are replaced with NULL.
string_to_array(str, delimiter[, null_str])
NULL.> select string_to_array('abc##def', '##'); +-----------------------------------+ | string_to_array(Utf8('abc##def')) | +-----------------------------------+ | ['abc', 'def'] | +-----------------------------------+ > select string_to_array('abc def', ' ', 'def'); +---------------------------------------------+ | string_to_array(Utf8('abc def'), Utf8(' '), Utf8('def')) | +---------------------------------------------+ | ['abc', NULL] | +---------------------------------------------+
string_to_listAlias of string_to_array.
named_structReturns an Arrow struct using the specified name and input expressions pairs.
named_struct(expression1_name, expression1_input[, ..., expression_n_name, expression_n_input])
For example, this query converts two columns a and b to a single column with a struct type of fields field_a and field_b:
> select * from t; +---+---+ | a | b | +---+---+ | 1 | 2 | | 3 | 4 | +---+---+ > select named_struct('field_a', a, 'field_b', b) from t; +-------------------------------------------------------+ | named_struct(Utf8("field_a"),t.a,Utf8("field_b"),t.b) | +-------------------------------------------------------+ | {field_a: 1, field_b: 2} | | {field_a: 3, field_b: 4} | +-------------------------------------------------------+
rowAlias of struct.
structReturns an Arrow struct using the specified input expressions optionally named. Fields in the returned struct use the optional name or the cN naming convention. For example: c0, c1, c2, etc.
struct(expression1[, ..., expression_n])
For example, this query converts two columns a and b to a single column with a struct type of fields field_a and c1:
> select * from t; +---+---+ | a | b | +---+---+ | 1 | 2 | | 3 | 4 | +---+---+ -- use default names `c0`, `c1` > select struct(a, b) from t; +-----------------+ | struct(t.a,t.b) | +-----------------+ | {c0: 1, c1: 2} | | {c0: 3, c1: 4} | +-----------------+ -- name the first field `field_a` select struct(a as field_a, b) from t; +--------------------------------------------------+ | named_struct(Utf8("field_a"),t.a,Utf8("c1"),t.b) | +--------------------------------------------------+ | {field_a: 1, c1: 2} | | {field_a: 3, c1: 4} | +--------------------------------------------------+
element_atAlias of map_extract.
mapReturns an Arrow map with the specified key-value pairs.
The make_map function creates a map from two lists: one for keys and one for values. Each key must be unique and non-null.
map(key, value) map(key: value) make_map(['key1', 'key2'], ['value1', 'value2'])
map: Expression to be used for key. Can be a constant, column, function, or any combination of arithmetic or string operators. For make_map: The list of keys to be used in the map. Each key must be unique and non-null.map: Expression to be used for value. Can be a constant, column, function, or any combination of arithmetic or string operators. For make_map: The list of values to be mapped to the corresponding keys.-- Using map function SELECT MAP('type', 'test'); ---- {type: test} SELECT MAP(['POST', 'HEAD', 'PATCH'], [41, 33, null]); ---- {POST: 41, HEAD: 33, PATCH: } SELECT MAP([[1,2], [3,4]], ['a', 'b']); ---- {[1, 2]: a, [3, 4]: b} SELECT MAP { 'a': 1, 'b': 2 }; ---- {a: 1, b: 2} -- Using make_map function SELECT MAKE_MAP(['POST', 'HEAD'], [41, 33]); ---- {POST: 41, HEAD: 33} SELECT MAKE_MAP(['key1', 'key2'], ['value1', null]); ---- {key1: value1, key2: }
map_extractReturns a list containing the value for the given key or an empty list if the key is not present in the map.
map_extract(map, key)
SELECT map_extract(MAP {'a': 1, 'b': NULL, 'c': 3}, 'a'); ---- [1] SELECT map_extract(MAP {1: 'one', 2: 'two'}, 2); ---- ['two'] SELECT map_extract(MAP {'x': 10, 'y': NULL, 'z': 30}, 'y'); ---- []
map_keysReturns a list of all keys in the map.
map_keys(map)
SELECT map_keys(MAP {'a': 1, 'b': NULL, 'c': 3}); ---- [a, b, c] SELECT map_keys(map([100, 5], [42, 43])); ---- [100, 5]
map_valuesReturns a list of all values in the map.
map_values(map)
SELECT map_values(MAP {'a': 1, 'b': NULL, 'c': 3}); ---- [1, , 3] SELECT map_values(map([100, 5], [42, 43])); ---- [42, 43]
digestComputes the binary hash of an expression using the specified algorithm.
digest(expression, algorithm)
> select digest('foo', 'sha256'); +------------------------------------------+ | digest(Utf8("foo"), Utf8("sha256")) | +------------------------------------------+ | <binary_hash_result> | +------------------------------------------+
md5Computes an MD5 128-bit checksum for a string expression.
md5(expression)
> select md5('foo'); +-------------------------------------+ | md5(Utf8("foo")) | +-------------------------------------+ | <md5_checksum_result> | +-------------------------------------+
sha224Computes the SHA-224 hash of a binary string.
sha224(expression)
> select sha224('foo'); +------------------------------------------+ | sha224(Utf8("foo")) | +------------------------------------------+ | <sha224_hash_result> | +------------------------------------------+
sha256Computes the SHA-256 hash of a binary string.
sha256(expression)
> select sha256('foo'); +--------------------------------------+ | sha256(Utf8("foo")) | +--------------------------------------+ | <sha256_hash_result> | +--------------------------------------+
sha384Computes the SHA-384 hash of a binary string.
sha384(expression)
> select sha384('foo'); +-----------------------------------------+ | sha384(Utf8("foo")) | +-----------------------------------------+ | <sha384_hash_result> | +-----------------------------------------+
sha512Computes the SHA-512 hash of a binary string.
sha512(expression)
> select sha512('foo'); +-------------------------------------------+ | sha512(Utf8("foo")) | +-------------------------------------------+ | <sha512_hash_result> | +-------------------------------------------+
arrow_castCasts a value to a specific Arrow data type.
arrow_cast(expression, datatype)
arrow_typeof]> select arrow_cast(-5, 'Int8') as a, arrow_cast('foo', 'Dictionary(Int32, Utf8)') as b, arrow_cast('bar', 'LargeUtf8') as c, arrow_cast('2023-01-02T12:53:02', 'Timestamp(Microsecond, Some("+08:00"))') as d ; +----+-----+-----+---------------------------+ | a | b | c | d | +----+-----+-----+---------------------------+ | -5 | foo | bar | 2023-01-02T12:53:02+08:00 | +----+-----+-----+---------------------------+
arrow_typeofReturns the name of the underlying Arrow data type of the expression.
arrow_typeof(expression)
> select arrow_typeof('foo'), arrow_typeof(1); +---------------------------+------------------------+ | arrow_typeof(Utf8("foo")) | arrow_typeof(Int64(1)) | +---------------------------+------------------------+ | Utf8 | Int64 | +---------------------------+------------------------+
get_fieldReturns a field within a map or a struct with the given key. Note: most users invoke get_field indirectly via field access syntax such as my_struct_col['field_name'] which results in a call to get_field(my_struct_col, 'field_name').
get_field(expression1, expression2)
> create table t (idx varchar, v varchar) as values ('data','fusion'), ('apache', 'arrow'); > select struct(idx, v) from t as c; +-------------------------+ | struct(c.idx,c.v) | +-------------------------+ | {c0: data, c1: fusion} | | {c0: apache, c1: arrow} | +-------------------------+ > select get_field((select struct(idx, v) from t), 'c0'); +-----------------------+ | struct(t.idx,t.v)[c0] | +-----------------------+ | data | | apache | +-----------------------+ > select get_field((select struct(idx, v) from t), 'c1'); +-----------------------+ | struct(t.idx,t.v)[c1] | +-----------------------+ | fusion | | arrow | +-----------------------+
versionReturns the version of DataFusion.
version()
> select version(); +--------------------------------------------+ | version() | +--------------------------------------------+ | Apache DataFusion 42.0.0, aarch64 on macos | +--------------------------------------------+