| <table class="table"> |
| <thead> |
| <tr> |
| <th style="width:25%">Function</th> |
| <th>Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td>any(expr)</td> |
| <td>Returns true if at least one value of `expr` is true.</td> |
| </tr> |
| <tr> |
| <td>any_value(expr[, isIgnoreNull])</td> |
| <td>Returns some value of `expr` for a group of rows. |
| If `isIgnoreNull` is true, returns only non-null values.</td> |
| </tr> |
| <tr> |
| <td>approx_count_distinct(expr[, relativeSD])</td> |
| <td>Returns the estimated cardinality by HyperLogLog++. |
| `relativeSD` defines the maximum relative standard deviation allowed.</td> |
| </tr> |
| <tr> |
| <td>approx_percentile(col, percentage [, accuracy])</td> |
| <td>Returns the approximate `percentile` of the numeric or |
| ansi interval column `col` which is the smallest value in the ordered `col` values (sorted |
| from least to greatest) such that no more than `percentage` of `col` values is less than |
| the value or equal to that value. The value of percentage must be between 0.0 and 1.0. |
| The `accuracy` parameter (default: 10000) is a positive numeric literal which controls |
| approximation accuracy at the cost of memory. Higher value of `accuracy` yields better |
| accuracy, `1.0/accuracy` is the relative error of the approximation. |
| When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. |
| In this case, returns the approximate percentile array of column `col` at the given |
| percentage array.</td> |
| </tr> |
| <tr> |
| <td>array_agg(expr)</td> |
| <td>Collects and returns a list of non-unique elements.</td> |
| </tr> |
| <tr> |
| <td>avg(expr)</td> |
| <td>Returns the mean calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>bit_and(expr)</td> |
| <td>Returns the bitwise AND of all non-null input values, or null if none.</td> |
| </tr> |
| <tr> |
| <td>bit_or(expr)</td> |
| <td>Returns the bitwise OR of all non-null input values, or null if none.</td> |
| </tr> |
| <tr> |
| <td>bit_xor(expr)</td> |
| <td>Returns the bitwise XOR of all non-null input values, or null if none.</td> |
| </tr> |
| <tr> |
| <td>bitmap_construct_agg(child)</td> |
| <td>Returns a bitmap with the positions of the bits set from all the values from |
| the child expression. The child expression will most likely be bitmap_bit_position().</td> |
| </tr> |
| <tr> |
| <td>bitmap_or_agg(child)</td> |
| <td>Returns a bitmap that is the bitwise OR of all of the bitmaps from the child |
| expression. The input should be bitmaps created from bitmap_construct_agg().</td> |
| </tr> |
| <tr> |
| <td>bool_and(expr)</td> |
| <td>Returns true if all values of `expr` are true.</td> |
| </tr> |
| <tr> |
| <td>bool_or(expr)</td> |
| <td>Returns true if at least one value of `expr` is true.</td> |
| </tr> |
| <tr> |
| <td>collect_list(expr)</td> |
| <td>Collects and returns a list of non-unique elements.</td> |
| </tr> |
| <tr> |
| <td>collect_set(expr)</td> |
| <td>Collects and returns a set of unique elements.</td> |
| </tr> |
| <tr> |
| <td>corr(expr1, expr2)</td> |
| <td>Returns Pearson coefficient of correlation between a set of number pairs.</td> |
| </tr> |
| <tr> |
| <td>count(*)</td> |
| <td>Returns the total number of retrieved rows, including rows containing null.</td> |
| </tr> |
| <tr> |
| <td> count(expr[, expr...])</td> |
| <td>Returns the number of rows for which the supplied expression(s) are all non-null.</td> |
| </tr> |
| <tr> |
| <td> count(DISTINCT expr[, expr...])</td> |
| <td>Returns the number of rows for which the supplied expression(s) are unique and non-null.</td> |
| </tr> |
| <tr> |
| <td>count_if(expr)</td> |
| <td>Returns the number of `TRUE` values for the expression.</td> |
| </tr> |
| <tr> |
| <td>count_min_sketch(col, eps, confidence, seed)</td> |
| <td>Returns a count-min sketch of a column with the given esp, |
| confidence and seed. The result is an array of bytes, which can be deserialized to a |
| `CountMinSketch` before usage. Count-min sketch is a probabilistic data structure used for |
| cardinality estimation using sub-linear space.</td> |
| </tr> |
| <tr> |
| <td>covar_pop(expr1, expr2)</td> |
| <td>Returns the population covariance of a set of number pairs.</td> |
| </tr> |
| <tr> |
| <td>covar_samp(expr1, expr2)</td> |
| <td>Returns the sample covariance of a set of number pairs.</td> |
| </tr> |
| <tr> |
| <td>every(expr)</td> |
| <td>Returns true if all values of `expr` are true.</td> |
| </tr> |
| <tr> |
| <td>first(expr[, isIgnoreNull])</td> |
| <td>Returns the first value of `expr` for a group of rows. |
| If `isIgnoreNull` is true, returns only non-null values.</td> |
| </tr> |
| <tr> |
| <td>first_value(expr[, isIgnoreNull])</td> |
| <td>Returns the first value of `expr` for a group of rows. |
| If `isIgnoreNull` is true, returns only non-null values.</td> |
| </tr> |
| <tr> |
| <td>grouping(col)</td> |
| <td>indicates whether a specified column in a GROUP BY is aggregated or |
| not, returns 1 for aggregated or 0 for not aggregated in the result set.",</td> |
| </tr> |
| <tr> |
| <td>grouping_id([col1[, col2 ..]])</td> |
| <td>returns the level of grouping, equals to |
| `(grouping(c1) << (n-1)) + (grouping(c2) << (n-2)) + ... + grouping(cn)`</td> |
| </tr> |
| <tr> |
| <td>histogram_numeric(expr, nb)</td> |
| <td>Computes a histogram on numeric 'expr' using nb bins. |
| The return value is an array of (x,y) pairs representing the centers of the |
| histogram's bins. As the value of 'nb' is increased, the histogram approximation |
| gets finer-grained, but may yield artifacts around outliers. In practice, 20-40 |
| histogram bins appear to work well, with more bins being required for skewed or |
| smaller datasets. Note that this function creates a histogram with non-uniform |
| bin widths. It offers no guarantees in terms of the mean-squared-error of the |
| histogram, but in practice is comparable to the histograms produced by the R/S-Plus |
| statistical computing packages. Note: the output type of the 'x' field in the return value is |
| propagated from the input value consumed in the aggregate function.</td> |
| </tr> |
| <tr> |
| <td>hll_sketch_agg(expr, lgConfigK)</td> |
| <td>Returns the HllSketch's updatable binary representation. |
| `lgConfigK` (optional) the log-base-2 of K, with K is the number of buckets or |
| slots for the HllSketch.</td> |
| </tr> |
| <tr> |
| <td>hll_union_agg(expr, allowDifferentLgConfigK)</td> |
| <td>Returns the estimated number of unique values. |
| `allowDifferentLgConfigK` (optional) Allow sketches with different lgConfigK values |
| to be unioned (defaults to false).</td> |
| </tr> |
| <tr> |
| <td>kurtosis(expr)</td> |
| <td>Returns the kurtosis value calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>last(expr[, isIgnoreNull])</td> |
| <td>Returns the last value of `expr` for a group of rows. |
| If `isIgnoreNull` is true, returns only non-null values</td> |
| </tr> |
| <tr> |
| <td>last_value(expr[, isIgnoreNull])</td> |
| <td>Returns the last value of `expr` for a group of rows. |
| If `isIgnoreNull` is true, returns only non-null values</td> |
| </tr> |
| <tr> |
| <td>max(expr)</td> |
| <td>Returns the maximum value of `expr`.</td> |
| </tr> |
| <tr> |
| <td>max_by(x, y)</td> |
| <td>Returns the value of `x` associated with the maximum value of `y`.</td> |
| </tr> |
| <tr> |
| <td>mean(expr)</td> |
| <td>Returns the mean calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>median(col)</td> |
| <td>Returns the median of numeric or ANSI interval column `col`.</td> |
| </tr> |
| <tr> |
| <td>min(expr)</td> |
| <td>Returns the minimum value of `expr`.</td> |
| </tr> |
| <tr> |
| <td>min_by(x, y)</td> |
| <td>Returns the value of `x` associated with the minimum value of `y`.</td> |
| </tr> |
| <tr> |
| <td>mode(col)</td> |
| <td>Returns the most frequent value for the values within `col`. NULL values are ignored. If all the values are NULL, or there are 0 rows, returns NULL.</td> |
| </tr> |
| <tr> |
| <td>percentile(col, percentage [, frequency])</td> |
| <td>Returns the exact percentile value of numeric |
| or ANSI interval column `col` at the given percentage. The value of percentage must be |
| between 0.0 and 1.0. The value of frequency should be positive integral</td> |
| </tr> |
| <tr> |
| <td> percentile(col, array(percentage1 [, percentage2]...) [, frequency])</td> |
| <td>Returns the exact |
| percentile value array of numeric column `col` at the given percentage(s). Each value |
| of the percentage array must be between 0.0 and 1.0. The value of frequency should be |
| positive integral</td> |
| </tr> |
| <tr> |
| <td>percentile_approx(col, percentage [, accuracy])</td> |
| <td>Returns the approximate `percentile` of the numeric or |
| ansi interval column `col` which is the smallest value in the ordered `col` values (sorted |
| from least to greatest) such that no more than `percentage` of `col` values is less than |
| the value or equal to that value. The value of percentage must be between 0.0 and 1.0. |
| The `accuracy` parameter (default: 10000) is a positive numeric literal which controls |
| approximation accuracy at the cost of memory. Higher value of `accuracy` yields better |
| accuracy, `1.0/accuracy` is the relative error of the approximation. |
| When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. |
| In this case, returns the approximate percentile array of column `col` at the given |
| percentage array.</td> |
| </tr> |
| <tr> |
| <td>regr_avgx(y, x)</td> |
| <td>Returns the average of the independent variable for non-null pairs in a group, where `y` is the dependent variable and `x` is the independent variable.</td> |
| </tr> |
| <tr> |
| <td>regr_avgy(y, x)</td> |
| <td>Returns the average of the dependent variable for non-null pairs in a group, where `y` is the dependent variable and `x` is the independent variable.</td> |
| </tr> |
| <tr> |
| <td>regr_count(y, x)</td> |
| <td>Returns the number of non-null number pairs in a group, where `y` is the dependent variable and `x` is the independent variable.</td> |
| </tr> |
| <tr> |
| <td>regr_intercept(y, x)</td> |
| <td>Returns the intercept of the univariate linear regression line for non-null pairs in a group, where `y` is the dependent variable and `x` is the independent variable.</td> |
| </tr> |
| <tr> |
| <td>regr_r2(y, x)</td> |
| <td>Returns the coefficient of determination for non-null pairs in a group, where `y` is the dependent variable and `x` is the independent variable.</td> |
| </tr> |
| <tr> |
| <td>regr_slope(y, x)</td> |
| <td>Returns the slope of the linear regression line for non-null pairs in a group, where `y` is the dependent variable and `x` is the independent variable.</td> |
| </tr> |
| <tr> |
| <td>regr_sxx(y, x)</td> |
| <td>Returns REGR_COUNT(y, x) * VAR_POP(x) for non-null pairs in a group, where `y` is the dependent variable and `x` is the independent variable.</td> |
| </tr> |
| <tr> |
| <td>regr_sxy(y, x)</td> |
| <td>Returns REGR_COUNT(y, x) * COVAR_POP(y, x) for non-null pairs in a group, where `y` is the dependent variable and `x` is the independent variable.</td> |
| </tr> |
| <tr> |
| <td>regr_syy(y, x)</td> |
| <td>Returns REGR_COUNT(y, x) * VAR_POP(y) for non-null pairs in a group, where `y` is the dependent variable and `x` is the independent variable.</td> |
| </tr> |
| <tr> |
| <td>skewness(expr)</td> |
| <td>Returns the skewness value calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>some(expr)</td> |
| <td>Returns true if at least one value of `expr` is true.</td> |
| </tr> |
| <tr> |
| <td>std(expr)</td> |
| <td>Returns the sample standard deviation calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>stddev(expr)</td> |
| <td>Returns the sample standard deviation calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>stddev_pop(expr)</td> |
| <td>Returns the population standard deviation calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>stddev_samp(expr)</td> |
| <td>Returns the sample standard deviation calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>sum(expr)</td> |
| <td>Returns the sum calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>try_avg(expr)</td> |
| <td>Returns the mean calculated from values of a group and the result is null on overflow.</td> |
| </tr> |
| <tr> |
| <td>try_sum(expr)</td> |
| <td>Returns the sum calculated from values of a group and the result is null on overflow.</td> |
| </tr> |
| <tr> |
| <td>var_pop(expr)</td> |
| <td>Returns the population variance calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>var_samp(expr)</td> |
| <td>Returns the sample variance calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>variance(expr)</td> |
| <td>Returns the sample variance calculated from values of a group.</td> |
| </tr> |
| </tbody> |
| </table> |