| <table class="table"> |
| <thead> |
| <tr> |
| <th style="width:25%">Function</th> |
| <th>Description</th> |
| </tr> |
| </thead> |
| <tbody> |
| <tr> |
| <td>any(expr)</td> |
| <td>Returns true if at least one value of `expr` is true.</td> |
| </tr> |
| <tr> |
| <td>approx_count_distinct(expr[, relativeSD])</td> |
| <td>Returns the estimated cardinality by HyperLogLog++. |
| `relativeSD` defines the maximum relative standard deviation allowed.</td> |
| </tr> |
| <tr> |
| <td>approx_percentile(col, percentage [, accuracy])</td> |
| <td>Returns the approximate percentile value of numeric |
| column `col` at the given percentage. The value of percentage must be between 0.0 |
| and 1.0. The `accuracy` parameter (default: 10000) is a positive numeric literal which |
| controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields |
| better accuracy, `1.0/accuracy` is the relative error of the approximation. |
| When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. |
| In this case, returns the approximate percentile array of column `col` at the given |
| percentage array.</td> |
| </tr> |
| <tr> |
| <td>avg(expr)</td> |
| <td>Returns the mean calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>bit_or(expr)</td> |
| <td>Returns the bitwise OR of all non-null input values, or null if none.</td> |
| </tr> |
| <tr> |
| <td>bit_xor(expr)</td> |
| <td>Returns the bitwise XOR of all non-null input values, or null if none.</td> |
| </tr> |
| <tr> |
| <td>bool_and(expr)</td> |
| <td>Returns true if all values of `expr` are true.</td> |
| </tr> |
| <tr> |
| <td>bool_or(expr)</td> |
| <td>Returns true if at least one value of `expr` is true.</td> |
| </tr> |
| <tr> |
| <td>collect_list(expr)</td> |
| <td>Collects and returns a list of non-unique elements.</td> |
| </tr> |
| <tr> |
| <td>collect_set(expr)</td> |
| <td>Collects and returns a set of unique elements.</td> |
| </tr> |
| <tr> |
| <td>corr(expr1, expr2)</td> |
| <td>Returns Pearson coefficient of correlation between a set of number pairs.</td> |
| </tr> |
| <tr> |
| <td>count(*)</td> |
| <td>Returns the total number of retrieved rows, including rows containing null.</td> |
| </tr> |
| <tr> |
| <td>count(expr[, expr...])</td> |
| <td>Returns the number of rows for which the supplied expression(s) are all non-null.</td> |
| </tr> |
| <tr> |
| <td>count(DISTINCT expr[, expr...])</td> |
| <td>Returns the number of rows for which the supplied expression(s) are unique and non-null.</td> |
| </tr> |
| <tr> |
| <td>count_if(expr)</td> |
| <td>Returns the number of `TRUE` values for the expression.</td> |
| </tr> |
| <tr> |
| <td>count_min_sketch(col, eps, confidence, seed)</td> |
| <td>Returns a count-min sketch of a column with the given esp, |
| confidence and seed. The result is an array of bytes, which can be deserialized to a |
| `CountMinSketch` before usage. Count-min sketch is a probabilistic data structure used for |
| cardinality estimation using sub-linear space.</td> |
| </tr> |
| <tr> |
| <td>covar_pop(expr1, expr2)</td> |
| <td>Returns the population covariance of a set of number pairs.</td> |
| </tr> |
| <tr> |
| <td>covar_samp(expr1, expr2)</td> |
| <td>Returns the sample covariance of a set of number pairs.</td> |
| </tr> |
| <tr> |
| <td>every(expr)</td> |
| <td>Returns true if all values of `expr` are true.</td> |
| </tr> |
| <tr> |
| <td>first(expr[, isIgnoreNull])</td> |
| <td>Returns the first value of `expr` for a group of rows. |
| If `isIgnoreNull` is true, returns only non-null values.</td> |
| </tr> |
| <tr> |
| <td>first_value(expr[, isIgnoreNull])</td> |
| <td>Returns the first value of `expr` for a group of rows. |
| If `isIgnoreNull` is true, returns only non-null values.</td> |
| </tr> |
| <tr> |
| <td>kurtosis(expr)</td> |
| <td>Returns the kurtosis value calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>last(expr[, isIgnoreNull])</td> |
| <td>Returns the last value of `expr` for a group of rows. |
| If `isIgnoreNull` is true, returns only non-null values</td> |
| </tr> |
| <tr> |
| <td>last_value(expr[, isIgnoreNull])</td> |
| <td>Returns the last value of `expr` for a group of rows. |
| If `isIgnoreNull` is true, returns only non-null values</td> |
| </tr> |
| <tr> |
| <td>max(expr)</td> |
| <td>Returns the maximum value of `expr`.</td> |
| </tr> |
| <tr> |
| <td>max_by(x, y)</td> |
| <td>Returns the value of `x` associated with the maximum value of `y`.</td> |
| </tr> |
| <tr> |
| <td>mean(expr)</td> |
| <td>Returns the mean calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>min(expr)</td> |
| <td>Returns the minimum value of `expr`.</td> |
| </tr> |
| <tr> |
| <td>min_by(x, y)</td> |
| <td>Returns the value of `x` associated with the minimum value of `y`.</td> |
| </tr> |
| <tr> |
| <td>percentile(col, percentage [, frequency])</td> |
| <td>Returns the exact percentile value of numeric column |
| `col` at the given percentage. The value of percentage must be between 0.0 and 1.0. The |
| value of frequency should be positive integral</td> |
| </tr> |
| <tr> |
| <td>percentile(col, array(percentage1 [, percentage2]...) [, frequency])</td> |
| <td>Returns the exact |
| percentile value array of numeric column `col` at the given percentage(s). Each value |
| of the percentage array must be between 0.0 and 1.0. The value of frequency should be |
| positive integral</td> |
| </tr> |
| <tr> |
| <td>percentile_approx(col, percentage [, accuracy])</td> |
| <td>Returns the approximate percentile value of numeric |
| column `col` at the given percentage. The value of percentage must be between 0.0 |
| and 1.0. The `accuracy` parameter (default: 10000) is a positive numeric literal which |
| controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields |
| better accuracy, `1.0/accuracy` is the relative error of the approximation. |
| When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0. |
| In this case, returns the approximate percentile array of column `col` at the given |
| percentage array.</td> |
| </tr> |
| <tr> |
| <td>skewness(expr)</td> |
| <td>Returns the skewness value calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>some(expr)</td> |
| <td>Returns true if at least one value of `expr` is true.</td> |
| </tr> |
| <tr> |
| <td>std(expr)</td> |
| <td>Returns the sample standard deviation calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>stddev(expr)</td> |
| <td>Returns the sample standard deviation calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>stddev_pop(expr)</td> |
| <td>Returns the population standard deviation calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>stddev_samp(expr)</td> |
| <td>Returns the sample standard deviation calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>sum(expr)</td> |
| <td>Returns the sum calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>var_pop(expr)</td> |
| <td>Returns the population variance calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>var_samp(expr)</td> |
| <td>Returns the sample variance calculated from values of a group.</td> |
| </tr> |
| <tr> |
| <td>variance(expr)</td> |
| <td>Returns the sample variance calculated from values of a group.</td> |
| </tr> |
| </tbody> |
| </table> |