blob: 45337641bac0df91a1d1cc0c56aece737ff165c3 [file] [log] [blame]
<table class="table">
<thead>
<tr>
<th style="width:25%">Function</th>
<th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>any(expr)</td>
<td>Returns true if at least one value of `expr` is true.</td>
</tr>
<tr>
<td>approx_count_distinct(expr[, relativeSD])</td>
<td>Returns the estimated cardinality by HyperLogLog++.
`relativeSD` defines the maximum relative standard deviation allowed.</td>
</tr>
<tr>
<td>approx_percentile(col, percentage [, accuracy])</td>
<td>Returns the approximate percentile value of numeric
column `col` at the given percentage. The value of percentage must be between 0.0
and 1.0. The `accuracy` parameter (default: 10000) is a positive numeric literal which
controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields
better accuracy, `1.0/accuracy` is the relative error of the approximation.
When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0.
In this case, returns the approximate percentile array of column `col` at the given
percentage array.</td>
</tr>
<tr>
<td>avg(expr)</td>
<td>Returns the mean calculated from values of a group.</td>
</tr>
<tr>
<td>bit_or(expr)</td>
<td>Returns the bitwise OR of all non-null input values, or null if none.</td>
</tr>
<tr>
<td>bit_xor(expr)</td>
<td>Returns the bitwise XOR of all non-null input values, or null if none.</td>
</tr>
<tr>
<td>bool_and(expr)</td>
<td>Returns true if all values of `expr` are true.</td>
</tr>
<tr>
<td>bool_or(expr)</td>
<td>Returns true if at least one value of `expr` is true.</td>
</tr>
<tr>
<td>collect_list(expr)</td>
<td>Collects and returns a list of non-unique elements.</td>
</tr>
<tr>
<td>collect_set(expr)</td>
<td>Collects and returns a set of unique elements.</td>
</tr>
<tr>
<td>corr(expr1, expr2)</td>
<td>Returns Pearson coefficient of correlation between a set of number pairs.</td>
</tr>
<tr>
<td>count(*)</td>
<td>Returns the total number of retrieved rows, including rows containing null.</td>
</tr>
<tr>
<td>count(expr[, expr...])</td>
<td>Returns the number of rows for which the supplied expression(s) are all non-null.</td>
</tr>
<tr>
<td>count(DISTINCT expr[, expr...])</td>
<td>Returns the number of rows for which the supplied expression(s) are unique and non-null.</td>
</tr>
<tr>
<td>count_if(expr)</td>
<td>Returns the number of `TRUE` values for the expression.</td>
</tr>
<tr>
<td>count_min_sketch(col, eps, confidence, seed)</td>
<td>Returns a count-min sketch of a column with the given esp,
confidence and seed. The result is an array of bytes, which can be deserialized to a
`CountMinSketch` before usage. Count-min sketch is a probabilistic data structure used for
cardinality estimation using sub-linear space.</td>
</tr>
<tr>
<td>covar_pop(expr1, expr2)</td>
<td>Returns the population covariance of a set of number pairs.</td>
</tr>
<tr>
<td>covar_samp(expr1, expr2)</td>
<td>Returns the sample covariance of a set of number pairs.</td>
</tr>
<tr>
<td>every(expr)</td>
<td>Returns true if all values of `expr` are true.</td>
</tr>
<tr>
<td>first(expr[, isIgnoreNull])</td>
<td>Returns the first value of `expr` for a group of rows.
If `isIgnoreNull` is true, returns only non-null values.</td>
</tr>
<tr>
<td>first_value(expr[, isIgnoreNull])</td>
<td>Returns the first value of `expr` for a group of rows.
If `isIgnoreNull` is true, returns only non-null values.</td>
</tr>
<tr>
<td>kurtosis(expr)</td>
<td>Returns the kurtosis value calculated from values of a group.</td>
</tr>
<tr>
<td>last(expr[, isIgnoreNull])</td>
<td>Returns the last value of `expr` for a group of rows.
If `isIgnoreNull` is true, returns only non-null values</td>
</tr>
<tr>
<td>last_value(expr[, isIgnoreNull])</td>
<td>Returns the last value of `expr` for a group of rows.
If `isIgnoreNull` is true, returns only non-null values</td>
</tr>
<tr>
<td>max(expr)</td>
<td>Returns the maximum value of `expr`.</td>
</tr>
<tr>
<td>max_by(x, y)</td>
<td>Returns the value of `x` associated with the maximum value of `y`.</td>
</tr>
<tr>
<td>mean(expr)</td>
<td>Returns the mean calculated from values of a group.</td>
</tr>
<tr>
<td>min(expr)</td>
<td>Returns the minimum value of `expr`.</td>
</tr>
<tr>
<td>min_by(x, y)</td>
<td>Returns the value of `x` associated with the minimum value of `y`.</td>
</tr>
<tr>
<td>percentile(col, percentage [, frequency])</td>
<td>Returns the exact percentile value of numeric column
`col` at the given percentage. The value of percentage must be between 0.0 and 1.0. The
value of frequency should be positive integral</td>
</tr>
<tr>
<td>percentile(col, array(percentage1 [, percentage2]...) [, frequency])</td>
<td>Returns the exact
percentile value array of numeric column `col` at the given percentage(s). Each value
of the percentage array must be between 0.0 and 1.0. The value of frequency should be
positive integral</td>
</tr>
<tr>
<td>percentile_approx(col, percentage [, accuracy])</td>
<td>Returns the approximate percentile value of numeric
column `col` at the given percentage. The value of percentage must be between 0.0
and 1.0. The `accuracy` parameter (default: 10000) is a positive numeric literal which
controls approximation accuracy at the cost of memory. Higher value of `accuracy` yields
better accuracy, `1.0/accuracy` is the relative error of the approximation.
When `percentage` is an array, each value of the percentage array must be between 0.0 and 1.0.
In this case, returns the approximate percentile array of column `col` at the given
percentage array.</td>
</tr>
<tr>
<td>skewness(expr)</td>
<td>Returns the skewness value calculated from values of a group.</td>
</tr>
<tr>
<td>some(expr)</td>
<td>Returns true if at least one value of `expr` is true.</td>
</tr>
<tr>
<td>std(expr)</td>
<td>Returns the sample standard deviation calculated from values of a group.</td>
</tr>
<tr>
<td>stddev(expr)</td>
<td>Returns the sample standard deviation calculated from values of a group.</td>
</tr>
<tr>
<td>stddev_pop(expr)</td>
<td>Returns the population standard deviation calculated from values of a group.</td>
</tr>
<tr>
<td>stddev_samp(expr)</td>
<td>Returns the sample standard deviation calculated from values of a group.</td>
</tr>
<tr>
<td>sum(expr)</td>
<td>Returns the sum calculated from values of a group.</td>
</tr>
<tr>
<td>var_pop(expr)</td>
<td>Returns the population variance calculated from values of a group.</td>
</tr>
<tr>
<td>var_samp(expr)</td>
<td>Returns the sample variance calculated from values of a group.</td>
</tr>
<tr>
<td>variance(expr)</td>
<td>Returns the sample variance calculated from values of a group.</td>
</tr>
</tbody>
</table>