blob: af13d71287e54818339d25e58c4a7f34978c0079 [file] [log] [blame] [view]
---
{
"title": "PERCENTILE",
"language": "en"
}
---
## Description
Calculates the exact percentile, suitable for small datasets. First sorts the specified column in descending order, then takes the exact p-th percentile. The value of `p` is between `0` and `1`. If `p` does not point to an exact position, it returns the [linear interpolation](https://en.wikipedia.org/wiki/Linear_interpolation) of the adjacent values at position `p`. Note that this is not the average of the two numbers. Special cases:
## Syntax
```sql
PERCENTILE(<col>, <p>)
```
## Parameters
| Parameter | Description |
| -- | -- |
| `<col>` | The column to calculate the exact percentile for. Supported types: Double, Float, LargeInt, BigInt, Int, SmallInt, TinyInt. |
| `<p>` | The exact percentile to be calculated, a constant value. Supported type: Double. Range: `[0.0, 1.0]`. The second parameter must be a constant. |
## Return Value
Return the exact percentile of the specified column, with a return type of Double.
If there is no valid data in the group, returns NULL.
## Examples
```sql
-- Setup
CREATE TABLE sales_data
(
product_id INT,
sale_price DECIMAL(10, 2)
) DUPLICATE KEY(`product_id`)
DISTRIBUTED BY HASH(`product_id`) BUCKETS AUTO
PROPERTIES (
"replication_allocation" = "tag.location.default: 1"
);
INSERT INTO sales_data VALUES
(1, 10.00),
(1, 15.00),
(1, 20.00),
(1, 25.00),
(1, 30.00),
(1, 35.00),
(1, 40.00),
(1, 45.00),
(1, 50.00),
(1, 100.00);
```
```sql
SELECT
percentile(sale_price, 0.5) as median_price, -- Median
percentile(sale_price, 0.75) as p75_price, -- 75th percentile
percentile(sale_price, 0.90) as p90_price, -- 90th percentile
percentile(sale_price, 0.95) as p95_price, -- 95th percentile
percentile(null, 0.99) as p99_null -- Null value at 99th percentile
FROM sales_data;
```
Calculates sale prices at different percentiles.
```text
+--------------+-----------+-------------------+-------------------+----------+
| median_price | p75_price | p90_price | p95_price | p99_null |
+--------------+-----------+-------------------+-------------------+----------+
| 32.5 | 43.75 | 54.99999999999998 | 77.49999999999994 | NULL |
+--------------+-----------+-------------------+-------------------+----------+
```
```sql
select percentile(if(sale_price>90,sale_price,NULL), 0.5) from sales_data;
```
Only non-NULL input values are considered in the calculation.
```text
+----------------------------------------------------+
| percentile(if(sale_price>90,sale_price,NULL), 0.5) |
+----------------------------------------------------+
| 100 |
+----------------------------------------------------+
```
```sql
select percentile(sale_price, NULL) from sales_data;
```
If all input values are NULL, returns NULL.
```text
+------------------------------+
| percentile(sale_price, NULL) |
+------------------------------+
| NULL |
+------------------------------+
```