The QUANTILE_UNION function is used to merge multiple intermediate results of quantile calculations. This function is usually used together with QUANTILE_STATE, especially suitable for scenarios requiring multi-stage quantile calculation.
QUANTILE_UNION(<query_state>)
| Parameter | Description |
|---|---|
<query_state> | The data to be aggregated, type QuantileState supported. |
Returns an aggregation state for further quantile calculation, type QuantileState. Returns NULL if there is no valid data in the group.
-- setup CREATE TABLE response_times ( request_id INT, response_time DOUBLE, region STRING ) DUPLICATE KEY(request_id) DISTRIBUTED BY HASH(request_id) BUCKETS AUTO PROPERTIES ( "replication_allocation" = "tag.location.default: 1" ); INSERT INTO response_times VALUES (1, 10.5, 'east'), (2, 15.2, 'east'), (3, 20.1, 'west'), (4, 25.8, 'east'), (5, 30.3, 'west'), (6, 35.7, 'east'), (7, 40.2, 'west'), (8, 45.9, 'east'), (9, 50.4, 'west'), (10, 100.6, 'east');
SELECT region, QUANTILE_PERCENT( QUANTILE_UNION( TO_QUANTILE_STATE(response_time, 2048) ), 0.5 ) AS median_response_time FROM response_times GROUP BY region;
Calculate the 50th percentile of response times by region.
+--------+----------------------+ | region | median_response_time | +--------+----------------------+ | west | 35.25 | | east | 30.75 | +--------+----------------------+
SELECT QUANTILE_UNION(TO_QUANTILE_STATE(response_time, 2048)) FROM response_times where response_time is null;
Returns NULL if there is no valid data in the group.
+--------------------------------------------------------+ | QUANTILE_UNION(TO_QUANTILE_STATE(response_time, 2048)) | +--------------------------------------------------------+ | NULL | +--------------------------------------------------------+