blob: 0da3707f478ca8fbc6959f511547fa358025098d [file] [view]
---
{
"title": "BITMAP_UNION",
"language": "en",
"description": "Calculate the union of input Bitmaps and return a new bitmap"
}
---
## Description
Calculate the union of input Bitmaps and return a new bitmap
## Syntax
```sql
BITMAP_UNION(<expr>)
```
## Parameters
| Parameter | Description |
| -- | -- |
| `<expr>` | Supported data types of BITMAP |
## Return Value
The data type of the return value is BITMAP.
## Example
```sql
select dt,page,bitmap_to_string(user_id) from pv_bitmap;
```
```text
+------+------+---------------------------+
| dt | page | bitmap_to_string(user_id) |
+------+------+---------------------------+
| 1 | 100 | 100,200,300 |
| 2 | 200 | 300 |
+------+------+---------------------------+
```
Calculate the deduplication value of user_id:
```
select bitmap_count(bitmap_union(user_id)) from pv_bitmap;
```
```text
+-------------------------------------+
| bitmap_count(bitmap_union(user_id)) |
+-------------------------------------+
| 3 |
+-------------------------------------+
```
### Create table
The aggregation model needs to be used when creating the table. The data type is bitmap and the aggregation function is bitmap_union.
```
CREATE TABLE `pv_bitmap` (
`dt` int (11) NULL COMMENT" ",
`page` varchar (10) NULL COMMENT" ",
`user_id` bitmap BITMAP_UNION NULL COMMENT" "
) ENGINE = OLAP
AGGREGATE KEY (`dt`,` page`)
COMMENT "OLAP"
DISTRIBUTED BY HASH (`dt`) BUCKETS 2;
```
Note: When the amount of data is large, it is best to create a corresponding rollup table for high-frequency bitmap_union queries
```
ALTER TABLE pv_bitmap ADD ROLLUP pv (page, user_id);
```
### Data Load
`TO_BITMAP (expr)`: Convert 0 ~ 18446744073709551615 unsigned bigint to bitmap
`BITMAP_EMPTY ()`: Generate empty bitmap columns, used for insert or import to fill the default value
`BITMAP_HASH (expr)` or `BITMAP_HASH64 (expr)`: Convert any type of column to a bitmap by hashing
#### Stream Load
```
cat data | curl --location-trusted -u user: passwd -T--H "columns: dt, page, user_id, user_id = to_bitmap (user_id)" http: // host: 8410 / api / test / testDb / _stream_load
```
```
cat data | curl --location-trusted -u user: passwd -T--H "columns: dt, page, user_id, user_id = bitmap_hash (user_id)" http: // host: 8410 / api / test / testDb / _stream_load
```
```
cat data | curl --location-trusted -u user: passwd -T--H "columns: dt, page, user_id, user_id = bitmap_empty ()" http: // host: 8410 / api / test / testDb / _stream_load
```
#### Insert Into
id2's column type is bitmap
```
insert into bitmap_table1 select id, id2 from bitmap_table2;
```
id2's column type is bitmap
```
INSERT INTO bitmap_table1 (id, id2) VALUES (1001, to_bitmap (1000)), (1001, to_bitmap (2000));
```
id2's column type is bitmap
```
insert into bitmap_table1 select id, bitmap_union (id2) from bitmap_table2 group by id;
```
id2's column type is int
```
insert into bitmap_table1 select id, to_bitmap (id2) from table;
```
id2's column type is String
```
insert into bitmap_table1 select id, bitmap_hash (id_string) from table;
```