date_bin 函数date_bin 是一个标量函数,用于将时间戳规整到指定的时间区间起点,并结合 GROUP BY 子句实现降采样。
在示例数据页面中,包含了用于构建表结构和插入数据的SQL语句,下载并在IoTDB CLI中执行这些语句,即可将数据导入IoTDB,您可以使用这些数据来测试和执行示例中的SQL语句,并获得相应的结果。
示例 1:获取设备** 100 **某个时间范围的每小时平均温度
SELECT date_bin(1h, time) AS hour_time, avg(temperature) AS avg_temp FROM table1 WHERE (time >= 2024-11-27 00:00:00 AND time <= 2024-11-30 00:00:00) AND device_id = '100' GROUP BY 1;
结果:
+-----------------------------+--------+ | hour_time|avg_temp| +-----------------------------+--------+ |2024-11-29T11:00:00.000+08:00| null| |2024-11-29T18:00:00.000+08:00| 90.0| |2024-11-28T08:00:00.000+08:00| 85.0| |2024-11-28T09:00:00.000+08:00| null| |2024-11-28T10:00:00.000+08:00| 85.0| |2024-11-28T11:00:00.000+08:00| 88.0| +-----------------------------+--------+
示例 2:获取每个设备某个时间范围的每小时平均温度
SELECT date_bin(1h, time) AS hour_time, device_id, avg(temperature) AS avg_temp FROM table1 WHERE time >= 2024-11-27 00:00:00 AND time <= 2024-11-30 00:00:00 GROUP BY 1, device_id;
结果:
+-----------------------------+---------+--------+ | hour_time|device_id|avg_temp| +-----------------------------+---------+--------+ |2024-11-29T11:00:00.000+08:00| 100| null| |2024-11-29T18:00:00.000+08:00| 100| 90.0| |2024-11-28T08:00:00.000+08:00| 100| 85.0| |2024-11-28T09:00:00.000+08:00| 100| null| |2024-11-28T10:00:00.000+08:00| 100| 85.0| |2024-11-28T11:00:00.000+08:00| 100| 88.0| |2024-11-29T10:00:00.000+08:00| 101| 85.0| |2024-11-27T16:00:00.000+08:00| 101| 85.0| +-----------------------------+---------+--------+
示例 3:获取所有设备某个时间范围的每小时平均温度
SELECT date_bin(1h, time) AS hour_time, avg(temperature) AS avg_temp FROM table1 WHERE time >= 2024-11-27 00:00:00 AND time <= 2024-11-30 00:00:00 group by 1;
结果:
+-----------------------------+--------+ | hour_time|avg_temp| +-----------------------------+--------+ |2024-11-29T10:00:00.000+08:00| 85.0| |2024-11-27T16:00:00.000+08:00| 85.0| |2024-11-29T11:00:00.000+08:00| null| |2024-11-29T18:00:00.000+08:00| 90.0| |2024-11-28T08:00:00.000+08:00| 85.0| |2024-11-28T09:00:00.000+08:00| null| |2024-11-28T10:00:00.000+08:00| 85.0| |2024-11-28T11:00:00.000+08:00| 88.0| +-----------------------------+--------+
date_bin_gapfill 函数date_bin_gapfill 是 date_bin 的扩展,能够填充缺失的时间区间,从而返回完整的时间序列。
date_bin_gapfill会返回空结果集date_bin_gapfill 必须与 GROUP BY 子句搭配使用,如果用在其他子句中,不会报错,但不会执行 gapfill 功能,效果与使用 date_bin 相同。GROUP BY 子句中只能使用一个 date_bin_gapfill。如果出现多个 date_bin_gapfill,会报错:multiple date_bin_gapfill calls not alloweddate_bin_gapfill 的执行顺序:GAPFILL 功能发生在 HAVING 子句执行之后,FILL 子句执行之前。date_bin_gapfill 时,****WHERE 子句中的时间过滤条件必须是以下形式之一:time >= XXX AND time <= XXXtime > XXX AND time < XXXtime BETWEEN XXX AND XXXdate_bin_gapfill 时,如果出现其他时间过滤条件,会报错。时间过滤条件与其他值过滤条件只能通过 AND 连接。示例 1:填充缺失时间区间
SELECT date_bin_gapfill(1h, time) AS hour_time, avg(temperature) AS avg_temp FROM table1 WHERE (time >= 2024-11-28 07:00:00 AND time <= 2024-11-28 16:00:00) AND device_id = '100' GROUP BY 1;
结果:
+-----------------------------+--------+ | hour_time|avg_temp| +-----------------------------+--------+ |2024-11-28T07:00:00.000+08:00| null| |2024-11-28T08:00:00.000+08:00| 85.0| |2024-11-28T09:00:00.000+08:00| null| |2024-11-28T10:00:00.000+08:00| 85.0| |2024-11-28T11:00:00.000+08:00| 88.0| |2024-11-28T12:00:00.000+08:00| null| |2024-11-28T13:00:00.000+08:00| null| |2024-11-28T14:00:00.000+08:00| null| |2024-11-28T15:00:00.000+08:00| null| |2024-11-28T16:00:00.000+08:00| null| +-----------------------------+--------+
示例 2:结合设备分组填充缺失时间区间
SELECT date_bin_gapfill(1h, time) AS hour_time, device_id, avg(temperature) AS avg_temp FROM table1 WHERE time >= 2024-11-28 07:00:00 AND time <= 2024-11-28 16:00:00 GROUP BY 1, device_id;
结果:
+-----------------------------+---------+--------+ | hour_time|device_id|avg_temp| +-----------------------------+---------+--------+ |2024-11-28T07:00:00.000+08:00| 100| null| |2024-11-28T08:00:00.000+08:00| 100| 85.0| |2024-11-28T09:00:00.000+08:00| 100| null| |2024-11-28T10:00:00.000+08:00| 100| 85.0| |2024-11-28T11:00:00.000+08:00| 100| 88.0| |2024-11-28T12:00:00.000+08:00| 100| null| |2024-11-28T13:00:00.000+08:00| 100| null| |2024-11-28T14:00:00.000+08:00| 100| null| |2024-11-28T15:00:00.000+08:00| 100| null| |2024-11-28T16:00:00.000+08:00| 100| null| +-----------------------------+---------+--------+
示例 3:查询范围内没有数据返回空结果集
SELECT date_bin_gapfill(1h, time) AS hour_time, device_id, avg(temperature) AS avg_temp FROM table1 WHERE time >= 2024-11-27 09:00:00 AND time <= 2024-11-27 14:00:00 GROUP BY 1, device_id;
结果:
+---------+---------+--------+ |hour_time|device_id|avg_temp| +---------+---------+--------+ +---------+---------+--------+
DIFF 函数用于计算当前行与上一行的差值。对于第一行,由于没有前一行数据,因此永远返回 NULL。
DIFF(numberic[, boolean]) -> Double
第一个参数:数值类型
INT32、INT64、FLOAT、DOUBLE)第二个参数:布尔类型(可选)
true 或 false)。true。true:忽略 NULL 值,向前找到第一个非 NULL 值进行差值计算。如果前面没有非 NULL 值,则返回 NULL。false:不忽略 NULL 值,如果前一行为 NULL,则差值结果为 NULL。'ignoreNull'='true' 或 'ignoreNull'='false',但在表模型中,只需指定为 true 或 false。'ignoreNull'='true' 或 'ignoreNull'='false',表模型会将其视为对两个字符串常量进行等号比较,返回布尔值,但结果总是 false,等价于指定第二个参数为 false。示例 1:忽略 NULL 值
SELECT time, DIFF(temperature) AS diff_temp FROM table1 WHERE device_id = '100';
结果:
+-----------------------------+---------+ | time|diff_temp| +-----------------------------+---------+ |2024-11-29T11:00:00.000+08:00| null| |2024-11-29T18:30:00.000+08:00| null| |2024-11-28T08:00:00.000+08:00| -5.0| |2024-11-28T09:00:00.000+08:00| null| |2024-11-28T10:00:00.000+08:00| 0.0| |2024-11-28T11:00:00.000+08:00| 3.0| |2024-11-26T13:37:00.000+08:00| 2.0| |2024-11-26T13:38:00.000+08:00| 0.0| +-----------------------------+---------+
示例 2:不忽略 NULL 值
SELECT time, DIFF(temperature, false) AS diff_temp FROM table1 WHERE device_id = '100';
结果:
+-----------------------------+---------+ | time|diff_temp| +-----------------------------+---------+ |2024-11-29T11:00:00.000+08:00| null| |2024-11-29T18:30:00.000+08:00| null| |2024-11-28T08:00:00.000+08:00| -5.0| |2024-11-28T09:00:00.000+08:00| null| |2024-11-28T10:00:00.000+08:00| null| |2024-11-28T11:00:00.000+08:00| 3.0| |2024-11-26T13:37:00.000+08:00| 2.0| |2024-11-26T13:38:00.000+08:00| 0.0| +-----------------------------+---------+
示例 3:完整示例
SELECT time, temperature, DIFF(temperature) AS diff_temp_1, DIFF(temperature, false) AS diff_temp_2 FROM table1 WHERE device_id = '100';
结果:
+-----------------------------+-----------+-----------+-----------+ | time|temperature|diff_temp_1|diff_temp_2| +-----------------------------+-----------+-----------+-----------+ |2024-11-29T11:00:00.000+08:00| null| null| null| |2024-11-29T18:30:00.000+08:00| 90.0| null| null| |2024-11-28T08:00:00.000+08:00| 85.0| -5.0| -5.0| |2024-11-28T09:00:00.000+08:00| null| null| null| |2024-11-28T10:00:00.000+08:00| 85.0| 0.0| null| |2024-11-28T11:00:00.000+08:00| 88.0| 3.0| 3.0| |2024-11-26T13:37:00.000+08:00| 90.0| 2.0| 2.0| |2024-11-26T13:38:00.000+08:00| 90.0| 0.0| 0.0| +-----------------------------+-----------+-----------+-----------+
原始示例数据如下:
IoTDB> SELECT * FROM bid; +-----------------------------+--------+-----+ | time|stock_id|price| +-----------------------------+--------+-----+ |2021-01-01T09:05:00.000+08:00| AAPL|100.0| |2021-01-01T09:06:00.000+08:00| TESL|200.0| |2021-01-01T09:07:00.000+08:00| AAPL|103.0| |2021-01-01T09:07:00.000+08:00| TESL|202.0| |2021-01-01T09:09:00.000+08:00| AAPL|102.0| |2021-01-01T09:15:00.000+08:00| TESL|195.0| +-----------------------------+--------+-----+ -- 创建语句 CREATE TABLE bid(time TIMESTAMP TIME, stock_id STRING TAG, price FLOAT FIELD); -- 插入数据 INSERT INTO bid(time, stock_id, price) VALUES('2021-01-01T09:05:00','AAPL',100.0),('2021-01-01T09:06:00','TESL',200.0),('2021-01-01T09:07:00','AAPL',103.0),('2021-01-01T09:07:00','TESL',202.0),('2021-01-01T09:09:00','AAPL',102.0),('2021-01-01T09:15:00','TESL',195.0);
HOP 函数用于按时间分段分窗分析,识别每一行数据所属的时间窗口。该函数通过指定固定窗口大小(size)和窗口滑动步长(SLIDE),将数据按时间戳分配到所有与其时间戳重叠的窗口中。若窗口之间存在重叠(步长 < 窗口大小),数据会自动复制到多个窗口。
HOP(data, timecol, size, slide[, origin])
| 参数名 | 参数类型 | 参数属性 | 描述 |
|---|---|---|---|
| DATA | 表参数 | ROW SEMANTICPASS THROUGH | 输入表 |
| TIMECOL | 标量参数 | 字符串类型默认值:time | 时间列 |
| SIZE | 标量参数 | 长整数类型 | 窗口大小 |
| SLIDE | 标量参数 | 长整数类型 | 窗口滑动步长 |
| ORIGIN | 标量参数 | 时间戳类型默认值:Unix 纪元时间 | 第一个窗口起始时间 |
HOP 函数的返回结果列包含:
IoTDB> SELECT * FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m); +-----------------------------+-----------------------------+-----------------------------+--------+-----+ | window_start| window_end| time|stock_id|price| +-----------------------------+-----------------------------+-----------------------------+--------+-----+ |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| |2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| |2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| |2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| |2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| |2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| |2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| |2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| +-----------------------------+-----------------------------+-----------------------------+--------+-----+ -- 结合 GROUP BY 语句,等效于树模型的 GROUP BY TIME IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM HOP(DATA => bid,TIMECOL => 'time',SLIDE => 5m,SIZE => 10m) GROUP BY window_start, window_end, stock_id; +-----------------------------+-----------------------------+--------+------------------+ | window_start| window_end|stock_id| avg| +-----------------------------+-----------------------------+--------+------------------+ |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| |2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 201.0| |2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| |2021-01-01T09:15:00.000+08:00|2021-01-01T09:25:00.000+08:00| TESL| 195.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| |2021-01-01T09:05:00.000+08:00|2021-01-01T09:15:00.000+08:00| AAPL|101.66666666666667| +-----------------------------+-----------------------------+--------+------------------+
SESSION 函数用于按会话间隔对数据进行分窗。系统逐行检查与前一行的时间间隔,小于阈值(GAP)则归入当前窗口,超过则归入下一个窗口。
SESSION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], timecol, gap)
| 参数名 | 参数类型 | 参数属性 | 描述 |
|---|---|---|---|
| DATA | 表参数 | SET SEMANTICPASS THROUGH | 输入表通过 pkeys、okeys 指定分区和排序 |
| TIMECOL | 标量参数 | 字符串类型默认值:‘time’ | 时间列名 |
| | GAP | 标量参数 | 长整数类型 | 会话间隔阈值 |
SESSION 函数的返回结果列包含:
IoTDB> SELECT * FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m); +-----------------------------+-----------------------------+-----------------------------+--------+-----+ | window_start| window_end| time|stock_id|price| +-----------------------------+-----------------------------+-----------------------------+--------+-----+ |2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| |2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| |2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| |2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| |2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| |2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| +-----------------------------+-----------------------------+-----------------------------+--------+-----+ -- 结合 GROUP BY 语句,等效于树模型的 GROUP BY SESSION IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM SESSION(DATA => bid PARTITION BY stock_id ORDER BY time,TIMECOL => 'time',GAP => 2m) GROUP BY window_start, window_end, stock_id; +-----------------------------+-----------------------------+--------+------------------+ | window_start| window_end|stock_id| avg| +-----------------------------+-----------------------------+--------+------------------+ |2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL| 201.0| |2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL| 195.0| |2021-01-01T09:05:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|101.66666666666667| +-----------------------------+-----------------------------+--------+------------------+
VARIATION 函数用于按数据差值分窗,将第一条数据作为首个窗口的基准值,每个数据点会与基准值进行差值运算,如果差值小于给定的阈值(delta)则加入当前窗口;如果超过阈值,则分为下一个窗口,将该值作为下一个窗口的基准值。
VARIATION(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], col, delta)
| 参数名 | 参数类型 | 参数属性 | 描述 |
|---|---|---|---|
| DATA | 表参数 | SET SEMANTICPASS THROUGH | 输入表通过 pkeys、okeys 指定分区和排序 |
| COL | 标量参数 | 字符串类型 | 标识对哪一列计算差值 |
| DELTA | 标量参数 | 浮点数类型 | 差值阈值 |
VARIATION 函数的返回结果列包含:
IoTDB> SELECT * FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price',DELTA => 2.0); +------------+-----------------------------+--------+-----+ |window_index| time|stock_id|price| +------------+-----------------------------+--------+-----+ | 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| | 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| | 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| | 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| | 1|2021-01-01T09:07:00.000+08:00| AAPL|103.0| | 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| +------------+-----------------------------+--------+-----+ -- 结合 GROUP BY 语句,等效于树模型的 GROUP BY VARIATION IoTDB> SELECT first(time) as window_start, last(time) as window_end, stock_id, avg(price) as avg FROM VARIATION(DATA => bid PARTITION BY stock_id ORDER BY time,COL => 'price', DELTA => 2.0) GROUP BY window_index, stock_id; +-----------------------------+-----------------------------+--------+-----+ | window_start| window_end|stock_id| avg| +-----------------------------+-----------------------------+--------+-----+ |2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| |2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| |2021-01-01T09:05:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| |2021-01-01T09:07:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.5| +-----------------------------+-----------------------------+--------+-----+
CAPACITY 函数用于按数据点数(行数)分窗,每个窗口最多有 SIZE 行数据。
CAPACITY(data [PARTITION BY(pkeys, ...)] [ORDER BY(okeys, ...)], size)
| 参数名 | 参数类型 | 参数属性 | 描述 |
|---|---|---|---|
| DATA | 表参数 | SET SEMANTICPASS THROUGH | 输入表通过 pkeys、okeys 指定分区和排序 |
| SIZE | 标量参数 | 长整数类型 | 窗口大小 |
CAPACITY 函数的返回结果列包含:
IoTDB> SELECT * FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2); +------------+-----------------------------+--------+-----+ |window_index| time|stock_id|price| +------------+-----------------------------+--------+-----+ | 0|2021-01-01T09:06:00.000+08:00| TESL|200.0| | 0|2021-01-01T09:07:00.000+08:00| TESL|202.0| | 1|2021-01-01T09:15:00.000+08:00| TESL|195.0| | 0|2021-01-01T09:05:00.000+08:00| AAPL|100.0| | 0|2021-01-01T09:07:00.000+08:00| AAPL|103.0| | 1|2021-01-01T09:09:00.000+08:00| AAPL|102.0| +------------+-----------------------------+--------+-----+ -- 结合 GROUP BY 语句,等效于树模型的 GROUP BY COUNT IoTDB> SELECT first(time) as start_time, last(time) as end_time, stock_id, avg(price) as avg FROM CAPACITY(DATA => bid PARTITION BY stock_id ORDER BY time, SIZE => 2) GROUP BY window_index, stock_id; +-----------------------------+-----------------------------+--------+-----+ | start_time| end_time|stock_id| avg| +-----------------------------+-----------------------------+--------+-----+ |2021-01-01T09:06:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|201.0| |2021-01-01T09:15:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| |2021-01-01T09:05:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|101.5| |2021-01-01T09:09:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| +-----------------------------+-----------------------------+--------+-----+
TUMBLE 函数用于通过时间属性字段为每行数据分配一个窗口,滚动窗口的大小固定且不重复。
TUMBLE(data, timecol, size[, origin])
| 参数名 | 参数类型 | 参数属性 | 描述 |
|---|---|---|---|
| DATA | 表参数 | ROW SEMANTICPASS THROUGH | 输入表 |
| TIMECOL | 标量参数 | 字符串类型默认值:time | 时间列 |
| SIZE | 标量参数 | 长整数类型 | 窗口大小,需为正数 |
| ORIGIN | 标量参数 | 时间戳类型默认值:Unix 纪元时间 | 第一个窗口起始时间 |
TUBMLE 函数的返回结果列包含:
IoTDB> SELECT * FROM TUMBLE( DATA => bid, TIMECOL => 'time', SIZE => 10m); +-----------------------------+-----------------------------+-----------------------------+--------+-----+ | window_start| window_end| time|stock_id|price| +-----------------------------+-----------------------------+-----------------------------+--------+-----+ |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| |2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| +-----------------------------+-----------------------------+-----------------------------+--------+-----+ -- 结合 GROUP BY 语句,等效于树模型的 GROUP BY TIME IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM TUMBLE(DATA => bid, TIMECOL => 'time', SIZE => 10m) GROUP BY window_start, window_end, stock_id; +-----------------------------+-----------------------------+--------+------------------+ | window_start| window_end|stock_id| avg| +-----------------------------+-----------------------------+--------+------------------+ |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| |2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| +-----------------------------+-----------------------------+--------+------------------+
Cumulate 函数用于从初始的窗口开始,创建相同窗口开始但窗口结束步长不同的窗口,直到达到最大的窗口大小。每个窗口包含其区间内的元素。例如:1小时步长,24小时大小的累计窗口,每天可以获得如下这些窗口:[00:00, 01:00),[00:00, 02:00),[00:00, 03:00), …, [00:00, 24:00)
CUMULATE(data, timecol, size, step[, origin])
| 参数名 | 参数类型 | 参数属性 | 描述 |
|---|---|---|---|
| DATA | 表参数 | ROW SEMANTICPASS THROUGH | 输入表 |
| TIMECOL | 标量参数 | 字符串类型默认值:time | 时间列 |
| SIZE | 标量参数 | 长整数类型 | 窗口大小,SIZE必须是STEP的整数倍,需为正数 |
| STEP | 标量参数 | 长整数类型 | 窗口步长,需为正数 |
| ORIGIN | 标量参数 | 时间戳类型默认值:Unix 纪元时间 | 第一个窗口起始时间 |
注意:size 如果不是 step 的整数倍,则会报错
Cumulative table function requires size must be an integral multiple of step
CUMULATE函数的返回结果列包含:
IoTDB> SELECT * FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m,SIZE => 10m); +-----------------------------+-----------------------------+-----------------------------+--------+-----+ | window_start| window_end| time|stock_id|price| +-----------------------------+-----------------------------+-----------------------------+--------+-----+ |2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:06:00.000+08:00| TESL|200.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| TESL|202.0| |2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| |2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| |2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00|2021-01-01T09:15:00.000+08:00| TESL|195.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:05:00.000+08:00| AAPL|100.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:07:00.000+08:00| AAPL|103.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00|2021-01-01T09:09:00.000+08:00| AAPL|102.0| +-----------------------------+-----------------------------+-----------------------------+--------+-----+ -- 结合 GROUP BY 语句,等效于树模型的 GROUP BY TIME IoTDB> SELECT window_start, window_end, stock_id, avg(price) as avg FROM CUMULATE(DATA => bid,TIMECOL => 'time',STEP => 2m, SIZE => 10m) GROUP BY window_start, window_end, stock_id; +-----------------------------+-----------------------------+--------+------------------+ | window_start| window_end|stock_id| avg| +-----------------------------+-----------------------------+--------+------------------+ |2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| TESL| 201.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| TESL| 201.0| |2021-01-01T09:10:00.000+08:00|2021-01-01T09:16:00.000+08:00| TESL| 195.0| |2021-01-01T09:10:00.000+08:00|2021-01-01T09:18:00.000+08:00| TESL| 195.0| |2021-01-01T09:10:00.000+08:00|2021-01-01T09:20:00.000+08:00| TESL| 195.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:06:00.000+08:00| AAPL| 100.0| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:08:00.000+08:00| AAPL| 101.5| |2021-01-01T09:00:00.000+08:00|2021-01-01T09:10:00.000+08:00| AAPL|101.66666666666667| +-----------------------------+-----------------------------+--------+------------------+
IoTDB 支持的窗口函数(Window Function) 是一种基于与当前行相关的特定行集合(称为“窗口”) 对每一行进行计算的特殊函数。它将分组操作(PARTITION BY)、排序(ORDER BY)与可定义的计算范围(窗口框架 FRAME)结合,在不折叠原始数据行的前提下实现复杂的跨行计算。常用于数据分析场景,比如排名、累计和、移动平均等操作。
注意:该功能从 V 2.0.5 版本开始提供。
例如,某场景下需要查询不同设备的功耗累加值,即可通过窗口函数来实现。
-- 原始数据 +-----------------------------+------+-----+ | time|device| flow| +-----------------------------+------+-----+ |1970-01-01T08:00:00.000+08:00| d0| 3| |1970-01-01T08:00:00.001+08:00| d0| 5| |1970-01-01T08:00:00.002+08:00| d0| 3| |1970-01-01T08:00:00.003+08:00| d0| 1| |1970-01-01T08:00:00.004+08:00| d1| 2| |1970-01-01T08:00:00.005+08:00| d1| 4| +-----------------------------+------+-----+ -- 创建表并插入数据 CREATE TABLE device_flow(device String tag, flow INT32 FIELD); insert into device_flow(time, device ,flow ) values ('1970-01-01T08:00:00.000+08:00','d0',3),('1970-01-01T08:00:01.000+08:00','d0',5),('1970-01-01T08:00:02.000+08:00','d0',3),('1970-01-01T08:00:03.000+08:00','d0',1),('1970-01-01T08:00:04.000+08:00','d1',2),('1970-01-01T08:00:05.000+08:00','d1',4); --执行窗口函数查询 SELECT *, sum(flow) OVER(PARTITION BY device ORDER BY flow) as sum FROM device_flow;
经过分组、排序、计算(步骤拆解如下图所示),
即可得到期望结果:
+-----------------------------+------+----+----+ | time|device|flow| sum| +-----------------------------+------+----+----+ |1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| |1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| |1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| |1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| |1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| |1970-01-01T08:00:01.000+08:00| d0| 5|12.0| +-----------------------------+------+----+----+
windowDefinition : name=identifier AS '(' windowSpecification ')' ; windowSpecification : (existingWindowName=identifier)? (PARTITION BY partition+=expression (',' partition+=expression)*)? (ORDER BY sortItem (',' sortItem)*)? windowFrame? ; windowFrame : frameExtent ; frameExtent : frameType=RANGE start=frameBound | frameType=ROWS start=frameBound | frameType=GROUPS start=frameBound | frameType=RANGE BETWEEN start=frameBound AND end=frameBound | frameType=ROWS BETWEEN start=frameBound AND end=frameBound | frameType=GROUPS BETWEEN start=frameBound AND end=frameBound ; frameBound : UNBOUNDED boundType=PRECEDING #unboundedFrame | UNBOUNDED boundType=FOLLOWING #unboundedFrame | CURRENT ROW #currentRowBound | expression boundType=(PRECEDING | FOLLOWING) #boundedFrame ;
PARTITION BY 用于将数据分为多个独立、不相关的「组」,窗口函数只能访问并操作其所属分组内的数据,无法访问其它分组。该子句是可选的;如果未显式指定,则默认将所有数据分到同一组。值得注意的是,与 GROUP BY 通过聚合函数将一组数据规约成一行不同,PARTITION BY 的窗口函数并不会影响组内的行数。
查询语句:
IoTDB> SELECT *, count(flow) OVER (PARTITION BY device) as count FROM device_flow;
拆解步骤:
查询结果:
+-----------------------------+------+----+-----+ | time|device|flow|count| +-----------------------------+------+----+-----+ |1970-01-01T08:00:04.000+08:00| d1| 2| 2| |1970-01-01T08:00:05.000+08:00| d1| 4| 2| |1970-01-01T08:00:00.000+08:00| d0| 3| 4| |1970-01-01T08:00:01.000+08:00| d0| 5| 4| |1970-01-01T08:00:02.000+08:00| d0| 3| 4| |1970-01-01T08:00:03.000+08:00| d0| 1| 4| +-----------------------------+------+----+-----+
ORDER BY 用于对 partition 内的数据进行排序。排序后,相等的行被称为 peers。peers 会影响窗口函数的行为,例如不同 rank function 对 peers 的处理不同;不同 frame 的划分方式对于 peers 的处理也不同。该子句是可选的。
查询语句:
IoTDB> SELECT *, rank() OVER (PARTITION BY device ORDER BY flow) as rank FROM device_flow;
拆解步骤:
查询结果:
+-----------------------------+------+----+----+ | time|device|flow|rank| +-----------------------------+------+----+----+ |1970-01-01T08:00:04.000+08:00| d1| 2| 1| |1970-01-01T08:00:05.000+08:00| d1| 4| 2| |1970-01-01T08:00:03.000+08:00| d0| 1| 1| |1970-01-01T08:00:00.000+08:00| d0| 3| 2| |1970-01-01T08:00:02.000+08:00| d0| 3| 2| |1970-01-01T08:00:01.000+08:00| d0| 5| 4| +-----------------------------+------+----+----+
对于 partition 中的每一行,窗口函数都会在相应的一组行上求值,这些行称为 Frame(即 Window Function 在每一行上的输入域)。Frame 可以手动指定,指定时涉及两个属性,具体说明如下。
其中,CURRENT ROW、PRECEDING N 和 FOLLOWING N 的含义随着 frame 种类的不同而不同,如下表所示:
ROWS | GROUPS | RANGE | |
|---|---|---|---|
CURRENT ROW | 当前行 | 由于 peer group 包含多行,因此这个选项根据作用于 frame_start 和 frame_end 而不同:* frame_start:peer group 的第一行;* frame_end:peer group 的最后一行。 | 和 GROUPS 相同,根据作用于 frame_start 和 frame_end 而不同:* frame_start:peer group 的第一行;* frame_end:peer group 的最后一行。 |
offset PRECEDING | 前 offset 行 | 前 offset 个 peer group; | 前面与当前行的值之差小于等于 offset 就分为一个 frame |
offset FOLLOWING | 后 offset 行 | 后 offset 个 peer group。 | 后面与当前行的值之差小于等于 offset 就分为一个 frame |
语法格式如下:
-- 同时指定 frame_start 和 frame_end { RANGE | ROWS | GROUPS } BETWEEN frame_start AND frame_end -- 仅指定 frame_start,frame_end 为 CURRENT ROW { RANGE | ROWS | GROUPS } frame_start
若未手动指定 Frame,Frame 的默认划分规则如下:
需要注意的是,当 Frame 类型为 GROUPS 或 RANGE 时,需要指定 ORDER BY,区别在于 GROUPS 中的 ORDER BY 可以涉及多个字段,而 RANGE 需要计算,所以只能指定一个字段。
查询语句:
IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ROWS 1 PRECEDING) as count FROM device_flow;
拆解步骤:
查询结果:
+-----------------------------+------+----+-----+ | time|device|flow|count| +-----------------------------+------+----+-----+ |1970-01-01T08:00:04.000+08:00| d1| 2| 1| |1970-01-01T08:00:05.000+08:00| d1| 4| 2| |1970-01-01T08:00:00.000+08:00| d0| 3| 1| |1970-01-01T08:00:01.000+08:00| d0| 5| 2| |1970-01-01T08:00:02.000+08:00| d0| 3| 2| |1970-01-01T08:00:03.000+08:00| d0| 1| 2| +-----------------------------+------+----+-----+
查询语句:
IoTDB> SELECT *, count(flow) OVER(PARTITION BY device ORDER BY flow GROUPS BETWEEN 1 PRECEDING AND CURRENT ROW) as count FROM device_flow;
拆解步骤:
查询结果:
+-----------------------------+------+----+-----+ | time|device|flow|count| +-----------------------------+------+----+-----+ |1970-01-01T08:00:04.000+08:00| d1| 2| 1| |1970-01-01T08:00:05.000+08:00| d1| 4| 2| |1970-01-01T08:00:03.000+08:00| d0| 1| 1| |1970-01-01T08:00:00.000+08:00| d0| 3| 3| |1970-01-01T08:00:02.000+08:00| d0| 3| 3| |1970-01-01T08:00:01.000+08:00| d0| 5| 3| +-----------------------------+------+----+-----+
查询语句:
IoTDB> SELECT *,count(flow) OVER(PARTITION BY device ORDER BY flow RANGE BETWEEN 2 PRECEDING AND CURRENT ROW) as count FROM device_flow;
拆解步骤:
查询结果:
+-----------------------------+------+----+-----+ | time|device|flow|count| +-----------------------------+------+----+-----+ |1970-01-01T08:00:04.000+08:00| d1| 2| 1| |1970-01-01T08:00:05.000+08:00| d1| 4| 2| |1970-01-01T08:00:03.000+08:00| d0| 1| 1| |1970-01-01T08:00:00.000+08:00| d0| 3| 3| |1970-01-01T08:00:02.000+08:00| d0| 3| 3| |1970-01-01T08:00:01.000+08:00| d0| 5| 3| +-----------------------------+------+----+-----+
所有内置聚合函数,如 sum()、avg()、min()、max() 都能当作 Window Function 使用。
注意:与 GROUP BY 不同,Window Function 中每一行都有相应的输出
示例:
IoTDB> SELECT *, sum(flow) OVER (PARTITION BY device ORDER BY flow) as sum FROM device_flow; +-----------------------------+------+----+----+ | time|device|flow| sum| +-----------------------------+------+----+----+ |1970-01-01T08:00:04.000+08:00| d1| 2| 2.0| |1970-01-01T08:00:05.000+08:00| d1| 4| 6.0| |1970-01-01T08:00:03.000+08:00| d0| 1| 1.0| |1970-01-01T08:00:00.000+08:00| d0| 3| 7.0| |1970-01-01T08:00:02.000+08:00| d0| 3| 7.0| |1970-01-01T08:00:01.000+08:00| d0| 5|12.0| +-----------------------------+------+----+----+
first_valuefirst_value(value) [IGNORE NULLS]IoTDB> SELECT *, first_value(flow) OVER w as first_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); +-----------------------------+------+----+-----------+ | time|device|flow|first_value| +-----------------------------+------+----+-----------+ |1970-01-01T08:00:04.000+08:00| d1| 2| 2| |1970-01-01T08:00:05.000+08:00| d1| 4| 2| |1970-01-01T08:00:03.000+08:00| d0| 1| 1| |1970-01-01T08:00:00.000+08:00| d0| 3| 1| |1970-01-01T08:00:02.000+08:00| d0| 3| 3| |1970-01-01T08:00:01.000+08:00| d0| 5| 3| +-----------------------------+------+----+-----------+
last_valuelast_value(value) [IGNORE NULLS]IoTDB> SELECT *, last_value(flow) OVER w as last_value FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); +-----------------------------+------+----+----------+ | time|device|flow|last_value| +-----------------------------+------+----+----------+ |1970-01-01T08:00:04.000+08:00| d1| 2| 4| |1970-01-01T08:00:05.000+08:00| d1| 4| 4| |1970-01-01T08:00:03.000+08:00| d0| 1| 3| |1970-01-01T08:00:00.000+08:00| d0| 3| 3| |1970-01-01T08:00:02.000+08:00| d0| 3| 5| |1970-01-01T08:00:01.000+08:00| d0| 5| 5| +-----------------------------+------+----+----------+
nth_valuenth_value(value, n) [IGNORE NULLS]IoTDB> SELECT *, nth_value(flow, 2) OVER w as nth_values FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING); +-----------------------------+------+----+----------+ | time|device|flow|nth_values| +-----------------------------+------+----+----------+ |1970-01-01T08:00:04.000+08:00| d1| 2| 4| |1970-01-01T08:00:05.000+08:00| d1| 4| 4| |1970-01-01T08:00:03.000+08:00| d0| 1| 3| |1970-01-01T08:00:00.000+08:00| d0| 3| 3| |1970-01-01T08:00:02.000+08:00| d0| 3| 3| |1970-01-01T08:00:01.000+08:00| d0| 5| 5| +-----------------------------+------+----+----------+
lead(value[, offset[, default]]) [IGNORE NULLS]IoTDB> SELECT *, lead(flow) OVER w as lead FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY time); +-----------------------------+------+----+----+ | time|device|flow|lead| +-----------------------------+------+----+----+ |1970-01-01T08:00:04.000+08:00| d1| 2| 4| |1970-01-01T08:00:05.000+08:00| d1| 4|null| |1970-01-01T08:00:00.000+08:00| d0| 3| 5| |1970-01-01T08:00:01.000+08:00| d0| 5| 3| |1970-01-01T08:00:02.000+08:00| d0| 3| 1| |1970-01-01T08:00:03.000+08:00| d0| 1|null| +-----------------------------+------+----+----+
lag(value[, offset[, default]]) [IGNORE NULLS]IoTDB> SELECT *, lag(flow) OVER w as lag FROM device_flow WINDOW w AS(PARTITION BY device ORDER BY device); +-----------------------------+------+----+----+ | time|device|flow| lag| +-----------------------------+------+----+----+ |1970-01-01T08:00:04.000+08:00| d1| 2|null| |1970-01-01T08:00:05.000+08:00| d1| 4| 2| |1970-01-01T08:00:00.000+08:00| d0| 3|null| |1970-01-01T08:00:01.000+08:00| d0| 5| 3| |1970-01-01T08:00:02.000+08:00| d0| 3| 5| |1970-01-01T08:00:03.000+08:00| d0| 1| 3| +-----------------------------+------+----+----+
rank()IoTDB> SELECT *, rank() OVER w as rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); +-----------------------------+------+----+----+ | time|device|flow|rank| +-----------------------------+------+----+----+ |1970-01-01T08:00:04.000+08:00| d1| 2| 1| |1970-01-01T08:00:05.000+08:00| d1| 4| 2| |1970-01-01T08:00:03.000+08:00| d0| 1| 1| |1970-01-01T08:00:00.000+08:00| d0| 3| 2| |1970-01-01T08:00:02.000+08:00| d0| 3| 2| |1970-01-01T08:00:01.000+08:00| d0| 5| 4| +-----------------------------+------+----+----+
dense_rank()IoTDB> SELECT *, dense_rank() OVER w as dense_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); +-----------------------------+------+----+----------+ | time|device|flow|dense_rank| +-----------------------------+------+----+----------+ |1970-01-01T08:00:04.000+08:00| d1| 2| 1| |1970-01-01T08:00:05.000+08:00| d1| 4| 2| |1970-01-01T08:00:03.000+08:00| d0| 1| 1| |1970-01-01T08:00:00.000+08:00| d0| 3| 2| |1970-01-01T08:00:02.000+08:00| d0| 3| 2| |1970-01-01T08:00:01.000+08:00| d0| 5| 3| +-----------------------------+------+----+----------+
row_number()IoTDB> SELECT *, row_number() OVER w as row_number FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); +-----------------------------+------+----+----------+ | time|device|flow|row_number| +-----------------------------+------+----+----------+ |1970-01-01T08:00:04.000+08:00| d1| 2| 1| |1970-01-01T08:00:05.000+08:00| d1| 4| 2| |1970-01-01T08:00:03.000+08:00| d0| 1| 1| |1970-01-01T08:00:00.000+08:00| d0| 3| 2| |1970-01-01T08:00:02.000+08:00| d0| 3| 3| |1970-01-01T08:00:01.000+08:00| d0| 5| 4| +-----------------------------+------+----+----------+
percent_rank()IoTDB> SELECT *, percent_rank() OVER w as percent_rank FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); +-----------------------------+------+----+------------------+ | time|device|flow| percent_rank| +-----------------------------+------+----+------------------+ |1970-01-01T08:00:04.000+08:00| d1| 2| 0.0| |1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| |1970-01-01T08:00:03.000+08:00| d0| 1| 0.0| |1970-01-01T08:00:00.000+08:00| d0| 3|0.3333333333333333| |1970-01-01T08:00:02.000+08:00| d0| 3|0.3333333333333333| |1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| +-----------------------------+------+----+------------------+
IoTDB> SELECT *, cume_dist() OVER w as cume_dist FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); +-----------------------------+------+----+---------+ | time|device|flow|cume_dist| +-----------------------------+------+----+---------+ |1970-01-01T08:00:04.000+08:00| d1| 2| 0.5| |1970-01-01T08:00:05.000+08:00| d1| 4| 1.0| |1970-01-01T08:00:03.000+08:00| d0| 1| 0.25| |1970-01-01T08:00:00.000+08:00| d0| 3| 0.75| |1970-01-01T08:00:02.000+08:00| d0| 3| 0.75| |1970-01-01T08:00:01.000+08:00| d0| 5| 1.0| +-----------------------------+------+----+---------+
IoTDB> SELECT *, ntile(2) OVER w as ntile FROM device_flow WINDOW w AS (PARTITION BY device ORDER BY flow); +-----------------------------+------+----+-----+ | time|device|flow|ntile| +-----------------------------+------+----+-----+ |1970-01-01T08:00:04.000+08:00| d1| 2| 1| |1970-01-01T08:00:05.000+08:00| d1| 4| 2| |1970-01-01T08:00:03.000+08:00| d0| 1| 1| |1970-01-01T08:00:00.000+08:00| d0| 3| 1| |1970-01-01T08:00:02.000+08:00| d0| 3| 2| |1970-01-01T08:00:01.000+08:00| d0| 5| 2| +-----------------------------+------+----+-----+
对于每个设备的每一行,与前一行求差值:
SELECT *, measurement - lag(measurement) OVER (PARTITION BY device ORDER BY time) FROM data WHERE timeCondition;
对于每个设备的每一行,与后一行求差值:
SELECT *, measurement - lead(measurement) OVER (PARTITION BY device ORDER BY time) FROM data WHERE timeCondition;
对于单个设备的每一行,与前一行求差值(后一行同理):
SELECT *, measurement - lag(measurement) OVER (ORDER BY time) FROM data where device='d1' WHERE timeCondition;
利用 rank 获取序号,然后在外部的查询中保留想要的顺序。
(注意, window function 的执行顺序在 HAVING 子句之后,所以这里需要子查询)
SELECT * FROM( SELECT *, rank() OVER (PARTITION BY device ORDER BY time DESC) FROM data WHERE timeCondition ) WHERE rank <= 3;
除了按照时间排序之外,还可以按照测点的值进行排序:
SELECT * FROM( SELECT *, rank() OVER (PARTITION BY device ORDER BY measurement DESC) FROM data WHERE timeCondition ) WHERE rank <= 3;
这个 sql 用来去除输入序列中连续相同值,可以用 lead + 子查询实现:
SELECT time, device, measurement FROM( SELECT time, device, measurement, LEAD(measurement) OVER (PARTITION BY device ORDER BY time) AS next FROM data WHERE timeCondition ) WHERE measurement != next OR next IS NULL;