| <!-- |
| |
| Licensed to the Apache Software Foundation (ASF) under one |
| or more contributor license agreements. See the NOTICE file |
| distributed with this work for additional information |
| regarding copyright ownership. The ASF licenses this file |
| to you under the Apache License, Version 2.0 (the |
| "License"); you may not use this file except in compliance |
| with the License. You may obtain a copy of the License at |
| |
| http://www.apache.org/licenses/LICENSE-2.0 |
| |
| Unless required by applicable law or agreed to in writing, |
| software distributed under the License is distributed on an |
| "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY |
| KIND, either express or implied. See the License for the |
| specific language governing permissions and limitations |
| under the License. |
| |
| --> |
| |
| # 选择表达式 |
| |
| ## 语法定义 |
| |
| 选择表达式(`selectExpr`)是 SELECT 子句的组成单元,每个 `selectExpr` 对应查询结果集中的一列,其语法定义如下: |
| |
| ```sql |
| selectClause |
| : SELECT resultColumn (',' resultColumn)* |
| ; |
| |
| resultColumn |
| : selectExpr (AS alias)? |
| ; |
| |
| selectExpr |
| : '(' selectExpr ')' |
| | '-' selectExpr |
| | '!' selectExpr |
| | selectExpr ('*' | '/' | '%') selectExpr |
| | selectExpr ('+' | '-') selectExpr |
| | selectExpr ('>' | '>=' | '<' | '<=' | '==' | '!=') selectExpr |
| | selectExpr (AND | OR) selectExpr |
| | functionName '(' selectExpr (',' selectExpr)* functionAttribute* ')' |
| | timeSeriesSuffixPath |
| | number |
| ; |
| ``` |
| |
| 由该语法定义可知,`selectExpr` 可以包含: |
| |
| - 时间序列路径后缀 |
| - 函数 |
| - 内置聚合函数,详见 [聚合查询](./Aggregate-Query.md) 。 |
| - 时间序列生成函数 |
| - 用户自定义函数,详见 [UDF](../Process-Data/UDF-User-Defined-Function.md) 。 |
| - 表达式 |
| - 算数运算表达式 |
| - 逻辑运算表达式 |
| - 时间序列查询嵌套表达式 |
| - 聚合查询嵌套表达式 |
| - 数字常量(仅用于表达式) |
| |
| ## 算数运算查询 |
| |
| > 请注意,对齐时间序列和非对齐时间序列在当前版本中没有任何区别,均支持算数运算查询。 |
| |
| ### 运算符 |
| |
| #### 一元算数运算符 |
| |
| 支持的运算符:`+`, `-` |
| |
| 输入数据类型要求:`INT32`, `INT64`, `FLOAT`, `DOUBLE` |
| |
| 输出数据类型:与输入数据类型一致 |
| |
| #### 二元算数运算符 |
| |
| 支持的运算符:`+`, `-`, `*`, `/`, `%` |
| |
| 输入数据类型要求:`INT32`, `INT64`, `FLOAT`和`DOUBLE` |
| |
| 输出数据类型:`DOUBLE` |
| |
| 注意:当某个时间戳下左操作数和右操作数都不为空(`null`)时,二元运算操作才会有输出结果 |
| |
| ### 使用示例 |
| |
| 例如: |
| |
| ```sql |
| select s1, - s1, s2, + s2, s1 + s2, s1 - s2, s1 * s2, s1 / s2, s1 % s2 from root.sg.d1 |
| ``` |
| |
| 结果: |
| |
| ``` |
| +-----------------------------+-------------+--------------+-------------+-------------+-----------------------------+-----------------------------+-----------------------------+-----------------------------+-----------------------------+ |
| | Time|root.sg.d1.s1|-root.sg.d1.s1|root.sg.d1.s2|root.sg.d1.s2|root.sg.d1.s1 + root.sg.d1.s2|root.sg.d1.s1 - root.sg.d1.s2|root.sg.d1.s1 * root.sg.d1.s2|root.sg.d1.s1 / root.sg.d1.s2|root.sg.d1.s1 % root.sg.d1.s2| |
| +-----------------------------+-------------+--------------+-------------+-------------+-----------------------------+-----------------------------+-----------------------------+-----------------------------+-----------------------------+ |
| |1970-01-01T08:00:00.001+08:00| 1.0| -1.0| 1.0| 1.0| 2.0| 0.0| 1.0| 1.0| 0.0| |
| |1970-01-01T08:00:00.002+08:00| 2.0| -2.0| 2.0| 2.0| 4.0| 0.0| 4.0| 1.0| 0.0| |
| |1970-01-01T08:00:00.003+08:00| 3.0| -3.0| 3.0| 3.0| 6.0| 0.0| 9.0| 1.0| 0.0| |
| |1970-01-01T08:00:00.004+08:00| 4.0| -4.0| 4.0| 4.0| 8.0| 0.0| 16.0| 1.0| 0.0| |
| |1970-01-01T08:00:00.005+08:00| 5.0| -5.0| 5.0| 5.0| 10.0| 0.0| 25.0| 1.0| 0.0| |
| +-----------------------------+-------------+--------------+-------------+-------------+-----------------------------+-----------------------------+-----------------------------+-----------------------------+-----------------------------+ |
| Total line number = 5 |
| It costs 0.014s |
| ``` |
| |
| ## 逻辑运算查询 |
| ### 运算符 |
| #### 一元逻辑运算符 |
| 支持运算符 `!` |
| |
| 输入数据类型:`BOOLEAN` |
| |
| 输出数据类型:`BOOLEAN` |
| |
| 注意:`!`的优先级很高,记得使用括号调整优先级 |
| |
| #### 二元比较运算符 |
| |
| 支持运算符 `>`, `>=`, `<`, `<=`, `==`, `!=` |
| |
| 输入数据类型: `INT32`, `INT64`, `FLOAT`, `DOUBLE` |
| |
| 注意:会将所有数据转换为`DOUBLE`类型后进行比较。`==`和`!=`可以直接比较两个`BOOLEAN` |
| |
| 返回类型:`BOOLEAN` |
| |
| #### 二元逻辑运算符 |
| |
| 支持运算符 AND:`and`,`&`, `&&`; OR:`or`,`|`,`||` |
| |
| 输入数据类型:`BOOLEAN` |
| |
| 返回类型 `BOOLEAN` |
| |
| 注意:当某个时间戳下左操作数和右操作数都为`BOOLEAN`类型时,二元逻辑操作才会有输出结果 |
| |
| #### IN 运算符 |
| |
| 支持运算符 `IN` |
| |
| 输入数据类型:`All Types` |
| |
| 返回类型 `BOOLEAN` |
| |
| 注意: 请确保集合中的值串可以被转为操作数的类型 |
| > 例如: |
| > |
| >`s1 in (1, 2, 3, 'test')`,`s1`的数据类型是`INT32` |
| > |
| > 我们将会抛出异常,因为`'test'`不能被转为`INT32`类型 |
| |
| #### 字符串匹配运算符 |
| |
| 支持运算符 `LIKE`, `REGEXP` |
| |
| 输入数据类型:`TEXT` |
| |
| 返回类型:`BOOLEAN` |
| |
| ### 使用示例 |
| 输入1: |
| ```sql |
| select a, b, a > 10, a <= b, !(a <= b), a > 10 && a > b from root.test; |
| ``` |
| 输出1: |
| ``` |
| IoTDB> select a, b, a > 10, a <= b, !(a <= b), a > 10 && a > b from root.test; |
| +-----------------------------+-----------+-----------+----------------+--------------------------+---------------------------+------------------------------------------------+ |
| | Time|root.test.a|root.test.b|root.test.a > 10|root.test.a <= root.test.b|!root.test.a <= root.test.b|(root.test.a > 10) & (root.test.a > root.test.b)| |
| +-----------------------------+-----------+-----------+----------------+--------------------------+---------------------------+------------------------------------------------+ |
| |1970-01-01T08:00:00.001+08:00| 23| 10.0| true| false| true| true| |
| |1970-01-01T08:00:00.002+08:00| 33| 21.0| true| false| true| true| |
| |1970-01-01T08:00:00.004+08:00| 13| 15.0| true| true| false| false| |
| |1970-01-01T08:00:00.005+08:00| 26| 0.0| true| false| true| true| |
| |1970-01-01T08:00:00.008+08:00| 1| 22.0| false| true| false| false| |
| |1970-01-01T08:00:00.010+08:00| 23| 12.0| true| false| true| true| |
| +-----------------------------+-----------+-----------+----------------+--------------------------+---------------------------+------------------------------------------------+ |
| ``` |
| |
| 输入2: |
| ```sql |
| select a, b, a in (1, 2), b like '1%', b regexp '[0-2]' from root.test; |
| ``` |
| |
| 输出2: |
| ``` |
| +-----------------------------+-----------+-----------+--------------------+-------------------------+--------------------------+ |
| | Time|root.test.a|root.test.b|root.test.a IN (1,2)|root.test.b LIKE '^1.*?$'|root.test.b REGEXP '[0-2]'| |
| +-----------------------------+-----------+-----------+--------------------+-------------------------+--------------------------+ |
| |1970-01-01T08:00:00.001+08:00| 1| 111test111| true| true| true| |
| |1970-01-01T08:00:00.003+08:00| 3| 333test333| false| false| false| |
| +-----------------------------+-----------+-----------+--------------------+-------------------------+--------------------------+ |
| ``` |
| |
| ## 运算符优先级 |
| |
| |优先级 |运算符 |含义 | |
| |-------|-------|-----------| |
| |1 |`-` |单目运算符负号 | |
| |1 |`+` |单目运算符正号 | |
| |1 |`!` |单目运算符取非 | |
| |2 |`*` |双目运算符乘 | |
| |2 |`/` |双目运算符除 | |
| |2 |`%` |双目运算符取余| |
| |3 |`+` |双目运算符加| |
| |3 |`-` |双目运算符减| |
| |4 |`>` |双目比较运算符大于| |
| |4 |`>=` |双目比较运算符大于等于| |
| |4 |`<` |双目比较运算符小于| |
| |4 |`<=` |双目比较运算符小于等于| |
| |4 |`==` |双目比较运算符等于| |
| |4 |`!=`/`<>` |双目比较运算符不等于| |
| |5 |`REGEXP` |`REGEXP`运算符| |
| |5 |`LIKE` |`LIKE`运算符| |
| |6 |`IN` |`IN`运算符| |
| |7 |`and`/`&`/`&&`|双目逻辑运算符与| |
| |8 |`or`/ | / |||双目逻辑运算符或| |
| <!--- |即管道符 转义不能用在``里, 表格内不允许使用管道符 --> |
| |
| ## 内置时间序列生成函数 |
| |
| 时间序列生成函数可接受若干原始时间序列作为输入,产生一列时间序列输出。与聚合函数不同的是,时间序列生成函数的结果集带有时间戳列。 |
| |
| 所有的时间序列生成函数都可以接受 * 作为输入,都可以与原始查询混合进行。 |
| |
| > 请注意,目前对齐时间序列(Aligned Timeseries)尚不支持内置函数查询。使用内置函数时,如果自变量包含对齐时间序列,会提示错误。 |
| |
| ### 数学函数 |
| |
| 目前 IoTDB 支持下列数学函数,这些数学函数的行为与这些函数在 Java Math 标准库中对应实现的行为一致。 |
| |
| | 函数名 | 输入序列类型 | 输出序列类型 | Java 标准库中的对应实现 | |
| | ------- | ------------------------------ | ------------------------ | ------------------------------------------------------------ | |
| | SIN | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#sin(double) | |
| | COS | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#cos(double) | |
| | TAN | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#tan(double) | |
| | ASIN | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#asin(double) | |
| | ACOS | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#acos(double) | |
| | ATAN | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#atan(double) | |
| | SINH | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#sinh(double) | |
| | COSH | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#cosh(double) | |
| | TANH | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#tanh(double) | |
| | DEGREES | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#toDegrees(double) | |
| | RADIANS | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#toRadians(double) | |
| | ABS | INT32 / INT64 / FLOAT / DOUBLE | 与输入序列的实际类型一致 | Math#abs(int) / Math#abs(long) /Math#abs(float) /Math#abs(double) | |
| | SIGN | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#signum(double) | |
| | CEIL | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#ceil(double) | |
| | FLOOR | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#floor(double) | |
| | ROUND | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#rint(double) | |
| | EXP | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#exp(double) | |
| | LN | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#log(double) | |
| | LOG10 | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#log10(double) | |
| | SQRT | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | Math#sqrt(double) | |
| |
| 例如: |
| |
| ``` sql |
| select s1, sin(s1), cos(s1), tan(s1) from root.sg1.d1 limit 5 offset 1000; |
| ``` |
| |
| 结果: |
| |
| ``` |
| +-----------------------------+-------------------+-------------------+--------------------+-------------------+ |
| | Time| root.sg1.d1.s1|sin(root.sg1.d1.s1)| cos(root.sg1.d1.s1)|tan(root.sg1.d1.s1)| |
| +-----------------------------+-------------------+-------------------+--------------------+-------------------+ |
| |2020-12-10T17:11:49.037+08:00|7360723084922759782| 0.8133527237573284| 0.5817708713544664| 1.3980636773094157| |
| |2020-12-10T17:11:49.038+08:00|4377791063319964531|-0.8938962705202537| 0.4482738644511651| -1.994085181866842| |
| |2020-12-10T17:11:49.039+08:00|7972485567734642915| 0.9627757585308978|-0.27030138509681073|-3.5618602479083545| |
| |2020-12-10T17:11:49.040+08:00|2508858212791964081|-0.6073417341629443| -0.7944406950452296| 0.7644897069734913| |
| |2020-12-10T17:11:49.041+08:00|2817297431185141819|-0.8419358900502509| -0.5395775727782725| 1.5603611649667768| |
| +-----------------------------+-------------------+-------------------+--------------------+-------------------+ |
| Total line number = 5 |
| It costs 0.008s |
| ``` |
| |
| ### 字符串函数 |
| |
| 目前 IoTDB 支持下列字符串处理函数: |
| |
| | 函数名 | 输入序列类型 | 必要的属性参数 | 输出序列类型 | 功能描述 | |
| | --------------- | ------------ | ------------------------------------ | ------------ | ----------------------------------------- | |
| | STRING_CONTAINS | TEXT | `s`: 待搜寻的字符串 | BOOLEAN | 判断字符串中是否存在`s` | |
| | STRING_MATCHES | TEXT | `regex`: Java 标准库风格的正则表达式 | BOOLEAN | 判断字符串是否能够被正则表达式`regex`匹配 | |
| |
| 例如: |
| |
| ``` sql |
| select s1, string_contains(s1, 's'='warn'), string_matches(s1, 'regex'='[^\\s]+37229') from root.sg1.d4; |
| ``` |
| |
| 结果: |
| |
| ``` |
| +-----------------------------+--------------+-------------------------------------------+------------------------------------------------------+ |
| | Time|root.sg1.d4.s1|string_contains(root.sg1.d4.s1, "s"="warn")|string_matches(root.sg1.d4.s1, "regex"="[^\\s]+37229")| |
| +-----------------------------+--------------+-------------------------------------------+------------------------------------------------------+ |
| |1970-01-01T08:00:00.001+08:00| warn:-8721| true| false| |
| |1970-01-01T08:00:00.002+08:00| error:-37229| false| true| |
| |1970-01-01T08:00:00.003+08:00| warn:1731| true| false| |
| +-----------------------------+--------------+-------------------------------------------+------------------------------------------------------+ |
| Total line number = 3 |
| It costs 0.007s |
| ``` |
| |
| ### 选择函数 |
| |
| 目前 IoTDB 支持如下选择函数: |
| |
| | 函数名 | 输入序列类型 | 必要的属性参数 | 输出序列类型 | 功能描述 | |
| | -------- | ------------------------------------- | ------------------------------------------------- | ------------------------ | ------------------------------------------------------------ | |
| | TOP_K | INT32 / INT64 / FLOAT / DOUBLE / TEXT | `k`: 最多选择的数据点数,必须大于 0 小于等于 1000 | 与输入序列的实际类型一致 | 返回某时间序列中值最大的`k`个数据点。若多于`k`个数据点的值并列最大,则返回时间戳最小的数据点。 | |
| | BOTTOM_K | INT32 / INT64 / FLOAT / DOUBLE / TEXT | `k`: 最多选择的数据点数,必须大于 0 小于等于 1000 | 与输入序列的实际类型一致 | 返回某时间序列中值最小的`k`个数据点。若多于`k`个数据点的值并列最小,则返回时间戳最小的数据点。 | |
| |
| 例如: |
| |
| ``` sql |
| select s1, top_k(s1, 'k'='2'), bottom_k(s1, 'k'='2') from root.sg1.d2 where time > 2020-12-10T20:36:15.530+08:00; |
| ``` |
| |
| 结果: |
| |
| ``` |
| +-----------------------------+--------------------+------------------------------+---------------------------------+ |
| | Time| root.sg1.d2.s1|top_k(root.sg1.d2.s1, "k"="2")|bottom_k(root.sg1.d2.s1, "k"="2")| |
| +-----------------------------+--------------------+------------------------------+---------------------------------+ |
| |2020-12-10T20:36:15.531+08:00| 1531604122307244742| 1531604122307244742| null| |
| |2020-12-10T20:36:15.532+08:00|-7426070874923281101| null| null| |
| |2020-12-10T20:36:15.533+08:00|-7162825364312197604| -7162825364312197604| null| |
| |2020-12-10T20:36:15.534+08:00|-8581625725655917595| null| -8581625725655917595| |
| |2020-12-10T20:36:15.535+08:00|-7667364751255535391| null| -7667364751255535391| |
| +-----------------------------+--------------------+------------------------------+---------------------------------+ |
| Total line number = 5 |
| It costs 0.006s |
| ``` |
| |
| ### 趋势计算函数 |
| |
| 目前 IoTDB 支持如下趋势计算函数: |
| |
| | 函数名 | 输入序列类型 | 输出序列类型 | 功能描述 | |
| | ----------------------- | ----------------------------------------------- | ------------------------ | ------------------------------------------------------------ | |
| | TIME_DIFFERENCE | INT32 / INT64 / FLOAT / DOUBLE / BOOLEAN / TEXT | INT64 | 统计序列中某数据点的时间戳与前一数据点时间戳的差。范围内第一个数据点没有对应的结果输出。 | |
| | DIFFERENCE | INT32 / INT64 / FLOAT / DOUBLE | 与输入序列的实际类型一致 | 统计序列中某数据点的值与前一数据点的值的差。范围内第一个数据点没有对应的结果输出。 | |
| | NON_NEGATIVE_DIFFERENCE | INT32 / INT64 / FLOAT / DOUBLE | 与输入序列的实际类型一致 | 统计序列中某数据点的值与前一数据点的值的差的绝对值。范围内第一个数据点没有对应的结果输出。 | |
| | DERIVATIVE | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | 统计序列中某数据点相对于前一数据点的变化率,数量上等同于 DIFFERENCE / TIME_DIFFERENCE。范围内第一个数据点没有对应的结果输出。 | |
| | NON_NEGATIVE_DERIVATIVE | INT32 / INT64 / FLOAT / DOUBLE | DOUBLE | 统计序列中某数据点相对于前一数据点的变化率的绝对值,数量上等同于 NON_NEGATIVE_DIFFERENCE / TIME_DIFFERENCE。范围内第一个数据点没有对应的结果输出。 | |
| |
| 例如: |
| |
| ``` sql |
| select s1, time_difference(s1), difference(s1), non_negative_difference(s1), derivative(s1), non_negative_derivative(s1) from root.sg1.d1 limit 5 offset 1000; |
| ``` |
| |
| 结果: |
| |
| ``` |
| +-----------------------------+-------------------+-------------------------------+--------------------------+---------------------------------------+--------------------------+---------------------------------------+ |
| | Time| root.sg1.d1.s1|time_difference(root.sg1.d1.s1)|difference(root.sg1.d1.s1)|non_negative_difference(root.sg1.d1.s1)|derivative(root.sg1.d1.s1)|non_negative_derivative(root.sg1.d1.s1)| |
| +-----------------------------+-------------------+-------------------------------+--------------------------+---------------------------------------+--------------------------+---------------------------------------+ |
| |2020-12-10T17:11:49.037+08:00|7360723084922759782| 1| -8431715764844238876| 8431715764844238876| -8.4317157648442388E18| 8.4317157648442388E18| |
| |2020-12-10T17:11:49.038+08:00|4377791063319964531| 1| -2982932021602795251| 2982932021602795251| -2.982932021602795E18| 2.982932021602795E18| |
| |2020-12-10T17:11:49.039+08:00|7972485567734642915| 1| 3594694504414678384| 3594694504414678384| 3.5946945044146785E18| 3.5946945044146785E18| |
| |2020-12-10T17:11:49.040+08:00|2508858212791964081| 1| -5463627354942678834| 5463627354942678834| -5.463627354942679E18| 5.463627354942679E18| |
| |2020-12-10T17:11:49.041+08:00|2817297431185141819| 1| 308439218393177738| 308439218393177738| 3.0843921839317773E17| 3.0843921839317773E17| |
| +-----------------------------+-------------------+-------------------------------+--------------------------+---------------------------------------+--------------------------+---------------------------------------+ |
| Total line number = 5 |
| It costs 0.014s |
| ``` |
| |
| ### 常序列生成函数 |
| |
| 常序列生成函数用于生成所有数据点的值都相同的时间序列。 |
| |
| 常序列生成函数接受一个或者多个时间序列输入,其输出的数据点的时间戳集合是这些输入序列时间戳集合的并集。 |
| |
| 目前 IoTDB 支持如下常序列生成函数: |
| |
| | 函数名 | 必要的属性参数 | 输出序列类型 | 功能描述 | |
| | ------ | ------------------------------------------------------------ | -------------------------- | ------------------------------------------------------------ | |
| | CONST | `value`: 输出的数据点的值 <br />`type`: 输出的数据点的类型,只能是 INT32 / INT64 / FLOAT / DOUBLE / BOOLEAN / TEXT | 由输入属性参数 `type` 决定 | 根据输入属性 `value` 和 `type` 输出用户指定的常序列。 | |
| | PI | 无 | DOUBLE | 常序列的值:`π` 的 `double` 值,圆的周长与其直径的比值,即圆周率,等于 *Java标准库* 中的`Math.PI`。 | |
| | E | 无 | DOUBLE | 常序列的值:`e` 的 `double` 值,自然对数的底,它等于 *Java 标准库* 中的 `Math.E`。 | |
| |
| 例如: |
| |
| ``` sql |
| select s1, s2, const(s1, 'value'='1024', 'type'='INT64'), pi(s2), e(s1, s2) from root.sg1.d1; |
| ``` |
| |
| 结果: |
| |
| ``` |
| select s1, s2, const(s1, 'value'='1024', 'type'='INT64'), pi(s2), e(s1, s2) from root.sg1.d1; |
| +-----------------------------+--------------+--------------+-----------------------------------------------------+------------------+---------------------------------+ |
| | Time|root.sg1.d1.s1|root.sg1.d1.s2|const(root.sg1.d1.s1, "value"="1024", "type"="INT64")|pi(root.sg1.d1.s2)|e(root.sg1.d1.s1, root.sg1.d1.s2)| |
| +-----------------------------+--------------+--------------+-----------------------------------------------------+------------------+---------------------------------+ |
| |1970-01-01T08:00:00.000+08:00| 0.0| 0.0| 1024| 3.141592653589793| 2.718281828459045| |
| |1970-01-01T08:00:00.001+08:00| 1.0| null| 1024| null| 2.718281828459045| |
| |1970-01-01T08:00:00.002+08:00| 2.0| null| 1024| null| 2.718281828459045| |
| |1970-01-01T08:00:00.003+08:00| null| 3.0| null| 3.141592653589793| 2.718281828459045| |
| |1970-01-01T08:00:00.004+08:00| null| 4.0| null| 3.141592653589793| 2.718281828459045| |
| +-----------------------------+--------------+--------------+-----------------------------------------------------+------------------+---------------------------------+ |
| Total line number = 5 |
| It costs 0.005s |
| ``` |
| |
| ### 数据类型转换函数 |
| |
| 当前IoTDB支持6种数据类型,其中包括INT32、INT64、FLOAT、DOUBLE、BOOLEAN以及TEXT。当我们对数据进行查询或者计算时可能需要进行数据类型的转换, 比如说将TEXT转换为INT32,或者提高数据精度,比如说将FLOAT转换为DOUBLE。所以,IoTDB支持使用cast函数对数据类型进行转换。 |
| |
| | 函数名 | 必要的属性参数 | 输出序列类型 | 功能类型 | |
| | ------ | ------------------------------------------------------------ | ------------------------ | ---------------------------------- | |
| | CAST | `type`:输出的数据点的类型,只能是 INT32 / INT64 / FLOAT / DOUBLE / BOOLEAN / TEXT | 由输入属性参数`type`决定 | 将数据转换为`type`参数指定的类型。 | |
| |
| ##### 类型转换说明 |
| |
| 1.当INT32、INT64类型的值不为0时,FLOAT与DOUBLE类型的值不为0.0时,TEXT类型不为空字符串或者"false"时,转换为BOOLEAN类型时值为true,否则为false。 |
| |
| ``` |
| IoTDB> show timeseries root.sg.d1.*; |
| +-------------+-----+-------------+--------+--------+-----------+----+----------+ |
| | timeseries|alias|database|dataType|encoding|compression|tags|attributes| |
| +-------------+-----+-------------+--------+--------+-----------+----+----------+ |
| |root.sg.d1.s3| null| root.sg| FLOAT| RLE| SNAPPY|null| null| |
| |root.sg.d1.s4| null| root.sg| DOUBLE| RLE| SNAPPY|null| null| |
| |root.sg.d1.s5| null| root.sg| TEXT| PLAIN| SNAPPY|null| null| |
| |root.sg.d1.s6| null| root.sg| BOOLEAN| RLE| SNAPPY|null| null| |
| |root.sg.d1.s1| null| root.sg| INT32| RLE| SNAPPY|null| null| |
| |root.sg.d1.s2| null| root.sg| INT64| RLE| SNAPPY|null| null| |
| +-------------+-----+-------------+--------+--------+-----------+----+----------+ |
| Total line number = 6 |
| It costs 0.006s |
| IoTDB> select * from root.sg.d1; |
| +-----------------------------+-------------+-------------+-------------+-------------+-------------+-------------+ |
| | Time|root.sg.d1.s3|root.sg.d1.s4|root.sg.d1.s5|root.sg.d1.s6|root.sg.d1.s1|root.sg.d1.s2| |
| +-----------------------------+-------------+-------------+-------------+-------------+-------------+-------------+ |
| |1970-01-01T08:00:00.001+08:00| 1.1| 1.1| test| false| 1| 1| |
| |1970-01-01T08:00:00.002+08:00| -2.2| -2.2| false| true| -2| -2| |
| |1970-01-01T08:00:00.003+08:00| 0.0| 0.0| true| true| 0| 0| |
| +-----------------------------+-------------+-------------+-------------+-------------+-------------+-------------+ |
| Total line number = 3 |
| It costs 0.009s |
| IoTDB> select cast(s1, 'type'='BOOLEAN'), cast(s2, 'type'='BOOLEAN'), cast(s3, 'type'='BOOLEAN'), cast(s4, 'type'='BOOLEAN'), cast(s5, 'type'='BOOLEAN') from root.sg.d1; |
| +-----------------------------+-------------------------------------+-------------------------------------+-------------------------------------+-------------------------------------+-------------------------------------+ |
| | Time|cast(root.sg.d1.s1, "type"="BOOLEAN")|cast(root.sg.d1.s2, "type"="BOOLEAN")|cast(root.sg.d1.s3, "type"="BOOLEAN")|cast(root.sg.d1.s4, "type"="BOOLEAN")|cast(root.sg.d1.s5, "type"="BOOLEAN")| |
| +-----------------------------+-------------------------------------+-------------------------------------+-------------------------------------+-------------------------------------+-------------------------------------+ |
| |1970-01-01T08:00:00.001+08:00| true| true| true| true| true| |
| |1970-01-01T08:00:00.002+08:00| true| true| true| true| false| |
| |1970-01-01T08:00:00.003+08:00| false| false| false| false| true| |
| +-----------------------------+-------------------------------------+-------------------------------------+-------------------------------------+-------------------------------------+-------------------------------------+ |
| Total line number = 3 |
| It costs 0.012s |
| ``` |
| |
| 2.当BOOLEAN类型的值为true时,转换为INT32与INT64类型的值为1,转换为FLOAT或者DOUBLE类型时值为1.0,转换为TEXT类型时值为"true"。当BOOLEAN类型的值为false时,转换为INT32与INT64类型的值为0,转换为FLOAT或者DOUBLE类型时值为0.0,转换为TEXT类型时值为"false"。 |
| |
| ``` |
| IoTDB> select cast(s6, 'type'='INT32'), cast(s6, 'type'='INT64'), cast(s6, 'type'='FLOAT'), cast(s6, 'type'='DOUBLE'), cast(s6, 'type'='TEXT') from root.sg.d1; |
| +-----------------------------+-----------------------------------+-----------------------------------+-----------------------------------+------------------------------------+----------------------------------+ |
| | Time|cast(root.sg.d1.s6, "type"="INT32")|cast(root.sg.d1.s6, "type"="INT64")|cast(root.sg.d1.s6, "type"="FLOAT")|cast(root.sg.d1.s6, "type"="DOUBLE")|cast(root.sg.d1.s6, "type"="TEXT")| |
| +-----------------------------+-----------------------------------+-----------------------------------+-----------------------------------+------------------------------------+----------------------------------+ |
| |1970-01-01T08:00:00.001+08:00| 0| 0| 0.0| 0.0| false| |
| |1970-01-01T08:00:00.002+08:00| 1| 1| 1.0| 1.0| true| |
| |1970-01-01T08:00:00.003+08:00| 1| 1| 1.0| 1.0| true| |
| +-----------------------------+-----------------------------------+-----------------------------------+-----------------------------------+------------------------------------+----------------------------------+ |
| Total line number = 3 |
| It costs 0.016s |
| ``` |
| |
| 3.当TEXT类型转换为INT32、INT64、FLOAT类型时,会先将TEXT类型的数据转换为DOUBLE类型,然后再转换为对应的类型,此时可能会存在损失精度的问题。如果无法转换的话则直接跳过。 |
| |
| ``` |
| IoTDB> select cast(s5, 'type'='INT32'), cast(s5, 'type'='INT64'), cast(s5, 'type'='FLOAT') from root.sg.d1; |
| +----+-----------------------------------+-----------------------------------+-----------------------------------+ |
| |Time|cast(root.sg.d1.s5, "type"="INT32")|cast(root.sg.d1.s5, "type"="INT64")|cast(root.sg.d1.s5, "type"="FLOAT")| |
| +----+-----------------------------------+-----------------------------------+-----------------------------------+ |
| +----+-----------------------------------+-----------------------------------+-----------------------------------+ |
| Empty set. |
| It costs 0.009s |
| ``` |
| |
| |
| |
| ##### 演示 |
| 测试数据: |
| ``` |
| IoTDB> select text from root.test; |
| +-----------------------------+--------------+ |
| | Time|root.test.text| |
| +-----------------------------+--------------+ |
| |1970-01-01T08:00:00.001+08:00| 1.1| |
| |1970-01-01T08:00:00.002+08:00| 1| |
| |1970-01-01T08:00:00.003+08:00| hello world| |
| |1970-01-01T08:00:00.004+08:00| false| |
| +-----------------------------+--------------+ |
| ``` |
| SQL语句: |
| ```sql |
| select cast(text, 'type'='BOOLEAN'), cast(text, 'type'='INT32'), cast(text, 'type'='INT64'), cast(text, 'type'='FLOAT'), cast(text, 'type'='DOUBLE') from root.test; |
| ``` |
| 结果: |
| ``` |
| +-----------------------------+--------------------------------------+------------------------------------+------------------------------------+------------------------------------+-------------------------------------+ |
| | Time|cast(root.test.text, "type"="BOOLEAN")|cast(root.test.text, "type"="INT32")|cast(root.test.text, "type"="INT64")|cast(root.test.text, "type"="FLOAT")|cast(root.test.text, "type"="DOUBLE")| |
| +-----------------------------+--------------------------------------+------------------------------------+------------------------------------+------------------------------------+-------------------------------------+ |
| |1970-01-01T08:00:00.001+08:00| true| 1| 1| 1.1| 1.1| |
| |1970-01-01T08:00:00.002+08:00| true| 1| 1| 1.0| 1.0| |
| |1970-01-01T08:00:00.003+08:00| true| null| null| null| null| |
| |1970-01-01T08:00:00.004+08:00| false| null| null| null| null| |
| +-----------------------------+--------------------------------------+------------------------------------+------------------------------------+------------------------------------+-------------------------------------+ |
| Total line number = 4 |
| It costs 0.078s |
| ``` |
| |
| ### 条件函数 |
| 条件函数针对每个数据点进行条件判断,返回布尔值 |
| |
| 目前IoTDB支持以下条件函数 |
| |
| | 函数名 | 可接收的输入序列类型 | 必要的属性参数 | 输出序列类型 | 功能类型 | |
| |----------|--------------------------------|---------------------------------------|------------|--------------------------------------------------| |
| | ON_OFF | INT32 / INT64 / FLOAT / DOUBLE | `threshold`:DOUBLE类型 | BOOLEAN 类型 | 返回`ts_value >= threshold`的bool值 | |
| | IN_RANGE | INT32 / INT64 / FLOAT / DOUBLE | `lower`:DOUBLE类型<br/>`upper`:DOUBLE类型 | BOOLEAN类型 | 返回`ts_value >= lower && ts_value <= upper`的bool值 | | |
| |
| 测试数据: |
| ``` |
| IoTDB> select ts from root.test; |
| +-----------------------------+------------+ |
| | Time|root.test.ts| |
| +-----------------------------+------------+ |
| |1970-01-01T08:00:00.001+08:00| 1| |
| |1970-01-01T08:00:00.002+08:00| 2| |
| |1970-01-01T08:00:00.003+08:00| 3| |
| |1970-01-01T08:00:00.004+08:00| 4| |
| +-----------------------------+------------+ |
| ``` |
| |
| ##### 测试1 |
| |
| SQL语句: |
| ```sql |
| select ts, on_off(ts, 'threshold'='2') from root.test; |
| ``` |
| |
| 输出: |
| ``` |
| IoTDB> select ts, on_off(ts, 'threshold'='2') from root.test; |
| +-----------------------------+------------+-------------------------------------+ |
| | Time|root.test.ts|on_off(root.test.ts, "threshold"="2")| |
| +-----------------------------+------------+-------------------------------------+ |
| |1970-01-01T08:00:00.001+08:00| 1| false| |
| |1970-01-01T08:00:00.002+08:00| 2| true| |
| |1970-01-01T08:00:00.003+08:00| 3| true| |
| |1970-01-01T08:00:00.004+08:00| 4| true| |
| +-----------------------------+------------+-------------------------------------+ |
| ``` |
| |
| ##### 测试2 |
| Sql语句: |
| ```sql |
| select ts, in_range(ts, 'lower'='2', 'upper'='3.1') from root.test; |
| ``` |
| |
| 输出: |
| ``` |
| IoTDB> select ts, in_range(ts, 'lower'='2', 'upper'='3.1') from root.test; |
| +-----------------------------+------------+--------------------------------------------------+ |
| | Time|root.test.ts|in_range(root.test.ts, "lower"="2", "upper"="3.1")| |
| +-----------------------------+------------+--------------------------------------------------+ |
| |1970-01-01T08:00:00.001+08:00| 1| false| |
| |1970-01-01T08:00:00.002+08:00| 2| true| |
| |1970-01-01T08:00:00.003+08:00| 3| true| |
| |1970-01-01T08:00:00.004+08:00| 4| false| |
| +-----------------------------+------------+--------------------------------------------------+ |
| ``` |
| |
| ### 连续满足区间函数 |
| 连续满足条件区间函数用来查询所有满足指定条件的连续区间。 |
| 按返回值可分为两类: |
| 1. 返回满足条件连续区间的起始时间戳和时间跨度(时间跨度为0表示此处只有起始时间这一个数据点满足条件) |
| 2. 返回满足条件连续区间的起始时间戳和后面连续满足条件的点的个数(个数为1表示此处只有起始时间这一个数据点满足条件) |
| |
| | 函数名 | 输入序列类型 | 属性参数 | 输出序列类型 | 功能描述 | |
| |-------------------|--------------------------------------|------------------------------------------------|-------|------------------------------------------------------------------| |
| | ZERO_DURATION | INT32/ INT64/ FLOAT/ DOUBLE/ BOOLEAN | `min`:可选,默认值0</br>`max`:可选,默认值`Long.MAX_VALUE` | Long | 返回时间序列连续为0(false)的开始时间与持续时间,持续时间t(单位ms)满足`t >= min && t <= max` | |
| | NON_ZERO_DURATION | INT32/ INT64/ FLOAT/ DOUBLE/ BOOLEAN | `min`:可选,默认值0</br>`max`:可选,默认值`Long.MAX_VALUE` | Long | 返回时间序列连续不为0(false)的开始时间与持续时间,持续时间t(单位ms)满足`t >= min && t <= max` | | |
| | ZERO_COUNT | INT32/ INT64/ FLOAT/ DOUBLE/ BOOLEAN | `min`:可选,默认值1</br>`max`:可选,默认值`Long.MAX_VALUE` | Long | 返回时间序列连续为0(false)的开始时间与其后数据点的个数,数据点个数n满足`n >= min && n <= max` | | |
| | NON_ZERO_COUNT | INT32/ INT64/ FLOAT/ DOUBLE/ BOOLEAN | `min`:可选,默认值1</br>`max`:可选,默认值`Long.MAX_VALUE` | Long | 返回时间序列连续不为0(false)的开始时间与其后数据点的个数,数据点个数n满足`n >= min && n <= max` | | |
| |
| ##### 演示 |
| 测试数据: |
| ``` |
| IoTDB> select s1,s2,s3,s4,s5 from root.sg.d2; |
| +-----------------------------+-------------+-------------+-------------+-------------+-------------+ |
| | Time|root.sg.d2.s1|root.sg.d2.s2|root.sg.d2.s3|root.sg.d2.s4|root.sg.d2.s5| |
| +-----------------------------+-------------+-------------+-------------+-------------+-------------+ |
| |1970-01-01T08:00:00.000+08:00| 0| 0| 0.0| 0.0| false| |
| |1970-01-01T08:00:00.001+08:00| 1| 1| 1.0| 1.0| true| |
| |1970-01-01T08:00:00.002+08:00| 1| 1| 1.0| 1.0| true| |
| |1970-01-01T08:00:00.003+08:00| 0| 0| 0.0| 0.0| false| |
| |1970-01-01T08:00:00.004+08:00| 1| 1| 1.0| 1.0| true| |
| |1970-01-01T08:00:00.005+08:00| 0| 0| 0.0| 0.0| false| |
| |1970-01-01T08:00:00.006+08:00| 0| 0| 0.0| 0.0| false| |
| |1970-01-01T08:00:00.007+08:00| 1| 1| 1.0| 1.0| true| |
| +-----------------------------+-------------+-------------+-------------+-------------+-------------+ |
| ``` |
| sql: |
| ```sql |
| select s1, zero_count(s1), non_zero_count(s2), zero_duration(s3), non_zero_duration(s4) from root.sg.d2; |
| ``` |
| 结果: |
| ``` |
| +-----------------------------+-------------+-------------------------+-----------------------------+----------------------------+--------------------------------+ |
| | Time|root.sg.d2.s1|zero_count(root.sg.d2.s1)|non_zero_count(root.sg.d2.s2)|zero_duration(root.sg.d2.s3)|non_zero_duration(root.sg.d2.s4)| |
| +-----------------------------+-------------+-------------------------+-----------------------------+----------------------------+--------------------------------+ |
| |1970-01-01T08:00:00.000+08:00| 0| 1| null| 0| null| |
| |1970-01-01T08:00:00.001+08:00| 1| null| 2| null| 1| |
| |1970-01-01T08:00:00.002+08:00| 1| null| null| null| null| |
| |1970-01-01T08:00:00.003+08:00| 0| 1| null| 0| null| |
| |1970-01-01T08:00:00.004+08:00| 1| null| 1| null| 0| |
| |1970-01-01T08:00:00.005+08:00| 0| 2| null| 1| null| |
| |1970-01-01T08:00:00.006+08:00| 0| null| null| null| null| |
| |1970-01-01T08:00:00.007+08:00| 1| null| 1| null| 0| |
| +-----------------------------+-------------+-------------------------+-----------------------------+----------------------------+--------------------------------+ |
| ``` |
| |
| ### 等数量分桶降采样函数 |
| 本函数对输入序列进行等数量分桶采样,即根据用户给定的降采样比例和降采样方法将输入序列按固定点数等分为若干桶。在每个桶内通过给定的采样方法进行采样。 |
| - `proportion`:降采样比例,取值区间为`(0, 1]`。 |
| #### 等数量分桶随机采样 |
| 对等数量分桶后,桶内进行随机采样。 |
| |
| | 函数名 | 可接收的输入序列类型 | 必要的属性参数 | 输出序列类型 | 功能类型 | |
| |----------|--------------------------------|---------------------------------------|------------|--------------------------------------------------| |
| | EQUAL_SIZE_BUCKET_RANDOM_SAMPLE | INT32 / INT64 / FLOAT / DOUBLE | `proportion`取值范围为`(0, 1]`,默认为`0.1` | INT32 / INT64 / FLOAT / DOUBLE | 返回符合采样比例的等分桶随机采样 | |
| |
| ##### 演示 |
| 测试数据:`root.ln.wf01.wt01.temperature`从`0.0-99.0`共`100`条有序数据。 |
| ``` |
| IoTDB> select temperature from root.ln.wf01.wt01; |
| +-----------------------------+-----------------------------+ |
| | Time|root.ln.wf01.wt01.temperature| |
| +-----------------------------+-----------------------------+ |
| |1970-01-01T08:00:00.000+08:00| 0.0| |
| |1970-01-01T08:00:00.001+08:00| 1.0| |
| |1970-01-01T08:00:00.002+08:00| 2.0| |
| |1970-01-01T08:00:00.003+08:00| 3.0| |
| |1970-01-01T08:00:00.004+08:00| 4.0| |
| |1970-01-01T08:00:00.005+08:00| 5.0| |
| |1970-01-01T08:00:00.006+08:00| 6.0| |
| |1970-01-01T08:00:00.007+08:00| 7.0| |
| |1970-01-01T08:00:00.008+08:00| 8.0| |
| |1970-01-01T08:00:00.009+08:00| 9.0| |
| |1970-01-01T08:00:00.010+08:00| 10.0| |
| |1970-01-01T08:00:00.011+08:00| 11.0| |
| |1970-01-01T08:00:00.012+08:00| 12.0| |
| |.............................|.............................| |
| |1970-01-01T08:00:00.089+08:00| 89.0| |
| |1970-01-01T08:00:00.090+08:00| 90.0| |
| |1970-01-01T08:00:00.091+08:00| 91.0| |
| |1970-01-01T08:00:00.092+08:00| 92.0| |
| |1970-01-01T08:00:00.093+08:00| 93.0| |
| |1970-01-01T08:00:00.094+08:00| 94.0| |
| |1970-01-01T08:00:00.095+08:00| 95.0| |
| |1970-01-01T08:00:00.096+08:00| 96.0| |
| |1970-01-01T08:00:00.097+08:00| 97.0| |
| |1970-01-01T08:00:00.098+08:00| 98.0| |
| |1970-01-01T08:00:00.099+08:00| 99.0| |
| +-----------------------------+-----------------------------+ |
| ``` |
| sql: |
| ```sql |
| select equal_size_bucket_random_sample(temperature,'proportion'='0.1') as random_sample from root.ln.wf01.wt01; |
| ``` |
| 结果: |
| ``` |
| +-----------------------------+-------------+ |
| | Time|random_sample| |
| +-----------------------------+-------------+ |
| |1970-01-01T08:00:00.007+08:00| 7.0| |
| |1970-01-01T08:00:00.014+08:00| 14.0| |
| |1970-01-01T08:00:00.020+08:00| 20.0| |
| |1970-01-01T08:00:00.035+08:00| 35.0| |
| |1970-01-01T08:00:00.047+08:00| 47.0| |
| |1970-01-01T08:00:00.059+08:00| 59.0| |
| |1970-01-01T08:00:00.063+08:00| 63.0| |
| |1970-01-01T08:00:00.079+08:00| 79.0| |
| |1970-01-01T08:00:00.086+08:00| 86.0| |
| |1970-01-01T08:00:00.096+08:00| 96.0| |
| +-----------------------------+-------------+ |
| Total line number = 10 |
| It costs 0.024s |
| ``` |
| |
| #### 等数量分桶聚合采样 |
| |
| 采用聚合采样法对输入序列进行采样,用户需要另外提供一个聚合函数参数即 |
| - `type`:聚合类型,取值为`avg`或`max`或`min`或`sum`或`extreme`或`variance`。在缺省情况下,采用`avg`。其中`extreme`表示等分桶中,绝对值最大的值。`variance`表示采样等分桶中的方差。 |
| |
| 每个桶采样输出的时间戳为这个桶第一个点的时间戳 |
| |
| |
| | 函数名 | 可接收的输入序列类型 | 必要的属性参数 | 输出序列类型 | 功能类型 | |
| |----------|--------------------------------|---------------------------------------|------------|--------------------------------------------------| |
| | EQUAL_SIZE_BUCKET_AGG_SAMPLE | INT32 / INT64 / FLOAT / DOUBLE | `proportion`取值范围为`(0, 1]`,默认为`0.1`</br>`type`:取值类型有`avg`, `max`, `min`, `sum`, `extreme`, `variance`, 默认为`avg` | INT32 / INT64 / FLOAT / DOUBLE | 返回符合采样比例的等分桶聚合采样 | |
| |
| ##### 演示 |
| 测试数据:`root.ln.wf01.wt01.temperature`从`0.0-99.0`共`100`条有序数据,同等分桶随机采样的测试数据。 |
| |
| sql: |
| ```sql |
| select equal_size_bucket_agg_sample(temperature, 'type'='avg','proportion'='0.1') as agg_avg, equal_size_bucket_agg_sample(temperature, 'type'='max','proportion'='0.1') as agg_max, equal_size_bucket_agg_sample(temperature,'type'='min','proportion'='0.1') as agg_min, equal_size_bucket_agg_sample(temperature, 'type'='sum','proportion'='0.1') as agg_sum, equal_size_bucket_agg_sample(temperature, 'type'='extreme','proportion'='0.1') as agg_extreme, equal_size_bucket_agg_sample(temperature, 'type'='variance','proportion'='0.1') as agg_variance from root.ln.wf01.wt01; |
| ``` |
| 结果: |
| ``` |
| +-----------------------------+-----------------+-------+-------+-------+-----------+------------+ |
| | Time| agg_avg|agg_max|agg_min|agg_sum|agg_extreme|agg_variance| |
| +-----------------------------+-----------------+-------+-------+-------+-----------+------------+ |
| |1970-01-01T08:00:00.000+08:00| 4.5| 9.0| 0.0| 45.0| 9.0| 8.25| |
| |1970-01-01T08:00:00.010+08:00| 14.5| 19.0| 10.0| 145.0| 19.0| 8.25| |
| |1970-01-01T08:00:00.020+08:00| 24.5| 29.0| 20.0| 245.0| 29.0| 8.25| |
| |1970-01-01T08:00:00.030+08:00| 34.5| 39.0| 30.0| 345.0| 39.0| 8.25| |
| |1970-01-01T08:00:00.040+08:00| 44.5| 49.0| 40.0| 445.0| 49.0| 8.25| |
| |1970-01-01T08:00:00.050+08:00| 54.5| 59.0| 50.0| 545.0| 59.0| 8.25| |
| |1970-01-01T08:00:00.060+08:00| 64.5| 69.0| 60.0| 645.0| 69.0| 8.25| |
| |1970-01-01T08:00:00.070+08:00|74.50000000000001| 79.0| 70.0| 745.0| 79.0| 8.25| |
| |1970-01-01T08:00:00.080+08:00| 84.5| 89.0| 80.0| 845.0| 89.0| 8.25| |
| |1970-01-01T08:00:00.090+08:00| 94.5| 99.0| 90.0| 945.0| 99.0| 8.25| |
| +-----------------------------+-----------------+-------+-------+-------+-----------+------------+ |
| Total line number = 10 |
| It costs 0.044s |
| ``` |
| #### 等数量分桶M4采样 |
| |
| 采用M4采样法对输入序列进行采样。即对于每个桶采样首、尾、最小和最大值。 |
| |
| | 函数名 | 可接收的输入序列类型 | 必要的属性参数 | 输出序列类型 | 功能类型 | |
| |----------|--------------------------------|---------------------------------------|------------|--------------------------------------------------| |
| | EQUAL_SIZE_BUCKET_M4_SAMPLE | INT32 / INT64 / FLOAT / DOUBLE | `proportion`取值范围为`(0, 1]`,默认为`0.1`| INT32 / INT64 / FLOAT / DOUBLE | 返回符合采样比例的等分桶M4采样 | |
| |
| ##### 演示 |
| 测试数据:`root.ln.wf01.wt01.temperature`从`0.0-99.0`共`100`条有序数据,同等分桶随机采样的测试数据。 |
| |
| sql: |
| ```sql |
| select equal_size_bucket_m4_sample(temperature, 'proportion'='0.1') as M4_sample from root.ln.wf01.wt01; |
| ``` |
| 结果: |
| ``` |
| +-----------------------------+---------+ |
| | Time|M4_sample| |
| +-----------------------------+---------+ |
| |1970-01-01T08:00:00.000+08:00| 0.0| |
| |1970-01-01T08:00:00.001+08:00| 1.0| |
| |1970-01-01T08:00:00.038+08:00| 38.0| |
| |1970-01-01T08:00:00.039+08:00| 39.0| |
| |1970-01-01T08:00:00.040+08:00| 40.0| |
| |1970-01-01T08:00:00.041+08:00| 41.0| |
| |1970-01-01T08:00:00.078+08:00| 78.0| |
| |1970-01-01T08:00:00.079+08:00| 79.0| |
| |1970-01-01T08:00:00.080+08:00| 80.0| |
| |1970-01-01T08:00:00.081+08:00| 81.0| |
| |1970-01-01T08:00:00.098+08:00| 98.0| |
| |1970-01-01T08:00:00.099+08:00| 99.0| |
| +-----------------------------+---------+ |
| Total line number = 12 |
| It costs 0.065s |
| ``` |
| |
| #### 等数量分桶离群值采样 |
| 本函数对输入序列进行等数量分桶离群值采样,即根据用户给定的降采样比例和桶内采样个数将输入序列按固定点数等分为若干桶,在每个桶内通过给定的离群值采样方法进行采样。 |
| |
| | 函数名 | 可接收的输入序列类型 | 必要的属性参数 | 输出序列类型 | 功能类型 | |
| |----------|--------------------------------|---------------------------------------|------------|--------------------------------------------------| |
| | EQUAL_SIZE_BUCKET_OUTLIER_SAMPLE | INT32 / INT64 / FLOAT / DOUBLE | `proportion`取值范围为`(0, 1]`,默认为`0.1`</br>`type`取值为`avg`或`stendis`或`cos`或`prenextdis`,默认为`avg`</br>`number`取值应大于0,默认`3`| INT32 / INT64 / FLOAT / DOUBLE | 返回符合采样比例和桶内采样个数的等分桶离群值采样 | |
| |
| 参数说明 |
| - `proportion`: 采样比例 |
| - `number`: 每个桶内的采样个数,默认`3` |
| - `type`: 离群值采样方法,取值为 |
| - `avg`: 取桶内数据点的平均值,并根据采样比例,找到距离均值最远的`top number`个 |
| - `stendis`: 取桶内每一个数据点距离桶的首末数据点连成直线的垂直距离,并根据采样比例,找到距离最大的`top number`个 |
| - `cos`: 设桶内一个数据点为b,b左边的数据点为a,b右边的数据点为c,则取ab与bc向量的夹角的余弦值,值越小,说明形成的角度越大,越可能是异常值。找到cos值最小的`top number`个 |
| - `prenextdis`: 设桶内一个数据点为b,b左边的数据点为a,b右边的数据点为c,则取ab与bc的长度之和作为衡量标准,和越大越可能是异常值,找到最大的`top number`个 |
| |
| ##### 演示 |
| 测试数据:`root.ln.wf01.wt01.temperature`从`0.0-99.0`共`100`条数据,其中为了加入离群值,我们使得个位数为5的值自增100。 |
| ``` |
| IoTDB> select temperature from root.ln.wf01.wt01; |
| +-----------------------------+-----------------------------+ |
| | Time|root.ln.wf01.wt01.temperature| |
| +-----------------------------+-----------------------------+ |
| |1970-01-01T08:00:00.000+08:00| 0.0| |
| |1970-01-01T08:00:00.001+08:00| 1.0| |
| |1970-01-01T08:00:00.002+08:00| 2.0| |
| |1970-01-01T08:00:00.003+08:00| 3.0| |
| |1970-01-01T08:00:00.004+08:00| 4.0| |
| |1970-01-01T08:00:00.005+08:00| 105.0| |
| |1970-01-01T08:00:00.006+08:00| 6.0| |
| |1970-01-01T08:00:00.007+08:00| 7.0| |
| |1970-01-01T08:00:00.008+08:00| 8.0| |
| |1970-01-01T08:00:00.009+08:00| 9.0| |
| |1970-01-01T08:00:00.010+08:00| 10.0| |
| |1970-01-01T08:00:00.011+08:00| 11.0| |
| |1970-01-01T08:00:00.012+08:00| 12.0| |
| |1970-01-01T08:00:00.013+08:00| 13.0| |
| |1970-01-01T08:00:00.014+08:00| 14.0| |
| |1970-01-01T08:00:00.015+08:00| 115.0| |
| |1970-01-01T08:00:00.016+08:00| 16.0| |
| |.............................|.............................| |
| |1970-01-01T08:00:00.092+08:00| 92.0| |
| |1970-01-01T08:00:00.093+08:00| 93.0| |
| |1970-01-01T08:00:00.094+08:00| 94.0| |
| |1970-01-01T08:00:00.095+08:00| 195.0| |
| |1970-01-01T08:00:00.096+08:00| 96.0| |
| |1970-01-01T08:00:00.097+08:00| 97.0| |
| |1970-01-01T08:00:00.098+08:00| 98.0| |
| |1970-01-01T08:00:00.099+08:00| 99.0| |
| +-----------------------------+-----------------------------+ |
| ``` |
| sql: |
| ```sql |
| select equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='avg', 'number'='2') as outlier_avg_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='stendis', 'number'='2') as outlier_stendis_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='cos', 'number'='2') as outlier_cos_sample, equal_size_bucket_outlier_sample(temperature, 'proportion'='0.1', 'type'='prenextdis', 'number'='2') as outlier_prenextdis_sample from root.ln.wf01.wt01; |
| ``` |
| 结果: |
| ``` |
| +-----------------------------+------------------+----------------------+------------------+-------------------------+ |
| | Time|outlier_avg_sample|outlier_stendis_sample|outlier_cos_sample|outlier_prenextdis_sample| |
| +-----------------------------+------------------+----------------------+------------------+-------------------------+ |
| |1970-01-01T08:00:00.005+08:00| 105.0| 105.0| 105.0| 105.0| |
| |1970-01-01T08:00:00.015+08:00| 115.0| 115.0| 115.0| 115.0| |
| |1970-01-01T08:00:00.025+08:00| 125.0| 125.0| 125.0| 125.0| |
| |1970-01-01T08:00:00.035+08:00| 135.0| 135.0| 135.0| 135.0| |
| |1970-01-01T08:00:00.045+08:00| 145.0| 145.0| 145.0| 145.0| |
| |1970-01-01T08:00:00.055+08:00| 155.0| 155.0| 155.0| 155.0| |
| |1970-01-01T08:00:00.065+08:00| 165.0| 165.0| 165.0| 165.0| |
| |1970-01-01T08:00:00.075+08:00| 175.0| 175.0| 175.0| 175.0| |
| |1970-01-01T08:00:00.085+08:00| 185.0| 185.0| 185.0| 185.0| |
| |1970-01-01T08:00:00.095+08:00| 195.0| 195.0| 195.0| 195.0| |
| +-----------------------------+------------------+----------------------+------------------+-------------------------+ |
| Total line number = 10 |
| It costs 0.041s |
| ``` |
| |
| ### M4函数 |
| |
| M4用于在窗口内采样第一个点(`first`)、最后一个点(`last`)、最小值点(`bottom`)、最大值点(`top`): |
| |
| - 第一个点是拥有这个窗口内最小时间戳的点; |
| - 最后一个点是拥有这个窗口内最大时间戳的点; |
| - 最小值点是拥有这个窗口内最小值的点(如果有多个这样的点,M4只返回其中一个); |
| - 最大值点是拥有这个窗口内最大值的点(如果有多个这样的点,M4只返回其中一个)。 |
| |
| <img src="https://user-images.githubusercontent.com/33376433/198178733-a0919d17-0663-4672-9c4f-1efad6f463c2.png" alt="image" style="zoom:50%;" /> |
| |
| | 函数名 | 可接收的输入序列类型 | 属性参数 | 输出序列类型 | 功能类型 | |
| | ------ | ------------------------------ | ------------------------------------------------------------ | ------------------------------ | ------------------------------------------------------------ | |
| | M4 | INT32 / INT64 / FLOAT / DOUBLE | 包含固定点数的窗口和滑动时间窗口使用不同的属性参数。包含固定点数的窗口使用属性`windowSize`和`slidingStep`。滑动时间窗口使用属性`timeInterval`、`slidingStep`、`displayWindowBegin`和`displayWindowEnd`。更多细节见下文。 | INT32 / INT64 / FLOAT / DOUBLE | 返回每个窗口内的第一个点(`first`)、最后一个点(`last`)、最小值点(`bottom`)、最大值点(`top`)。在一个窗口内的聚合点输出之前,M4会将它们按照时间戳递增排序并且去重。 | |
| |
| #### 属性参数 |
| |
| **(1) 包含固定点数的窗口(SlidingSizeWindowAccessStrategy)使用的属性参数:** |
| |
| + `windowSize`: 一个窗口内的点数。Int数据类型。必需的属性参数。 |
| + `slidingStep`: 按照设定的点数来滑动窗口。Int数据类型。可选的属性参数;如果没有设置,默认取值和`windowSize`一样。 |
| |
| <img src="https://user-images.githubusercontent.com/33376433/198181449-00d563c8-7bce-4ecd-a031-ec120ca42c3f.png" alt="image" style="zoom: 50%;" /> |
| |
| *(图片来源: https://iotdb.apache.org/UserGuide/Master/Process-Data/UDF-User-Defined-Function.html#udtf-user-defined-timeseries-generating-function)* |
| |
| **(2) 滑动时间窗口(SlidingTimeWindowAccessStrategy)使用的属性参数:** |
| |
| + `timeInterval`: 一个窗口的时间长度。Long数据类型。必需的属性参数。 |
| + `slidingStep`: 按照设定的时长来滑动窗口。Long数据类型。可选的属性参数;如果没有设置,默认取值和`timeInterval`一样。 |
| + `displayWindowBegin`: 窗口滑动的起始时间戳位置(包含在内)。Long数据类型。可选的属性参数;如果没有设置,默认取值为Long.MIN_VALUE,意为使用输入的时间序列的第一个点的时间戳作为窗口滑动的起始时间戳位置。 |
| + `displayWindowEnd`: 结束时间限制(不包含在内;本质上和`WHERE time < displayWindowEnd`起的效果是一样的)。Long数据类型。可选的属性参数;如果没有设置,默认取值为Long.MAX_VALUE,意为除了输入的时间序列自身数据读取完毕之外没有增加额外的结束时间过滤条件限制。 |
| |
| <img src="https://user-images.githubusercontent.com/33376433/198183015-93b56644-3330-4acf-ae9e-d718a02b5f4c.png" alt="groupBy window" style="zoom: 67%;" /> |
| |
| *(图片来源: https://iotdb.apache.org/UserGuide/Master/Query-Data/Aggregate-Query.html#downsampling-aggregate-query)* |
| |
| #### 演示 |
| |
| 输入的时间序列: |
| |
| ```sql |
| +-----------------------------+------------------+ |
| | Time|root.vehicle.d1.s1| |
| +-----------------------------+------------------+ |
| |1970-01-01T08:00:00.001+08:00| 5.0| |
| |1970-01-01T08:00:00.002+08:00| 15.0| |
| |1970-01-01T08:00:00.005+08:00| 10.0| |
| |1970-01-01T08:00:00.008+08:00| 8.0| |
| |1970-01-01T08:00:00.010+08:00| 30.0| |
| |1970-01-01T08:00:00.020+08:00| 20.0| |
| |1970-01-01T08:00:00.025+08:00| 8.0| |
| |1970-01-01T08:00:00.027+08:00| 20.0| |
| |1970-01-01T08:00:00.030+08:00| 40.0| |
| |1970-01-01T08:00:00.033+08:00| 9.0| |
| |1970-01-01T08:00:00.035+08:00| 10.0| |
| |1970-01-01T08:00:00.040+08:00| 20.0| |
| |1970-01-01T08:00:00.045+08:00| 30.0| |
| |1970-01-01T08:00:00.052+08:00| 8.0| |
| |1970-01-01T08:00:00.054+08:00| 18.0| |
| +-----------------------------+------------------+ |
| ``` |
| |
| 查询语句1: |
| |
| ```sql |
| select M4(s1,'timeInterval'='25','displayWindowBegin'='0','displayWindowEnd'='100') from root.vehicle.d1 |
| ``` |
| |
| 输出结果1: |
| |
| ```sql |
| +-----------------------------+-----------------------------------------------------------------------------------------------+ |
| | Time|M4(root.vehicle.d1.s1, "timeInterval"="25", "displayWindowBegin"="0", "displayWindowEnd"="100")| |
| +-----------------------------+-----------------------------------------------------------------------------------------------+ |
| |1970-01-01T08:00:00.001+08:00| 5.0| |
| |1970-01-01T08:00:00.010+08:00| 30.0| |
| |1970-01-01T08:00:00.020+08:00| 20.0| |
| |1970-01-01T08:00:00.025+08:00| 8.0| |
| |1970-01-01T08:00:00.030+08:00| 40.0| |
| |1970-01-01T08:00:00.045+08:00| 30.0| |
| |1970-01-01T08:00:00.052+08:00| 8.0| |
| |1970-01-01T08:00:00.054+08:00| 18.0| |
| +-----------------------------+-----------------------------------------------------------------------------------------------+ |
| Total line number = 8 |
| ``` |
| |
| 查询语句2: |
| |
| ```sql |
| select M4(s1,'windowSize'='10') from root.vehicle.d1 |
| ``` |
| |
| 输出结果2: |
| |
| ```sql |
| +-----------------------------+-----------------------------------------+ |
| | Time|M4(root.vehicle.d1.s1, "windowSize"="10")| |
| +-----------------------------+-----------------------------------------+ |
| |1970-01-01T08:00:00.001+08:00| 5.0| |
| |1970-01-01T08:00:00.030+08:00| 40.0| |
| |1970-01-01T08:00:00.033+08:00| 9.0| |
| |1970-01-01T08:00:00.035+08:00| 10.0| |
| |1970-01-01T08:00:00.045+08:00| 30.0| |
| |1970-01-01T08:00:00.052+08:00| 8.0| |
| |1970-01-01T08:00:00.054+08:00| 18.0| |
| +-----------------------------+-----------------------------------------+ |
| Total line number = 7 |
| ``` |
| |
| |
| |
| #### 推荐的使用场景 |
| |
| **(1) 使用场景:保留极端点的降采样** |
| |
| 由于M4为每个窗口聚合其第一个点(`first`)、最后一个点(`last`)、最小值点(`bottom`)、最大值点(`top`),因此M4通常保留了极值点,因此比其他下采样方法(如分段聚合近似 (PAA))能更好地保留模式。如果你想对时间序列进行下采样并且希望保留极值点,你可以试试 M4。 |
| |
| **(2) 使用场景:基于数据缩约的大规模时间序列的零误差双色折线图可视化** |
| |
| 参考论文: ["M4: A Visualization-Oriented Time Series Data Aggregation"](http://www.vldb.org/pvldb/vol7/p797-jugel.pdf). |
| |
| 假设屏幕画布的像素宽乘高是`w*h`,假设时间序列root.vehicle.d1.s1要可视化的时间范围是`[tqs,tqe)`(在这个使用场景里面,需要请你自行将tqe自适应调整使得(tqe-tqs)是w的整数倍),那么落在第i个时间跨度`Ii=[tqs+(tqe-tqs)/w*(i-1),tqs+(tqe-tqs)/w*i)` 内的点将会被画在第i个像素列中,i=1,2,...,w。 |
| |
| 于是从可视化驱动的角度出发,使用查询语句:`"select M4(s1,'timeInterval'='(tqe-tqs)/w','displayWindowBegin'='tqs','displayWindowEnd'='tqe') from root.vehicle.d1"`,来采集每个时间跨度内的第一个点(`first`)、最后一个点(`last`)、最小值点(`bottom`)、最大值点(`top`)。最终结果点数不会超过`4*w`个,使用这些聚合点画出来的折线图与使用原始数据画出来的图在像素级别上是完全一致的。 |
| |
| |
| |
| #### 和其它SQL的功能比较 |
| |
| | SQL | 是否支持M4聚合 | 滑动窗口类型 | 示例 | 相关文档 | |
| | ------------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ | |
| | 1. 带有Group By子句的内置聚合函数 | 不支持,缺少`BOTTOM_TIME`和`TOP_TIME`,即缺少最小值点和最大值点的时间戳。 | Time Window | `select count(status), max_value(temperature) from root.ln.wf01.wt01 group by ([2017-11-01 00:00:00, 2017-11-07 23:00:00), 3h, 1d)` | https://iotdb.apache.org/UserGuide/Master/Query-Data/Aggregate-Query.html#built-in-aggregate-functions <br />https://iotdb.apache.org/UserGuide/Master/Query-Data/Aggregate-Query.html#downsampling-aggregate-query | |
| | 2. EQUAL_SIZE_BUCKET_M4_SAMPLE (内置UDF) | 支持* | Size Window. `windowSize = 4*(int)(1/proportion)` | `select equal_size_bucket_m4_sample(temperature, 'proportion'='0.1') as M4_sample from root.ln.wf01.wt01` | https://iotdb.apache.org/UserGuide/Master/Query-Data/Select-Expression.html#time-series-generating-functions | |
| | **3. M4 (内置UDF)** | 支持* | Size Window, Time Window | (1) Size Window: `select M4(s1,'windowSize'='10') from root.vehicle.d1` <br />(2) Time Window: `select M4(s1,'timeInterval'='25','displayWindowBegin'='0','displayWindowEnd'='100') from root.vehicle.d1` | 本文档 | |
| | 4. 扩展带有Group By子句的内置聚合函数来支持M4聚合 | 未实施 | 未实施 | 未实施 | 未实施 | |
| |
| 进一步比较`EQUAL_SIZE_BUCKET_M4_SAMPLE`和`M4`: |
| |
| **(1) 不同的M4聚合函数定义:** |
| |
| 在每个窗口内,`EQUAL_SIZE_BUCKET_M4_SAMPLE`从排除了第一个点和最后一个点之后剩余的点中提取最小值点和最大值点。 |
| |
| 而`M4`则是从窗口内所有点中(包括第一个点和最后一个点)提取最小值点和最大值点,这个定义与元数据中保存的`max_value`和`min_value`的语义更加一致。 |
| |
| 值得注意的是,在一个窗口内的聚合点输出之前,`EQUAL_SIZE_BUCKET_M4_SAMPLE`和`M4`都会将它们按照时间戳递增排序并且去重。 |
| |
| **(2) 不同的滑动窗口:** |
| |
| `EQUAL_SIZE_BUCKET_M4_SAMPLE`使用SlidingSizeWindowAccessStrategy,并且通过采样比例(`proportion`)来间接控制窗口点数(`windowSize`),转换公式是`windowSize = 4*(int)(1/proportion)`。 |
| |
| `M4`支持两种滑动窗口:SlidingSizeWindowAccessStrategy和SlidingTimeWindowAccessStrategy,并且`M4`通过相应的参数直接控制窗口的点数或者时长。 |
| |
| |
| |
| ### JEXL自定义函数 |
| |
| Java Expression Language (JEXL) 是一个表达式语言引擎。我们使用JEXL来扩展UDF,在命令行中,通过简易的lambda表达式来实现UDF。lambda表达式中支持的运算符详见链接 [JEXL中lambda表达式支持的运算符](https://commons.apache.org/proper/commons-jexl/apidocs/org/apache/commons/jexl3/package-summary.html#customization) 。 |
| |
| |
| | 函数名 | 可接收的输入序列类型 | 必要的属性参数 | 输出序列类型 | 功能类型 | |
| |----------|--------------------------------|---------------------------------------|------------|--------------------------------------------------| |
| | JEXL | INT32 / INT64 / FLOAT / DOUBLE / TEXT / BOOLEAN | `expr`是一个支持标准的一元或多元参数的lambda表达式,符合`x -> {...}`或`(x, y, z) -> {...}`的格式,例如`x -> {x * 2}`, `(x, y, z) -> {x + y * z}`| INT32 / INT64 / FLOAT / DOUBLE / TEXT / BOOLEAN | 返回将输入的时间序列通过lambda表达式变换的序列 | |
| |
| #### 演示 |
| 测试数据:`root.ln.wf01.wt01.temperature`, `root.ln.wf01.wt01.st`, `root.ln.wf01.wt01.str`共`11`条数据。 |
| ``` |
| IoTDB> select * from root.ln.wf01.wt01; |
| +-----------------------------+---------------------+--------------------+-----------------------------+ |
| | Time|root.ln.wf01.wt01.str|root.ln.wf01.wt01.st|root.ln.wf01.wt01.temperature| |
| +-----------------------------+---------------------+--------------------+-----------------------------+ |
| |1970-01-01T08:00:00.000+08:00| str| 10.0| 0.0| |
| |1970-01-01T08:00:00.001+08:00| str| 20.0| 1.0| |
| |1970-01-01T08:00:00.002+08:00| str| 30.0| 2.0| |
| |1970-01-01T08:00:00.003+08:00| str| 40.0| 3.0| |
| |1970-01-01T08:00:00.004+08:00| str| 50.0| 4.0| |
| |1970-01-01T08:00:00.005+08:00| str| 60.0| 5.0| |
| |1970-01-01T08:00:00.006+08:00| str| 70.0| 6.0| |
| |1970-01-01T08:00:00.007+08:00| str| 80.0| 7.0| |
| |1970-01-01T08:00:00.008+08:00| str| 90.0| 8.0| |
| |1970-01-01T08:00:00.009+08:00| str| 100.0| 9.0| |
| |1970-01-01T08:00:00.010+08:00| str| 110.0| 10.0| |
| +-----------------------------+---------------------+--------------------+-----------------------------+ |
| ``` |
| sql: |
| ```sql |
| select jexl(temperature, 'expr'='x -> {x + x}') as jexl1, jexl(temperature, 'expr'='x -> {x * 3}') as jexl2, jexl(temperature, 'expr'='x -> {x * x}') as jexl3, jexl(temperature, 'expr'='x -> {multiply(x, 100)}') as jexl4, jexl(temperature, st, 'expr'='(x, y) -> {x + y}') as jexl5, jexl(temperature, st, str, 'expr'='(x, y, z) -> {x + y + z}') as jexl6 from root.ln.wf01.wt01;``` |
| ``` |
| 结果: |
| ``` |
| +-----------------------------+-----+-----+-----+------+-----+--------+ |
| | Time|jexl1|jexl2|jexl3| jexl4|jexl5| jexl6| |
| +-----------------------------+-----+-----+-----+------+-----+--------+ |
| |1970-01-01T08:00:00.000+08:00| 0.0| 0.0| 0.0| 0.0| 10.0| 10.0str| |
| |1970-01-01T08:00:00.001+08:00| 2.0| 3.0| 1.0| 100.0| 21.0| 21.0str| |
| |1970-01-01T08:00:00.002+08:00| 4.0| 6.0| 4.0| 200.0| 32.0| 32.0str| |
| |1970-01-01T08:00:00.003+08:00| 6.0| 9.0| 9.0| 300.0| 43.0| 43.0str| |
| |1970-01-01T08:00:00.004+08:00| 8.0| 12.0| 16.0| 400.0| 54.0| 54.0str| |
| |1970-01-01T08:00:00.005+08:00| 10.0| 15.0| 25.0| 500.0| 65.0| 65.0str| |
| |1970-01-01T08:00:00.006+08:00| 12.0| 18.0| 36.0| 600.0| 76.0| 76.0str| |
| |1970-01-01T08:00:00.007+08:00| 14.0| 21.0| 49.0| 700.0| 87.0| 87.0str| |
| |1970-01-01T08:00:00.008+08:00| 16.0| 24.0| 64.0| 800.0| 98.0| 98.0str| |
| |1970-01-01T08:00:00.009+08:00| 18.0| 27.0| 81.0| 900.0|109.0|109.0str| |
| |1970-01-01T08:00:00.010+08:00| 20.0| 30.0|100.0|1000.0|120.0|120.0str| |
| +-----------------------------+-----+-----+-----+------+-----+--------+ |
| Total line number = 11 |
| It costs 0.118s |
| ``` |
| |
| ### 自定义时间序列生成函数 |
| |
| 请参考 [UDF](../Process-Data/UDF-User-Defined-Function.md)。 |
| |
| 已知的自定义时间序列生成函数库实现: |
| |
| + [IoTDB-Quality](https://thulab.github.io/iotdb-quality),一个关于数据质量的 UDF 库实现,包括数据画像、数据质量评估与修复等一系列函数。 |
| |
| ## 嵌套表达式 |
| |
| IoTDB支持嵌套表达式,由于聚合查询和时间序列查询不能在一条查询语句中同时出现,我们将支持的嵌套表达式分为时间序列查询嵌套表达式和聚合查询嵌套表达式两类。 |
| |
| 下面是嵌套表达式统一的 `SELECT` 子句语法定义: |
| |
| ```sql |
| selectClause |
| : SELECT resultColumn (',' resultColumn)* |
| ; |
| |
| resultColumn |
| : expression (AS ID)? |
| ; |
| |
| expression |
| : '(' expression ')' |
| | '-' expression |
| | expression ('*' | '/' | '%') expression |
| | expression ('+' | '-') expression |
| | functionName '(' expression (',' expression)* functionAttribute* ')' |
| | timeSeriesSuffixPath |
| | number |
| ; |
| ``` |
| |
| ### 时间序列查询嵌套表达式 |
| |
| IoTDB 支持在 `SELECT` 子句中计算由**数字常量,时间序列、时间序列生成函数(包括用户自定义函数)和算数运算表达式**组成的任意嵌套表达式。 |
| |
| #### 示例 |
| |
| 输入1: |
| |
| ```sql |
| select a, |
| b, |
| ((a + 1) * 2 - 1) % 2 + 1.5, |
| sin(a + sin(a + sin(b))), |
| -(a + b) * (sin(a + b) * sin(a + b) + cos(a + b) * cos(a + b)) + 1 |
| from root.sg1; |
| ``` |
| |
| 结果集1: |
| |
| ``` |
| +-----------------------------+----------+----------+----------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| | Time|root.sg1.a|root.sg1.b|((((root.sg1.a + 1) * 2) - 1) % 2) + 1.5|sin(root.sg1.a + sin(root.sg1.a + sin(root.sg1.b)))|(-root.sg1.a + root.sg1.b * ((sin(root.sg1.a + root.sg1.b) * sin(root.sg1.a + root.sg1.b)) + (cos(root.sg1.a + root.sg1.b) * cos(root.sg1.a + root.sg1.b)))) + 1| |
| +-----------------------------+----------+----------+----------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| |1970-01-01T08:00:00.010+08:00| 1| 1| 2.5| 0.9238430524420609| -1.0| |
| |1970-01-01T08:00:00.020+08:00| 2| 2| 2.5| 0.7903505371876317| -3.0| |
| |1970-01-01T08:00:00.030+08:00| 3| 3| 2.5| 0.14065207680386618| -5.0| |
| |1970-01-01T08:00:00.040+08:00| 4| null| 2.5| null| null| |
| |1970-01-01T08:00:00.050+08:00| null| 5| null| null| null| |
| |1970-01-01T08:00:00.060+08:00| 6| 6| 2.5| -0.7288037411970916| -11.0| |
| +-----------------------------+----------+----------+----------------------------------------+---------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------+ |
| Total line number = 6 |
| It costs 0.048s |
| ``` |
| |
| 输入2: |
| |
| ```sql |
| select (a + b) * 2 + sin(a) from root.sg |
| ``` |
| |
| 结果集2: |
| |
| ``` |
| +-----------------------------+----------------------------------------------+ |
| | Time|((root.sg.a + root.sg.b) * 2) + sin(root.sg.a)| |
| +-----------------------------+----------------------------------------------+ |
| |1970-01-01T08:00:00.010+08:00| 59.45597888911063| |
| |1970-01-01T08:00:00.020+08:00| 100.91294525072763| |
| |1970-01-01T08:00:00.030+08:00| 139.01196837590714| |
| |1970-01-01T08:00:00.040+08:00| 180.74511316047935| |
| |1970-01-01T08:00:00.050+08:00| 219.73762514629607| |
| |1970-01-01T08:00:00.060+08:00| 259.6951893788978| |
| |1970-01-01T08:00:00.070+08:00| 300.7738906815579| |
| |1970-01-01T08:00:00.090+08:00| 39.45597888911063| |
| |1970-01-01T08:00:00.100+08:00| 39.45597888911063| |
| +-----------------------------+----------------------------------------------+ |
| Total line number = 9 |
| It costs 0.011s |
| ``` |
| |
| 输入3: |
| |
| ```sql |
| select (a + *) / 2 from root.sg1 |
| ``` |
| |
| 结果集3: |
| |
| ``` |
| +-----------------------------+-----------------------------+-----------------------------+ |
| | Time|(root.sg1.a + root.sg1.a) / 2|(root.sg1.a + root.sg1.b) / 2| |
| +-----------------------------+-----------------------------+-----------------------------+ |
| |1970-01-01T08:00:00.010+08:00| 1.0| 1.0| |
| |1970-01-01T08:00:00.020+08:00| 2.0| 2.0| |
| |1970-01-01T08:00:00.030+08:00| 3.0| 3.0| |
| |1970-01-01T08:00:00.040+08:00| 4.0| null| |
| |1970-01-01T08:00:00.060+08:00| 6.0| 6.0| |
| +-----------------------------+-----------------------------+-----------------------------+ |
| Total line number = 5 |
| It costs 0.011s |
| ``` |
| |
| 输入4: |
| |
| ```sql |
| select (a + b) * 3 from root.sg, root.ln |
| ``` |
| |
| 结果集4: |
| |
| ``` |
| +-----------------------------+---------------------------+---------------------------+---------------------------+---------------------------+ |
| | Time|(root.sg.a + root.sg.b) * 3|(root.sg.a + root.ln.b) * 3|(root.ln.a + root.sg.b) * 3|(root.ln.a + root.ln.b) * 3| |
| +-----------------------------+---------------------------+---------------------------+---------------------------+---------------------------+ |
| |1970-01-01T08:00:00.010+08:00| 90.0| 270.0| 360.0| 540.0| |
| |1970-01-01T08:00:00.020+08:00| 150.0| 330.0| 690.0| 870.0| |
| |1970-01-01T08:00:00.030+08:00| 210.0| 450.0| 570.0| 810.0| |
| |1970-01-01T08:00:00.040+08:00| 270.0| 240.0| 690.0| 660.0| |
| |1970-01-01T08:00:00.050+08:00| 330.0| null| null| null| |
| |1970-01-01T08:00:00.060+08:00| 390.0| null| null| null| |
| |1970-01-01T08:00:00.070+08:00| 450.0| null| null| null| |
| |1970-01-01T08:00:00.090+08:00| 60.0| null| null| null| |
| |1970-01-01T08:00:00.100+08:00| 60.0| null| null| null| |
| +-----------------------------+---------------------------+---------------------------+---------------------------+---------------------------+ |
| Total line number = 9 |
| It costs 0.014s |
| ``` |
| |
| #### 说明 |
| |
| - 当某个时间戳下左操作数和右操作数都不为空(`null`)时,表达式才会有结果,否则表达式值为`null`,且默认不出现在结果集中。 |
| - 从结果集1中可知,时间戳40下,时间序列`root.sg.a`的值为4,`root.sg.b`的值为`null`。所以在时间戳40下,表达式`(a + b) * 2 + sin(a)`的值为`null`,从结果集2可以看出,时间戳40没有出现。 |
| - 如果表达式中某个操作数对应多条时间序列(如通配符`*`),那么每条时间序列对应的结果都会出现在结果集中(按照笛卡尔积形式)。请参考示例中输入3、4和对应结果集3、4。 |
| |
| #### 注意 |
| |
| > 目前对齐时间序列(Aligned Timeseries)尚不支持时间序列查询嵌套表达式。使用时间序列查询嵌套表达式时,如果操作数包含对齐时间序列,会提示错误。 |
| |
| ### 聚合查询嵌套表达式 |
| |
| IoTDB 支持在 `SELECT` 子句中计算由**数字常量,聚合查询和算数运算表达式**组成的任意嵌套表达式。 |
| |
| #### 示例 |
| |
| 不指定 `GROUP BY` 的聚合查询。 |
| |
| 输入1: |
| |
| ```sql |
| select avg(temperature), |
| sin(avg(temperature)), |
| avg(temperature) + 1, |
| -sum(hardware), |
| avg(temperature) + sum(hardware) |
| from root.ln.wf01.wt01; |
| ``` |
| |
| 结果集1: |
| |
| ``` |
| +----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+--------------------------------------------------------------------+ |
| |avg(root.ln.wf01.wt01.temperature)|sin(avg(root.ln.wf01.wt01.temperature))|avg(root.ln.wf01.wt01.temperature) + 1|-sum(root.ln.wf01.wt01.hardware)|avg(root.ln.wf01.wt01.temperature) + sum(root.ln.wf01.wt01.hardware)| |
| +----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+--------------------------------------------------------------------+ |
| | 15.927999999999999| -0.21826546964855045| 16.927999999999997| -7426.0| 7441.928| |
| +----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+--------------------------------------------------------------------+ |
| Total line number = 1 |
| It costs 0.009s |
| ``` |
| |
| 输入2: |
| |
| ```sql |
| select avg(*), |
| (avg(*) + 1) * 3 / 2 -1 |
| from root.sg1 |
| ``` |
| |
| 结果集2: |
| |
| ``` |
| +---------------+---------------+-------------------------------------+-------------------------------------+ |
| |avg(root.sg1.a)|avg(root.sg1.b)|(avg(root.sg1.a) + 1) * 3 / 2 - 1 |(avg(root.sg1.b) + 1) * 3 / 2 - 1 | |
| +---------------+---------------+-------------------------------------+-------------------------------------+ |
| | 3.2| 3.4| 5.300000000000001| 5.6000000000000005| |
| +---------------+---------------+-------------------------------------+-------------------------------------+ |
| Total line number = 1 |
| It costs 0.007s |
| ``` |
| |
| 指定 `GROUP BY` 的聚合查询。 |
| |
| 输入3: |
| |
| ```sql |
| select avg(temperature), |
| sin(avg(temperature)), |
| avg(temperature) + 1, |
| -sum(hardware), |
| avg(temperature) + sum(hardware) as custom_sum |
| from root.ln.wf01.wt01 |
| GROUP BY([10, 90), 10ms); |
| ``` |
| |
| 结果集3: |
| |
| ``` |
| +-----------------------------+----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+----------+ |
| | Time|avg(root.ln.wf01.wt01.temperature)|sin(avg(root.ln.wf01.wt01.temperature))|avg(root.ln.wf01.wt01.temperature) + 1|-sum(root.ln.wf01.wt01.hardware)|custom_sum| |
| +-----------------------------+----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+----------+ |
| |1970-01-01T08:00:00.010+08:00| 13.987499999999999| 0.9888207947857667| 14.987499999999999| -3211.0| 3224.9875| |
| |1970-01-01T08:00:00.020+08:00| 29.6| -0.9701057337071853| 30.6| -3720.0| 3749.6| |
| |1970-01-01T08:00:00.030+08:00| null| null| null| null| null| |
| |1970-01-01T08:00:00.040+08:00| null| null| null| null| null| |
| |1970-01-01T08:00:00.050+08:00| null| null| null| null| null| |
| |1970-01-01T08:00:00.060+08:00| null| null| null| null| null| |
| |1970-01-01T08:00:00.070+08:00| null| null| null| null| null| |
| |1970-01-01T08:00:00.080+08:00| null| null| null| null| null| |
| +-----------------------------+----------------------------------+---------------------------------------+--------------------------------------+--------------------------------+----------+ |
| Total line number = 8 |
| It costs 0.012s |
| ``` |
| |
| #### 说明 |
| |
| - 当某个时间戳下左操作数和右操作数都不为空(`null`)时,表达式才会有结果,否则表达式值为`null`,且默认不出现在结果集中。但在使用`GROUP BY`子句的聚合查询嵌套表达式中,我们希望保留每个时间窗口的值,所以表达式值为`null`的窗口也包含在结果集中。请参考输入3及结果集3。 |
| - 如果表达式中某个操作数对应多条时间序列(如通配符`*`),那么每条时间序列对应的结果都会出现在结果集中(按照笛卡尔积形式)。请参考输入2及结果集2。 |
| |
| #### 注意 |
| |
| > 目前此功能尚不支持填充算子(`FILL`)和按层级聚合(`GROUP BY LEVEL`)查询, 在后续版本会支持。 |
| > |
| > 聚合计算目前只能当做最底层表达式输入,暂不支持聚合函数内部出现表达式。 |
| > |
| > 即不支持以下查询 |
| > |
| > ```SQL |
| > SELECT avg(s1 + 1) FROM root.sg.d1; -- 聚合函数内部有表达式 |
| > SELECT avg(s1) + avg(s2) FROM root.sg.* GROUP BY LEVEL=1; -- 按层级聚合 |
| > SELECT avg(s1) + avg(s2) FROM root.sg.d1 GROUP BY([0, 10000), 1s) FILL(previous); -- 空值填充 |
| > ``` |
| |
| ## 使用别名 |
| |
| 由于 IoTDB 独特的数据模型,在每个传感器前都附带有设备等诸多额外信息。有时,我们只针对某个具体设备查询,而这些前缀信息频繁显示造成了冗余,影响了结果集的显示与分析。这时我们可以使用 IoTDB 提供的 AS 函数,将查询中出现的时间序列给定一个别名。 |
| |
| 例如: |
| |
| ```sql |
| select s1 as temperature, s2 as speed from root.ln.wf01.wt01; |
| ``` |
| |
| 则结果集将显示为: |
| |
| | Time | temperature | speed | |
| | ---- | ----------- | ----- | |
| | ... | ... | ... | |