docs/zh/UserGuide/Operators-Functions/Series-Discovery.md - iotdb - Git at Google

 <!--

     Licensed to the Apache Software Foundation (ASF) under one
     or more contributor license agreements.  See the NOTICE file
     distributed with this work for additional information
     regarding copyright ownership.  The ASF licenses this file
     to you under the Apache License, Version 2.0 (the
     "License"); you may not use this file except in compliance
     with the License.  You may obtain a copy of the License at

         http://www.apache.org/licenses/LICENSE-2.0

     Unless required by applicable law or agreed to in writing,
     software distributed under the License is distributed on an
     "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     KIND, either express or implied.  See the License for the
     specific language governing permissions and limitations
     under the License.

 -->

 ## 序列发现

 ### ConsecutiveSequences

 #### 函数简介

 本函数用于在多维严格等间隔数据中发现局部最长连续子序列。

 严格等间隔数据是指数据的时间间隔是严格相等的，允许存在数据缺失（包括行缺失和值缺失），但不允许存在数据冗余和时间戳偏移。

 连续子序列是指严格按照标准时间间隔等距排布，不存在任何数据缺失的子序列。如果某个连续子序列不是任何连续子序列的真子序列，那么它是局部最长的。


 **函数名：** CONSECUTIVESEQUENCES

 **输入序列：** 支持多个输入序列，类型可以是任意的，但要满足严格等间隔的要求。

 **参数：**

 + `gap`：标准时间间隔，是一个有单位的正数。目前支持五种单位，分别是'ms'（毫秒）、's'（秒）、'm'（分钟）、'h'（小时）和'd'（天）。在缺省情况下，函数会利用众数估计标准时间间隔。

 **输出序列：** 输出单个序列，类型为 INT32。输出序列中的每一个数据点对应一个局部最长连续子序列，时间戳为子序列的起始时刻，值为子序列包含的数据点个数。

 **提示：** 对于不符合要求的输入，本函数不对输出做任何保证。

 #### 使用示例

 ##### 手动指定标准时间间隔

 本函数可以通过`gap`参数手动指定标准时间间隔。需要注意的是，错误的参数设置会导致输出产生严重错误。

 输入序列：

 ```
 +-----------------------------+---------------+---------------+
 |                         Time|root.test.d1.s1|root.test.d1.s2|
 +-----------------------------+---------------+---------------+
 |2020-01-01T00:00:00.000+08:00|            1.0|            1.0|
 |2020-01-01T00:05:00.000+08:00|            1.0|            1.0|
 |2020-01-01T00:10:00.000+08:00|            1.0|            1.0|
 |2020-01-01T00:20:00.000+08:00|            1.0|            1.0|
 |2020-01-01T00:25:00.000+08:00|            1.0|            1.0|
 |2020-01-01T00:30:00.000+08:00|            1.0|            1.0|
 |2020-01-01T00:35:00.000+08:00|            1.0|            1.0|
 |2020-01-01T00:40:00.000+08:00|            1.0|           null|
 |2020-01-01T00:45:00.000+08:00|            1.0|            1.0|
 |2020-01-01T00:50:00.000+08:00|            1.0|            1.0|
 +-----------------------------+---------------+---------------+
 ```

 用于查询的SQL语句：

 ```sql
 select consecutivesequences(s1,s2,'gap'='5m') from root.test.d1
 ```

 输出序列：

 ```
 +-----------------------------+------------------------------------------------------------------+
 |                         Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2, "gap"="5m")|
 +-----------------------------+------------------------------------------------------------------+
 |2020-01-01T00:00:00.000+08:00|                                                                 3|
 |2020-01-01T00:20:00.000+08:00|                                                                 4|
 |2020-01-01T00:45:00.000+08:00|                                                                 2|
 +-----------------------------+------------------------------------------------------------------+
 ```

 ##### 自动估计标准时间间隔

 当`gap`参数缺省时，本函数可以利用众数估计标准时间间隔，得到同样的结果。因此，这种用法更受推荐。

 输入序列同上，用于查询的SQL语句如下：

 ```sql
 select consecutivesequences(s1,s2) from root.test.d1
 ```

 输出序列：

 ```
 +-----------------------------+------------------------------------------------------+
 |                         Time|consecutivesequences(root.test.d1.s1, root.test.d1.s2)|
 +-----------------------------+------------------------------------------------------+
 |2020-01-01T00:00:00.000+08:00|                                                     3|
 |2020-01-01T00:20:00.000+08:00|                                                     4|
 |2020-01-01T00:45:00.000+08:00|                                                     2|
 +-----------------------------+------------------------------------------------------+
 ```

 ### ConsecutiveWindows

 #### 函数简介

 本函数用于在多维严格等间隔数据中发现指定长度的连续窗口。

 严格等间隔数据是指数据的时间间隔是严格相等的，允许存在数据缺失（包括行缺失和值缺失），但不允许存在数据冗余和时间戳偏移。

 连续窗口是指严格按照标准时间间隔等距排布，不存在任何数据缺失的子序列。


 **函数名：** CONSECUTIVEWINDOWS

 **输入序列：** 支持多个输入序列，类型可以是任意的，但要满足严格等间隔的要求。

 **参数：**

 + `gap`：标准时间间隔，是一个有单位的正数。目前支持五种单位，分别是 'ms'（毫秒）、's'（秒）、'm'（分钟）、'h'（小时）和'd'（天）。在缺省情况下，函数会利用众数估计标准时间间隔。
 + `length`：序列长度，是一个有单位的正数。目前支持五种单位，分别是 'ms'（毫秒）、's'（秒）、'm'（分钟）、'h'（小时）和'd'（天）。该参数不允许缺省。

 **输出序列：** 输出单个序列，类型为 INT32。输出序列中的每一个数据点对应一个指定长度连续子序列，时间戳为子序列的起始时刻，值为子序列包含的数据点个数。

 **提示：** 对于不符合要求的输入，本函数不对输出做任何保证。

 #### 使用示例

 输入序列：

 ```
 +-----------------------------+---------------+---------------+
 |                         Time|root.test.d1.s1|root.test.d1.s2|
 +-----------------------------+---------------+---------------+
 |2020-01-01T00:00:00.000+08:00|            1.0|            1.0|
 |2020-01-01T00:05:00.000+08:00|            1.0|            1.0|
 |2020-01-01T00:10:00.000+08:00|            1.0|            1.0|
 |2020-01-01T00:20:00.000+08:00|            1.0|            1.0|
 |2020-01-01T00:25:00.000+08:00|            1.0|            1.0|
 |2020-01-01T00:30:00.000+08:00|            1.0|            1.0|
 |2020-01-01T00:35:00.000+08:00|            1.0|            1.0|
 |2020-01-01T00:40:00.000+08:00|            1.0|           null|
 |2020-01-01T00:45:00.000+08:00|            1.0|            1.0|
 |2020-01-01T00:50:00.000+08:00|            1.0|            1.0|
 +-----------------------------+---------------+---------------+
 ```

 用于查询的SQL语句：

 ```sql
 select consecutivewindows(s1,s2,'length'='10m') from root.test.d1
 ```

 输出序列：

 ```
 +-----------------------------+--------------------------------------------------------------------+
 |                         Time|consecutivewindows(root.test.d1.s1, root.test.d1.s2, "length"="10m")|
 +-----------------------------+--------------------------------------------------------------------+
 |2020-01-01T00:00:00.000+08:00|                                                                   3|
 |2020-01-01T00:20:00.000+08:00|                                                                   3|
 |2020-01-01T00:25:00.000+08:00|                                                                   3|
 +-----------------------------+--------------------------------------------------------------------+
 ```
	<!--

	Licensed to the Apache Software Foundation (ASF) under one
	or more contributor license agreements. See the NOTICE file
	distributed with this work for additional information
	regarding copyright ownership. The ASF licenses this file
	to you under the Apache License, Version 2.0 (the
	"License"); you may not use this file except in compliance
	with the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing,
	software distributed under the License is distributed on an
	"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	KIND, either express or implied. See the License for the
	specific language governing permissions and limitations
	under the License.

	-->

	## 序列发现

	### ConsecutiveSequences

	#### 函数简介

	本函数用于在多维严格等间隔数据中发现局部最长连续子序列。

	严格等间隔数据是指数据的时间间隔是严格相等的，允许存在数据缺失（包括行缺失和值缺失），但不允许存在数据冗余和时间戳偏移。

	连续子序列是指严格按照标准时间间隔等距排布，不存在任何数据缺失的子序列。如果某个连续子序列不是任何连续子序列的真子序列，那么它是局部最长的。


	函数名： CONSECUTIVESEQUENCES

	输入序列：支持多个输入序列，类型可以是任意的，但要满足严格等间隔的要求。

	参数：

	+ `gap`：标准时间间隔，是一个有单位的正数。目前支持五种单位，分别是'ms'（毫秒）、's'（秒）、'm'（分钟）、'h'（小时）和'd'（天）。在缺省情况下，函数会利用众数估计标准时间间隔。

	输出序列：输出单个序列，类型为 INT32。输出序列中的每一个数据点对应一个局部最长连续子序列，时间戳为子序列的起始时刻，值为子序列包含的数据点个数。

	提示：对于不符合要求的输入，本函数不对输出做任何保证。

	#### 使用示例

	##### 手动指定标准时间间隔

	本函数可以通过`gap`参数手动指定标准时间间隔。需要注意的是，错误的参数设置会导致输出产生严重错误。

	输入序列：

	```
	+-----------------------------+---------------+---------------+
	\| Time\|root.test.d1.s1\|root.test.d1.s2\|
	+-----------------------------+---------------+---------------+
	\|2020-01-01T00:00:00.000+08:00\| 1.0\| 1.0\|
	\|2020-01-01T00:05:00.000+08:00\| 1.0\| 1.0\|
	\|2020-01-01T00:10:00.000+08:00\| 1.0\| 1.0\|
	\|2020-01-01T00:20:00.000+08:00\| 1.0\| 1.0\|
	\|2020-01-01T00:25:00.000+08:00\| 1.0\| 1.0\|
	\|2020-01-01T00:30:00.000+08:00\| 1.0\| 1.0\|
	\|2020-01-01T00:35:00.000+08:00\| 1.0\| 1.0\|
	\|2020-01-01T00:40:00.000+08:00\| 1.0\| null\|
	\|2020-01-01T00:45:00.000+08:00\| 1.0\| 1.0\|
	\|2020-01-01T00:50:00.000+08:00\| 1.0\| 1.0\|
	+-----------------------------+---------------+---------------+
	```

	用于查询的SQL语句：

	```sql
	select consecutivesequences(s1,s2,'gap'='5m') from root.test.d1
	```

	输出序列：

	```
	+-----------------------------+------------------------------------------------------------------+
	\| Time\|consecutivesequences(root.test.d1.s1, root.test.d1.s2, "gap"="5m")\|
	+-----------------------------+------------------------------------------------------------------+
	\|2020-01-01T00:00:00.000+08:00\| 3\|
	\|2020-01-01T00:20:00.000+08:00\| 4\|
	\|2020-01-01T00:45:00.000+08:00\| 2\|
	+-----------------------------+------------------------------------------------------------------+
	```

	##### 自动估计标准时间间隔

	当`gap`参数缺省时，本函数可以利用众数估计标准时间间隔，得到同样的结果。因此，这种用法更受推荐。

	输入序列同上，用于查询的SQL语句如下：

	```sql
	select consecutivesequences(s1,s2) from root.test.d1
	```

	输出序列：

	```
	+-----------------------------+------------------------------------------------------+
	\| Time\|consecutivesequences(root.test.d1.s1, root.test.d1.s2)\|
	+-----------------------------+------------------------------------------------------+
	\|2020-01-01T00:00:00.000+08:00\| 3\|
	\|2020-01-01T00:20:00.000+08:00\| 4\|
	\|2020-01-01T00:45:00.000+08:00\| 2\|
	+-----------------------------+------------------------------------------------------+
	```

	### ConsecutiveWindows

	#### 函数简介

	本函数用于在多维严格等间隔数据中发现指定长度的连续窗口。

	严格等间隔数据是指数据的时间间隔是严格相等的，允许存在数据缺失（包括行缺失和值缺失），但不允许存在数据冗余和时间戳偏移。

	连续窗口是指严格按照标准时间间隔等距排布，不存在任何数据缺失的子序列。


	函数名： CONSECUTIVEWINDOWS

	输入序列：支持多个输入序列，类型可以是任意的，但要满足严格等间隔的要求。

	参数：

	+ `gap`：标准时间间隔，是一个有单位的正数。目前支持五种单位，分别是 'ms'（毫秒）、's'（秒）、'm'（分钟）、'h'（小时）和'd'（天）。在缺省情况下，函数会利用众数估计标准时间间隔。
	+ `length`：序列长度，是一个有单位的正数。目前支持五种单位，分别是 'ms'（毫秒）、's'（秒）、'm'（分钟）、'h'（小时）和'd'（天）。该参数不允许缺省。

	输出序列：输出单个序列，类型为 INT32。输出序列中的每一个数据点对应一个指定长度连续子序列，时间戳为子序列的起始时刻，值为子序列包含的数据点个数。

	提示：对于不符合要求的输入，本函数不对输出做任何保证。

	#### 使用示例

	输入序列：

	```
	+-----------------------------+---------------+---------------+
	\| Time\|root.test.d1.s1\|root.test.d1.s2\|
	+-----------------------------+---------------+---------------+
	\|2020-01-01T00:00:00.000+08:00\| 1.0\| 1.0\|
	\|2020-01-01T00:05:00.000+08:00\| 1.0\| 1.0\|
	\|2020-01-01T00:10:00.000+08:00\| 1.0\| 1.0\|
	\|2020-01-01T00:20:00.000+08:00\| 1.0\| 1.0\|
	\|2020-01-01T00:25:00.000+08:00\| 1.0\| 1.0\|
	\|2020-01-01T00:30:00.000+08:00\| 1.0\| 1.0\|
	\|2020-01-01T00:35:00.000+08:00\| 1.0\| 1.0\|
	\|2020-01-01T00:40:00.000+08:00\| 1.0\| null\|
	\|2020-01-01T00:45:00.000+08:00\| 1.0\| 1.0\|
	\|2020-01-01T00:50:00.000+08:00\| 1.0\| 1.0\|
	+-----------------------------+---------------+---------------+
	```

	用于查询的SQL语句：

	```sql
	select consecutivewindows(s1,s2,'length'='10m') from root.test.d1
	```

	输出序列：

	```
	+-----------------------------+--------------------------------------------------------------------+
	\| Time\|consecutivewindows(root.test.d1.s1, root.test.d1.s2, "length"="10m")\|
	+-----------------------------+--------------------------------------------------------------------+
	\|2020-01-01T00:00:00.000+08:00\| 3\|
	\|2020-01-01T00:20:00.000+08:00\| 3\|
	\|2020-01-01T00:25:00.000+08:00\| 3\|
	+-----------------------------+--------------------------------------------------------------------+
	```