src/UserGuide/Master/Table/SQL-Manual/Row-Pattern-Recognition_timecho.md - iotdb-docs - Git at Google

 <!--

     Licensed to the Apache Software Foundation (ASF) under one
     or more contributor license agreements.  See the NOTICE file
     distributed with this work for additional information
     regarding copyright ownership.  The ASF licenses this file
     to you under the Apache License, Version 2.0 (the
     "License"); you may not use this file except in compliance
     with the License.  You may obtain a copy of the License at

         http://www.apache.org/licenses/LICENSE-2.0

     Unless required by applicable law or agreed to in writing,
     software distributed under the License is distributed on an
     "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
     KIND, either express or implied.  See the License for the
     specific language governing permissions and limitations
     under the License.

 -->

 # Pattern Query

 ## 1. Syntax Definition

 ```SQL
 MATCH_RECOGNIZE (
   [ PARTITION BY column [, ...] ]
   [ ORDER BY column [, ...] ]
   [ MEASURES measure_definition [, ...] ]
   [ ROWS PER MATCH ]
   [ AFTER MATCH skip_to ]
   PATTERN ( row_pattern )
   [ SUBSET subset_definition [, ...] ]
   DEFINE variable_definition [, ...]
 )
 ```

 **Note:**

 * PARTITION BY: Optional. Used to group the input table, and each group can perform pattern matching independently. If this clause is not specified, the entire input table will be processed as a single unit.
 * ORDER BY: Optional. Used to ensure that input data is processed in a specific order during matching.
 * MEASURES: Optional. Used to specify which information to extract from the matched segment of data.
 * ROWS PER MATCH: Optional. Used to specify the output method of the result set after successful pattern matching.
 * AFTER MATCH SKIP: Optional. Used to specify which row to resume from for the next pattern match after identifying a non-empty match.
 * PATTERN: Used to define the row pattern to be matched.
 * SUBSET: Optional. Used to merge rows matched by multiple basic pattern variables into a single logical set.
 * DEFINE: Used to define the basic pattern variables for the row pattern.

 For more detailed introductions to the features, please refer to:[Pattern Query](../User-Manual/Timeseries-Featured-Analysis_timecho.md#_1-pattern-query)

 ## 2. Usage Examples

 Using [Sample Data](../Reference/Sample-Data.md) as the source data

 1. Time Segment Query

 Segment the data in table1 by time intervals less than or equal to 24 hours, and query the total number of data entries in each segment, as well as the start and end times.

 Query SQL

 SQL

 ```SQL
 SELECT start_time, end_time, cnt
 FROM table1
 MATCH_RECOGNIZE (
     ORDER BY time
     MEASURES
         RPR_FIRST(A.time) AS start_time,
         RPR_LAST(time) AS end_time,
         COUNT() AS cnt
     PATTERN (A B*)
     DEFINE  B AS (cast(B.time as INT64) - cast(PREV(B.time) as INT64)) <= 86400000
 ) AS m
 ```

 Query Results

 SQL

 ```SQL
 +-----------------------------+-----------------------------+---+
 |                   start_time|                     end_time|cnt|
 +-----------------------------+-----------------------------+---+
 |2024-11-26T13:37:00.000+08:00|2024-11-26T13:38:00.000+08:00|  2|
 |2024-11-27T16:38:00.000+08:00|2024-11-30T14:30:00.000+08:00| 16|
 +-----------------------------+-----------------------------+---+
 Total line number = 2
 ```

 2. Difference Segment Query

 Segment the data in table2 by humidity value differences less than 0.1, and query the total number of data entries in each segment, as well as the start and end times.

 * Query SQL

 SQL

 ```SQL
 SELECT start_time, end_time, cnt
 FROM table2
 MATCH_RECOGNIZE (
     ORDER BY time
     MEASURES
         RPR_FIRST(A.time) AS start_time,
         RPR_LAST(time) AS end_time,
         COUNT() AS cnt
     PATTERN (A B*)
     DEFINE  B AS (B.humidity - PREV(B.humidity )) <=0.1
 ) AS m;
 ```

 * Query Results

 SQL

 ```SQL
 +-----------------------------+-----------------------------+---+
 |                   start_time|                     end_time|cnt|
 +-----------------------------+-----------------------------+---+
 |2024-11-26T13:37:00.000+08:00|2024-11-27T00:00:00.000+08:00|  2|
 |2024-11-28T08:00:00.000+08:00|2024-11-29T00:00:00.000+08:00|  2|
 |2024-11-29T11:00:00.000+08:00|2024-11-30T00:00:00.000+08:00|  2|
 +-----------------------------+-----------------------------+---+
 Total line number = 3
 ```

 3. Event Statistics Query

 Group the data in table1 by device ID, and count the start and end times and maximum humidity value where the humidity in the Shanghai area is greater than 35.

 * Query SQL

 SQL

 ```SQL
 SELECT m.device_id, m.match, m.event_start, m.event_end, m.max_humidity
 FROM table1
 MATCH_RECOGNIZE (
   PARTITION BY device_id
   ORDER BY time
   MEASURES
     MATCH_NUMBER() AS match,
     RPR_FIRST(A.time) AS event_start,
     RPR_LAST(A.time) AS event_end,
     MAX(A.humidity) AS max_humidity
   ONE ROW PER MATCH
   PATTERN (A+)
   DEFINE
     A AS A.region= 'Shanghai' AND A.humidity> 35
 ) AS m
 ```

 * Query Results

 SQL

 ```SQL
 +---------+-----+-----------------------------+-----------------------------+------------+
 |device_id|match|                  event_start|                    event_end|max_humidity|
 +---------+-----+-----------------------------+-----------------------------+------------+
 |      100|    1|2024-11-28T09:00:00.000+08:00|2024-11-29T18:30:00.000+08:00|        45.1|
 |      101|    1|2024-11-30T09:30:00.000+08:00|2024-11-30T09:30:00.000+08:00|        35.2|
 +---------+-----+-----------------------------+-----------------------------+------------+
 Total line number = 2
 ```
	<!--

	Licensed to the Apache Software Foundation (ASF) under one
	or more contributor license agreements. See the NOTICE file
	distributed with this work for additional information
	regarding copyright ownership. The ASF licenses this file
	to you under the Apache License, Version 2.0 (the
	"License"); you may not use this file except in compliance
	with the License. You may obtain a copy of the License at

	http://www.apache.org/licenses/LICENSE-2.0

	Unless required by applicable law or agreed to in writing,
	software distributed under the License is distributed on an
	"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
	KIND, either express or implied. See the License for the
	specific language governing permissions and limitations
	under the License.

	-->

	# Pattern Query

	## 1. Syntax Definition

	```SQL
	MATCH_RECOGNIZE (
	[ PARTITION BY column [, ...] ]
	[ ORDER BY column [, ...] ]
	[ MEASURES measure_definition [, ...] ]
	[ ROWS PER MATCH ]
	[ AFTER MATCH skip_to ]
	PATTERN ( row_pattern )
	[ SUBSET subset_definition [, ...] ]
	DEFINE variable_definition [, ...]
	)
	```

	Note:

	* PARTITION BY: Optional. Used to group the input table, and each group can perform pattern matching independently. If this clause is not specified, the entire input table will be processed as a single unit.
	* ORDER BY: Optional. Used to ensure that input data is processed in a specific order during matching.
	* MEASURES: Optional. Used to specify which information to extract from the matched segment of data.
	* ROWS PER MATCH: Optional. Used to specify the output method of the result set after successful pattern matching.
	* AFTER MATCH SKIP: Optional. Used to specify which row to resume from for the next pattern match after identifying a non-empty match.
	* PATTERN: Used to define the row pattern to be matched.
	* SUBSET: Optional. Used to merge rows matched by multiple basic pattern variables into a single logical set.
	* DEFINE: Used to define the basic pattern variables for the row pattern.

	For more detailed introductions to the features, please refer to:[Pattern Query](../User-Manual/Timeseries-Featured-Analysis_timecho.md#_1-pattern-query)

	## 2. Usage Examples

	Using [Sample Data](../Reference/Sample-Data.md) as the source data

	1. Time Segment Query

	Segment the data in table1 by time intervals less than or equal to 24 hours, and query the total number of data entries in each segment, as well as the start and end times.

	Query SQL

	SQL

	```SQL
	SELECT start_time, end_time, cnt
	FROM table1
	MATCH_RECOGNIZE (
	ORDER BY time
	MEASURES
	RPR_FIRST(A.time) AS start_time,
	RPR_LAST(time) AS end_time,
	COUNT() AS cnt
	PATTERN (A B*)
	DEFINE B AS (cast(B.time as INT64) - cast(PREV(B.time) as INT64)) <= 86400000
	) AS m
	```

	Query Results

	SQL

	```SQL
	+-----------------------------+-----------------------------+---+
	\| start_time\| end_time\|cnt\|
	+-----------------------------+-----------------------------+---+
	\|2024-11-26T13:37:00.000+08:00\|2024-11-26T13:38:00.000+08:00\| 2\|
	\|2024-11-27T16:38:00.000+08:00\|2024-11-30T14:30:00.000+08:00\| 16\|
	+-----------------------------+-----------------------------+---+
	Total line number = 2
	```

	2. Difference Segment Query

	Segment the data in table2 by humidity value differences less than 0.1, and query the total number of data entries in each segment, as well as the start and end times.

	* Query SQL

	SQL

	```SQL
	SELECT start_time, end_time, cnt
	FROM table2
	MATCH_RECOGNIZE (
	ORDER BY time
	MEASURES
	RPR_FIRST(A.time) AS start_time,
	RPR_LAST(time) AS end_time,
	COUNT() AS cnt
	PATTERN (A B*)
	DEFINE B AS (B.humidity - PREV(B.humidity )) <=0.1
	) AS m;
	```

	* Query Results

	SQL

	```SQL
	+-----------------------------+-----------------------------+---+
	\| start_time\| end_time\|cnt\|
	+-----------------------------+-----------------------------+---+
	\|2024-11-26T13:37:00.000+08:00\|2024-11-27T00:00:00.000+08:00\| 2\|
	\|2024-11-28T08:00:00.000+08:00\|2024-11-29T00:00:00.000+08:00\| 2\|
	\|2024-11-29T11:00:00.000+08:00\|2024-11-30T00:00:00.000+08:00\| 2\|
	+-----------------------------+-----------------------------+---+
	Total line number = 3
	```

	3. Event Statistics Query

	Group the data in table1 by device ID, and count the start and end times and maximum humidity value where the humidity in the Shanghai area is greater than 35.

	* Query SQL

	SQL

	```SQL
	SELECT m.device_id, m.match, m.event_start, m.event_end, m.max_humidity
	FROM table1
	MATCH_RECOGNIZE (
	PARTITION BY device_id
	ORDER BY time
	MEASURES
	MATCH_NUMBER() AS match,
	RPR_FIRST(A.time) AS event_start,
	RPR_LAST(A.time) AS event_end,
	MAX(A.humidity) AS max_humidity
	ONE ROW PER MATCH
	PATTERN (A+)
	DEFINE
	A AS A.region= 'Shanghai' AND A.humidity> 35
	) AS m
	```

	* Query Results

	SQL

	```SQL
	+---------+-----+-----------------------------+-----------------------------+------------+
	\|device_id\|match\| event_start\| event_end\|max_humidity\|
	+---------+-----+-----------------------------+-----------------------------+------------+
	\| 100\| 1\|2024-11-28T09:00:00.000+08:00\|2024-11-29T18:30:00.000+08:00\| 45.1\|
	\| 101\| 1\|2024-11-30T09:30:00.000+08:00\|2024-11-30T09:30:00.000+08:00\| 35.2\|
	+---------+-----+-----------------------------+-----------------------------+------------+
	Total line number = 2
	```