The SELECT INTO statement copies data from query result set into target time series.
The application scenarios are as follows:
The following is the syntax definition of the select statement:
selectIntoStatement : SELECT resultColumn [, resultColumn] ... INTO intoItem [, intoItem] ... FROM prefixPath [, prefixPath] ... [WHERE whereCondition] [GROUP BY groupByTimeClause, groupByLevelClause] [FILL {PREVIOUS | LINEAR | constant}] [LIMIT rowLimit OFFSET rowOffset] [ALIGN BY DEVICE] ; intoItem : [ALIGNED] intoDevicePath '(' intoMeasurementName [',' intoMeasurementName]* ')' ;
INTO ClauseThe INTO clause consists of several intoItem.
Each intoItem consists of a target device and a list of target measurements (similar to the INTO clause in an INSERT statement).
Each target measurement and device form a target time series, and an intoItem contains a series of time series. For example: root.sg_copy.d1(s1, s2) specifies two target time series root.sg_copy.d1.s1 and root.sg_copy.d1.s2.
The target time series specified by the INTO clause must correspond one-to-one with the columns of the query result set. The specific rules are as follows:
intoItem must be consistent with the number of columns in the query result set (except the time column) and correspond one-to-one in the order from left to right in the header.ALIGN BY DEVICE): the number of target devices specified in all intoItem is the same as the number of devices queried (i.e., the number of devices matched by the path pattern in the FROM clause), and One-to-one correspondence according to the output order of the result set device. For examples:
IoTDB> select s1, s2 into root.sg_copy.d1(t1), root.sg_copy.d2(t1, t2), root.sg_copy.d1(t2) from root.sg.d1, root.sg.d2; +--------------+-------------------+--------+ | source column| target timeseries| written| +--------------+-------------------+--------+ | root.sg.d1.s1| root.sg_copy.d1.t1| 8000| +--------------+-------------------+--------+ | root.sg.d2.s1| root.sg_copy.d2.t1| 10000| +--------------+-------------------+--------+ | root.sg.d1.s2| root.sg_copy.d2.t2| 12000| +--------------+-------------------+--------+ | root.sg.d2.s2| root.sg_copy.d1.t2| 10000| +--------------+-------------------+--------+ Total line number = 4 It costs 0.725s
This statement writes the query results of the four time series under the root.sg database to the four specified time series under the root.sg_copy database. Note that root.sg_copy.d2(t1, t2) can also be written as root.sg_copy.d2(t1), root.sg_copy.d2(t2).
We can see that the writing of the INTO clause is very flexible as long as the combined target time series is not repeated and corresponds to the query result column one-to-one.
In the result set displayed by
CLI, the meaning of each column is as follows:
- The
source columncolumn represents the column name of the query result.target timeseriesrepresents the target time series for the corresponding column to write.writtenindicates the amount of data expected to be written.
IoTDB> select count(s1 + s2), last_value(s2) into root.agg.count(s1_add_s2), root.agg.last_value(s2) from root.sg.d1 group by ([0, 100), 10ms); +--------------------------------------+-------------------------+--------+ | source column| target timeseries| written| +--------------------------------------+-------------------------+--------+ | count(root.sg.d1.s1 + root.sg.d1.s2)| root.agg.count.s1_add_s2| 10| +--------------------------------------+-------------------------+--------+ | last_value(root.sg.d1.s2)| root.agg.last_value.s2| 10| +--------------------------------------+-------------------------+--------+ Total line number = 2 It costs 0.375s
This statement stores the results of an aggregated query into the specified time series.
IoTDB> select s1, s2 into root.sg_copy.d1(t1, t2), root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; +--------------+--------------+-------------------+--------+ | source device| source column| target timeseries| written| +--------------+--------------+-------------------+--------+ | root.sg.d1| s1| root.sg_copy.d1.t1| 8000| +--------------+--------------+-------------------+--------+ | root.sg.d1| s2| root.sg_copy.d1.t2| 11000| +--------------+--------------+-------------------+--------+ | root.sg.d2| s1| root.sg_copy.d2.t1| 12000| +--------------+--------------+-------------------+--------+ | root.sg.d2| s2| root.sg_copy.d2.t2| 9000| +--------------+--------------+-------------------+--------+ Total line number = 4 It costs 0.625s
This statement also writes the query results of the four time series under the root.sg database to the four specified time series under the root.sg_copy database. However, in ALIGN BY DEVICE, the number of intoItem must be the same as the number of queried devices, and each queried device corresponds to one intoItem.
When aligning the query by device, the result set displayed by
CLIhas one more column, thesource devicecolumn indicating the queried device.
IoTDB> select s1 + s2 into root.expr.add(d1s1_d1s2), root.expr.add(d2s1_d2s2) from root.sg.d1, root.sg.d2 align by device; +--------------+--------------+------------------------+--------+ | source device| source column| target timeseries| written| +--------------+--------------+------------------------+--------+ | root.sg.d1| s1 + s2| root.expr.add.d1s1_d1s2| 10000| +--------------+--------------+------------------------+--------+ | root.sg.d2| s1 + s2| root.expr.add.d2s1_d2s2| 10000| +--------------+--------------+------------------------+--------+ Total line number = 2 It costs 0.532s
This statement stores the result of evaluating an expression into the specified time series.
In particular, We can use variable placeholders to describe the correspondence between the target and query time series, simplifying the statement. The following two variable placeholders are currently supported:
::: Copy the suffix (or measurement) of the query device, indicating that from this layer to the last layer (or measurement) of the device, the node name (or measurement) of the target device corresponds to the queried device The node name (or measurement) is the same.${i}: Indicates that the current level node name of the target sequence is the same as the i-th level node name of the query sequence. For example, for the path root.sg1.d1.s1, ${1} means sg1, ${2} means d1, and ${3} means s1.When using variable placeholders, there must be no ambiguity in the correspondence between intoItem and the columns of the query result set. The specific cases are classified as follows:
Note: The variable placeholder can only describe the correspondence between time series. If the query includes aggregation and expression calculation, the columns in the query result cannot correspond to a time series, so neither the target device nor the measurement can use variable placeholders.
Limitations:
intoItem, the length of the list of physical quantities must be 1. root.sg1.d1(::, s1), it is not possible to determine which columns match ::)intoItem is 1, or the same as the number of columns in the query result set. intoItem, it means that all the query sequences are written to the same device; if the number of intoItem is consistent with the query sequence, it is expressed as each query time series specifies a target device; if intoItem is greater than one and less than the number of query sequences, it cannot be a one-to-one correspondence with the query sequence)Matching method: Each query time series specifies the target device, and the target measurement is generated from the variable placeholder.
Example:
select s1, s2 into root.sg_copy.d1(::), root.sg_copy.d2(s1), root.sg_copy.d1(${3}), root.sg_copy.d2(::) from root.sg.d1, root.sg.d2;
This statement is equivalent to:
select s1, s2 into root.sg_copy.d1(s1), root.sg_copy.d2(s1), root.sg_copy.d1(s2), root.sg_copy.d2(s2) from root.sg.d1, root.sg.d2;
As you can see, the statement is not very simplified in this case.
Limitations: The number of target measurements in all intoItem is the same as the number of columns in the query result set.
Matching method: The target measurement is specified for each query time series, and the target device is generated according to the target device placeholder of the intoItem where the corresponding target measurement is located.
Example:
select d1.s1, d1.s2, d2.s3, d3.s4 into ::(s1_1, s2_2), root.sg.d2_2(s3_3), root.${2}_copy.::(s4) from root.sg;
Limitations: There is only one intoItem, and the length of the list of measurement list is 1.
Matching method: Each query time series can get a target time series according to the variable placeholder.
Example:
select * into root.sg_bk.::(::) from root.sg.**;
Write the query results of all time series under root.sg to root.sg_bk, the device name suffix and measurement remain unchanged.
Note: The variable placeholder can only describe the correspondence between time series. If the query includes aggregation and expression calculation, the columns in the query result cannot correspond to a specific physical quantity, so the target measurement cannot use variable placeholders.
Limitations: In each intoItem, if the list of measurement uses variable placeholders, the length of the list must be 1.
Matching method: Each query time series specifies the target device, and the target measurement is generated from the variable placeholder.
Example:
select s1, s2, s3, s4 into root.backup_sg.d1(s1, s2, s3, s4), root.backup_sg.d2(::), root.sg.d3(backup_${4}) from root.sg.d1, root.sg.d2, root.sg.d3 align by device;
Limitations: There is only one intoItem. (If there are multiple intoItem with placeholders, we will not know which source devices each intoItem needs to match)
Matching method: Each query device obtains a target device according to the variable placeholder, and the target measurement written in each column of the result set under each device is specified by the target measurement list.
Example:
select avg(s1), sum(s2) + sum(s3), count(s4) into root.agg_${2}.::(avg_s1, sum_s2_add_s3, count_s4) from root.** align by device;
Limitations: There is only one intoItem and the length of the target measurement list is 1.
Matching method: Each query time series can get a target time series according to the variable placeholder.
Example:
select * into ::(backup_${4}) from root.sg.** align by device;
Write the query result of each time series in root.sg to the same device, and add backup_ before the measurement.
We can use the ALIGNED keyword to specify the target device for writing to be aligned, and each intoItem can be set independently.
Example:
select s1, s2 into root.sg_copy.d1(t1, t2), aligned root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device;
This statement specifies that root.sg_copy.d1 is an unaligned device and root.sg_copy.d2 is an aligned device.
SLIMIT, SOFFSET: The query columns are uncertain, so they are not supported.LAST, GROUP BY TAGS, DISABLE ALIGN: The table structure is inconsistent with the writing structure, so it is not supported.ETL the original data and write a new time series.
IOTDB > SELECT preprocess_udf(s1, s2) INTO ::(preprocessed_s1, preprocessed_s2) FROM root.sg.* ALIGN BY DEIVCE; +--------------+-------------------+---------------------------+--------+ | source device| source column| target timeseries| written| +--------------+-------------------+---------------------------+--------+ | root.sg.d1| preprocess_udf(s1)| root.sg.d1.preprocessed_s1| 8000| +--------------+-------------------+---------------------------+--------+ | root.sg.d1| preprocess_udf(s2)| root.sg.d1.preprocessed_s2| 10000| +--------------+-------------------+---------------------------+--------+ | root.sg.d2| preprocess_udf(s1)| root.sg.d2.preprocessed_s1| 11000| +--------------+-------------------+---------------------------+--------+ | root.sg.d2| preprocess_udf(s2)| root.sg.d2.preprocessed_s2| 9000| +--------------+-------------------+---------------------------+--------+
Persistently store the query results, which acts like a materialized view.
IOTDB > SELECT count(s1), last_value(s1) INTO root.sg.agg_${2}(count_s1, last_value_s1) FROM root.sg1.d1 GROUP BY ([0, 10000), 10ms); +--------------------------+-----------------------------+--------+ | source column| target timeseries| written| +--------------------------+-----------------------------+--------+ | count(root.sg.d1.s1)| root.sg.agg_d1.count_s1| 1000| +--------------------------+-----------------------------+--------+ | last_value(root.sg.d1.s2)| root.sg.agg_d1.last_value_s2| 1000| +--------------------------+-----------------------------+--------+ Total line number = 2 It costs 0.115s
Rewrite non-aligned time series into another aligned time series.
Note: It is recommended to use the LIMIT & OFFSET clause or the WHERE clause (time filter) to batch data to prevent excessive data volume in a single operation.
IOTDB > SELECT s1, s2 INTO ALIGNED root.sg1.aligned_d(s1, s2) FROM root.sg1.non_aligned_d WHERE time >= 0 and time < 10000; +--------------------------+----------------------+--------+ | source column| target timeseries| written| +--------------------------+----------------------+--------+ | root.sg1.non_aligned_d.s1| root.sg1.aligned_d.s1| 10000| +--------------------------+----------------------+--------+ | root.sg1.non_aligned_d.s2| root.sg1.aligned_d.s2| 10000| +--------------------------+----------------------+--------+ Total line number = 2 It costs 0.375s
The user must have the following permissions to execute a query write-back statement:
READ_TIMESERIES permissions for the source series in the select clause.INSERT_TIMESERIES permissions for the target series in the into clause.For more user permissions related content, please refer to Account Management Statements.
select_into_insert_tablet_plan_row_limit: The maximum number of rows can be processed in one insert-tablet-plan when executing select-into statements. 10000 by default.