The SELECT INTO
statement copies data from query result set into target time series.
The application scenarios are as follows:
The following is the syntax definition of the select
statement:
selectIntoStatement : SELECT resultColumn [, resultColumn] ... INTO intoItem [, intoItem] ... FROM prefixPath [, prefixPath] ... [WHERE whereCondition] [GROUP BY groupByTimeClause, groupByLevelClause] [FILL {PREVIOUS | LINEAR | constant}] [LIMIT rowLimit OFFSET rowOffset] [ALIGN BY DEVICE] ; intoItem : [ALIGNED] intoDevicePath '(' intoMeasurementName [',' intoMeasurementName]* ')' ;
INTO
ClauseThe INTO
clause consists of several intoItem
.
Each intoItem
consists of a target device and a list of target measurements (similar to the INTO
clause in an INSERT
statement).
Each target measurement and device form a target time series, and an intoItem
contains a series of time series. For example: root.sg_copy.d1(s1, s2)
specifies two target time series root.sg_copy.d1.s1
and root.sg_copy.d1.s2
.
The target time series specified by the INTO
clause must correspond one-to-one with the columns of the query result set. The specific rules are as follows:
intoItem
must be consistent with the number of columns in the query result set (except the time column) and correspond one-to-one in the order from left to right in the header.ALIGN BY DEVICE
): the number of target devices specified in all intoItem
is the same as the number of devices queried (i.e., the number of devices matched by the path pattern in the FROM
clause), and One-to-one correspondence according to the output order of the result set device. For examples:
IoTDB> select s1, s2 into root.sg_copy.d1(t1), root.sg_copy.d2(t1, t2), root.sg_copy.d1(t2) from root.sg.d1, root.sg.d2; +--------------+-------------------+--------+ | source column| target timeseries| written| +--------------+-------------------+--------+ | root.sg.d1.s1| root.sg_copy.d1.t1| 8000| +--------------+-------------------+--------+ | root.sg.d2.s1| root.sg_copy.d2.t1| 10000| +--------------+-------------------+--------+ | root.sg.d1.s2| root.sg_copy.d2.t2| 12000| +--------------+-------------------+--------+ | root.sg.d2.s2| root.sg_copy.d1.t2| 10000| +--------------+-------------------+--------+ Total line number = 4 It costs 0.725s
This statement writes the query results of the four time series under the root.sg
database to the four specified time series under the root.sg_copy
database. Note that root.sg_copy.d2(t1, t2)
can also be written as root.sg_copy.d2(t1), root.sg_copy.d2(t2)
.
We can see that the writing of the INTO
clause is very flexible as long as the combined target time series is not repeated and corresponds to the query result column one-to-one.
In the result set displayed by
CLI
, the meaning of each column is as follows:
- The
source column
column represents the column name of the query result.target timeseries
represents the target time series for the corresponding column to write.written
indicates the amount of data expected to be written.
IoTDB> select count(s1 + s2), last_value(s2) into root.agg.count(s1_add_s2), root.agg.last_value(s2) from root.sg.d1 group by ([0, 100), 10ms); +--------------------------------------+-------------------------+--------+ | source column| target timeseries| written| +--------------------------------------+-------------------------+--------+ | count(root.sg.d1.s1 + root.sg.d1.s2)| root.agg.count.s1_add_s2| 10| +--------------------------------------+-------------------------+--------+ | last_value(root.sg.d1.s2)| root.agg.last_value.s2| 10| +--------------------------------------+-------------------------+--------+ Total line number = 2 It costs 0.375s
This statement stores the results of an aggregated query into the specified time series.
IoTDB> select s1, s2 into root.sg_copy.d1(t1, t2), root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device; +--------------+--------------+-------------------+--------+ | source device| source column| target timeseries| written| +--------------+--------------+-------------------+--------+ | root.sg.d1| s1| root.sg_copy.d1.t1| 8000| +--------------+--------------+-------------------+--------+ | root.sg.d1| s2| root.sg_copy.d1.t2| 11000| +--------------+--------------+-------------------+--------+ | root.sg.d2| s1| root.sg_copy.d2.t1| 12000| +--------------+--------------+-------------------+--------+ | root.sg.d2| s2| root.sg_copy.d2.t2| 9000| +--------------+--------------+-------------------+--------+ Total line number = 4 It costs 0.625s
This statement also writes the query results of the four time series under the root.sg
database to the four specified time series under the root.sg_copy
database. However, in ALIGN BY DEVICE, the number of intoItem
must be the same as the number of queried devices, and each queried device corresponds to one intoItem
.
When aligning the query by device, the result set displayed by
CLI
has one more column, thesource device
column indicating the queried device.
IoTDB> select s1 + s2 into root.expr.add(d1s1_d1s2), root.expr.add(d2s1_d2s2) from root.sg.d1, root.sg.d2 align by device; +--------------+--------------+------------------------+--------+ | source device| source column| target timeseries| written| +--------------+--------------+------------------------+--------+ | root.sg.d1| s1 + s2| root.expr.add.d1s1_d1s2| 10000| +--------------+--------------+------------------------+--------+ | root.sg.d2| s1 + s2| root.expr.add.d2s1_d2s2| 10000| +--------------+--------------+------------------------+--------+ Total line number = 2 It costs 0.532s
This statement stores the result of evaluating an expression into the specified time series.
In particular, We can use variable placeholders to describe the correspondence between the target and query time series, simplifying the statement. The following two variable placeholders are currently supported:
::
: Copy the suffix (or measurement) of the query device, indicating that from this layer to the last layer (or measurement) of the device, the node name (or measurement) of the target device corresponds to the queried device The node name (or measurement) is the same.${i}
: Indicates that the current level node name of the target sequence is the same as the i-th level node name of the query sequence. For example, for the path root.sg1.d1.s1
, ${1}
means sg1
, ${2}
means d1
, and ${3}
means s1
.When using variable placeholders, there must be no ambiguity in the correspondence between intoItem
and the columns of the query result set. The specific cases are classified as follows:
Note: The variable placeholder can only describe the correspondence between time series. If the query includes aggregation and expression calculation, the columns in the query result cannot correspond to a time series, so neither the target device nor the measurement can use variable placeholders.
Limitations:
intoItem
, the length of the list of physical quantities must be 1. root.sg1.d1(::, s1)
, it is not possible to determine which columns match ::
)intoItem
is 1, or the same as the number of columns in the query result set. intoItem
, it means that all the query sequences are written to the same device; if the number of intoItem
is consistent with the query sequence, it is expressed as each query time series specifies a target device; if intoItem
is greater than one and less than the number of query sequences, it cannot be a one-to-one correspondence with the query sequence)Matching method: Each query time series specifies the target device, and the target measurement is generated from the variable placeholder.
Example:
select s1, s2 into root.sg_copy.d1(::), root.sg_copy.d2(s1), root.sg_copy.d1(${3}), root.sg_copy.d2(::) from root.sg.d1, root.sg.d2;
This statement is equivalent to:
select s1, s2 into root.sg_copy.d1(s1), root.sg_copy.d2(s1), root.sg_copy.d1(s2), root.sg_copy.d2(s2) from root.sg.d1, root.sg.d2;
As you can see, the statement is not very simplified in this case.
Limitations: The number of target measurements in all intoItem
is the same as the number of columns in the query result set.
Matching method: The target measurement is specified for each query time series, and the target device is generated according to the target device placeholder of the intoItem
where the corresponding target measurement is located.
Example:
select d1.s1, d1.s2, d2.s3, d3.s4 into ::(s1_1, s2_2), root.sg.d2_2(s3_3), root.${2}_copy.::(s4) from root.sg;
Limitations: There is only one intoItem
, and the length of the list of measurement list is 1.
Matching method: Each query time series can get a target time series according to the variable placeholder.
Example:
select * into root.sg_bk.::(::) from root.sg.**;
Write the query results of all time series under root.sg
to root.sg_bk
, the device name suffix and measurement remain unchanged.
Note: The variable placeholder can only describe the correspondence between time series. If the query includes aggregation and expression calculation, the columns in the query result cannot correspond to a specific physical quantity, so the target measurement cannot use variable placeholders.
Limitations: In each intoItem
, if the list of measurement uses variable placeholders, the length of the list must be 1.
Matching method: Each query time series specifies the target device, and the target measurement is generated from the variable placeholder.
Example:
select s1, s2, s3, s4 into root.backup_sg.d1(s1, s2, s3, s4), root.backup_sg.d2(::), root.sg.d3(backup_${4}) from root.sg.d1, root.sg.d2, root.sg.d3 align by device;
Limitations: There is only one intoItem
. (If there are multiple intoItem
with placeholders, we will not know which source devices each intoItem
needs to match)
Matching method: Each query device obtains a target device according to the variable placeholder, and the target measurement written in each column of the result set under each device is specified by the target measurement list.
Example:
select avg(s1), sum(s2) + sum(s3), count(s4) into root.agg_${2}.::(avg_s1, sum_s2_add_s3, count_s4) from root.** align by device;
Limitations: There is only one intoItem
and the length of the target measurement list is 1.
Matching method: Each query time series can get a target time series according to the variable placeholder.
Example:
select * into ::(backup_${4}) from root.sg.** align by device;
Write the query result of each time series in root.sg
to the same device, and add backup_
before the measurement.
We can use the ALIGNED
keyword to specify the target device for writing to be aligned, and each intoItem
can be set independently.
Example:
select s1, s2 into root.sg_copy.d1(t1, t2), aligned root.sg_copy.d2(t1, t2) from root.sg.d1, root.sg.d2 align by device;
This statement specifies that root.sg_copy.d1
is an unaligned device and root.sg_copy.d2
is an aligned device.
SLIMIT
, SOFFSET
: The query columns are uncertain, so they are not supported.LAST
, GROUP BY TAGS
, DISABLE ALIGN
: The table structure is inconsistent with the writing structure, so it is not supported.ETL the original data and write a new time series.
IOTDB > SELECT preprocess_udf(s1, s2) INTO ::(preprocessed_s1, preprocessed_s2) FROM root.sg.* ALIGN BY DEIVCE; +--------------+-------------------+---------------------------+--------+ | source device| source column| target timeseries| written| +--------------+-------------------+---------------------------+--------+ | root.sg.d1| preprocess_udf(s1)| root.sg.d1.preprocessed_s1| 8000| +--------------+-------------------+---------------------------+--------+ | root.sg.d1| preprocess_udf(s2)| root.sg.d1.preprocessed_s2| 10000| +--------------+-------------------+---------------------------+--------+ | root.sg.d2| preprocess_udf(s1)| root.sg.d2.preprocessed_s1| 11000| +--------------+-------------------+---------------------------+--------+ | root.sg.d2| preprocess_udf(s2)| root.sg.d2.preprocessed_s2| 9000| +--------------+-------------------+---------------------------+--------+
Persistently store the query results, which acts like a materialized view.
IOTDB > SELECT count(s1), last_value(s1) INTO root.sg.agg_${2}(count_s1, last_value_s1) FROM root.sg1.d1 GROUP BY ([0, 10000), 10ms); +--------------------------+-----------------------------+--------+ | source column| target timeseries| written| +--------------------------+-----------------------------+--------+ | count(root.sg.d1.s1)| root.sg.agg_d1.count_s1| 1000| +--------------------------+-----------------------------+--------+ | last_value(root.sg.d1.s2)| root.sg.agg_d1.last_value_s2| 1000| +--------------------------+-----------------------------+--------+ Total line number = 2 It costs 0.115s
Rewrite non-aligned time series into another aligned time series.
Note: It is recommended to use the LIMIT & OFFSET
clause or the WHERE
clause (time filter) to batch data to prevent excessive data volume in a single operation.
IOTDB > SELECT s1, s2 INTO ALIGNED root.sg1.aligned_d(s1, s2) FROM root.sg1.non_aligned_d WHERE time >= 0 and time < 10000; +--------------------------+----------------------+--------+ | source column| target timeseries| written| +--------------------------+----------------------+--------+ | root.sg1.non_aligned_d.s1| root.sg1.aligned_d.s1| 10000| +--------------------------+----------------------+--------+ | root.sg1.non_aligned_d.s2| root.sg1.aligned_d.s2| 10000| +--------------------------+----------------------+--------+ Total line number = 2 It costs 0.375s
The user must have the following permissions to execute a query write-back statement:
READ_TIMESERIES
permissions for the source series in the select
clause.INSERT_TIMESERIES
permissions for the target series in the into
clause.For more user permissions related content, please refer to Account Management Statements.
select_into_insert_tablet_plan_row_limit
: The maximum number of rows can be processed in one insert-tablet-plan when executing select-into statements. 10000 by default.