Data Import Export Script

IoTDB provides data import and export scripts (tools/export-data, tools/import-data, supported in versions 1.3.2 and above; for historical versions, tools/export-csv, tools/import-csv scripts can be used, see the reference link for usage Document ), which are used to facilitate the interaction between IoTDB internal data and external files, suitable for batch operations of single files or directories.

Supported Data Formats

  • CSV : Plain text format for storing formatted data, which must be constructed according to the specified CSV format below.
  • SQL : Files containing custom SQL statements.

export-data Script (Data Export)

Command

# Unix/OS X
>tools/export-data.sh  -h <ip> -p <port> -u <username> -pw <password> -t <directory> [-tf <time-format> -datatype <true/false> -q <query command> -s <source sql file> -tfn <target file name> -lpf <int> -type <export type> -aligned <true/false>]

# Windows
>tools\export-data.bat -h <ip> -p <port> -u <username> -pw <password> -t <directory> [-tf <time-format> -datatype <true/false> -q <query command> -s <source sql file> -tfn <target file name> -lpf <int> -type <export type> -aligned <true/false>]

Parameter Introduction:

ParameterDefinitionRequiredDefault
-hDatabase IP addressNo127.0.0.1
-pDatabase portNo6667
-uDatabase connection usernameNoroot
-pwDatabase connection passwordNoroot
-tOutput path for the exported CSV or SQL file(The parameter for V1.3.2 is -td)Yes
-datatypeWhether to print the corresponding data type behind the time series in the CSV file header, options are true or falseNotrue
-qDirectly specify the query statement to be executed in the command (currently only supports some statements, see the table below for details).
Note: -q and -s parameters must be filled in one, and -q takes effect if both are filled. For detailed examples of supported SQL statements, please refer to the “SQL Statement Support Details” below.
No
-sSpecify an SQL file, which may contain one or more SQL statements. If there are multiple SQL statements, they should be separated by newlines (returns). Each SQL statement corresponds to one or more output CSV or SQL files.
Note: -q and -s parameters must be filled in one, and -q takes effect if both are filled. For detailed examples of supported SQL statements, please refer to the “SQL Statement Support Details” below.
No
-typeSpecify the type of exported file, options are csv or sqlNocsv
-tfSpecify the time format. The time format must comply with the ISO 8601 standard or timestamp.
Note: Only effective when -type is csv
Noyyyy-MM-dd HH:mm:ss.SSSz
-lpfSpecify the maximum number of lines for the exported dump file(The parameter for V1.3.2 is -linesPerFile)No10000
-timeoutSpecify the timeout time for session queries in millisecondsNo-1

SQL Statement Support Rules:

  1. Only query statements are supported; non-query statements (such as metadata management, system management, etc.) are not supported. For unsupported SQL, the program will automatically skip and output an error message.
  2. In the current version of query statements, only the export of raw data is supported. If there are group by, aggregate functions, UDFs, or operational operators, they are not supported for export as SQL. When exporting raw data, please note that if exporting data from multiple devices, please use the align by device statement. Detailed examples are as follows:
Supported for ExportExample
Raw data single device querySupportedselect * from root.s_0.d_0
Raw data multi-device query (align by device)Supportedselect * from root.** align by device
Raw data multi-device query (without align by device)Unsupportedselect * from root.**
select * from root.s_0.*

Running Examples

  • Export all data within a certain SQL execution range to a CSV file.
  # Unix/OS X
  >tools/export-data.sh -t ./data/ -q 'select * from root.stock.**'
  # Windows
  >tools/export-data.bat -t ./data/ -q 'select * from root.stock.**'
  • Export Results

    Time,root.stock.Legacy.0700HK.L1_BidPrice,root.stock.Legacy.0700HK.Type,root.stock.Legacy.0700HK.L1_BidSize,root.stock.Legacy.0700HK.Domain,root.stock.Legacy.0700HK.L1_BuyNo,root.stock.Legacy.0700HK.L1_AskPrice
    2024-07-29T18:37:18.700+08:00,0.9666617,3.0,0.021367407654674264,-6.0,false,0.8926191
    2024-07-29T18:37:19.701+08:00,0.3057328,3.0,0.9965377284981661,-5.0,false,0.15167356
    
  • All data within the scope of all SQL statements in the SQL file is exported to CSV files.

    # Unix/OS X
    >tools/export-data.sh -t ./data/ -s export.sql
    # Windows
    >tools/export-data.bat -t ./data/ -s export.sql
    
  • Contents of export.sql File (Pointed to by -s Parameter)

    select * from root.stock.** limit 100
    select * from root.db.** limit 100
    
  • Export Result File 1

    Time,root.stock.Legacy.0700HK.L1_BidPrice,root.stock.Legacy.0700HK.Type,root.stock.Legacy.0700HK.L1_BidSize,root.stock.Legacy.0700HK.Domain,root.stock.Legacy.0700HK.L1_BuyNo,root.stock.Legacy.0700HK.L1_AskPrice
    2024-07-29T18:37:18.700+08:00,0.9666617,3.0,0.021367407654674264,-6.0,false,0.8926191
    2024-07-29T18:37:19.701+08:00,0.3057328,3.0,0.9965377284981661,-5.0,false,0.15167356
    
  • Export Result File 2

    Time,root.db.Random.RandomBoolean
    2024-07-22T17:16:05.820+08:00,true
    2024-07-22T17:16:02.597+08:00,false
    
  • Export Data in SQL File to SQL Statements with Aligned Format

    # Unix/OS X
    >tools/export-data.sh -h 127.0.0.1 -p 6667 -u root -p root -t ./data/ -s export.sql -type sql -aligned true
    # Windows
    >tools/export-data.bat -h 127.0.0.1 -p 6667 -u root -p root -t ./data/ -s export.sql -type sql -aligned true
    
  • Export Results

    INSERT INTO root.stock.Legacy.0700HK(TIMESTAMP,L1_BidPrice,Type,L1_BidSize,Domain,L1_BuyNo,L1_AskPrice) ALIGNED VALUES (1722249629831,0.62308747,2.0,0.012206747854849653,-6.0,false,0.14164352);
    INSERT INTO root.stock.Legacy.0700HK(TIMESTAMP,L1_BidPrice,Type,L1_BidSize,Domain,L1_BuyNo,L1_AskPrice) ALIGNED VALUES (1722249630834,0.7520042,3.0,0.22760657101910464,-5.0,true,0.089064896);
    INSERT INTO root.stock.Legacy.0700HK(TIMESTAMP,L1_BidPrice,Type,L1_BidSize,Domain,L1_BuyNo,L1_AskPrice) ALIGNED VALUES (1722249631835,0.3981064,3.0,0.6254559288663467,-6.0,false,0.9767922);
    
  • Export Data in a Certain SQL Execution Range to a CSV File with Specified Time Format and Data Types

    # Unix/OS X
    >tools/export-data.sh -t ./data/ -tf 'yyyy-MM-dd HH:mm:ss' -datatype true -q "select * from root.stock.**" -type csv
    # Windows
    >tools/export-data.bat -t ./data/ -tf 'yyyy-MM-dd HH:mm:ss' -datatype true -q "select * from root.stock.**" -type csv
    
  • Export Results

    Time,root.stock.Legacy.0700HK.L1_BidPrice(DOUBLE),root.stock.Legacy.0700HK.Type(DOUBLE),root.stock.Legacy.0700HK.L1_BidSize(DOUBLE),root.stock.Legacy.0700HK.Domain(DOUBLE),root.stock.Legacy.0700HK.L1_BuyNo(BOOLEAN),root.stock.Legacy.0700HK.L1_AskPrice(DOUBLE)
    2024-07-30 10:33:55,0.44574088,3.0,0.21476832811611501,-4.0,true,0.5951748
    2024-07-30 10:33:56,0.6880933,3.0,0.6289119476165305,-5.0,false,0.114634395
    

import-data Script (Data Import)

Import File Examples

CSV File Example

Note that before importing CSV data, special characters need to be handled as follows:

  1. If the text type field contains special characters such as ,, it should be escaped with \.
  2. You can import times in formats like yyyy-MM-dd'T'HH:mm:ss, yyyy-MM-dd HH:mm:ss, or yyyy-MM-dd'T'HH:mm:ss.SSSZ.
  3. The time column Time should always be in the first column.

Example 1: Time Aligned, No Data Types in Header

Time,root.test.t1.str,root.test.t2.str,root.test.t2.var
1970-01-01T08:00:00.001+08:00,"123hello world","123\,abc",100
1970-01-01T08:00:00.002+08:00,"123",,

Example 2: Time Aligned, Data Types in Header(Text type data supports double quotation marks and non double quotation marks)

Time,root.test.t1.str(TEXT),root.test.t2.str(TEXT),root.test.t2.var(INT32)
1970-01-01T08:00:00.001+08:00,"123hello world","123\,abc",100
1970-01-01T08:00:00.002+08:00,123,hello world,123
1970-01-01T08:00:00.003+08:00,"123",,
1970-01-01T08:00:00.004+08:00,123,,12

Example 3: Device Aligned, No Data Types in Header

Time,Device,str,var
1970-01-01T08:00:00.001+08:00,root.test.t1,"123hello world",
1970-01-01T08:00:00.002+08:00,root.test.t1,"123",
1970-01-01T08:00:00.001+08:00,root.test.t2,"123\,abc",100

Example 4: Device Aligned, Data Types in Header (Text type data supports double quotation marks and non double quotation marks)

Time,Device,str(TEXT),var(INT32)
1970-01-01T08:00:00.001+08:00,root.test.t1,"123hello world",
1970-01-01T08:00:00.002+08:00,root.test.t1,"123",
1970-01-01T08:00:00.001+08:00,root.test.t2,"123\,abc",100
1970-01-01T08:00:00.002+08:00,root.test.t1,hello world,123

SQL File Example

For unsupported SQL, illegal SQL, or failed SQL executions, they will be placed in the failed directory under the failed file (default to filename.failed).

INSERT INTO root.stock.Legacy.0700HK(TIMESTAMP,L1_BidPrice,Type,L1_BidSize,Domain,L1_BuyNo,L1_AskPrice) VALUES (1721728578812,0.21911979,4.0,0.7129878488375604,-5.0,false,0.65362453);
INSERT INTO root.stock.Legacy.0700HK(TIMESTAMP,L1_BidPrice,Type,L1_BidSize,Domain,L1_BuyNo,L1_AskPrice) VALUES (1721728579812,0.35814416,3.0,0.04674720094979623,-5.0,false,0.9365247);
INSERT INTO root.stock.Legacy.0700HK(TIMESTAMP,L1_BidPrice,Type,L1_BidSize,Domain,L1_BuyNo,L1_AskPrice) VALUES (1721728580813,0.20012152,3.0,0.9910098187911393,-4.0,true,0.70040536);
INSERT INTO root.stock.Legacy.0700HK(TIMESTAMP,L1_BidPrice,Type,L1_BidSize,Domain,L1_BuyNo,L1_AskPrice) VALUES (1721728581814,0.034122765,4.0,0.9313345284181858,-4.0,true,0.9945297);

Command

# Unix/OS X
>tools/import-data.sh -h <ip> -p <port> -u <username> -pw <password> -s <xxx.csv/sql> [-fd <./failedDirectory> -aligned <true/false> -batch <int> -tp <ms/ns/us> -typeInfer <boolean=text,float=double...> -lpf <int>]

# Windows
>tools\import-data.bat -h <ip> -p <port> -u <username> -pw <password> -s <xxx.csv/sql> [-fd <./failedDirectory> -aligned <true/false> -batch <int> -tp <ms/ns/us> -typeInfer <boolean=text,float=double...> -lpf <int>]

Although IoTDB has the ability to infer types, it is still recommended to create metadata before importing data to avoid unnecessary type conversion errors. For example:

CREATE DATABASE root.fit.d1;
CREATE DATABASE root.fit.d2;
CREATE DATABASE root.fit.p;
CREATE TIMESERIES root.fit.d1.s1 WITH DATATYPE=INT32,ENCODING=RLE;
CREATE TIMESERIES root.fit.d1.s2 WITH DATATYPE=TEXT,ENCODING=PLAIN;
CREATE TIMESERIES root.fit.d2.s1 WITH DATATYPE=INT32,ENCODING=RLE;
CREATE TIMESERIES root.fit.d2.s3 WITH DATATYPE=INT32,ENCODING=RLE;
CREATE TIMESERIES root.fit.p.s1 WITH DATATYPE=INT32,ENCODING=RLE;

Parameter Introduction:

ParameterDefinitionRequiredDefault
-hDatabase IP addressNo127.0.0.1
-pDatabase portNo6667
-uDatabase connection usernameNoroot
-pwDatabase connection passwordNoroot
-sSpecify the data you want to import, here you can specify a file or folder. If a folder is specified, all files with the suffix CSV or SQL in the folder will be imported in bulk.(The parameter for V1.3.2 is -f)Yes
-fdSpecify the directory to store the failed SQL files. If this parameter is not specified, the failed files will be saved to the directory of the source data.
Note: For unsupported SQL, illegal SQL, and failed SQL, they will be placed in the failed file in the failed directory (default file name is. failed)
NoAdd the suffix ‘. failed’ to the source file name
-alignedSpecify whether to use the ‘aligned’ interface, with options of true or false. This parameter only takes effect when the imported file is a CSV fileNofalse
-batchUsed to specify the number of points to be inserted for each batch of data (minimum value is 1, maximum value is Integer.MAX_VALUE). If the program reports' org.apache.hrift.transport ' If TTransportException: Frame size larger than protect max size is incorrect, you can adjust this parameter appropriately.No100000
-tpSpecify time precision, optional values include ‘ms’ (milliseconds),' ns' (nanoseconds), ‘us’ (microseconds)Noms
-lpfSpecify the number of lines to write data to each failed import file(The parameter for V1.3.2 is -linesPerFailedFile)No10000
-typeInferUsed to specify type inference rules. For Example:<srcTsDataType1=dstTsDataType1,srcTsDataType2=dstTsDataType2,...>.
Note: Used to specify type inference rules.srcTsDataType include boolean,int,long,float,double,NaN.dstTsDataType include boolean,int,long,float,double,text.whensrcTsDataTypeisboolean, dstTsDataTypecan only bebooleanortext.whensrcTsDataTypeisNaN, dstTsDataTypecan only befloat, doubleortext.whensrcTsDataTypeis numeric, the precision of dstTsDataTypeneeds to be higher than that of srcTsDataType.For example:-typeInfer boolean=text,float=double
No

Running Examples

  • Import the dump0_0.sql data in the current data directory into the local IoTDB database.
# Unix/OS X
>tools/import-data.sh -s ./data/dump0_0.sql
# Windows
>tools/import-data.bat -s ./data/dump0_0.sql
  • Import all data in the current data directory in an aligned manner into the local IoTDB database.
# Unix/OS X
>tools/import-data.sh -s ./data/ -fd ./failed/ -aligned true
# Windows
>tools/import-data.bat -s ./data/ -fd ./failed/ -aligned true
  • Import the dump0_0.csv data in the current data directory into the local IoTDB database.
# Unix/OS X
>tools/import-data.sh -s ./data/dump0_0.csv -fd ./failed/
# Windows
>tools/import-data.bat -s ./data/dump0_0.csv -fd ./failed/
  • Import the dump0_0.csv data in the current data directory in an aligned manner, batch import 100,000 records into the IoTDB database on the host with IP 192.168.100.1, record failures in the current failed directory, and limit each file to 1,000 records.
# Unix/OS X
>tools/import-data.sh -h 192.168.100.1 -p 6667 -u root -pw root -s ./data/dump0_0.csv -fd ./failed/ -aligned true -batch 100000 -tp ms -typeInfer boolean=text,float=double -lpf 1000
# Windows
>tools/import-data.bat -h 192.168.100.1 -p 6667 -u root -pw root -s ./data/dump0_0.csv -fd ./failed/ -aligned true -batch 100000 -tp ms -typeInfer boolean=text,float=double -lpf 1000