The CSV tool can help you import data in CSV format to IoTDB or export data from IoTDB to a CSV file.
# Unix/OS X > tools/export-csv.sh -h <ip> -p <port> -u <username> -pw <password> -td <directory> [-tf <time-format> -datatype <true/false> -q <query command> -s <sql file> -linesPerFile <int>] # Windows > tools\export-csv.bat -h <ip> -p <port> -u <username> -pw <password> -td <directory> [-tf <time-format> -datatype <true/false> -q <query command> -s <sql file> -linesPerFile <int>]
Description:
-datatype:Time, root.sg1.d1.s1(INT32), root.sg1.d1.s2(INT64).Time, root.sg1.d1.s1 , root.sg1.d1.s2-q <query command>:select * from root.** limit 100, or select * from root.** limit 100 align by device-s <sql file>:-td <directory>:-tf <time-format>:-tf timestamp-tf yyyy-MM-dd\ HH:mm:ss or -tf timestamp-linesPerFile <int>:10000 is default.-linesPerFile 1-t <timeout>:More, if you don't use one of -s and -q, you need to enter some queries after running the export script. The results of the different query will be saved to different CSV files.
# Unix/OS X > tools/export-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -td ./ # Or > tools/export-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -tf yyyy-MM-dd\ HH:mm:ss # or > tools/export-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -q "select * from root.**" # Or > tools/export-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -s sql.txt # Or > tools/export-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -tf yyyy-MM-dd\ HH:mm:ss -s sql.txt # Or > tools/export-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -tf yyyy-MM-dd\ HH:mm:ss -s sql.txt -linesPerFile 10 # Or > tools/export-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -tf yyyy-MM-dd\ HH:mm:ss -s sql.txt -linesPerFile 10 -t 10000 # Windows > tools/export-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -td ./ # Or > tools/export-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -tf yyyy-MM-dd\ HH:mm:ss # or > tools/export-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -q "select * from root.**" # Or > tools/export-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -s sql.txt # Or > tools/export-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -tf yyyy-MM-dd\ HH:mm:ss -s sql.txt # Or > tools/export-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -tf yyyy-MM-dd\ HH:mm:ss -s sql.txt -linesPerFile 10 # Or > tools/export-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -td ./ -tf yyyy-MM-dd\ HH:mm:ss -s sql.txt -linesPerFile 10 -t 10000
select * from root.**; select * from root.** align by device;
The result of select * from root.**
Time,root.ln.wf04.wt04.status(BOOLEAN),root.ln.wf03.wt03.hardware(TEXT),root.ln.wf02.wt02.status(BOOLEAN),root.ln.wf02.wt02.hardware(TEXT),root.ln.wf01.wt01.hardware(TEXT),root.ln.wf01.wt01.status(BOOLEAN) 1970-01-01T08:00:00.001+08:00,true,"v1",true,"v1",v1,true 1970-01-01T08:00:00.002+08:00,true,"v1",,,,true
The result of select * from root.** align by device
Time,Device,hardware(TEXT),status(BOOLEAN) 1970-01-01T08:00:00.001+08:00,root.ln.wf01.wt01,"v1",true 1970-01-01T08:00:00.002+08:00,root.ln.wf01.wt01,,true 1970-01-01T08:00:00.001+08:00,root.ln.wf02.wt02,"v1",true 1970-01-01T08:00:00.001+08:00,root.ln.wf03.wt03,"v1", 1970-01-01T08:00:00.002+08:00,root.ln.wf03.wt03,"v1", 1970-01-01T08:00:00.001+08:00,root.ln.wf04.wt04,,true 1970-01-01T08:00:00.002+08:00,root.ln.wf04.wt04,,true
The data of boolean type signed by true and false without double quotes. And the text data will be enclosed in double quotes.
Note that if fields exported by the export tool have the following special characters:
,: the field will be escaped by \.CREATE DATABASE root.fit.d1; CREATE DATABASE root.fit.d2; CREATE DATABASE root.fit.p; CREATE TIMESERIES root.fit.d1.s1 WITH DATATYPE=INT32,ENCODING=RLE; CREATE TIMESERIES root.fit.d1.s2 WITH DATATYPE=TEXT,ENCODING=PLAIN; CREATE TIMESERIES root.fit.d2.s1 WITH DATATYPE=INT32,ENCODING=RLE; CREATE TIMESERIES root.fit.d2.s3 WITH DATATYPE=INT32,ENCODING=RLE; CREATE TIMESERIES root.fit.p.s1 WITH DATATYPE=INT32,ENCODING=RLE;
IoTDB has the ability of type inference, so it is not necessary to create metadata before data import. However, we still recommend creating metadata before importing data using the CSV import tool, as this can avoid unnecessary type conversion errors.
The data aligned by time, and headers without data type.
Time,root.test.t1.str,root.test.t2.str,root.test.t2.int 1970-01-01T08:00:00.001+08:00,"123hello world","123\,abc",100 1970-01-01T08:00:00.002+08:00,"123",,
The data aligned by time, and headers with data type.(Text type data supports double quotation marks and no double quotation marks)
Time,root.test.t1.str(TEXT),root.test.t2.str(TEXT),root.test.t2.int(INT32) 1970-01-01T08:00:00.001+08:00,"123hello world","123\,abc",100 1970-01-01T08:00:00.002+08:00,123,hello world,123 1970-01-01T08:00:00.003+08:00,"123",, 1970-01-01T08:00:00.004+08:00,123,,12
The data aligned by device, and headers without data type.
Time,Device,str,int 1970-01-01T08:00:00.001+08:00,root.test.t1,"123hello world", 1970-01-01T08:00:00.002+08:00,root.test.t1,"123", 1970-01-01T08:00:00.001+08:00,root.test.t2,"123\,abc",100
The data aligned by device, and headers with data type.(Text type data supports double quotation marks and no double quotation marks)
Time,Device,str(TEXT),int(INT32) 1970-01-01T08:00:00.001+08:00,root.test.t1,"123hello world", 1970-01-01T08:00:00.002+08:00,root.test.t1,hello world,123 1970-01-01T08:00:00.003+08:00,root.test.t1,,123
# Unix/OS X > tools/import-csv.sh -h <ip> -p <port> -u <username> -pw <password> -f <xxx.csv> [-fd <./failedDirectory>] [-aligned <true>] [-tp <ms/ns/us>] [-typeInfer <boolean=text,float=double...>] # Windows > tools\import-csv.bat -h <ip> -p <port> -u <username> -pw <password> -f <xxx.csv> [-fd <./failedDirectory>] [-aligned <true>] [-tp <ms/ns/us>] [-typeInfer <boolean=text,float=double...>]
Description:
-f:
-f filename.csv-fd:
.failed.-fd ./failed/-aligned:
false is default.-aligned true-batch:
org.apache.thrift.transport.TTransportException: Frame size larger than protect max size, you can lower this parameter as appropriate.-batch 100000, 100000 is the default value.-tp <time-precision>:
ms(millisecond), ns(nanosecond), and us(microsecond), ms is default.-typeInfer <srcTsDataType1=dstTsDataType1,srcTsDataType2=dstTsDataType2,...>:
srcTsDataType includes boolean,int,long,float,double,NaN.dstTsDataType includes boolean,int,long,float,double,text.srcTsDataType is boolean, dstTsDataType should be between boolean and text.srcTsDataType is NaN, dstTsDataType should be among float, double and text.srcTsDataType is Numeric type, dstTsDataType precision should be greater than srcTsDataType.-typeInfer boolean=text,float=double-linesPerFailedFile <int>:
10000 is default.-linesPerFailedFile 1# Unix/OS X > tools/import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd ./failed # or > tools/import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd ./failed # or > tools\import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd ./failed -tp ns # or > tools\import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd ./failed -tp ns -typeInfer boolean=text,float=double # or > tools\import-csv.sh -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd ./failed -tp ns -typeInfer boolean=text,float=double -linesPerFailedFile 10 # Windows > tools\import-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv # or > tools\import-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd .\failed # or > tools\import-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd .\failed -tp ns # or > tools\import-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd .\failed -tp ns -typeInfer boolean=text,float=double # or > tools\import-csv.bat -h 127.0.0.1 -p 6667 -u root -pw root -f example-filename.csv -fd .\failed -tp ns -typeInfer boolean=text,float=double -linesPerFailedFile 10
Note that the following special characters in fields need to be checked before importing:
, : fields containing , should be escaped by \.yyyy-MM-dd'T'HH:mm:ss, yyy-MM-dd HH:mm:ss, or yyyy-MM-dd'T'HH:mm:ss.SSSZ.Time column must be the first one.