IoTDB currently supports importing data in CSV, SQL, and TsFile (IoTDB's underlying open-time series file format) into the database. The specific functionalities are as follows:
# Unix/OS X >tools/import-data.sh -h <ip> -p <port> -u <username> -pw <password> -s <xxx.csv/sql> [-fd <./failedDirectory> -aligned <true/false> -batch <int> -tp <ms/ns/us> -typeInfer <boolean=text,float=double...> -lpf <int>] # Windows >tools\import-data.bat -h <ip> -p <port> -u <username> -pw <password> -s <xxx.csv/sql> [-fd <./failedDirectory> -aligned <true/false> -batch <int> -tp <ms/ns/us> -typeInfer <boolean=text,float=double...> -lpf <int>]
Parameter | Definition | Required | Default |
---|---|---|---|
-h | Hostname | No | 127.0.0.1 |
-p | Port | No | 6667 |
-u | Username | No | root |
-pw | Password | No | root |
-s | Specify the data to be imported, here you can specify files or folders. If a folder is specified, all files with suffixes of csv or sql in the folder will be batch imported (In V1.3.2, the parameter is -f ) | Yes | |
-fd | Specify the directory for storing failed SQL files. If this parameter is not specified, failed files will be saved in the source data directory. Note: For unsupported SQL, illegal SQL, and failed SQL, they will be put into the failed directory under the failed file (default is the file name with .failed suffix) | No | The source filename with .failed suffix |
-aligned | Specify whether to use the aligned interface, options are true or false. Note: This parameter is only effective when importing csv files. | No | false |
-batch | Used to specify the number of data points per batch (minimum value is 1, maximum value is Integer.MAX_VALUE). If the program reports the error org.apache.thrift.transport.TTransportException: Frame size larger than protect max size , you can appropriately reduce this parameter. | No | 100000 |
-tp | Specify the time precision, options include ms (milliseconds), ns (nanoseconds), us (microseconds) | No | ms |
-lpf | Specify the number of data lines written per failed file (In V1.3.2, the parameter is -linesPerFailedFile ) | No | 10000 |
-typeInfer | Used to specify type inference rules, such as <srcTsDataType1=dstTsDataType1,srcTsDataType2=dstTsDataType2,...>. Note: Used to specify type inference rules. srcTsDataType includes boolean , int , long , float , double , NaN . dstTsDataType includes boolean , int , long , float , double , text . When srcTsDataType is boolean , dstTsDataType can only be boolean or text . When srcTsDataType is NaN , dstTsDataType can only be float , double , or text . When srcTsDataType is a numerical type, the precision of dstTsDataType needs to be higher than srcTsDataType . For example: -typeInfer boolean=text,float=double | No |
dump0_0.sql
data in the current data
directory to the local IoTDB database.# Unix/OS X >tools/import-data.sh -s ./data/dump0_0.sql # Windows >tools/import-data.bat -s ./data/dump0_0.sql
data
directory in an aligned manner to the local IoTDB database.# Unix/OS X >tools/import-data.sh -s ./data/ -fd ./failed/ -aligned true # Windows >tools/import-data.bat -s ./data/ -fd ./failed/ -aligned true
dump0_0.csv
data in the current data
directory to the local IoTDB database.# Unix/OS X >tools/import-data.sh -s ./data/dump0_0.csv -fd ./failed/ # Windows >tools/import-data.bat -s ./data/dump0_0.csv -fd ./failed/
dump0_0.csv
data in the current data
directory in an aligned manner, batch import 100000 lines to the IoTDB database on the host with IP 192.168.100.1
, record failures in the current failed
directory, with a maximum of 1000 lines per file.# Unix/OS X >tools/import-data.sh -h 192.168.100.1 -p 6667 -u root -pw root -s ./data/dump0_0.csv -fd ./failed/ -aligned true -batch 100000 -tp ms -typeInfer boolean=text,float=double -lpf 1000 # Windows >tools/import-data.bat -h 192.168.100.1 -p 6667 -u root -pw root -s ./data/dump0_0.csv -fd ./failed/ -aligned true -batch 100000 -tp ms -typeInfer boolean=text,float=double -lpf 1000
-- Headers without data types Time,root.test.t1.str,root.test.t2.str,root.test.t2.var 1970-01-01T08:00:00.001+08:00,"123hello world","123\,abc",100 1970-01-01T08:00:00.002+08:00,"123",, -- Headers with data types (Text-type data supports both quoted and unquoted formats) Time,root.test.t1.str(TEXT),root.test.t2.str(TEXT),root.test.t2.var(INT32) 1970-01-01T08:00:00.001+08:00,"123hello world","123\,abc",100 1970-01-01T08:00:00.002+08:00,123,hello world,123 1970-01-01T08:00:00.003+08:00,"123",, 1970-01-01T08:00:00.004+08:00,123,,12
-- Headers without data types Time,Device,str,var 1970-01-01T08:00:00.001+08:00,root.test.t1,"123hello world", 1970-01-01T08:00:00.002+08:00,root.test.t1,"123", 1970-01-01T08:00:00.001+08:00,root.test.t2,"123\,abc",100 -- Headers with data types (Text-type data supports both quoted and unquoted formats) Time,Device,str(TEXT),var(INT32) 1970-01-01T08:00:00.001+08:00,root.test.t1,"123hello world", 1970-01-01T08:00:00.002+08:00,root.test.t1,"123", 1970-01-01T08:00:00.001+08:00,root.test.t2,"123\,abc",100 1970-01-01T08:00:00.002+08:00,root.test.t1,hello world,123
# Unix/OS X >tools/load-tsfile.sh -h <ip> -p <port> -u <username> -pw <password> -s <source> -os <on_success> [-sd <success_dir>] -of <on_fail> [-fd <fail_dir>] [-tn <thread_num>] # Windows >tools\load-tsfile.bat -h <ip> -p <port> -u <username> -pw <password> -s <source> -os <on_success> [-sd <success_dir>] -of <on_fail> [-fd <fail_dir>] [-tn <thread_num>]
Parameter | Description | Required | Default |
---|---|---|---|
-h | Hostname | No | root |
-p | Port | No | root |
-u | Username | No | 127.0.0.1 |
-pw | Password | No | 6667 |
-s | The local directory path of the script file (folder) to be loaded | Yes | |
-os | none: Do not delete mv: Move successful files to the target folder cp: Hard link (copy) successful files to the target folder delete: Delete | Yes | |
-sd | When --on_success is mv or cp, the target folder for mv or cp. The file name of the file becomes the folder flattened and then concatenated with the original file name. | When --on_success is mv or cp, it is required to fill in Yes | ${EXEC_DIR}/success |
-of | none: Skip mv: Move failed files to the target folder cp: Hard link (copy) failed files to the target folder delete: Delete | Yes | |
-fd | When --on_fail is specified as mv or cp, the target folder for mv or cp. The file name of the file becomes the folder flattened and then concatenated with the original file name. | When --on_fail is specified as mv or cp, it is required to fill in | ${EXEC_DIR}/fail |
-tn | Maximum number of parallel threads | Yes | 8 |
# Unix/OS X > tools/load-tsfile.sh -h 127.0.0.1 -p 6667 -u root -pw root -s /path/sql -os delete -of delete -tn 8 > tools/load-tsfile.sh -h 127.0.0.1 -p 6667 -u root -pw root -s /path/sql -os mv -of cp -sd /path/success/dir -fd /path/failure/dir -tn 8 # Windows > tools/load_data.bat -h 127.0.0.1 -p 6667 -u root -pw root -s /path/sql -os mv -of cp -sd /path/success/dir -fd /path/failure/dir -tn 8 > tools/load_data.bat -h 127.0.0.1 -p 6667 -u root -pw root -s /path/sql -os delete -of delete -tn 8
The TsFile Active Listening & Loading Feature can actively monitor TsFile file changes in the specified target path (configured by the user) and automatically synchronize TsFile files from the target path to the specified reception path (configured by the user). Through this feature, IoTDB can automatically detect and load these files without the need for any additional manual loading operations. This automated process not only simplifies the user's operational steps but also reduces potential errors that may occur during the operation, effectively reducing the complexity for users during the usage process.
You can enable the TsFile Active Listening & Loading Feature by finding the following parameters in the configuration file template iotdb-system.properties.template
and adding them to the IoTDB configuration file iotdb-system.properties
. The complete configuration is as follows:
Configuration Parameter | Description | Value Range | Required | Default Value | Loading Method |
---|---|---|---|---|---|
load_active_listening_enable | Whether to enable the DataNode's active listening and loading of tsfile functionality (default is enabled). | Boolean: true,false | Optional | true | Hot Loading |
load_active_listening_dirs | The directories to be listened to (automatically includes subdirectories of the directory), if there are multiple, separate with “,”. The default directory is ext/load/pending (supports hot loading). | String: one or more file directories | Optional | ext/load/pending | Hot Loading |
load_active_listening_fail_dir | The directory to which files are transferred after the execution of loading tsfile files fails, only one directory can be configured. | String: one file directory | Optional | ext/load/failed | Hot Loading |
load_active_listening_max_thread_num | The maximum number of threads to perform loading tsfile tasks simultaneously. The default value when the parameter is commented out is max(1, CPU core count / 2). When the user sets a value not in the range [1, CPU core count / 2], it will be set to the default value (1, CPU core count / 2). | Long: [1, Long.MAX_VALUE] | Optional | max(1, CPU core count / 2) | Effective after restart |
load_active_listening_check_interval_seconds | Active listening polling interval in seconds. The function of actively listening to tsfile is achieved by polling the folder. This configuration specifies the time interval between two checks of load_active_listening_dirs, and the next check will be executed after load_active_listening_check_interval_seconds seconds of each check. When the user sets the polling interval to less than 1, it will be set to the default value of 5 seconds. | Long: [1, Long.MAX_VALUE] | Optional | 5 | Effective after restart |
FUNCTION moveFilesToListeningDirectory(sourceDirectory, listeningDirectory) // Move mods files modsFiles = searchFiles(sourceDirectory, "*mods*") IF modsFiles IS NOT EMPTY FOR EACH file IN modsFiles MOVE(file, listeningDirectory) END FOR END IF // Move tsfile files tsfileFiles = searchFiles(sourceDirectory, "*tsfile*") IF tsfileFiles IS NOT EMPTY FOR EACH file IN tsfileFiles MOVE(file, listeningDirectory) END FOR END IF END FUNCTION FUNCTION searchFiles(directory, pattern) matchedFiles = [] FOR EACH file IN directory.files IF file.name MATCHES pattern APPEND file TO matchedFiles END IF END FOR RETURN matchedFiles END FUNCTION FUNCTION MOVE(sourceFile, targetDirectory) // Implement the logic of moving files from sourceFile to targetDirectory END FUNCTION
Prohibit setting the receiver directory of Pipe, the data directory for storing data, etc., as the listening directory.
Prohibit load_active_listening_fail_dir
from having the same directory as load_active_listening_dirs
, or each other's nesting.
Ensure that the load_active_listening_dirs
directory has sufficient permissions. After the load is successful, the files will be deleted. If there is no delete permission, it will lead to repeated loading.