blob: 97858fde5dfa296893c2fdd701841a3b317fdcbc [file] [log] [blame] [view]
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
[English](./README.md) | [Chinese](./README-zh.md)
# TsFile Tools Manual
## Introduction
## Development
### Prerequisites
To build the Java version of TsFile Tools, you must have the following dependencies installed:
1. Java >= 1.8 (1.8, 11 to 17 are verified. Make sure the environment variable is set).
2. Maven >= 3.6 (if you are compiling TsFile from source).
### Build with Maven
```sh
mvn clean package -P with-java -DskipTests
```
### Install to local machine
```
mvn install -P with-java -DskipTests
```
## schema 定义
| Parameter | Description | Required | Default |
|----------------|--------------------------|----------|------|
| table_name | Table name | Yes | |
| time_precision | Time precision (options: ms/us/ns) | No | ms |
| has_header | Whether it contains a header (options: true/false) | No | true |
| separator | Delimiter (options: , /tab/ ;) | No | , |
| null_format | Null value | No | |
| id_columns | Primary key columns, supports columns not in the CSV as hierarchy | No | |
| time_column | Time column | Yes | |
| csv_columns | Corresponding columns in the CSV in order | Yes | |
Explanation:
The "id_columns" sets values in order and supports using columns that do not exist in the CSV file as levels.
For example, if the CSV file has only five columns: "a", "b", "c", "d", and "time",
id_columns
a1 default aa
a
Among them, a1 is not in the CSV column and is a virtual column with a default value of aa
The content after csv_columns is the definition of the value column, with the first field in each row being the measurement point name in tsfile and the second field being the type
When a column in CSV does not need to be written to tsfile, it can be set to SKIP.
Example:
csv_columns
Region TEXT,
Factory Number TEXT,
Device Number TEXT,
SKIP,
SKIP,
Time INT64,
Temperature FLOAT,
Emission DOUBLE,
Data Example
CSV file content:
### sample data
CSV file content:
```
Region,FactoryNumber,DeviceNumber,Model,MaintenanceCycle,Time,Temperature,Emission
hebei,1001, 1,10,1,1,80.0,1000.0
hebei,1001,1,10,1,4,80.0,1000.0
hebei,1002,7,5,2,1,90.0,1200.0
```
Schema definition
```
table_name=root.db1
time_precision=ms
has_header=true
separator=,
null_format=\N
id_columns
Group DEFAULT Datang
Region
FactoryNumber
DeviceNumber
time_column=Time
csv_columns
RegionTEXT,
FactoryNumber TEXT,
DeviceNumber TEXT,
SKIP,
SKIP,
Time INT64,
Temperature FLOAT,
Emission DOUBLE,
```
## Commands
```
csv2tsfile.sh --source ./xxx/xxx --target /xxx/xxx --fail_dir /xxx/xxx
csv2tsfile.bat --source ./xxx/xxx --target /xxx/xxx --fail_dir /xxx/xxx
```