commit: 324acba9b9f4c844577a5c8d9ef20f9ee24bf654
[log]
author: Jackie Tien <jackietien97@gmail.com>
Wed Jun 03 17:10:56 2026 +0800
committer: GitHub <noreply@github.com>
Wed Jun 03 17:10:56 2026 +0800
tree: c1543145231f20fcaf850086ec928c288e534651
parent: 86ec4b992b8b14e73daed915b90c8da8528a9213 [diff]

Optimize get_timeseries_metadata() to eliminate N+1 I/O (~94x speedup) (#815)

* optimize get_timeseries_metadata() to eliminate N+1 I/O

- Add get_device_timeseries_meta_by_offset() that accepts pre-resolved
  offsets, skipping redundant load_device_index_entry() tree search and
  reusing the deserialized measurement index node via get_all_leaf()
  instead of re-reading the same bytes through
  load_all_measurement_index_entry()
- Add get_all_device_entries() to collect device IDs with their
  (start_offset, end_offset) in a single index tree traversal
- Rewrite get_timeseries_metadata() to use the above two methods
- Remove redundant PageArena::init() call in get_timeseries_metadata_impl

* fix int64_t truncation and error handling in get_all_device_entries

* optimize queryByRow with exact tag filter via B-tree direct lookup

When DeviceMetaIterator receives a single TagEq filter (exact match on
one tag column), construct the target device ID and use
load_device_index_entry() for O(log N) B-tree binary search instead of
traversing all internal nodes and scanning all leaf devices.

For ecg_dataset (53040 devices), this reduces queryByRow's
DeviceTaskIterator::next() from ~117ms to ~0.5ms per query (~230x).

Also adds bench_read.cpp for standalone C++ read path benchmarking
and makes load_device_index_entry() public for reuse.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* style: clang-format

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Updae C++ CLAUDE.md to persist format way

* fix direct lookup for multi-tag-column devices

The TagEq direct B-tree lookup constructed a device ID with only
col_idx_+1 segments, but devices with multiple tag columns have more.
The segment count mismatch caused operator== to always return false,
so queries returned 0 rows. Guard the optimization to only activate
when the filter fully specifies the device ID (single tag column).

* add unit tests for timeseries metadata and direct B-tree lookup

- GetTimeseriesMetadataTableModel: test get_timeseries_metadata()
  with table model, covering get_all_device_entries (LEAF_DEVICE
  path) and get_device_timeseries_meta_by_offset
- GetTimeseriesMetadataMultiTable: test with two tables, covering
  the multi-table iteration in get_timeseries_metadata
- DirectLookupSingleTagColumn: test the TagEq direct B-tree lookup
  optimization with a single-tag-column table, covering
  try_setup_direct_lookup and load_results_direct
- DirectLookupNonExistDevice: test direct lookup returns 0 rows
  when the device does not exist

* delete examples/bench_read.cpp

* fix direct lookup guard for multi-tag-column tables

The previous guard (actual_segment_count != eq->col_idx_ + 1) still
allowed direct lookup when filtering on the last tag column in a
multi-tag table (e.g., 2 tags: segment_count=3, col_idx=2, 2+1=3).
The constructed device ID had empty segments for unfiltered tags,
causing B-tree lookup to find nothing. Restrict direct lookup to
single-tag tables only (segment_count == 2).

* style: clang-format

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

7 files changed

tree: c1543145231f20fcaf850086ec928c288e534651

README.md

English | 中文

TsFile Document

Introduction

TsFile is a columnar storage file format designed for time series data, which supports efficient compression, high throughput of read and write, and compatibility with various frameworks, such as Spark and Flink. It is easy to integrate TsFile into IoT big data processing frameworks.

Time series data is becoming increasingly important in a wide range of applications, including IoT, intelligent control, finance, log analysis, and monitoring systems.

TsFile is the first existing standard file format for time series data. Despite the widespread presence and significance of temporal data, there has been a longstanding absence of standardized file formats for its management. The advent of TsFile introduces a unified file format to facilitate users in managing temporal data.

Click for More Information

TsFile Features

TsFile offers several distinctive features and benefits:

Multi Language Independent Use: Multiple language SDK can be used to directly read and write TsFile, making it possible for some lightweight data reading and writing scenarios.
Efficient Writing and Compression: A column storage format tailored for time series, organizing data by device and ensuring continuous storage of data for each sequence, minimizing storage space. Compared to CSV, the compression ratio can be increased by more than 90%.
High Query Performance: By indexing devices, measurement, and time dimensions, TsFile implements fast filtering and querying of temporal data based on specific time ranges. Compared to general file formats, query throughput can be increased by 2-10 times.
Open Integration: TsFile is the underlying storage file format of the temporal database IoTDB, which can form a pluggable storage computing separation architecture with IoTDB. TsFile supports compatibility with Spark Flink and other big data software establish seamless ecosystem integration to ensure compatibility and interoperability across different data processing environments, and achieve deep analysis of temporal data across ecosystems.

TsFile Basic Concepts

TsFile can manage the time series data of multiple devices. Each device can have different measurement.

Each measurement of each device corresponds to a time series.

The TsFile Scheme defines a set of measurement for all devices, as shown in the table below (m1~m5)

Time	deviceId	m1	m2	m3	m4	m5
1	device1	1	2	3
2	device1	1	2	3
3	device2	1		3	4	5
4	device2	1		3	4	5
5	device3	1	2	3	4	5

Among them, Time and deviceId are built-in fields that do not need to be defined and can be written directly.

TsFile Design

File Structure

TsFile adopts a columnar storage design, similar to other file formats, primarily to optimize time-series data's storage efficiency and query performance. This design aligns with the nature of time series data, which often involves large volumes of similar data types recorded over time. However, TsFile was developed particularly with a structure of page, chunk, chunk group, and index:

Page: The basic unit for storing time series data, sorted by time in ascending order with separate columns for timestamps and values.
Chunk: Comprising metadata headers and several pages, each chunk belongs to one time series, with variable sizes allowing for different compression and encoding methods.
Chunk Group: Multiple chunks within a chunk group belong to one or multiple series of a device written in the same period, facilitating efficient query processing.
Index: The file metadata at the end of TsFile contains a chunk-level index and file-level statistics for efficient data access.

TsFile Architecture

Encoding and Compression

TsFile employs advanced encoding and compression techniques to optimize storage and access for time series data. It uses methods like run-length encoding (RLE), bit-packing, and Snappy for efficient compression, allowing separate encoding of timestamp and value columns for better data processing. Its unique encoding algorithms are designed specifically for the characteristics of time series data in IoT scenarios, focusing on regular time intervals and the correlation among series.

Its uniqueness lies in the encoding algorithm designed specifically for time series data characteristics, focusing on the correlation between time attributes and data.

The table below compares 3 file formats in different dimensions.

TsFile, CSV and Parquet in Comparison

Dimension	TsFile	CSV	Parquet
Data Model	IoT	Plain	Nested
Write Mode	Tablet, Line	Line	Line
Compression	Yes	No	Yes
Read Mode	Query, Scan	Scan	Query
Index on Series	Yes	No	No
Index on Time	Yes	No	No

Its development facilitates efficient data encoding, compression, and access, reflecting a deep understanding of industry needs, pioneering a path toward efficient, scalable, and flexible data analytics platforms.

Data Type	Recommended Encoding	Recommended Compression
INT32	TS_2DIFF	LZ4
INT64	TS_2DIFF	LZ4
FLOAT	GORILLA	LZ4
DOUBLE	GORILLA	LZ4
BOOLEAN	RLE	LZ4
TEXT	DICTIONARY	LZ4

more see Docs

Build and Use TsFile

Java

C++

Python