Optimize get_timeseries_metadata() to eliminate N+1 I/O (~94x speedup) (#815)

* optimize get_timeseries_metadata() to eliminate N+1 I/O

- Add get_device_timeseries_meta_by_offset() that accepts pre-resolved
  offsets, skipping redundant load_device_index_entry() tree search and
  reusing the deserialized measurement index node via get_all_leaf()
  instead of re-reading the same bytes through
  load_all_measurement_index_entry()
- Add get_all_device_entries() to collect device IDs with their
  (start_offset, end_offset) in a single index tree traversal
- Rewrite get_timeseries_metadata() to use the above two methods
- Remove redundant PageArena::init() call in get_timeseries_metadata_impl

* fix int64_t truncation and error handling in get_all_device_entries

* optimize queryByRow with exact tag filter via B-tree direct lookup

When DeviceMetaIterator receives a single TagEq filter (exact match on
one tag column), construct the target device ID and use
load_device_index_entry() for O(log N) B-tree binary search instead of
traversing all internal nodes and scanning all leaf devices.

For ecg_dataset (53040 devices), this reduces queryByRow's
DeviceTaskIterator::next() from ~117ms to ~0.5ms per query (~230x).

Also adds bench_read.cpp for standalone C++ read path benchmarking
and makes load_device_index_entry() public for reuse.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* style: clang-format

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Updae C++ CLAUDE.md to persist format way

* fix direct lookup for multi-tag-column devices

The TagEq direct B-tree lookup constructed a device ID with only
col_idx_+1 segments, but devices with multiple tag columns have more.
The segment count mismatch caused operator== to always return false,
so queries returned 0 rows. Guard the optimization to only activate
when the filter fully specifies the device ID (single tag column).

* add unit tests for timeseries metadata and direct B-tree lookup

- GetTimeseriesMetadataTableModel: test get_timeseries_metadata()
  with table model, covering get_all_device_entries (LEAF_DEVICE
  path) and get_device_timeseries_meta_by_offset
- GetTimeseriesMetadataMultiTable: test with two tables, covering
  the multi-table iteration in get_timeseries_metadata
- DirectLookupSingleTagColumn: test the TagEq direct B-tree lookup
  optimization with a single-tag-column table, covering
  try_setup_direct_lookup and load_results_direct
- DirectLookupNonExistDevice: test direct lookup returns 0 rows
  when the device does not exist

* delete examples/bench_read.cpp

* fix direct lookup guard for multi-tag-column tables

The previous guard (actual_segment_count != eq->col_idx_ + 1) still
allowed direct lookup when filtering on the last tag column in a
multi-tag table (e.g., 2 tags: segment_count=3, col_idx=2, 2+1=3).
The constructed device ID had empty segments for unfiltered tags,
causing B-tree lookup to find nothing. Restrict direct lookup to
single-tag tables only (segment_count == 2).

* style: clang-format

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7 files changed
tree: c1543145231f20fcaf850086ec928c288e534651
  1. .github/
  2. .mvn/
  3. cpp/
  4. docs/
  5. java/
  6. python/
  7. .asf.yaml
  8. .clang-format
  9. .gitattributes
  10. .gitignore
  11. checkstyle.xml
  12. CLAUDE.md
  13. codecov.yml
  14. doap_tsfile.rdf
  15. jenkins.pom
  16. Jenkinsfile
  17. LICENSE
  18. mvnw
  19. mvnw.cmd
  20. NOTICE
  21. pom.xml
  22. README-zh.md
  23. README.md
  24. RELEASE_NOTES.md
README.md

English | 中文

TsFile Document

codecov Maven Central PyPI

Introduction

TsFile is a columnar storage file format designed for time series data, which supports efficient compression, high throughput of read and write, and compatibility with various frameworks, such as Spark and Flink. It is easy to integrate TsFile into IoT big data processing frameworks.

Time series data is becoming increasingly important in a wide range of applications, including IoT, intelligent control, finance, log analysis, and monitoring systems.

TsFile is the first existing standard file format for time series data. Despite the widespread presence and significance of temporal data, there has been a longstanding absence of standardized file formats for its management. The advent of TsFile introduces a unified file format to facilitate users in managing temporal data.

Click for More Information

TsFile Features

TsFile offers several distinctive features and benefits:

  • Multi Language Independent Use: Multiple language SDK can be used to directly read and write TsFile, making it possible for some lightweight data reading and writing scenarios.

  • Efficient Writing and Compression: A column storage format tailored for time series, organizing data by device and ensuring continuous storage of data for each sequence, minimizing storage space. Compared to CSV, the compression ratio can be increased by more than 90%.

  • High Query Performance: By indexing devices, measurement, and time dimensions, TsFile implements fast filtering and querying of temporal data based on specific time ranges. Compared to general file formats, query throughput can be increased by 2-10 times.

  • Open Integration: TsFile is the underlying storage file format of the temporal database IoTDB, which can form a pluggable storage computing separation architecture with IoTDB. TsFile supports compatibility with Spark Flink and other big data software establish seamless ecosystem integration to ensure compatibility and interoperability across different data processing environments, and achieve deep analysis of temporal data across ecosystems.

TsFile Basic Concepts

TsFile can manage the time series data of multiple devices. Each device can have different measurement.

Each measurement of each device corresponds to a time series.

The TsFile Scheme defines a set of measurement for all devices, as shown in the table below (m1~m5)

TimedeviceIdm1m2m3m4m5
1device1123
2device1123
3device21345
4device21345
5device312345

Among them, Time and deviceId are built-in fields that do not need to be defined and can be written directly.

TsFile Design

File Structure

TsFile adopts a columnar storage design, similar to other file formats, primarily to optimize time-series data's storage efficiency and query performance. This design aligns with the nature of time series data, which often involves large volumes of similar data types recorded over time. However, TsFile was developed particularly with a structure of page, chunk, chunk group, and index:

  • Page: The basic unit for storing time series data, sorted by time in ascending order with separate columns for timestamps and values.

  • Chunk: Comprising metadata headers and several pages, each chunk belongs to one time series, with variable sizes allowing for different compression and encoding methods.

  • Chunk Group: Multiple chunks within a chunk group belong to one or multiple series of a device written in the same period, facilitating efficient query processing.

  • Index: The file metadata at the end of TsFile contains a chunk-level index and file-level statistics for efficient data access.

TsFile Architecture

Encoding and Compression

TsFile employs advanced encoding and compression techniques to optimize storage and access for time series data. It uses methods like run-length encoding (RLE), bit-packing, and Snappy for efficient compression, allowing separate encoding of timestamp and value columns for better data processing. Its unique encoding algorithms are designed specifically for the characteristics of time series data in IoT scenarios, focusing on regular time intervals and the correlation among series.

Its uniqueness lies in the encoding algorithm designed specifically for time series data characteristics, focusing on the correlation between time attributes and data.

The table below compares 3 file formats in different dimensions.

TsFile, CSV and Parquet in Comparison

DimensionTsFileCSVParquet
Data ModelIoTPlainNested
Write ModeTablet, LineLineLine
CompressionYesNoYes
Read ModeQuery, ScanScanQuery
Index on SeriesYesNoNo
Index on TimeYesNoNo

Its development facilitates efficient data encoding, compression, and access, reflecting a deep understanding of industry needs, pioneering a path toward efficient, scalable, and flexible data analytics platforms.

Data TypeRecommended EncodingRecommended Compression
INT32TS_2DIFFLZ4
INT64TS_2DIFFLZ4
FLOATGORILLALZ4
DOUBLEGORILLALZ4
BOOLEANRLELZ4
TEXTDICTIONARYLZ4

more see Docs

Build and Use TsFile

Java

C++

Python