The writing process of TsFile is shown in the following figure:
Among them, each device corresponds to a ChunkGroupWriter, and each sensor corresponds to a ChunkWriter.
File writing is mainly divided into three operations, marked with 1, 2, 3 on the figure
TsFile file layer has two write interfaces
Write a device with a timestamp and multiple measurement points.
Write multiple timestamps and multiple measurement points on one device.
When the write interface is called, the data of this device will be delivered to the corresponding ChunkGroupWriter, and each measurement point will be delivered to the corresponding ChunkWriter for writing. ChunkWriter completes coding and packaging (generating a page).
When the data in the memory reaches a certain threshold, the persistence operation is triggered. Each persistence will persist all the data of all devices in the current memory to the TsFile file of the disk. Each device corresponds to a ChunkGroup and each measurement point corresponds to a Chunk.
After the persistence is complete, the corresponding metadata information is cached in memory for querying and generating the metadata at the end of the file.
Based on the metadata cached in memory, TsFileMetadata is generated and appended to the end of the file (TsFileWriter.flushMetadataIndex()), and the file is finally closed.
One of the most important steps in constructing TsFileMetadata is to construct MetadataIndex tree. As we have mentioned before, the MetadataIndex is designed as tree structure so that not all the TimeseriesMetadata need to be read when the number of devices or measurements is too large. Only reading specific MetadataIndex nodes according to requirement and reducing I/O could speed up the query. The whole process of constructing MetadataIndex tree is as below:
The input params of this method:
TimeseriesMetadataThe whole method contains three parts:
deviceTimeseriesMetadataMap is converted into deviceMetadataIndexMap. Specificly, for each device:queue for MetadataIndex nodes in this deviceLEAF_MEASUREMENT typecurrentIndexNode every MAX_DEGREE_OF_INDEX_NODE entriesMAX_DEGREE_OF_INDEX_NODE entries, add currentIndexNode into queue, and point currentIndexNode to a new MetadataIndexNodequeue, until the final root node (this method will be described later), and put the “device-root node” map into deviceMetadataIndexMapMAX_DEGREE_OF_INDEX_NODE. If not, the root node of MetadataIndex tree could be generated and returnINTERNAL_MEASUREMENT typedeviceMetadataIndexMap:metadataIndexNodeendOffset of root node and return itMAX_DEGREE_OF_INDEX_NODE, the device index level of MetadataIndex tree is generatedqueue for MetadataIndex nodes in device index levelLEAF_DEVICE typedeviceMetadataIndexMap:metadataIndexNodeMAX_DEGREE_OF_INDEX_NODE entries, add currentIndexNode into queue, and point currentIndexNode to a new MetadataIndexNodequeue, until the final root node (this method will be described later)endOffset of root node and return itThe input params of this method:
The method needs to generate a tree structure of nodes in metadataIndexNodeQueue, and return the root node:
currentIndexNode in specific typecurrentIndexNodeMAX_DEGREE_OF_INDEX_NODE entries, add currentIndexNode into queue, and point currentIndexNode to a new MetadataIndexNodeThe input params of this method:
This method set the endOffset of current MetadataIndexNode, and put it into queue.