Performance Optimization Guide

Overview

This document describes the performance optimizations implemented in the IoTDB Node.js client.

Performance Benchmarks (2026)

Write Performance - Multi-Process Cluster Mode

ConfigurationWorkersClients/WorkerThroughputAvg Latency
Best Config8105.42M pts/s329ms
Single Process1204.28M pts/s107ms
Over-parallel10103.81M pts/s551ms

Best Configuration:

WORKER_COUNT=8 CLIENT_NUMBER=10 DEVICE_NUMBER=1000 \
SENSOR_NUMBER=50 BATCH_SIZE_PER_WRITE=500 POOL_MAX_SIZE=10 \
node benchmark/benchmark-table-cluster.js

Node.js vs Java Performance

ClientMax ThroughputArchitecture
Node.js (multi-process)~5.5M pts/s8 workers × event loop
Java iot-benchmark~60M pts/sMulti-threaded

Gap Analysis (~11x):

  • Node.js single-threaded architecture limits true parallelism
  • Java Thrift implementation has JIT optimization
  • Java can use DirectByteBuffer for zero-copy

Key Optimization Findings

  1. Tablet Size: 25K points (500 rows × 50 sensors) is optimal

    • Larger tablets (100K) cause latency spikes (3.5s)
    • Smaller tablets increase RPC overhead
  2. Worker Count: 8 workers is optimal for most servers

    • 10+ workers cause server saturation
    • Each worker needs independent SessionPool
  3. Memory Management: Use streaming batch processing

    • Build tablets per-loop, not all upfront
    • Prevents OOM for large-scale tests (billions of points)

Implemented Optimizations

1. Buffer Pooling

Problem: Frequent buffer allocations and deallocations cause significant GC (Garbage Collection) pressure, especially when serializing large datasets.

Solution: Implemented BufferPool with size-based pooling strategy:

import { globalBufferPool } from '@iotdb/client';

// Buffer pool automatically manages buffers in 7 size classes:
// 1KB, 4KB, 16KB, 64KB, 256KB, 1MB, 4MB

// Get statistics
const stats = globalBufferPool.getStats();
console.log(`Hit rate: ${stats.hitRate}`);
console.log(`Pooled buffers: ${stats.pooledBuffers}`);

Impact:

  • Reduces GC pressure by 70-80%
  • Particularly effective for batch operations
  • Automatic cleanup prevents memory bloat

When to use:

  • Enabled by default via enableFastSerialization: true
  • Most beneficial for workloads with:
    • Large batch inserts (100+ rows)
    • High-frequency writes
    • Long-running processes

2. Fast Serialization

Problem: Original serialization used multiple buffer concatenations and intermediate allocations, causing performance bottlenecks.

Solution: Implemented type-specific fast serializers in FastSerializer.ts:

// Old approach (multiple allocations):
const buffer1 = serializeColumn1();
const buffer2 = serializeColumn2();
const result = Buffer.concat([buffer1, buffer2]); // Extra allocation!

// New approach (single pre-allocated buffer):
const totalSize = calculateSize();
const result = Buffer.allocUnsafe(totalSize);
// Write directly to result buffer

Features:

  • Single-pass serialization
  • Pre-calculated buffer sizes
  • Direct buffer writes (no intermediate arrays)
  • Conditional pooling (only for buffers >= 1KB)

Impact:

  • 1.5-2x faster serialization
  • 50-60% reduction in memory allocations
  • Zero intermediate buffer copies

3. Optimized Timestamp Handling

Problem: Converting timestamps one-by-one to BigInt and writing to buffer was inefficient.

Solution: Batch timestamp conversion with optimized buffer writes:

// Optimized timestamp serialization
function serializeTimestamps(timestamps: number[]): Buffer {
  const size = timestamps.length * 8;
  const buffer = size >= 1024 ? globalBufferPool.acquire(size) : Buffer.allocUnsafe(size);
  
  for (let i = 0; i < timestamps.length; i++) {
    buffer.writeBigInt64BE(BigInt(Math.floor(timestamps[i])), i * 8);
  }
  
  return buffer.subarray(0, size);
}

Impact:

  • 20-30% faster timestamp processing
  • Particularly effective for large batches

4. Columnar Result Format (Phase 2)

Problem: Row-by-row processing with object allocation creates overhead for large result sets.

Solution: Added toColumnar() API inspired by pg's array mode:

const dataSet = await session.executeQueryStatement('SELECT temp, humidity FROM root.test');

// OLD WAY: Object per row (high allocation overhead)
while (await dataSet.hasNext()) {
  const row = dataSet.next();  // Creates RowRecord object
  console.log(row.getValue('temp'));
}

// NEW WAY: Columnar format (zero allocation overhead)
const columnar = await dataSet.toColumnar();
// columnar = {
//   timestamps: [ts1, ts2, ts3, ...],
//   values: [[temp1, temp2, temp3, ...], [humidity1, humidity2, humidity3, ...]],
//   columnNames: ['temp', 'humidity'],
//   columnTypes: ['FLOAT', 'FLOAT']
// }

// Process entire columns at once
const avgTemp = columnar.values[0].reduce((a, b) => a + b) / columnar.values[0].length;

Impact:

  • 2-3x faster for bulk query processing
  • 80-90% reduction in GC pressure
  • Enables vectorized processing
  • Perfect for analytics workloads

When to use:

  • ✅ Small to medium result sets (< 100K rows)
  • ✅ Analytics and aggregation workloads
  • ✅ When processing entire columns
  • ❌ Very large result sets (use iterator pattern)
  • ❌ When you need streaming with backpressure

Configuration

Enabling/Disabling Fast Serialization

import { Session } from '@iotdb/client';

// Enable (default)
const session = new Session({
  host: 'localhost',
  port: 6667,
  enableFastSerialization: true,  // Uses optimized serializers
});

// Disable (fall back to legacy)
const legacySession = new Session({
  host: 'localhost',
  port: 6667,
  enableFastSerialization: false,  // Uses original serializers
});

When to Disable Fast Serialization

You might want to disable fast serialization if:

  • Debugging serialization issues
  • Running on memory-constrained environments
  • Comparing performance with legacy behavior

Performance Benchmarks

Write Performance

ScenarioLegacyOptimizedImprovement
Small batch (10 rows, 10 columns)2.5ms1.8ms1.4x
Medium batch (100 rows, 10 columns)15ms6ms2.5x
Large batch (1000 rows, 10 columns)180ms65ms2.8x
Mixed data types25ms10ms2.5x

Query Performance (toColumnar vs iterator)

Result Set SizeIterator (objects)toColumnarImprovement
1,000 rows45ms18ms2.5x
10,000 rows520ms180ms2.9x
100,000 rows5800ms1900ms3.1x

Benchmarks performed on Node.js v20, Intel i7, 16GB RAM

Best Practices

1. Use Batch Inserts

// ❌ BAD: One-by-one inserts
for (let i = 0; i < 1000; i++) {
  await session.insertTablet({
    deviceId: 'root.test.device1',
    measurements: ['temp'],
    dataTypes: [TSDataType.FLOAT],
    timestamps: [Date.now() + i],
    values: [[25.5]],
  });
}

// ✅ GOOD: Batch insert
const batchSize = 100;
await session.insertTablet({
  deviceId: 'root.test.device1',
  measurements: ['temp'],
  dataTypes: [TSDataType.FLOAT],
  timestamps: Array.from({ length: batchSize }, (_, i) => Date.now() + i),
  values: Array.from({ length: batchSize }, () => [25.5]),
});

2. Use Columnar Format for Analytics

// ✅ GOOD: Columnar processing for analytics
const columnar = await dataSet.toColumnar();
const temps = columnar.values[0];

// Vectorized operations
const avg = temps.reduce((a, b) => a + b, 0) / temps.length;
const max = Math.max(...temps);
const min = Math.min(...temps);

await dataSet.close();

3. Choose the Right Query Method

// For small result sets - use toColumnar()
const smallDataSet = await session.executeQueryStatement('SELECT * FROM root.test LIMIT 100');
const columnar = await smallDataSet.toColumnar();
await smallDataSet.close();

// For large result sets - use iterator
const largeDataSet = await session.executeQueryStatement('SELECT * FROM root.test');
while (await largeDataSet.hasNext()) {
  const row = largeDataSet.next();
  await processRow(row);  // Process with backpressure
}
await largeDataSet.close();

4. Monitor Buffer Pool Usage

import { globalBufferPool } from '@iotdb/client';

// After warmup period
setInterval(() => {
  const stats = globalBufferPool.getStats();
  console.log(`Buffer Pool - Hit rate: ${stats.hitRate}, Pooled: ${stats.pooledBuffers}`);
  
  // If hit rate < 50%, consider adjusting batch sizes
  if (parseFloat(stats.hitRate) < 50) {
    console.warn('Low buffer pool hit rate - consider larger batch sizes');
  }
}, 60000); // Check every minute

Future Optimizations (Planned)

Phase 2 (In Progress)

  • [ ] Batch insert helpers
  • [x] Query result array mode
  • [ ] Cursor/streaming API with backpressure
  • [ ] Request pipelining

Phase 3 (Future)

  • [ ] Optional native bindings for critical paths
  • [ ] Zero-copy deserialization
  • [ ] Custom type parsers
  • [ ] Prepared statement caching

Troubleshooting

High Memory Usage

// Clear buffer pool periodically in long-running processes
import { globalBufferPool } from '@iotdb/client';

// Clear pool every hour to prevent potential memory bloat
setInterval(() => {
  globalBufferPool.clear();
}, 3600000);

Slow Serialization

// Enable performance logging
process.env.LOG_LEVEL = 'debug';

// Check serialization timings in logs:
// [PERF] Values serialization: 5ms, buffer size: 4096 bytes
// [PERF] Timestamp serialization (fast=true): 1ms

Unexpected Results

// Disable fast serialization for debugging
const session = new Session({
  host: 'localhost',
  port: 6667,
  enableFastSerialization: false,  // Use legacy serializers
});

Contributing

Performance improvements are welcome! When contributing:

  1. Benchmark first: Establish baseline with existing code
  2. Profile: Use Node.js profiler to identify bottlenecks
  3. Test thoroughly: Ensure correctness with existing test suite
  4. Document: Update this guide with your improvements

References

License

Apache License 2.0