SessionDataSet Iterator Pattern Guide

This guide explains how to use the SessionDataSet iterator pattern for efficiently querying large datasets from Apache IoTDB.

Overview

The SessionDataSet provides an iterator-based approach to reading query results, similar to JDBC ResultSet or database cursors. This approach offers several advantages over loading all data into memory at once:

Memory Efficient: Only keeps the current batch in memory
Lazy Loading: Fetches data on-demand as you iterate
Large Datasets: Can handle result sets larger than available RAM
Resource Management: Proper cleanup of server-side resources

Basic Usage

Simple Iteration

import { Session } from '@iotdb/client';

const session = new Session({
  host: 'localhost',
  port: 6667,
  username: 'root',
  password: 'root',
});

await session.open();

// Execute query and get SessionDataSet
const dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');

// Iterate through results
while (await dataSet.hasNext()) {
  const row = dataSet.next();
  
  // Access timestamp
  const timestamp = row.getTimestamp();
  
  // Access values by column name
  const temperature = row.getFloat('temperature');
  const humidity = row.getDouble('humidity');
  const status = row.getString('status');
  
  console.log(`${timestamp}: temp=${temperature}, humidity=${humidity}, status=${status}`);
}

// Always close the dataset when done
await dataSet.close();

await session.close();

API Reference

SessionDataSet

Methods

`hasNext(): Promise<boolean>`

Checks if there are more rows available. This may trigger fetching the next batch from the server.

while (await dataSet.hasNext()) {
  // Process next row
}

`next(): RowRecord`

Returns the next row. Must call hasNext() first to ensure a row is available.

if (await dataSet.hasNext()) {
  const row = dataSet.next();
}

`close(): Promise<void>`

Closes the dataset and releases server-side resources. Always call this when done.

await dataSet.close();

`getColumnNames(): string[]`

Returns array of column names.

const columns = dataSet.getColumnNames();
console.log('Columns:', columns); // ['temperature', 'humidity', 'status']

`getColumnTypes(): string[]`

Returns array of column data types.

const types = dataSet.getColumnTypes();
console.log('Types:', types); // ['FLOAT', 'DOUBLE', 'TEXT']

`findColumn(columnName: string): number`

Returns the zero-based index of a column by name.

const tempIndex = dataSet.findColumn('temperature'); // Returns 0

`toArray(): Promise<any[][]>` (Deprecated)

Loads all remaining rows into memory as an array. Use only for small result sets.

// ⚠️ Not recommended for large datasets
const allRows = await dataSet.toArray();

RowRecord

Represents a single row of data from the query result.

Methods

Timestamp Access

const timestamp = row.getTimestamp(); // Returns number (milliseconds)

Access by Column Name

// Typed getters
const stringValue = row.getString('name');
const intValue = row.getInt('count');
const longValue = row.getLong('id');
const floatValue = row.getFloat('temperature');
const doubleValue = row.getDouble('humidity');
const boolValue = row.getBoolean('status');

// Generic getter (returns any)
const value = row.getValue('columnName');

Access by Column Index

const stringValue = row.getStringByIndex(0);
const intValue = row.getIntByIndex(1);
const floatValue = row.getFloatByIndex(2);
const value = row.getValueByIndex(3);

Null Checking

if (row.isNull('optionalColumn')) {
  console.log('Column is null');
} else {
  const value = row.getString('optionalColumn');
}

// By index
if (row.isNullByIndex(0)) {
  console.log('First column is null');
}

Get All Data

const fields = row.getFields(); // Returns array of field values
const array = row.toArray(); // Returns [timestamp, ...fields]

Advanced Usage

Configuring Fetch Size

Control how many rows are fetched in each batch:

const session = new Session({
  host: 'localhost',
  port: 6667,
  fetchSize: 1024, // Fetch 1024 rows at a time
});

const dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');
// Will automatically fetch in batches of 1024 rows

Processing Large Result Sets

For very large result sets, use iterator pattern to avoid memory issues:

const dataSet = await session.executeQueryStatement('SELECT * FROM root.large_dataset');

let count = 0;
let sum = 0;

while (await dataSet.hasNext()) {
  const row = dataSet.next();
  sum += row.getDouble('value');
  count++;
  
  // Log progress every 10000 rows
  if (count % 10000 === 0) {
    console.log(`Processed ${count} rows...`);
  }
}

console.log(`Total rows: ${count}, Average: ${sum / count}`);

await dataSet.close();

Error Handling

Always use try-finally to ensure cleanup:

let dataSet;
try {
  dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');
  
  while (await dataSet.hasNext()) {
    const row = dataSet.next();
    // Process row
  }
} catch (error) {
  console.error('Query error:', error);
  throw error;
} finally {
  if (dataSet) {
    await dataSet.close();
  }
}

Working with Multiple Columns

const dataSet = await session.executeQueryStatement(`
  SELECT temperature, humidity, pressure, status
  FROM root.weather.station1
  WHERE time > now() - 1h
`);

while (await dataSet.hasNext()) {
  const row = dataSet.next();
  
  const reading = {
    timestamp: new Date(row.getTimestamp()),
    temperature: row.getFloat('temperature'),
    humidity: row.getFloat('humidity'),
    pressure: row.getFloat('pressure'),
    status: row.getString('status'),
  };
  
  // Process reading
  if (!row.isNull('status') && reading.temperature > 30) {
    console.log('High temperature alert:', reading);
  }
}

await dataSet.close();

Aggregation Queries

const dataSet = await session.executeQueryStatement(`
  SELECT AVG(temperature), MAX(temperature), MIN(temperature)
  FROM root.test.d1
  GROUP BY ([2024-01-01, 2024-02-01), 1d)
`);

while (await dataSet.hasNext()) {
  const row = dataSet.next();
  
  console.log({
    timestamp: new Date(row.getTimestamp()),
    avg: row.getDouble('AVG(temperature)'),
    max: row.getDouble('MAX(temperature)'),
    min: row.getDouble('MIN(temperature)'),
  });
}

await dataSet.close();

Migration from Old Pattern

The old pattern loaded all data into memory at once. The new pattern uses lazy loading with iterators.

Old Approach (No Longer Supported)

// ❌ This no longer works - executeQueryStatement now returns SessionDataSet
// const result = await session.executeQueryStatement('SELECT * FROM root.test.d1');
// for (const row of result.rows) {
//   const timestamp = row[0];
//   const value1 = row[1];
//   const value2 = row[2];
//   console.log(timestamp, value1, value2);
// }

New Approach (Current Implementation)

// ✅ New way - lazy loading with iterator
const dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');

while (await dataSet.hasNext()) {
  const row = dataSet.next();
  
  const timestamp = row.getTimestamp();
  const value1 = row.getValueByIndex(0);
  const value2 = row.getValueByIndex(1);
  
  console.log(timestamp, value1, value2);
}

await dataSet.close();

Migration Steps

Change method call: executeQueryStatement() now returns SessionDataSet (not QueryResult)
Replace array access: Use iterator pattern instead of result.rows
Update row access: Use RowRecord methods instead of array indices
Add cleanup: Always call await dataSet.close()

For Small Datasets

If you need all data at once for small result sets, use toArray():

const dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');
const allRows = await dataSet.toArray(); // Loads everything into memory
// allRows is [[timestamp, value1, value2], ...]

⚠️ Warning: toArray() loads all data into memory. Only use for small result sets.

Best Practices

Always Close: Use try-finally to ensure dataSet.close() is called
Check hasNext(): Always call hasNext() before calling next()
Use Fetch Size: Set appropriate fetch size based on memory and network
Handle Nulls: Check isNull() before accessing nullable columns
Typed Access: Use typed getters (getInt, getString) for type safety
Avoid toArray(): Don't use toArray() for large datasets

Performance Tips

Fetch Size: Larger fetch size = fewer network calls, more memory used
- Small datasets: 100-1000
- Large datasets: 1000-10000
- Very large: 10000-50000
Batch Processing: Process rows in batches for better performance

const dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');
const batch = [];

while (await dataSet.hasNext()) {
  batch.push(dataSet.next());
  
  if (batch.length >= 1000) {
    await processBatch(batch);
    batch.length = 0;
  }
}

if (batch.length > 0) {
  await processBatch(batch);
}

await dataSet.close();

Column Access: Access by index is slightly faster than by name

// Faster
const value = row.getIntByIndex(0);

// Slightly slower (name lookup)
const value = row.getInt('temperature');

Troubleshooting

“No more rows available” Error

// ❌ Wrong - calling next() without checking hasNext()
const row = dataSet.next(); // May throw error

// ✅ Correct
if (await dataSet.hasNext()) {
  const row = dataSet.next();
}

“Column not found” Error

// Make sure column names match exactly (case-sensitive)
const columns = dataSet.getColumnNames();
console.log('Available columns:', columns);

// Use correct column name
const value = row.getString('temperature'); // Not 'Temperature' or 'temp'

Memory Issues

// ❌ Don't do this with large datasets
const allRows = await dataSet.toArray(); // Loads everything into memory

// ✅ Process iteratively instead
while (await dataSet.hasNext()) {
  const row = dataSet.next();
  // Process and discard each row
}

SessionDataSet Iterator Pattern Guide

Overview

Basic Usage

Simple Iteration

API Reference

SessionDataSet

Methods

hasNext(): Promise<boolean>

next(): RowRecord

close(): Promise<void>

getColumnNames(): string[]

getColumnTypes(): string[]

findColumn(columnName: string): number

toArray(): Promise<any[][]> (Deprecated)

RowRecord

Methods

Timestamp Access

Access by Column Name

Access by Column Index

Null Checking

Get All Data

Advanced Usage

Configuring Fetch Size

Processing Large Result Sets

Error Handling

Working with Multiple Columns

Aggregation Queries

Migration from Old Pattern

Old Approach (No Longer Supported)

New Approach (Current Implementation)

Migration Steps

For Small Datasets

Best Practices

Performance Tips

Troubleshooting

“No more rows available” Error

“Column not found” Error

Memory Issues

See Also

`hasNext(): Promise<boolean>`

`next(): RowRecord`

`close(): Promise<void>`

`getColumnNames(): string[]`

`getColumnTypes(): string[]`

`findColumn(columnName: string): number`

`toArray(): Promise<any[][]>` (Deprecated)