docs/sessiondataset-guide.md - iotdb-client-nodejs - Git at Google

 # SessionDataSet Iterator Pattern Guide

 This guide explains how to use the SessionDataSet iterator pattern for efficiently querying large datasets from Apache IoTDB.

 ## Overview

 The SessionDataSet provides an iterator-based approach to reading query results, similar to JDBC ResultSet or database cursors. This approach offers several advantages over loading all data into memory at once:

 - **Memory Efficient**: Only keeps the current batch in memory
 - **Lazy Loading**: Fetches data on-demand as you iterate
 - **Large Datasets**: Can handle result sets larger than available RAM
 - **Resource Management**: Proper cleanup of server-side resources

 ## Basic Usage

 ### Simple Iteration

 ```typescript
 import { Session } from '@iotdb/client';

 const session = new Session({
   host: 'localhost',
   port: 6667,
   username: 'root',
   password: 'root',
 });

 await session.open();

 // Execute query and get SessionDataSet
 const dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');

 // Iterate through results
 while (await dataSet.hasNext()) {
   const row = dataSet.next();

   // Access timestamp
   const timestamp = row.getTimestamp();

   // Access values by column name
   const temperature = row.getFloat('temperature');
   const humidity = row.getDouble('humidity');
   const status = row.getString('status');

   console.log(`${timestamp}: temp=${temperature}, humidity=${humidity}, status=${status}`);
 }

 // Always close the dataset when done
 await dataSet.close();

 await session.close();
 ```

 ## API Reference

 ### SessionDataSet

 #### Methods

 ##### `hasNext(): Promise<boolean>`
 Checks if there are more rows available. This may trigger fetching the next batch from the server.

 ```typescript
 while (await dataSet.hasNext()) {
   // Process next row
 }
 ```

 ##### `next(): RowRecord`
 Returns the next row. Must call `hasNext()` first to ensure a row is available.

 ```typescript
 if (await dataSet.hasNext()) {
   const row = dataSet.next();
 }
 ```

 ##### `close(): Promise<void>`
 Closes the dataset and releases server-side resources. Always call this when done.

 ```typescript
 await dataSet.close();
 ```

 ##### `getColumnNames(): string[]`
 Returns array of column names.

 ```typescript
 const columns = dataSet.getColumnNames();
 console.log('Columns:', columns); // ['temperature', 'humidity', 'status']
 ```

 ##### `getColumnTypes(): string[]`
 Returns array of column data types.

 ```typescript
 const types = dataSet.getColumnTypes();
 console.log('Types:', types); // ['FLOAT', 'DOUBLE', 'TEXT']
 ```

 ##### `findColumn(columnName: string): number`
 Returns the zero-based index of a column by name.

 ```typescript
 const tempIndex = dataSet.findColumn('temperature'); // Returns 0
 ```

 ##### `toArray(): Promise<any[][]>` (Deprecated)
 Loads all remaining rows into memory as an array. Use only for small result sets.

 ```typescript
 // ⚠️ Not recommended for large datasets
 const allRows = await dataSet.toArray();
 ```

 ### RowRecord

 Represents a single row of data from the query result.

 #### Methods

 ##### Timestamp Access

 ```typescript
 const timestamp = row.getTimestamp(); // Returns number (milliseconds)
 ```

 ##### Access by Column Name

 ```typescript
 // Typed getters
 const stringValue = row.getString('name');
 const intValue = row.getInt('count');
 const longValue = row.getLong('id');
 const floatValue = row.getFloat('temperature');
 const doubleValue = row.getDouble('humidity');
 const boolValue = row.getBoolean('status');

 // Generic getter (returns any)
 const value = row.getValue('columnName');
 ```

 ##### Access by Column Index

 ```typescript
 const stringValue = row.getStringByIndex(0);
 const intValue = row.getIntByIndex(1);
 const floatValue = row.getFloatByIndex(2);
 const value = row.getValueByIndex(3);
 ```

 ##### Null Checking

 ```typescript
 if (row.isNull('optionalColumn')) {
   console.log('Column is null');
 } else {
   const value = row.getString('optionalColumn');
 }

 // By index
 if (row.isNullByIndex(0)) {
   console.log('First column is null');
 }
 ```

 ##### Get All Data

 ```typescript
 const fields = row.getFields(); // Returns array of field values
 const array = row.toArray(); // Returns [timestamp, ...fields]
 ```

 ## Advanced Usage

 ### Configuring Fetch Size

 Control how many rows are fetched in each batch:

 ```typescript
 const session = new Session({
   host: 'localhost',
   port: 6667,
   fetchSize: 1024, // Fetch 1024 rows at a time
 });

 const dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');
 // Will automatically fetch in batches of 1024 rows
 ```

 ### Processing Large Result Sets

 For very large result sets, use iterator pattern to avoid memory issues:

 ```typescript
 const dataSet = await session.executeQueryStatement('SELECT * FROM root.large_dataset');

 let count = 0;
 let sum = 0;

 while (await dataSet.hasNext()) {
   const row = dataSet.next();
   sum += row.getDouble('value');
   count++;

   // Log progress every 10000 rows
   if (count % 10000 === 0) {
     console.log(`Processed ${count} rows...`);
   }
 }

 console.log(`Total rows: ${count}, Average: ${sum / count}`);

 await dataSet.close();
 ```

 ### Error Handling

 Always use try-finally to ensure cleanup:

 ```typescript
 let dataSet;
 try {
   dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');

   while (await dataSet.hasNext()) {
     const row = dataSet.next();
     // Process row
   }
 } catch (error) {
   console.error('Query error:', error);
   throw error;
 } finally {
   if (dataSet) {
     await dataSet.close();
   }
 }
 ```

 ### Working with Multiple Columns

 ```typescript
 const dataSet = await session.executeQueryStatement(`
   SELECT temperature, humidity, pressure, status
   FROM root.weather.station1
   WHERE time > now() - 1h
 `);

 while (await dataSet.hasNext()) {
   const row = dataSet.next();

   const reading = {
     timestamp: new Date(row.getTimestamp()),
     temperature: row.getFloat('temperature'),
     humidity: row.getFloat('humidity'),
     pressure: row.getFloat('pressure'),
     status: row.getString('status'),
   };

   // Process reading
   if (!row.isNull('status') && reading.temperature > 30) {
     console.log('High temperature alert:', reading);
   }
 }

 await dataSet.close();
 ```

 ### Aggregation Queries

 ```typescript
 const dataSet = await session.executeQueryStatement(`
   SELECT AVG(temperature), MAX(temperature), MIN(temperature)
   FROM root.test.d1
   GROUP BY ([2024-01-01, 2024-02-01), 1d)
 `);

 while (await dataSet.hasNext()) {
   const row = dataSet.next();

   console.log({
     timestamp: new Date(row.getTimestamp()),
     avg: row.getDouble('AVG(temperature)'),
     max: row.getDouble('MAX(temperature)'),
     min: row.getDouble('MIN(temperature)'),
   });
 }

 await dataSet.close();
 ```

 ## Migration from Old Pattern

 The old pattern loaded all data into memory at once. The new pattern uses lazy loading with iterators.

 ### Old Approach (No Longer Supported)

 ```typescript
 // ❌ This no longer works - executeQueryStatement now returns SessionDataSet
 // const result = await session.executeQueryStatement('SELECT * FROM root.test.d1');
 // for (const row of result.rows) {
 //   const timestamp = row[0];
 //   const value1 = row[1];
 //   const value2 = row[2];
 //   console.log(timestamp, value1, value2);
 // }
 ```

 ### New Approach (Current Implementation)

 ```typescript
 // ✅ New way - lazy loading with iterator
 const dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');

 while (await dataSet.hasNext()) {
   const row = dataSet.next();

   const timestamp = row.getTimestamp();
   const value1 = row.getValueByIndex(0);
   const value2 = row.getValueByIndex(1);

   console.log(timestamp, value1, value2);
 }

 await dataSet.close();
 ```

 ### Migration Steps

 1. **Change method call**: `executeQueryStatement()` now returns `SessionDataSet` (not `QueryResult`)
 2. **Replace array access**: Use iterator pattern instead of `result.rows`
 3. **Update row access**: Use `RowRecord` methods instead of array indices
 4. **Add cleanup**: Always call `await dataSet.close()`

 ### For Small Datasets

 If you need all data at once for small result sets, use `toArray()`:

 ```typescript
 const dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');
 const allRows = await dataSet.toArray(); // Loads everything into memory
 // allRows is [[timestamp, value1, value2], ...]
 ```

 ⚠️ **Warning**: `toArray()` loads all data into memory. Only use for small result sets.

 ## Best Practices

 1. **Always Close**: Use try-finally to ensure `dataSet.close()` is called
 2. **Check hasNext()**: Always call `hasNext()` before calling `next()`
 3. **Use Fetch Size**: Set appropriate fetch size based on memory and network
 4. **Handle Nulls**: Check `isNull()` before accessing nullable columns
 5. **Typed Access**: Use typed getters (`getInt`, `getString`) for type safety
 6. **Avoid toArray()**: Don't use `toArray()` for large datasets

 ## Performance Tips

 1. **Fetch Size**: Larger fetch size = fewer network calls, more memory used
    - Small datasets: 100-1000
    - Large datasets: 1000-10000
    - Very large: 10000-50000

 2. **Batch Processing**: Process rows in batches for better performance

 ```typescript
 const dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');
 const batch = [];

 while (await dataSet.hasNext()) {
   batch.push(dataSet.next());

   if (batch.length >= 1000) {
     await processBatch(batch);
     batch.length = 0;
   }
 }

 if (batch.length > 0) {
   await processBatch(batch);
 }

 await dataSet.close();
 ```

 3. **Column Access**: Access by index is slightly faster than by name

 ```typescript
 // Faster
 const value = row.getIntByIndex(0);

 // Slightly slower (name lookup)
 const value = row.getInt('temperature');
 ```

 ## Troubleshooting

 ### "No more rows available" Error

 ```typescript
 // ❌ Wrong - calling next() without checking hasNext()
 const row = dataSet.next(); // May throw error

 // ✅ Correct
 if (await dataSet.hasNext()) {
   const row = dataSet.next();
 }
 ```

 ### "Column not found" Error

 ```typescript
 // Make sure column names match exactly (case-sensitive)
 const columns = dataSet.getColumnNames();
 console.log('Available columns:', columns);

 // Use correct column name
 const value = row.getString('temperature'); // Not 'Temperature' or 'temp'
 ```

 ### Memory Issues

 ```typescript
 // ❌ Don't do this with large datasets
 const allRows = await dataSet.toArray(); // Loads everything into memory

 // ✅ Process iteratively instead
 while (await dataSet.hasNext()) {
   const row = dataSet.next();
   // Process and discard each row
 }
 ```

 ## See Also

 - [Implementation Guide](implementation.md)
 - [Data Types Reference](data-types.md)
 - [TypeScript Examples](typescript-examples.md)
	# SessionDataSet Iterator Pattern Guide

	This guide explains how to use the SessionDataSet iterator pattern for efficiently querying large datasets from Apache IoTDB.

	## Overview

	The SessionDataSet provides an iterator-based approach to reading query results, similar to JDBC ResultSet or database cursors. This approach offers several advantages over loading all data into memory at once:

	- Memory Efficient: Only keeps the current batch in memory
	- Lazy Loading: Fetches data on-demand as you iterate
	- Large Datasets: Can handle result sets larger than available RAM
	- Resource Management: Proper cleanup of server-side resources

	## Basic Usage

	### Simple Iteration

	```typescript
	import { Session } from '@iotdb/client';

	const session = new Session({
	host: 'localhost',
	port: 6667,
	username: 'root',
	password: 'root',
	});

	await session.open();

	// Execute query and get SessionDataSet
	const dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');

	// Iterate through results
	while (await dataSet.hasNext()) {
	const row = dataSet.next();

	// Access timestamp
	const timestamp = row.getTimestamp();

	// Access values by column name
	const temperature = row.getFloat('temperature');
	const humidity = row.getDouble('humidity');
	const status = row.getString('status');

	console.log(`${timestamp}: temp=${temperature}, humidity=${humidity}, status=${status}`);
	}

	// Always close the dataset when done
	await dataSet.close();

	await session.close();
	```

	## API Reference

	### SessionDataSet

	#### Methods

	##### `hasNext(): Promise<boolean>`
	Checks if there are more rows available. This may trigger fetching the next batch from the server.

	```typescript
	while (await dataSet.hasNext()) {
	// Process next row
	}
	```

	##### `next(): RowRecord`
	Returns the next row. Must call `hasNext()` first to ensure a row is available.

	```typescript
	if (await dataSet.hasNext()) {
	const row = dataSet.next();
	}
	```

	##### `close(): Promise<void>`
	Closes the dataset and releases server-side resources. Always call this when done.

	```typescript
	await dataSet.close();
	```

	##### `getColumnNames(): string[]`
	Returns array of column names.

	```typescript
	const columns = dataSet.getColumnNames();
	console.log('Columns:', columns); // ['temperature', 'humidity', 'status']
	```

	##### `getColumnTypes(): string[]`
	Returns array of column data types.

	```typescript
	const types = dataSet.getColumnTypes();
	console.log('Types:', types); // ['FLOAT', 'DOUBLE', 'TEXT']
	```

	##### `findColumn(columnName: string): number`
	Returns the zero-based index of a column by name.

	```typescript
	const tempIndex = dataSet.findColumn('temperature'); // Returns 0
	```

	##### `toArray(): Promise<any[][]>` (Deprecated)
	Loads all remaining rows into memory as an array. Use only for small result sets.

	```typescript
	// ⚠️ Not recommended for large datasets
	const allRows = await dataSet.toArray();
	```

	### RowRecord

	Represents a single row of data from the query result.

	#### Methods

	##### Timestamp Access

	```typescript
	const timestamp = row.getTimestamp(); // Returns number (milliseconds)
	```

	##### Access by Column Name

	```typescript
	// Typed getters
	const stringValue = row.getString('name');
	const intValue = row.getInt('count');
	const longValue = row.getLong('id');
	const floatValue = row.getFloat('temperature');
	const doubleValue = row.getDouble('humidity');
	const boolValue = row.getBoolean('status');

	// Generic getter (returns any)
	const value = row.getValue('columnName');
	```

	##### Access by Column Index

	```typescript
	const stringValue = row.getStringByIndex(0);
	const intValue = row.getIntByIndex(1);
	const floatValue = row.getFloatByIndex(2);
	const value = row.getValueByIndex(3);
	```

	##### Null Checking

	```typescript
	if (row.isNull('optionalColumn')) {
	console.log('Column is null');
	} else {
	const value = row.getString('optionalColumn');
	}

	// By index
	if (row.isNullByIndex(0)) {
	console.log('First column is null');
	}
	```

	##### Get All Data

	```typescript
	const fields = row.getFields(); // Returns array of field values
	const array = row.toArray(); // Returns [timestamp, ...fields]
	```

	## Advanced Usage

	### Configuring Fetch Size

	Control how many rows are fetched in each batch:

	```typescript
	const session = new Session({
	host: 'localhost',
	port: 6667,
	fetchSize: 1024, // Fetch 1024 rows at a time
	});

	const dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');
	// Will automatically fetch in batches of 1024 rows
	```

	### Processing Large Result Sets

	For very large result sets, use iterator pattern to avoid memory issues:

	```typescript
	const dataSet = await session.executeQueryStatement('SELECT * FROM root.large_dataset');

	let count = 0;
	let sum = 0;

	while (await dataSet.hasNext()) {
	const row = dataSet.next();
	sum += row.getDouble('value');
	count++;

	// Log progress every 10000 rows
	if (count % 10000 === 0) {
	console.log(`Processed ${count} rows...`);
	}
	}

	console.log(`Total rows: ${count}, Average: ${sum / count}`);

	await dataSet.close();
	```

	### Error Handling

	Always use try-finally to ensure cleanup:

	```typescript
	let dataSet;
	try {
	dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');

	while (await dataSet.hasNext()) {
	const row = dataSet.next();
	// Process row
	}
	} catch (error) {
	console.error('Query error:', error);
	throw error;
	} finally {
	if (dataSet) {
	await dataSet.close();
	}
	}
	```

	### Working with Multiple Columns

	```typescript
	const dataSet = await session.executeQueryStatement(`
	SELECT temperature, humidity, pressure, status
	FROM root.weather.station1
	WHERE time > now() - 1h
	`);

	while (await dataSet.hasNext()) {
	const row = dataSet.next();

	const reading = {
	timestamp: new Date(row.getTimestamp()),
	temperature: row.getFloat('temperature'),
	humidity: row.getFloat('humidity'),
	pressure: row.getFloat('pressure'),
	status: row.getString('status'),
	};

	// Process reading
	if (!row.isNull('status') && reading.temperature > 30) {
	console.log('High temperature alert:', reading);
	}
	}

	await dataSet.close();
	```

	### Aggregation Queries

	```typescript
	const dataSet = await session.executeQueryStatement(`
	SELECT AVG(temperature), MAX(temperature), MIN(temperature)
	FROM root.test.d1
	GROUP BY ([2024-01-01, 2024-02-01), 1d)
	`);

	while (await dataSet.hasNext()) {
	const row = dataSet.next();

	console.log({
	timestamp: new Date(row.getTimestamp()),
	avg: row.getDouble('AVG(temperature)'),
	max: row.getDouble('MAX(temperature)'),
	min: row.getDouble('MIN(temperature)'),
	});
	}

	await dataSet.close();
	```

	## Migration from Old Pattern

	The old pattern loaded all data into memory at once. The new pattern uses lazy loading with iterators.

	### Old Approach (No Longer Supported)

	```typescript
	// ❌ This no longer works - executeQueryStatement now returns SessionDataSet
	// const result = await session.executeQueryStatement('SELECT * FROM root.test.d1');
	// for (const row of result.rows) {
	// const timestamp = row[0];
	// const value1 = row[1];
	// const value2 = row[2];
	// console.log(timestamp, value1, value2);
	// }
	```

	### New Approach (Current Implementation)

	```typescript
	// ✅ New way - lazy loading with iterator
	const dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');

	while (await dataSet.hasNext()) {
	const row = dataSet.next();

	const timestamp = row.getTimestamp();
	const value1 = row.getValueByIndex(0);
	const value2 = row.getValueByIndex(1);

	console.log(timestamp, value1, value2);
	}

	await dataSet.close();
	```

	### Migration Steps

	1. Change method call: `executeQueryStatement()` now returns `SessionDataSet` (not `QueryResult`)
	2. Replace array access: Use iterator pattern instead of `result.rows`
	3. Update row access: Use `RowRecord` methods instead of array indices
	4. Add cleanup: Always call `await dataSet.close()`

	### For Small Datasets

	If you need all data at once for small result sets, use `toArray()`:

	```typescript
	const dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');
	const allRows = await dataSet.toArray(); // Loads everything into memory
	// allRows is [[timestamp, value1, value2], ...]
	```

	⚠️ Warning: `toArray()` loads all data into memory. Only use for small result sets.

	## Best Practices

	1. Always Close: Use try-finally to ensure `dataSet.close()` is called
	2. Check hasNext(): Always call `hasNext()` before calling `next()`
	3. Use Fetch Size: Set appropriate fetch size based on memory and network
	4. Handle Nulls: Check `isNull()` before accessing nullable columns
	5. Typed Access: Use typed getters (`getInt`, `getString`) for type safety
	6. Avoid toArray(): Don't use `toArray()` for large datasets

	## Performance Tips

	1. Fetch Size: Larger fetch size = fewer network calls, more memory used
	- Small datasets: 100-1000
	- Large datasets: 1000-10000
	- Very large: 10000-50000

	2. Batch Processing: Process rows in batches for better performance

	```typescript
	const dataSet = await session.executeQueryStatement('SELECT * FROM root.test.d1');
	const batch = [];

	while (await dataSet.hasNext()) {
	batch.push(dataSet.next());

	if (batch.length >= 1000) {
	await processBatch(batch);
	batch.length = 0;
	}
	}

	if (batch.length > 0) {
	await processBatch(batch);
	}

	await dataSet.close();
	```

	3. Column Access: Access by index is slightly faster than by name

	```typescript
	// Faster
	const value = row.getIntByIndex(0);

	// Slightly slower (name lookup)
	const value = row.getInt('temperature');
	```

	## Troubleshooting

	### "No more rows available" Error

	```typescript
	// ❌ Wrong - calling next() without checking hasNext()
	const row = dataSet.next(); // May throw error

	// ✅ Correct
	if (await dataSet.hasNext()) {
	const row = dataSet.next();
	}
	```

	### "Column not found" Error

	```typescript
	// Make sure column names match exactly (case-sensitive)
	const columns = dataSet.getColumnNames();
	console.log('Available columns:', columns);

	// Use correct column name
	const value = row.getString('temperature'); // Not 'Temperature' or 'temp'
	```

	### Memory Issues

	```typescript
	// ❌ Don't do this with large datasets
	const allRows = await dataSet.toArray(); // Loads everything into memory

	// ✅ Process iteratively instead
	while (await dataSet.hasNext()) {
	const row = dataSet.next();
	// Process and discard each row
	}
	```

	## See Also

	- [Implementation Guide](implementation.md)
	- [Data Types Reference](data-types.md)
	- [TypeScript Examples](typescript-examples.md)