SessionPool Exception Handling and Health Monitoring

Overview

The Apache IoTDB C# client library provides comprehensive exception handling and health monitoring capabilities for SessionPool operations. This document explains how to handle pool depletion scenarios, monitor pool health, and implement recovery strategies.

SessionPoolDepletedException

Description

SessionPoolDepletedException is a specialized exception thrown when the SessionPool cannot provide a client connection. This indicates that:

  • All clients in the pool are currently in use, OR
  • Client connections have failed and reconnection attempts were unsuccessful, OR
  • The pool wait timeout has been exceeded

Exception Properties

The exception provides detailed diagnostic information through the following properties:

PropertyTypeDescription
DepletionReasonstringA human-readable description of why the pool was depleted
AvailableClientsintNumber of currently available clients in the pool at the time of exception
TotalPoolSizeintThe total configured size of the session pool
FailedReconnectionsintNumber of failed reconnection attempts since the pool was opened

Example Usage

using Apache.IoTDB;
using System;

try
{
    var sessionPool = new SessionPool.Builder()
        .Host("127.0.0.1")
        .Port(6667)
        .PoolSize(4)
        .Build();

    await sessionPool.Open();

    // Perform operations...
    await sessionPool.InsertRecordAsync("root.sg.d1", record);
}
catch (SessionPoolDepletedException ex)
{
    Console.WriteLine($"Pool depleted: {ex.DepletionReason}");
    Console.WriteLine($"Available clients: {ex.AvailableClients}/{ex.TotalPoolSize}");
    Console.WriteLine($"Failed reconnections: {ex.FailedReconnections}");

    // Implement recovery strategy (see below)
}

Pool Health Metrics

Monitoring Pool Status

The SessionPool class exposes real-time health metrics that can be used for monitoring and alerting:

var sessionPool = new SessionPool.Builder()
    .Host("127.0.0.1")
    .Port(6667)
    .PoolSize(8)
    .Build();

await sessionPool.Open();

// Check pool health
Console.WriteLine($"Available Clients: {sessionPool.AvailableClients}");
Console.WriteLine($"Total Pool Size: {sessionPool.TotalPoolSize}");
Console.WriteLine($"Failed Reconnections: {sessionPool.FailedReconnections}");

Health Metrics

MetricPropertyDescriptionRecommended Threshold
Available ClientsAvailableClientsNumber of idle clients ready for useAlert if < 25% of pool size
Total Pool SizeTotalPoolSizeConfigured maximum pool sizeN/A (constant)
Failed ReconnectionsFailedReconnectionsCumulative count of failed reconnection attemptsAlert if > 0 and increasing

Failure Scenarios and Recovery Strategies

Scenario 1: Pool Exhaustion (High Load)

Symptoms:

  • SessionPoolDepletedException with reason “Connection pool is empty and wait time out”
  • AvailableClients = 0
  • FailedReconnections = 0 or low

Root Cause: Application workload exceeds pool capacity

Recovery Strategies:

  1. Increase Pool Size:
var sessionPool = new SessionPool.Builder()
    .Host("127.0.0.1")
    .Port(6667)
    .PoolSize(16)  // Increased from 8
    .Build();
  1. Implement Connection Retry with Backoff:
int maxRetries = 3;
int retryDelayMs = 1000;

for (int i = 0; i < maxRetries; i++)
{
    try
    {
        await sessionPool.InsertRecordAsync(deviceId, record);
        break;  // Success
    }
    catch (SessionPoolDepletedException ex) when (i < maxRetries - 1)
    {
        await Task.Delay(retryDelayMs * (i + 1));  // Exponential backoff
    }
}
  1. Optimize Operation Duration:
    • Reduce the time each client is held
    • Batch multiple operations together
    • Use async operations efficiently

Scenario 2: Network Connectivity Issues

Symptoms:

  • SessionPoolDepletedException with reason “Reconnection failed”
  • AvailableClients decreases over time
  • FailedReconnections > 0 and increasing

Root Cause: IoTDB server unreachable or network issues

Recovery Strategies:

  1. Reinitialize SessionPool:
catch (SessionPoolDepletedException ex) when (ex.FailedReconnections > 5)
{
    Console.WriteLine($"Critical: {ex.FailedReconnections} failed reconnections");

    // Close existing pool
    await sessionPool.Close();

    // Wait for network recovery
    await Task.Delay(5000);

    // Create new pool
    sessionPool = new SessionPool.Builder()
        .Host("127.0.0.1")
        .Port(6667)
        .PoolSize(8)
        .Build();

    await sessionPool.Open();
}
  1. Implement Circuit Breaker Pattern:
public class SessionPoolCircuitBreaker
{
    private SessionPool _pool;
    private int _failureCount = 0;
    private const int FailureThreshold = 5;
    private bool _circuitOpen = false;
    private DateTime _lastFailureTime;

    public async Task<T> ExecuteAsync<T>(Func<SessionPool, Task<T>> operation)
    {
        if (_circuitOpen && DateTime.Now - _lastFailureTime < TimeSpan.FromMinutes(1))
        {
            throw new Exception("Circuit breaker is open");
        }

        try
        {
            var result = await operation(_pool);
            _failureCount = 0;  // Reset on success
            _circuitOpen = false;
            return result;
        }
        catch (SessionPoolDepletedException ex)
        {
            _failureCount++;
            _lastFailureTime = DateTime.Now;

            if (_failureCount >= FailureThreshold)
            {
                _circuitOpen = true;
                Console.WriteLine("Circuit breaker opened - too many failures");
            }
            throw;
        }
    }
}

Scenario 3: Server Overload

Symptoms:

  • Intermittent SessionPoolDepletedException
  • Both connection timeouts and reconnection failures

Root Cause: IoTDB server is overloaded

Recovery Strategies:

  1. Implement Rate Limiting:
using System.Threading;

private SemaphoreSlim _rateLimiter = new SemaphoreSlim(10, 10);  // Max 10 concurrent operations

public async Task RateLimitedInsert(string deviceId, RowRecord record)
{
    await _rateLimiter.WaitAsync();
    try
    {
        await sessionPool.InsertRecordAsync(deviceId, record);
    }
    finally
    {
        _rateLimiter.Release();
    }
}
  1. Add Timeout Configuration:
var sessionPool = new SessionPool.Builder()
    .Host("127.0.0.1")
    .Port(6667)
    .Timeout(120)  // Increased timeout for slow server
    .Build();

Monitoring and Alerting Recommendations

Health Check Implementation

public class SessionPoolHealthCheck
{
    private readonly SessionPool _pool;

    public SessionPoolHealthCheck(SessionPool pool)
    {
        _pool = pool;
    }

    public HealthStatus CheckHealth()
    {
        var availableRatio = (double)_pool.AvailableClients / _pool.TotalPoolSize;

        if (_pool.FailedReconnections > 10)
        {
            return new HealthStatus
            {
                Status = "Critical",
                Message = $"High reconnection failures: {_pool.FailedReconnections}",
                Recommendation = "Check IoTDB server availability"
            };
        }

        if (availableRatio < 0.25)
        {
            return new HealthStatus
            {
                Status = "Warning",
                Message = $"Low available clients: {_pool.AvailableClients}/{_pool.TotalPoolSize}",
                Recommendation = "Consider increasing pool size"
            };
        }

        return new HealthStatus
        {
            Status = "Healthy",
            Message = $"Pool healthy: {_pool.AvailableClients}/{_pool.TotalPoolSize} available"
        };
    }
}

public class HealthStatus
{
    public string Status { get; set; }
    public string Message { get; set; }
    public string Recommendation { get; set; }
}

Metrics Collection for Monitoring Systems

// Example: Export metrics to Prometheus, StatsD, or similar
public class SessionPoolMetricsCollector
{
    private readonly SessionPool _pool;

    public void CollectMetrics()
    {
        // Gauge: Current available clients
        MetricsCollector.Set("iotdb_pool_available_clients", _pool.AvailableClients);

        // Gauge: Total pool size
        MetricsCollector.Set("iotdb_pool_total_size", _pool.TotalPoolSize);

        // Counter: Failed reconnections
        MetricsCollector.Set("iotdb_pool_failed_reconnections", _pool.FailedReconnections);

        // Calculated: Pool utilization percentage
        var utilization = (1.0 - (double)_pool.AvailableClients / _pool.TotalPoolSize) * 100;
        MetricsCollector.Set("iotdb_pool_utilization_percent", utilization);
    }
}

Recommended Alert Rules

  1. Critical Alerts:

    • FailedReconnections > 10: Server connectivity issues
    • AvailableClients == 0 for > 30 seconds: Complete pool exhaustion
  2. Warning Alerts:

    • AvailableClients < TotalPoolSize * 0.25: Pool under pressure
    • FailedReconnections > 0 and increasing: Network instability
  3. Info Alerts:

    • Pool utilization > 75% for extended periods: Consider scaling

Best Practices

  1. Pool Sizing:

    • Start with poolSize = 2 × expected concurrent operations
    • Monitor and adjust based on actual usage patterns
    • Larger pools use more server resources but provide better throughput
  2. Error Handling:

    • Always catch SessionPoolDepletedException specifically
    • Log exception properties for debugging
    • Implement appropriate retry logic based on depletion reason
  3. Monitoring:

    • Continuously monitor AvailableClients metric
    • Track FailedReconnections as a leading indicator of problems
    • Set up alerts before pool is completely depleted
  4. Resource Management:

    • Always call sessionPool.Close() when done
    • Use using statements or try-finally blocks for proper cleanup
    • Don't create multiple SessionPool instances unnecessarily

Example: Complete Production-Ready Implementation

using Apache.IoTDB;
using System;
using System.Threading.Tasks;

public class ProductionSessionPoolManager
{
    private SessionPool _pool;
    private readonly object _lock = new object();

    public async Task Initialize()
    {
        _pool = new SessionPool.Builder()
            .Host("127.0.0.1")
            .Port(6667)
            .PoolSize(8)
            .Timeout(60)
            .Build();

        await _pool.Open();

        // Start health monitoring
        _ = Task.Run(MonitorHealth);
    }

    public async Task<T> ExecuteWithRetry<T>(Func<SessionPool, Task<T>> operation)
    {
        const int maxRetries = 3;
        const int baseDelayMs = 1000;

        for (int attempt = 0; attempt < maxRetries; attempt++)
        {
            try
            {
                return await operation(_pool);
            }
            catch (SessionPoolDepletedException ex)
            {
                Console.WriteLine($"Attempt {attempt + 1} failed: {ex.Message}");
                Console.WriteLine($"Pool state - Available: {ex.AvailableClients}/{ex.TotalPoolSize}, Failed reconnections: {ex.FailedReconnections}");

                if (attempt == maxRetries - 1)
                {
                    // Last attempt failed
                    if (ex.FailedReconnections > 5)
                    {
                        // Reinitialize pool
                        await ReinitializePool();
                    }
                    throw;
                }

                // Exponential backoff
                await Task.Delay(baseDelayMs * (int)Math.Pow(2, attempt));
            }
        }

        throw new InvalidOperationException("Should not reach here");
    }

    private async Task ReinitializePool()
    {
        lock (_lock)
        {
            try
            {
                _pool?.Close().Wait();
            }
            catch { }
        }

        await Task.Delay(5000);  // Wait for server recovery
        await Initialize();
    }

    private async Task MonitorHealth()
    {
        while (true)
        {
            await Task.Delay(10000);  // Check every 10 seconds

            try
            {
                var availableRatio = (double)_pool.AvailableClients / _pool.TotalPoolSize;

                if (_pool.FailedReconnections > 10)
                {
                    Console.WriteLine($"CRITICAL: {_pool.FailedReconnections} failed reconnections");
                }
                else if (availableRatio < 0.25)
                {
                    Console.WriteLine($"WARNING: Low available clients - {_pool.AvailableClients}/{_pool.TotalPoolSize}");
                }
            }
            catch (Exception ex)
            {
                Console.WriteLine($"Health check failed: {ex.Message}");
            }
        }
    }

    public async Task Cleanup()
    {
        await _pool?.Close();
    }
}

Summary

The SessionPool exception handling and health monitoring features provide comprehensive tools for building robust IoTDB applications:

  • Use SessionPoolDepletedException to understand and react to pool issues
  • Monitor AvailableClients, TotalPoolSize, and FailedReconnections metrics
  • Implement appropriate recovery strategies based on failure scenarios
  • Set up proactive monitoring and alerting to prevent issues
  • Follow best practices for pool sizing and resource management