SimExR: Simulation Execution and Reasoning Framework

A comprehensive framework for importing, executing, and analyzing scientific simulations with AI-powered reasoning capabilities.

๐Ÿš€ Overview

SimExR is a FastAPI-based framework that provides a complete pipeline for:

  • Importing external simulation scripts from GitHub
  • Transforming scripts into standardized simulate(**params) functions
  • Executing single and batch simulations with automatic result storage
  • Analyzing results using AI-powered reasoning agents
  • Managing models, results, and conversations through REST APIs

๐Ÿ—๏ธ Architecture

Core Components

simexr_mod/
โ”œโ”€โ”€ api/                    # FastAPI application and routers
โ”‚   โ”œโ”€โ”€ main.py            # Main API application
โ”‚   โ”œโ”€โ”€ dependencies.py    # Dependency injection
โ”‚   โ””โ”€โ”€ routers/           # API endpoint definitions
โ”‚       โ”œโ”€โ”€ simulation.py  # Simulation execution APIs
โ”‚       โ”œโ”€โ”€ reasoning.py   # AI reasoning APIs
โ”‚       โ”œโ”€โ”€ database.py    # Database read-only APIs
โ”‚       โ””โ”€โ”€ health.py      # Health check APIs
โ”œโ”€โ”€ core/                   # Core business logic
โ”‚   โ”œโ”€โ”€ interfaces.py      # Abstract base classes
โ”‚   โ”œโ”€โ”€ patterns.py        # Design patterns implementation
โ”‚   โ””โ”€โ”€ services.py        # Main service layer
โ”œโ”€โ”€ execute/               # Simulation execution engine
โ”‚   โ”œโ”€โ”€ loader/           # Script loading and transformation
โ”‚   โ”œโ”€โ”€ run/              # Simulation execution
โ”‚   โ””โ”€โ”€ test/             # Code testing and refinement
โ”œโ”€โ”€ reasoning/             # AI reasoning engine
โ”‚   โ”œโ”€โ”€ agent/            # Reasoning agent implementation
โ”‚   โ”œโ”€โ”€ messages/         # LLM client implementations
โ”‚   โ””โ”€โ”€ base.py           # Base reasoning classes
โ”œโ”€โ”€ db/                    # Database layer
โ”‚   โ”œโ”€โ”€ repositories/     # Data access layer
โ”‚   โ”œโ”€โ”€ services/         # Database services
โ”‚   โ””โ”€โ”€ utils/            # Database utilities
โ”œโ”€โ”€ code/                  # Code processing utilities
โ”‚   โ”œโ”€โ”€ refactor/         # Code refactoring
โ”‚   โ”œโ”€โ”€ extract/          # Metadata extraction
โ”‚   โ””โ”€โ”€ utils/            # Code utilities
โ””โ”€โ”€ utils/                 # Configuration and utilities

๐Ÿ› ๏ธ Installation & Setup

Prerequisites

  • Python 3.8+
  • Git
  • OpenAI API key

1. Clone and Setup Environment

# Clone the repository
git clone <repository-url>
cd simexr_mod

# Create virtual environment
python -m venv simexr_venv
source simexr_venv/bin/activate  # On Windows: simexr_venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

2. Configuration

Copy the example configuration file and add your OpenAI API key:

cp config.yaml.example config.yaml

Then edit config.yaml and replace YOUR_OPENAI_API_KEY_HERE with your actual OpenAI API key from https://platform.openai.com/account/api-keys.

3. Database Setup

The framework uses SQLite by default. The database will be automatically created at mcp.db on first run.

๐Ÿš€ Quick Start

Option 1: Web UI (Recommended)

Start the complete application with the user-friendly Streamlit interface:

source simexr_venv/bin/activate
python start_streamlit.py

This will automatically:

  • โœ… Start the API server
  • โœ… Launch the Streamlit web interface
  • โœ… Open your browser to http://localhost:8501

Option 2: API Only

Start just the API server for programmatic access:

source simexr_venv/bin/activate
python start_api.py --host 127.0.0.1 --port 8000

The server will be available at:

2. Using the Web Interface

Once the Streamlit app is running, you can:

  1. ๐Ÿ“ฅ Import Models: Use the “Import Models” page to import scripts from GitHub
  2. โš™๏ธ Run Simulations: Use the “Run Simulations” page to execute simulations
  3. ๐Ÿ“Š View Results: Use the “View Results” page to explore simulation data
  4. ๐Ÿค– AI Analysis: Use the “AI Analysis” page to ask questions about your results
  5. ๐Ÿ” Search Models: Use the “Model Search” page to find existing models

๐Ÿ”„ End-to-End Flow

Complete workflow: Import GitHub scripts โ†’ Transform with AI โ†’ Run simulations โ†’ Analyze results โ†’ Get AI insights. The system automatically handles script transformation, parameter extraction, and result storage, enabling researchers to go from raw code to AI-powered insights in minutes.

3. Using the API Directly

If you prefer to use the API directly:

# Import and transform a simulation
curl -X POST "http://127.0.0.1:8000/simulation/transform/github" \
  -H "Content-Type: application/json" \
  -d '{
    "github_url": "https://github.com/vash02/physics-systems-dataset/blob/main/vanderpol.py",
    "model_name": "vanderpol_transform",
    "max_smoke_iters": 3
  }'

# Run simulations
curl -X POST "http://127.0.0.1:8000/simulation/run" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "vanderpol_transform_eac8429aea8f",
    "parameters": {
      "mu": 1.5,
      "z0": [1.5, 0.5],
      "eval_time": 25,
      "t_iteration": 250,
      "plot": false
    }
  }'

# Analyze results with AI
curl -X POST "http://127.0.0.1:8000/reasoning/ask" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "vanderpol_transform_eac8429aea8f",
    "question": "What is the behavior of the van der Pol oscillator for mu=1.0 and mu=1.5? How do the trajectories differ?",
    "max_steps": 5
  }'

๐ŸŒ Web Interface

The SimExR framework includes a modern, user-friendly web interface built with Streamlit:

๐Ÿ“ฑ Interface Pages

  • ๐Ÿ  Dashboard: Overview of system status, recent activity, and quick actions
  • ๐Ÿ“ฅ Import Models: Import and transform scripts from GitHub URLs
  • โš™๏ธ Run Simulations: Execute single or batch simulations with custom parameters
  • ๐Ÿ“Š View Results: Explore simulation results with interactive data tables
  • ๐Ÿค– AI Analysis: Ask AI-powered questions about your simulation results
  • ๐Ÿ” Model Search: Search and browse all available models

๐ŸŽฏ Key Features

  • ๐Ÿ” Fuzzy Search: Intelligent model search with relevance scoring
  • ๐Ÿ“Š Interactive Results: View and download simulation results as CSV
  • ๐Ÿค– AI Chat: Natural language analysis of simulation data
  • โš™๏ธ Parameter Management: Edit and manage simulation parameters
  • ๐Ÿ“ Script Editor: View and edit simulation scripts
  • ๐Ÿ“‹ Templates: Pre-built parameter templates for common systems

๐Ÿ“Š API Endpoints

Health Check APIs

  • GET /health/status - System health status
  • POST /health/test - Run system tests

Simulation APIs

  • POST /simulation/transform/github - Import and transform GitHub scripts
  • POST /simulation/run - Run single simulation
  • POST /simulation/batch - Run batch simulations
  • GET /simulation/models - List all models
  • GET /simulation/models/search - Fuzzy search models by name
  • GET /simulation/models/{model_id} - Get model information
  • GET /simulation/models/{model_id}/results - Get simulation results
  • DELETE /simulation/models/{model_id}/results - Clear model results

Reasoning APIs

  • POST /reasoning/ask - Ask AI reasoning questions
  • GET /reasoning/history/{model_id} - Get reasoning history
  • GET /reasoning/conversations - Get all conversations
  • GET /reasoning/stats - Get reasoning statistics

Database APIs (Read-only)

  • GET /database/results - Get simulation results
  • GET /database/models - Get database models
  • GET /database/stats - Get database statistics

๐Ÿงช Testing Results

Complete Workflow Test

We successfully tested the complete workflow from GitHub import to AI analysis:

1. GitHub Script Import & Transformation

# Test URL: https://github.com/vash02/physics-systems-dataset/blob/main/vanderpol.py
# Result: Successfully imported and transformed into simulate(**params) function
# Model ID: vanderpol_transform_eac8429aea8f

2. Single Simulation Execution

# Parameters: mu=1.5, z0=[1.5, 0.5], eval_time=25, t_iteration=250
# Result: Successfully executed with detailed logging
# Execution time: ~0.06 seconds
# Data points: 250 time steps, 15x15 grid

3. Batch Simulation Execution

# Parameter grid: 2 different configurations
# Result: Successfully executed with tqdm progress bars
# Automatic result saving to database
# Execution time: ~0.5 seconds total

4. AI Reasoning Analysis

# Question: "What is the behavior of the van der Pol oscillator for mu=1.0 and mu=1.5?"
# Result: Comprehensive scientific analysis with:
# - Common behavior identification
# - Parameter-specific differences
# - Technical details and insights
# Execution time: ~83 seconds

API Performance Metrics

API EndpointStatusResponse TimeFeatures
GET /health/statusโœ…<100msSystem health
POST /simulation/transform/githubโœ…~5sImport + transform + refine
POST /simulation/runโœ…~0.1sSingle simulation + auto-save
POST /simulation/batchโœ…~0.5sBatch simulation + tqdm + auto-save
GET /simulation/modelsโœ…<100ms50 models listed
GET /simulation/models/searchโœ…<100msFuzzy search with relevance scoring
GET /simulation/models/{id}/resultsโœ…<200msResults with NaN handling
POST /reasoning/askโœ…~83sAI analysis with 5 reasoning steps
GET /reasoning/history/{id}โœ…<100msConversation history
GET /reasoning/statsโœ…<100ms173 conversations, 18 models

Key Features Validated

โœ… GitHub Integration: Successfully imports and transforms external scripts
โœ… Code Refactoring: Converts scripts to standardized simulate(**params) format
โœ… Automatic Result Saving: All simulations automatically saved to database
โœ… Enhanced Logging: Detailed execution logs with result previews
โœ… tqdm Progress Bars: Visual progress for batch operations
โœ… NaN Handling: Proper JSON serialization of scientific data
โœ… Fuzzy Search: Intelligent model search with relevance scoring
โœ… AI Reasoning: Comprehensive analysis of simulation results
โœ… Error Handling: Graceful handling of various error conditions

๐Ÿ”ง Advanced Usage

Custom Simulation Parameters

The framework supports dynamic parameter extraction and validation:

# Example parameter structure for van der Pol oscillator
parameters = {
    "mu": 1.5,                    # Damping parameter
    "z0": [1.5, 0.5],            # Initial conditions [x0, y0]
    "eval_time": 25,              # Simulation time
    "t_iteration": 250,           # Number of time steps
    "plot": False                 # Plotting flag
}

Batch Simulation with Parameter Grids

curl -X POST "http://127.0.0.1:8000/simulation/batch" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "your_model_id",
    "parameter_grid": [
      {"param1": "value1", "param2": "value2"},
      {"param1": "value3", "param2": "value4"}
    ]
  }'

Fuzzy Model Search

# Search by partial name
curl "http://127.0.0.1:8000/simulation/models/search?name=vanderpol&limit=5"

# Search by model type
curl "http://127.0.0.1:8000/simulation/models/search?name=lorenz&limit=3"

AI Reasoning with Custom Questions

curl -X POST "http://127.0.0.1:8000/reasoning/ask" \
  -H "Content-Type: application/json" \
  -d '{
    "model_id": "your_model_id",
    "question": "Analyze the stability of the system and identify bifurcation points",
    "max_steps": 10
  }'

๐Ÿ› Troubleshooting

Common Issues

  1. OpenAI API Key Error

    # Ensure API key is set in utils/config.yaml
    # Or set environment variable
    export OPENAI_API_KEY="your-key-here"
    
  2. Import Errors

    # Ensure virtual environment is activated
    source simexr_venv/bin/activate
    
    # Install missing dependencies
    pip install -r requirements.txt
    
  3. Database Connection Issues

    # Check database file permissions
    ls -la mcp.db
    
    # Recreate database if corrupted
    rm mcp.db
    # Restart server to recreate
    
  4. Simulation Execution Errors

    # Check script syntax
    python -m py_compile your_script.py
    
    # Verify simulate function exists
    grep -n "def simulate" your_script.py
    

Debug Mode

Enable detailed logging by setting environment variables:

export LOG_LEVEL=DEBUG
export SIMEXR_DEBUG=true
python start_api.py --host 127.0.0.1 --port 8000

Performance Optimization

Database Optimization

  • Use appropriate indexes for large datasets
  • Implement result pagination for large result sets

Simulation Optimization

  • Use vectorized operations in simulation scripts
  • Implement parallel processing for batch simulations
  • Cache frequently used simulation results

AI Reasoning Optimization

  • Implement conversation caching
  • Use streaming responses for long analyses
  • Optimize prompt engineering for faster responses

๐Ÿ”ฎ Future Enhancements

Planned Features

  • Web UI: Interactive web interface for model management
  • Real-time Monitoring: Live simulation progress tracking
  • Distributed Computing: Multi-node simulation execution
  • Advanced Analytics: Statistical analysis and visualization
  • Model Versioning: Version control for simulation models
  • Plugin System: Extensible architecture for custom components
  • Computational Model MCP Server: MCP server for standardizing end to end scientific simulation workflows
  • Complete agentic Control: Agentic control from experiment initiation to results analysis & rerun.

Integration Possibilities

  • Jupyter Notebooks: Direct integration with Jupyter
  • Cloud Platforms: AWS, GCP, Azure deployment
  • Scientific Workflows: Integration with workflow engines
  • Data Lakes: Large-scale data storage and processing

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

Contributing

  1. Fork the repository
  2. Create a feature branch
  3. Make your changes
  4. Add tests for new functionality
  5. Submit a pull request

Support

For questions and support:

  • Create an issue on GitHub
  • Check the documentation at /docs
  • Review the API documentation at /docs

SimExR Framework - Empowering scientific simulation with AI reasoning capabilities.