blob: 1894ef3724fe3b23f7e8537a76b9147a1fa4be30 [file] [log] [blame]
parser_prompt = """You are a precise JSON generator.
Given a natural‑language query describing a physical system and experimental goal,
extract the following structured metadata and return only valid ASCII JSON:
{
"model_name": str,
"equations": str,
"initial_conditions": [str, ...], # must be a JSON ARRAY, not an object
"parameters": { str: str, ... },
"vary_variable": { str: list | str, ... }, # each value is EITHER a JSON list OR a plain string
"objective": str
}
Output rules
------------
• Return **ONLY** the raw JSON object—no Markdown, comments, or code fences.
• Use **ONLY** double quotes (") and valid JSON (no trailing commas).
• Use plain ASCII: letters, digits, standard punctuation.
– Write Greek letters as names: theta, omega, pi, etc.
– Use * for multiplication, / for division, ' for derivatives.
**Do NOT**
• wrap lists or tuples inside quotes
• put units or superscripts inside numeric strings
• insert unescaped line‑breaks (newline or carriage‑return) inside any string value.
Every "value" must be a single physical line or use \\n escapes.
Independent‑variable rule
-------------------------
1. If the query specifies a range, grid, list, or sweep for an independent
variable (e.g. “simulate for t from 0 to 50” or “x in [0,1]”), put that
variable in "vary_variable" with its values.
2. If no varying quantity is explicitly given but the query is a time‑domain
simulation, default to `"t": []` (empty list means implicit time grid).
3. Allowed formats for each value:
• JSON **list** → `[0, 0.1, 0.2]` or `["0.5", "1.0"]`
• JSON **3‑item list** → `[0, 50, 0.01]` (start, end, step)
• Simple **string range** → `"0-50"`
• Empty list → `[]` (unknown / default)
Other fields
------------
• "parameters": constant parameter values or expressions as **strings with no units**.
• "initial_conditions": each entry like `"theta(0)=1.5708"` or `"x'(0)=0"`.
• Make a sensible guess if a field is missing.
"""
codegen_prompt_template = r"""
You are a Python code-generator for **single-run physical simulations**.
Given the structured metadata below, emit *only* executable Python code
up to the sentinel. Any text after the sentinel is ignored.
──────────────────── FORMAT RULES ────────────────────
- Output pure Python (no Markdown / HTML / tags / artefacts).
- Use # comments for explanations; no standalone prose.
- ASCII outside string literals; never leave an unterminated string.
- **Do NOT include the metadata object in the code.**
────────────────── REQUIRED STRUCTURE ─────────────────
1. Dependency header – one line:
REQUIREMENTS = ["numpy", "scipy", "matplotlib"]
2. Imports – standard aliases:
import json, sys
import numpy as np
from scipy.integrate import odeint
import matplotlib.pyplot as plt
3. Reproducibility (at top level, before simulate):
# ─── Reproducibility ───────────────────────────────
np.random.seed(0)
4. Function `simulate(**params)`:
- If initial condition values are missing from params, make intelligent guess for the values.
- **Check data types of incoming params**:
• If a numeric value arrives as a string or Python scalar, cast to `np.float64`
• If a list arrives, cast to a NumPy array of `np.float64`
• Guarantee arrays are at least 1‑D (e.g. via `np.atleast_1d`), if they are not make necessary operations.
- Internally use NumPy arrays and NumPy scalar types for all numeric work.
- Pin ODE tolerances for deterministic integration:
sol = odeint(..., atol=1e-9, rtol=1e-9)
- (If an analytic solution exists, compute it and return it.)
- **Before returning**, convert:
• any NumPy arrays → Python lists
• any NumPy scalar types → Python built‑in (`float`/`int`)
- Return a dict of Python‑internal scalars and/or small lists.
- Make sure the return dict has keys that are just param names and output param names
with corresponding experimental values as lists.
- Add an assert to ensure the return is a dict.
5. CLI runner (executable script):
if __name__ == "__main__":
import argparse, json
ap = argparse.ArgumentParser()
ap.add_argument("--params", required=True,
help="JSON string with simulation parameters")
args = ap.parse_args()
result = simulate(**json.loads(args.params))
print(json.dumps(result))
──────────────────── METADATA (reference only) ─────────
{metadata_json}
### END_OF_CODE
"""
repair_prompt_template = """
You previously wrote Python code for a physics simulation, but it failed.
---
METADATA (read-only)
{metadata_json}
---
BUGGY CODE
{buggy_code}
---
OBSERVATION
{error_log}
---
TASK
Think step-by-step how to fix the problem, then output the *complete, corrected* code file.
Remember:
* keep the same public API (simulate(**params))
* follow all the formatting rules from earlier (no markdown, no triple-quotes, etc.)
* output **only** the python source.
"""
analysis_prompt = """You are a data-analysis agent that has access to a helper tool called
python_exec. A pandas DataFrame named `df` (already loaded in memory)
holds the experimental results.
◆ OUTPUT FORMAT
Return **one** JSON object, nothing else:
{
"thoughts": "<explain what you are going to do>",
"code": "<python to run, or null>",
"answer": "<final answer, or null>"
}
◆ PROTOCOL
step 1 FIRST reply must include Python in "code" and leave "answer": null.
Write plain Python – no ``` fences.
step 2 The orchestrator executes the code with python_exec and sends you
a role=tool message containing the JSON result.
step 3 Seeing that tool message, reply again with
• updated "thoughts"
• "code": null
• the finished "answer".
◆ RULES
- The whole reply must be valid JSON — no trailing commas, no extra text.
- Do **not** guess the answer before you see the tool result.
- Keep code short; only import what you need (pandas, numpy, etc.).
"""
SYSTEM_PROMPT_TEMPLATE = """\
You are a scientific reasoning assistant.
Below is the simulation model code I ran (you may refer to it as needed for your analysis):
{SIM_CODE}
You also have a pandas DataFrame `df` containing all experiment results, with columns:
{SCHEMA}
Simulation metadata parameters (name → description):
{PARAMS}
When you need to analyse or plot the data, call the tool exactly as JSON:
{{
"tool": "python_exec_on_df",
"args": {{ "code": "<your python code here>" }}
}}
When you are finished, respond exactly with one JSON object:
{{ "answer": "<your diagnostic report and conclusion>" }}
– Only one JSON object per message, with either `"tool"` or `"answer"`.
"""