llm_service: opt-in Bedrock prompt caching via cache_prefix
Add a cache_prefix parameter to call_llm. When provided for a Claude
model, the stable text (system prompt, ASVS requirement text, shared
inventory) is prepended to the first message as its own content block
marked cache_control={"type":"ephemeral"}, ahead of the volatile
content. litellm translates this OpenAI-format marker to Bedrock's native
cachePoint; cached input bills at ~10% on a hit.
- Marker carries NO ttl (Bedrock 400s on ttl; litellm #17250/#15880).
- Injection builds a NEW message list; caller's messages are not mutated.
Handles str content, existing block-form content, and empty lists.
- Gated to Claude models via _supports_prompt_caching; the OpenAI
'responses' path builds its own input from the original messages and is
untouched. stream_llm is intentionally not wired (the cached analysis
path uses call_llm).
- Sub-2048-token prefixes are silently uncached (no error), so pass only a
genuinely large prefix.
Rebuilt against the current gofannon tree (post bedrock-mythos STS work).
Unit-tested: block placement, no-ttl marker, no caller mutation,
block-form preservation, support gate.
Gofannon is a provider- and model-agnostic toolkit and web application for prototyping AI agents and the lightweight web UIs that wrap them. Subject matter experts compose tools, data sources, and decision paths through a guided interface, preview agent interactions in real time, and hand off working agent-driven experiences without committing to a single AI framework or model provider.
git clone https://github.com/The-AI-Alliance/gofannon.git cd gofannon/webapp/infra/docker docker-compose up --build
See the quickstart guide for details, including required environment configuration.
Full documentation lives in docs/ and is published at https://the-ai-alliance.github.io/gofannon/. Highlights:
Gofannon is the Welsh god of smithcraft. See About the name for the story behind the choice.
Planned features and their current status are tracked in ROADMAP.md.
Contributions are welcome. See CONTRIBUTING.md for how to get started, including the “good first issue” label for newcomers and contribution guides for adding tools, integrating new agentic frameworks, and extending the web UI.
Thanks to the open-source community for contributions and support that have made this project possible.
Gofannon is licensed under the Apache License, Version 2.0. See LICENSE for the full text.