Today’s AI agents fetch untrusted data, call APIs, and modify systems, which increases the consequences of treating all inputs as trustworthy.
This poster presents the Dual LLM pattern, an architectural approach that separates the handling of untrusted data from privileged reasoning. Instead of relying on a single model for both execution and validation, the system uses a quarantined LLM to process untrusted inputs under strict constraints, while a privileged LLM performs higher-level reasoning without direct exposure to unsafe content.
Using real-world examples, such as processing emails and web content containing hidden adversarial prompts, the poster shows how dual-LLM systems can safely summarize untrusted data while limiting the risk of prompt injection. It also highlights the trade-offs of this approach, including added latency, orchestration complexity, and cases where over-constraining models can reduce usefulness.
The goal of this poster is to share practical lessons from building agentic systems in Python and to spark discussion around when dual LLM patterns make sense, where they fall short, and how they can be applied responsibly and safely in real systems built with tools like FastAPI, LangChain, or similar frameworks.