
RAMPART & Clarity: security testing for AI agents
Microsoft open-sources RAMPART and Clarity, two frameworks for security-testing AI agents at development time.
What happened
Microsoft has open-sourced two security testing frameworks for AI agents (autonomous AI systems capable of executing multi-step tasks like booking appointments, running code, or calling external APIs): RAMPART and Clarity.
RAMPART (*Risk Assessment and Measurement Platform for Agentic Red Teaming*) is a Pytest-native framework — it plugs directly into the test infrastructure most Python teams already run. It lets developers write security test cases for agentic AI (AI that acts autonomously across multi-step workflows), covering *prompt injection* (manipulating input to make an agent execute malicious instructions), unsafe tool use, privilege escalation within agent flows, and out-of-spec behaviors.
Clarity handles observability: tracing agent decisions, which tools it calls, in what order, and with what inputs. It makes *red teaming* (structured, controlled attack simulation) reproducible and auditable rather than ad-hoc.
Both tools are available on GitHub under the MIT license.
Why it matters
There was no standard for security-testing AI agents. Teams either ran ad-hoc manual red teams or relied on generic evals that don't cover agentic-specific vectors: persistent tool access, multi-turn context manipulation, and data exfiltration through external API calls.
Microsoft open-sourcing this sends two clear signals. First, the attack surface of AI agents is now documented and large enough to justify dedicated tooling. Second, the industry needs shared vocabulary for agentic security — and Microsoft is betting theirs becomes the baseline.
The Pytest-native design is the right call. Zero learning curve for any Python team. No new DSL (domain-specific language). No CI/CD overhaul. You write security tests the same way you write unit tests.
What to do
- Add RAMPART to your CI pipeline if you ship agents on any LLM — GPT-4o, Claude, Gemini, or open-weight models.
- Use Clarity to map your agent's tool call graph before any real red team exercise.
- Study the example test cases in the repo: they're the minimum baseline for *prompt injection*, tool-based privilege escalation, and data leakage coverage.
- If your agent touches external tools (APIs, filesystem, code execution), prioritize those vectors in your test suite — highest blast radius if compromised.
Security tooling for AI development stopped being optional. RAMPART makes the integration concrete, replicable, and frictionless.
Share this story
Help more people discover BBLabs News.
Want to get news like this every day?
Browse all articles