What should I do about this? (1)

Add RAMPART to your CI pipeline if you build LLM-powered agents.

What should I do about this? (2)

Use Clarity to map your agent's tool calls before red teaming.

What should I do about this? (3)

Review repo test cases to cover prompt injection and data leakage.

Back to homeIA

RAMPART & Clarity: security testing for AI agents

Microsoft open-sources RAMPART and Clarity, two frameworks for security-testing AI agents at development time.

by Gorka El Bochi Morillo

2 min read

·May 26, 2026

What happened

Microsoft has open-sourced two security testing frameworks for AI agents (autonomous AI systems capable of executing multi-step tasks like booking appointments, running code, or calling external APIs): RAMPART and Clarity.

RAMPART (*Risk Assessment and Measurement Platform for Agentic Red Teaming*) is a Pytest-native framework — it plugs directly into the test infrastructure most Python teams already run. It lets developers write security test cases for agentic AI (AI that acts autonomously across multi-step workflows), covering *prompt injection* (manipulating input to make an agent execute malicious instructions), unsafe tool use, privilege escalation within agent flows, and out-of-spec behaviors.

Clarity handles observability: tracing agent decisions, which tools it calls, in what order, and with what inputs. It makes *red teaming* (structured, controlled attack simulation) reproducible and auditable rather than ad-hoc.

Both tools are available on GitHub under the MIT license.

Why it matters

There was no standard for security-testing AI agents. Teams either ran ad-hoc manual red teams or relied on generic evals that don't cover agentic-specific vectors: persistent tool access, multi-turn context manipulation, and data exfiltration through external API calls.

Microsoft open-sourcing this sends two clear signals. First, the attack surface of AI agents is now documented and large enough to justify dedicated tooling. Second, the industry needs shared vocabulary for agentic security — and Microsoft is betting theirs becomes the baseline.

The Pytest-native design is the right call. Zero learning curve for any Python team. No new DSL (domain-specific language). No CI/CD overhaul. You write security tests the same way you write unit tests.

What to do

Add RAMPART to your CI pipeline if you ship agents on any LLM — GPT-4o, Claude, Gemini, or open-weight models.
Use Clarity to map your agent's tool call graph before any real red team exercise.
Study the example test cases in the repo: they're the minimum baseline for *prompt injection*, tool-based privilege escalation, and data leakage coverage.
If your agent touches external tools (APIs, filesystem, code execution), prioritize those vectors in your test suite — highest blast radius if compromised.

Security tooling for AI development stopped being optional. RAMPART makes the integration concrete, replicable, and frictionless.

Help more people discover BBLabs News.

RAMPART & Clarity: security testing for AI agents

Vertical Download image

LinkedIn X WhatsApp

Destacado

IA4 jun 20262 min

Malicious npm targets Claude AI user directory

npm package `mouse5212-super-formatter` exfiltrates files from Claude AI's user data directory to GitHub.

Audit global npm dependencies with `npm ls -g` to catch unknown packages.
Review `/mnt/user-data` contents and treat any sensitive files there as compromised.
Block unexpected outbound traffic to `api.github.com` from dev processes in your SIEM.

Gorka El Bochi Morillo

Leer artículo

IA3 jun 20262 min

ChatGPhish: how ChatGPT web summaries become phishing lures

ChatGPT's web summary renderer trusts external Markdown, enabling indirect prompt injection attacks that deliver phishing links inside trusted AI responses.

Leer artículo

IA2 jun 20262 min

Claude Mythos goes public: what the security delay means

Anthropic confirms Mythos-class Claude models will reach the public after a delay over software security risks.

Leer artículo

Want to get news like this every day?

Browse all articles

What happened

Why it matters

What to do

Related articles

Malicious npm targets Claude AI user directory

ChatGPhish: how ChatGPT web summaries become phishing lures

Claude Mythos goes public: what the security delay means