Microsoft Just Open-Sourced the AI Agent Red-Team Stack It Uses Internally



May 22, 2026

Microsoft just open-sourced the AI agent red-team stack it has been using internally

On May 20, 2026, Microsoft's AI Red Team — the internal unit that stress-tests the company's own AI systems — published two security tools on GitHub: RAMPART, a pytest-native framework for continuously red-teaming AI agents in CI, and Clarity, a structured design-review tool that pushes back on the architectural decisions an agent gets baked with. Both are free. Both have been used inside Microsoft before release.

The release matters because, until this week, the AI red-team workflow at Microsoft was mostly inaccessible to anyone outside Microsoft.

What the tools actually do

RAMPART — Risk Assessment and Measurement Platform for Agentic Red Teaming — builds on PyRIT, Microsoft's existing open-source generative-AI red-teaming library released in 2024. Where PyRIT is for security researchers probing a finished system, RAMPART is for engineers shipping one. Tests are written as pytest cases, each describing an adversarial scenario — cross-prompt injection, data exfiltration, behavioral regression — and routed to the agent through a thin adapter. Pass/fail results gate the CI build the same way an integration test does. Ram Shankar Siva Kumar, founder of Microsoft's AI Red Team, said the company's incident-response team used RAMPART to generate 100 variants of a single reported vulnerability and verify mitigations against each one — work that would have taken Microsoft experts weeks was completed in hours.

RAMPART ships with adapter examples, supports probabilistic re-runs with configurable pass thresholds, and is designed to live in the same pull request as the agent change it tests.
Clarity runs structured conversations covering problem definition, solution exploration, failure analysis, and decision tracking — writing every outcome to a .clarity-protocol/ directory as human-readable markdown.
Clarity's failure analysis uses multiple AI "thinkers" examining a proposed design from different angles — security, human factors, adversarial scenarios, operational concerns — then the engineering team works through the grouped results together.

Why it matters

The AI-agent attack surface has been documented in incidents this month alone: the Mini Shai-Hulud npm worm hitting Mistral AI and TanStack, the AI-developed 2FA-bypass zero-day Google's Threat Intelligence Group caught in the wild, the Comment-and-Control prompt-injection pattern that hijacked Claude Code, Gemini CLI, and Copilot Agent in the same week. The pattern across those incidents is the same: an agent's intended capabilities — reading a PR title, executing a tool, calling an API — become the attack surface. Static AppSec workflows do not catch this. The failure modes are probabilistic and behavioral, not deterministic.

RAMPART turns the red-team test into a regression test. Clarity moves the security review left to the point where the decisions are still cheap to change. Neither is novel as a concept — both are how mature AppSec works for traditional code. What is new is that a major vendor's internal AI red-team workflow is now an open-source artifact any team building agents can adopt this afternoon.

What to do about it

Clone RAMPART and write one failing test for your most-deployed agent. Start with cross-prompt injection — paste an attacker payload into the data source the agent reads from, and verify the agent does not act on it. One failing test in CI beats a quarterly red-team report.
Run Clarity before your next agent design review. The conversation it forces — what tool access does this agent need, what does failure look like, who signs off on irreversible actions — is the conversation that does not happen often enough.
Treat agent safety tests like integration tests. Gate the merge on them. Block builds on regressions. Add a new test for every new tool, data source, or capability the agent gets.

Bottom line

The defensive playbook for AI agents has been: hire a red team, hope they catch things before the attackers do. Microsoft just published a way to fold that work into the pull request. The tooling does not solve prompt injection — nothing does, yet — but it moves agent safety from a one-time review into a set of living artifacts engineers maintain on every commit. For teams shipping production agents, that is the cheapest control upgrade available this week.

Follow us on social media:

Microsoft Just Open-Sourced the AI Agent Red-Team Stack It Uses Internally

How to Spot a Deepfake Video in 60 Seconds

Popular articles

Microsoft just open-sourced the AI agent red-team stack it has been using internally

What the tools actually do

Why it matters

What to do about it

Bottom line

Related articles

A Single HTTP Request Hands Out Shells on Most Public ChromaDB Servers

AI Just Wrote a Working Zero-Day. The Exploitation Window Is Now Hours.

Google Catches the First AI-Built Zero-Day in the Wild