Platform · Redteam + Pentest

Attack your agent before someone else does.

Promptfoo's catalog of 60+ plugins, 18 strategies, and the Vouch-AI offensive agent — all bounded by a hard cost cap, all turning into findings.

Two surfaces, one workflow.

Promptfoo plugin catalog

60+ plugins: harmful, PII, BFLA/BOLA/RBAC, ASCII smuggling, debug access, divergent repetition, excessive agency, hallucination, hijacking, indirect prompt injection, intent, MCP exploitation, memory poisoning, off-topic, prompt extraction, RAG document exfiltration, RAG source attribution, reasoning DoS, shell/SQL/SSRF injection, tool discovery, vertical packs (financial, medical, pharmacy, ecommerce, insurance, telecom, real estate, teen safety).

Strategy mixers

base64 / hex / leet / rot13 / homoglyph / multilingual / audio / image / video, plus iterative attacks (crescendo, GOAT, hydra, simba, bestOfN, layer composite), `mischievousUser` multi-turn, `indirectWebPwn`, `authoritativeMarkupInjection`, `mathPrompt`, `likert`, `gcg`, `citation`.

Vouch-AI offensive agent (v0)

Goal-driven attacker that picks a skill, crafts a prompt, sends it to your agent's chat endpoint, and iterates based on the response. Multi-judge ensemble: deterministic (canary regex) + tool-oracle (high-blast tool fired with attacker target) + LLM judge.

Skill library

8 agent-attack patterns shipped: indirect prompt injection, RAG poisoning, memory poisoning, MCP exploitation, tool-call hijack, cross-tenant escape, approval bypass, confused deputy. Each skill is a markdown playbook.

Cost-bounded by default

Three budgets: turn count (default 8), USD cost (default $1.00), and 100k-token chat-history budget. Misconfigured runs can't drain credit.

Continuous learning

Every successful attack feeds the firewall classifier and the corpus the Mutual Defense Network shares (opt-in). The defender gets smarter as attackers find more.

Run a redteam pack on your agent.

Configure the target, pick a skill, hit go. Findings land in the inbox.