Skip to main content
Open Source · Apache 2.0

Oubliette Dungeon

Automated adversarial testing for LLM applications. 57 attack scenarios across 8 categories. Know your blind spots before attackers do.

pip install oubliette-dungeon

57 Attack Scenarios, 8 Categories

Every scenario is mapped to MITRE ATLAS techniques and OWASP LLM Top 10 categories.

12
Prompt Injection
8
Jailbreak
7
Data Exfiltration
6
Persona Override
6
Logic Traps
6
Encoding Attacks
6
Multi-Turn
6
Context Switching

Features

Everything you need to red team your LLM applications.

57 Attack Scenarios

YAML-defined attack scenarios mapped to MITRE ATLAS and OWASP LLM Top 10. Covers prompt injection, jailbreak, exfiltration, persona override, and more.

Multi-Provider Comparison

Run the same attack suite against OpenAI, Anthropic, Ollama, Azure, and more. Compare detection rates side by side.

React Dashboard

Interactive dashboard with real-time results, trend analysis, provider comparison charts, and scenario drill-down.

CLI + API

Click-based CLI for scripting and CI/CD. REST API for programmatic access. JSON and HTML report generation.

Tool Integrations

Optional adapters for PyRIT, DeepTeam, Garak, and AIX. Use industry-standard tools alongside Dungeon scenarios.

Scheduled Campaigns

Schedule recurring red team campaigns. Track detection rate trends over time. Get alerts when regressions are detected.

Quick Start

Three commands to your first red team campaign.

1. Install
pip install oubliette-dungeon
2. Run all 57 scenarios against your target
oubliette-dungeon run --target http://localhost:5000/api/chat
3. Compare providers
oubliette-dungeon compare --providers openai,anthropic,ollama
New

MCP Server Integration

Run red team scenarios directly from Claude Desktop or any MCP-compatible client. No CLI needed.

list_scenarios

Browse and filter 57 attack scenarios by category, difficulty, or keyword.

run_scenario

Execute a single scenario against any target URL with full result analysis.

run_category

Run all scenarios in a category and get aggregated detection rate metrics.

get_results

Query historical test results from the built-in results database.

get_metrics

Compute detection rate, pass@k, risk density, and other security metrics.

export_report

Generate JSON or HTML reports from test campaign results.

Install and run
pip install oubliette-dungeon-mcp && oubliette-dungeon-mcp

Pairs with Oubliette Shield

Dungeon attacks. Shield defends. Run Dungeon against a Shield-protected endpoint to measure your detection rate and find gaps.

Learn About Shield

Start Testing Your AI Today

Free and open source. Enterprise support available.