Oubliette Shield
The AI firewall that detects prompt injections, deceives attackers with honeypots, and generates threat intelligence — all in under 2 milliseconds.
Three Pillars of AI Defense
Detection alone isn't enough. Oubliette combines detection, deception, and intelligence into a unified platform.
Detection
5-stage tiered ensemble pipeline. 85-90% detection rate with low false positives. Blocks obvious attacks in microseconds, reserves LLM judge for the 5-15% that need it.
Deception
Three deception modes turn attacks into intelligence-gathering operations. Honeypot returns fake data, Tarpit wastes attacker time, Redirect steers conversations.
Intelligence
Every detected attack generates structured threat intelligence. STIX 2.1 export, MITRE ATLAS mapping, IOC extraction, CEF logging for SIEM integration.
5-Stage Detection Pipeline
Block obvious attacks in microseconds. Reserve expensive LLM calls for the 5-15% that need them.
Input Sanitizer
<1msStrips 9 types of encoding attacks, Unicode obfuscation, and invisible characters before any analysis begins.
Pre-Filter
~10ms11 pattern-matching rules block obvious prompt injections, jailbreaks, and DAN attacks instantly. 1,550x faster than LLM-only.
ML Classifier
~2msLogisticRegression + TF-IDF with 733 features. F1=0.98, AUC=0.99. Catches sophisticated attacks the pre-filter misses.
LLM Judge
12 providersOnly 5-15% of inputs reach the LLM judge. Supports OpenAI, Anthropic, Azure, Bedrock, Vertex, Ollama, and more.
Session Tracker
multi-turnAccumulates attack signals across conversation turns. Escalates sessions when thresholds are exceeded.
"Most attacks are obvious — a pattern match catches it in 10 milliseconds. Only the truly ambiguous inputs need the full LLM judge."
ML Classifier Details
Purpose-built for speed and accuracy. Runs on every request without adding perceptible latency.
Architecture
- ModelLogisticRegression
- Feature ExtractionTF-IDF + Structural
- Feature Dimensions733
- Training Samples1,365
- Categories553 benign / 812 malicious
Performance
- F1 Score0.98
- AUC-ROC0.99
- Inference Time~2ms
- False Positive RateLow
- Cross-Val F1 Mean0.986
12 LLM Backends
Use any LLM provider as your judge backend. Switch with a single config change.
Plus any OpenAI-compatible server
Quick Start
Drop-in integration with your existing stack. Choose your framework:
from oubliette_shield import Shield
shield = Shield(provider="openai")
result = shield.analyze("user input here")
if result.blocked:
print("Attack detected!", result.verdict) pip install oubliette-shield[langchain,fastapi,litellm,crewai,haystack] Detection Capabilities
Comprehensive coverage across all known prompt injection categories.
Deploy Anywhere
From cloud to air-gapped SCIF environments. Zero cloud dependencies required.
Cloud
Docker/K8s in front of OpenAI, Anthropic, Azure, or Bedrock
On-Premise
Ollama backend, SQLite storage, zero external dependencies
Air-Gapped
SCIF-ready, no internet required, CEF to local SIEM only
MCP Server Integration
Use Shield as an MCP server in Claude Desktop, Claude Code, or any MCP-compatible client. Every tool call is security-scanned in real time.
analyze
Scan text for prompt injection and jailbreak attacks. Returns verdict, ML score, and MITRE ATLAS mapping.
validate_tool_call
Validate MCP tool arguments for injection attacks, path traversal, SSRF, and credential leaks.
scan_output
Scan LLM output for secrets, PII, suspicious URLs, invisible text, and data leakage.
get_session
Retrieve session state including threat counts, escalation status, and attack pattern history.
list_honey_tools
List available deception tool definitions for injection into MCP server tool lists.
export_threat_intel
Export STIX 2.1 threat intelligence bundle for a session's detected attacks.
pip install oubliette-shield-mcp && oubliette-shield-mcp Compliance-Ready from Day One
Mapped to every major AI security framework. Audit-ready documentation included.
Start Protecting Your AI Today
Free and open source. Enterprise support available.