Open Source · Apache 2.0

Oubliette Shield

The AI firewall that detects prompt injections, deceives attackers with honeypots, and generates threat intelligence — all in under 2 milliseconds.

Three Pillars of AI Defense

Detection alone isn't enough. Oubliette combines detection, deception, and intelligence into a unified platform.

Detection

5-stage tiered ensemble pipeline. 85-90% detection rate with low false positives. Blocks obvious attacks in microseconds, reserves LLM judge for the 5-15% that need it.

F1: 0.98AUC: 0.992ms ML85-90%

Deception

Three deception modes turn attacks into intelligence-gathering operations. Honeypot returns fake data, Tarpit wastes attacker time, Redirect steers conversations.

HoneypotTarpitRedirectHoney Tokens

Intelligence

Every detected attack generates structured threat intelligence. STIX 2.1 export, MITRE ATLAS mapping, IOC extraction, CEF logging for SIEM integration.

STIX 2.1MITRE ATLASIOC ExtractCEF Logging

5-Stage Detection Pipeline

Block obvious attacks in microseconds. Reserve expensive LLM calls for the 5-15% that need them.

1

Input Sanitizer

<1ms

Strips 9 types of encoding attacks, Unicode obfuscation, and invisible characters before any analysis begins.

2

Pre-Filter

~10ms

11 pattern-matching rules block obvious prompt injections, jailbreaks, and DAN attacks instantly. 1,550x faster than LLM-only.

3

ML Classifier

~2ms

LogisticRegression + TF-IDF with 733 features. F1=0.98, AUC=0.99. Catches sophisticated attacks the pre-filter misses.

4

LLM Judge

12 providers

Only 5-15% of inputs reach the LLM judge. Supports OpenAI, Anthropic, Azure, Bedrock, Vertex, Ollama, and more.

5

Session Tracker

multi-turn

Accumulates attack signals across conversation turns. Escalates sessions when thresholds are exceeded.

"Most attacks are obvious — a pattern match catches it in 10 milliseconds. Only the truly ambiguous inputs need the full LLM judge."

ML Classifier Details

Purpose-built for speed and accuracy. Runs on every request without adding perceptible latency.

Architecture

  • ModelLogisticRegression
  • Feature ExtractionTF-IDF + Structural
  • Feature Dimensions733
  • Training Samples1,365
  • Categories553 benign / 812 malicious

Performance

  • F1 Score0.98
  • AUC-ROC0.99
  • Inference Time~2ms
  • False Positive RateLow
  • Cross-Val F1 Mean0.986

12 LLM Backends

Use any LLM provider as your judge backend. Switch with a single config change.

OpenAI
Anthropic
Azure OpenAI
AWS Bedrock
Google Vertex
Google Gemini
Ollama
LiteLLM
Cohere
Mistral
Groq
vLLM

Plus any OpenAI-compatible server

Quick Start

Drop-in integration with your existing stack. Choose your framework:

quickstart.py
from oubliette_shield import Shield

shield = Shield(provider="openai")
result = shield.analyze("user input here")

if result.is_attack:
    print("Attack detected!", result.verdict)
pip install oubliette-shield[langchain,fastapi,litellm,crewai,haystack]

Detection Capabilities

Comprehensive coverage across all known prompt injection categories.

Instruction override / prompt injection
Persona override / identity manipulation
DAN and jailbreak attempts
Hypothetical framing attacks
Logic traps and indirect prompts
Prompt extraction / system prompt leakage
Context switching attacks
Multi-turn escalation patterns
Encoding and obfuscation attacks
Output manipulation / response steering

Deploy Anywhere

From cloud to air-gapped SCIF environments. Zero cloud dependencies required.

Cloud

Docker/K8s in front of OpenAI, Anthropic, Azure, or Bedrock

On-Premise

Ollama backend, SQLite storage, zero external dependencies

Air-Gapped

SCIF-ready, no internet required, CEF to local SIEM only

Compliance-Ready from Day One

Mapped to every major AI security framework. Audit-ready documentation included.

OWASP LLM Top 10
10/10 categories
OWASP Agentic AI
15/15 categories
MITRE ATLAS
13 techniques
NIST AI RMF 1.0
4 functions
NIST SP 800-53
9 controls
CMMC 2.0
5 domains
CWE
13 identifiers
CVSS v3.1
Severity mapping
NIST CSF 2.0
12 subcategories

Start Protecting Your AI Today

Free and open source. Enterprise support available.