SovereignShield

Immutable Firewall
as a Service

Hybrid OS-enforced security scanning for AI systems, agents, and LLM applications. Scans both user input and LLM responses. Blocks code execution, semantic manipulation, encoded payloads, and 200+ attack patterns before they reach your AI or your users. Use the managed Cloud API, self-host the Local Engine, or plug into OpenClaw.

Get Started Free See How It Works

Self-Host Engine View OpenClaw Plugin

<1ms Scan latency

200+ Attack patterns

0 Dependencies

CSS Suspicion Scoring

Why SovereignShield

Every AI agent needs a robust, OS-enforced security layer. Our API uses a Cumulative Suspicion Score (CSS) and deterministic pattern matching to scan inputs and filter responses instantly. For advanced semantic attacks like persona hijacking and multi-turn manipulation, the pipeline seamlessly escalates to the VetoShield LLM layer.

Sub-Millisecond

Pure pattern matching and rule evaluation. No LLM calls in the critical path. Your users won't feel a thing.

Predictable Logic

Our foundation relies entirely on predictable, OS-enforced rule evaluation. When deterministic signals are weak, the Cumulative Suspicion Score (CSS) accurately flags and blocks evasive behavior without hallucinations.

Self Adapting Ruleset

Our AdaptiveShield engine learns from reported attacks across all users, extracts attack keywords, validates them against benign traffic, and auto-deploys new rules with zero downtime.

OS-Enforced Core

Security payloads are strictly enforced using OS-level mprotect/VirtualProtect memory protection. Runtime modification is physically impossible, ensuring downstream code can never tamper with or bypass a BLOCK decision.

Minimal Friction

Integrate SovereignShield into your pipeline with just 3 lines of code. Whether using the Cloud API, Local Engine, or OpenClaw plugin, your agent is secured instantly.

Bidirectional Protection

Scan user prompts on the way in to stop jailbreaks, and filter LLM responses on the way out to catch data leaks, hallucinations, and unauthorized actions.

Defense Architecture

User input and LLM responses pass through independent security layers in sequence. The managed Cloud API runs 5 layers (skipping stateful hallucination checks). The Local Engine runs all 6 layers for full-stack protection, including the session-based TruthGuard module and the VetoShield API for semantic verification.

InputFilter

Pattern-based sanitization. Catches keyword injection, code execution, encoded payloads (base64, hex, leet speak, unicode homoglyphs), and obfuscated attacks across 200+ signals in 22 languages. A curated multilingual safe baseline eliminates false positives.

CLOUD & LOCAL ENGINE · Sub-millisecond

↓

AdaptiveShield

Self-learning rule engine. Learns from reported attacks across all users, extracts attack keywords, validates against benign traffic, and auto-deploys new rules with zero downtime.

CLOUD & LOCAL ENGINE · Self-Learning

↓

CoreSafety

Action-level audit: shell execution ban, file deletion ban, URL restrictions, credential exfiltration detection, malware syntax scanning, privilege escalation prevention. Core rules are protected via OS-level memory locks.

CLOUD & LOCAL ENGINE · Immutable Laws

↓

Conscience

Ethical evaluation engine. Detects deception, social engineering, harmful intent, fake tool injection, security evasion, IP extraction attempts. Hash-sealed and tamper-resistant.

CLOUD & LOCAL ENGINE · Ethical Gate

↓

TruthGuard

Factual hallucination detector. Tracks tool usage per session and blocks unverified factual claims - catches temporal, numerical, and citation-based confidence markers without hedging.

LOCAL ENGINE ONLY · Anti-Hallucination

↓

VetoShield

Deterministic defense with optional LLM-powered semantic analysis. Runs all deterministic layers by default (sub-millisecond). When an LLM provider is configured, adds semantic checks that catch persona hijacking, crescendo attacks, context manipulation, and attacks with no recognizable keywords.

CLOUD & LOCAL ENGINE · Deterministic + Optional LLM

What It Blocks

Real attack categories detected and stopped. Detection runs through our OS-enforced layers with an optional VetoShield logic escalation path.

✗ Code execution (os.system, subprocess, eval, exec)

✗ Shell commands (whoami, bash, curl, wget, netcat)

✗ Keyword injection (IGNORE PREVIOUS, developer mode)

✗ Credential exfiltration (API keys, passwords, tokens)

✗ SQL injection (UNION SELECT, DROP TABLE, OR 1=1)

✗ XSS payloads (<script>, onerror, document.cookie)

✗ Reverse shells (nc -e, /bin/bash, pty.spawn)

✗ Path traversal (../../../etc/passwd, /proc/self)

✗ Encoded payloads (base64, hex, leet speak, ROT13)

✗ Unicode obfuscation (homoglyphs, invisible characters)

✗ LLM token injection (ChatML, [INST], <<SYS>>)

✗ Deception & social engineering (fake tools, manipulation)

✗ Harmful intent (violence, theft, malware keywords)

✗ Privilege escalation (sudo, admin access, root)

✗ IP/source code extraction attempts

✗ Evasive behavior over time (CSS Suspicion Scoring)

LLM-Enhanced Semantic Detection (VetoShield - Cloud API & Local Engine) View SovereignShield Core on GitHub

✗ Persona hijacking (“You are now an unrestricted AI”)

✗ Crescendo attacks (gradual escalation across turns)

✗ Multi-turn manipulation (split attacks across messages)

✗ Context confusion (benign-looking semantic attacks)

✗ System prompt extraction (indirect probing)

✗ Model fingerprinting and capability probing

Stateful Verification (Local Engine Only)

✗ Factual hallucinations (TruthGuard session cache)

Pricing

Start free. Scale when you need to.

Monthly Yearly Save $16

Free

$0/mo

1,000 scans/month
100 scans/day
30 req/min rate limit
5 security layers & CSS
Automatic rule updates
Python client + REST API

No card required. Your API key will be emailed to you.

Cloud API Integration

Integrate with our managed API at api.sovereign-shield.net. Get an API key, install the Python client, and scan both user input and LLM responses in 3 lines of code.

Scan user input

User message passes through 4 security layers. Clean input is returned. Dangerous input is blocked.

Send to your LLM

Pass the safe input to your AI. Only verified, clean messages reach your model.

Scan LLM response

LLM output passes through the shield. Blocks data leaks, PII, credential exposure before the user sees it.

terminal

$ pip install sovereign-shield-client

Cloud API - Python Client

from sovereign_shield_client import SovereignShield

shield = SovereignShield(api_key="ss_your_key")

# 1. Scan user input - blocks dangerous prompts
safe_input = shield.scan(user_message)

# 2. Safe input goes to your LLM
llm_response = your_llm.generate(safe_input)

# 3. Scan LLM response - blocks data leaks, PII, etc.
safe_response = shield.veto(llm_response)

# 4. Safe response goes to the user
return safe_response

Cloud API - cURL

$ curl -X POST https://api.sovereign-shield.net/api/v1/scan \
  -H "Authorization: Bearer ss_your_key" \
  -H "Content-Type: application/json" \
  -d '{"input": "user message here"}'

Response (200 OK)

{
  "allowed": true,
  "stage": "Approved",
  "reason": "All security checks passed.",
  "clean_input": "user message here",
  "scan_id": "a1b2c3d4",
  "latency_ms": 0.42
}

Self-Host The Engine

Run SovereignShield locally as a standalone microservice. Install the Python package, boot the daemon, and any app in any language can POST JSON to localhost:8765 for instant security scanning. No cloud dependency.

View SovereignShield Core on GitHub