Hybrid OS-enforced security scanning for AI systems, agents, and LLM applications. Scans both user input and LLM responses. Blocks code execution, semantic manipulation, encoded payloads, and 200+ attack patterns before they reach your AI or your users. Use the managed Cloud API, self-host the Local Engine, or plug into OpenClaw.
Every AI agent needs a robust, OS-enforced security layer. Our API uses a Cumulative Suspicion Score (CSS) and deterministic pattern matching to scan inputs and filter responses instantly. For advanced semantic attacks like persona hijacking and multi-turn manipulation, the pipeline seamlessly escalates to the VetoShield LLM layer.
Pure pattern matching and rule evaluation. No LLM calls in the critical path. Your users won't feel a thing.
Our foundation relies entirely on predictable, OS-enforced rule evaluation. When deterministic signals are weak, the Cumulative Suspicion Score (CSS) accurately flags and blocks evasive behavior without hallucinations.
Our AdaptiveShield engine learns from reported attacks across all users, extracts attack keywords, validates them against benign traffic, and auto-deploys new rules with zero downtime.
Security payloads are strictly enforced using OS-level mprotect/VirtualProtect memory protection. Runtime modification is physically impossible, ensuring downstream code can never tamper with or bypass a BLOCK decision.
Integrate SovereignShield into your pipeline with just 3 lines of code. Whether using the Cloud API, Local Engine, or OpenClaw plugin, your agent is secured instantly.
Scan user prompts on the way in to stop jailbreaks, and filter LLM responses on the way out to catch data leaks, hallucinations, and unauthorized actions.
User input and LLM responses pass through independent security layers in sequence. The managed Cloud API runs 5 layers (skipping stateful hallucination checks). The Local Engine runs all 6 layers for full-stack protection, including the session-based TruthGuard module and the VetoShield API for semantic verification.
Pattern-based sanitization. Catches keyword injection, code execution, encoded payloads (base64, hex, leet speak, unicode homoglyphs), and obfuscated attacks across 200+ signals in 22 languages. A curated multilingual safe baseline eliminates false positives.
Self-learning rule engine. Learns from reported attacks across all users, extracts attack keywords, validates against benign traffic, and auto-deploys new rules with zero downtime.
Action-level audit: shell execution ban, file deletion ban, URL restrictions, credential exfiltration detection, malware syntax scanning, privilege escalation prevention. Core rules are protected via OS-level memory locks.
Ethical evaluation engine. Detects deception, social engineering, harmful intent, fake tool injection, security evasion, IP extraction attempts. Hash-sealed and tamper-resistant.
Factual hallucination detector. Tracks tool usage per session and blocks unverified factual claims - catches temporal, numerical, and citation-based confidence markers without hedging.
Deterministic defense with optional LLM-powered semantic analysis. Runs all deterministic layers by default (sub-millisecond). When an LLM provider is configured, adds semantic checks that catch persona hijacking, crescendo attacks, context manipulation, and attacks with no recognizable keywords.
Real attack categories detected and stopped. Detection runs through our OS-enforced layers with an optional VetoShield logic escalation path.
Start free. Scale when you need to.
No card required. Your API key will be emailed to you.
Integrate with our managed API at api.sovereign-shield.net. Get an API key, install the Python client, and scan both user input and LLM responses in 3 lines of code.
User message passes through 4 security layers. Clean input is returned. Dangerous input is blocked.
Pass the safe input to your AI. Only verified, clean messages reach your model.
LLM output passes through the shield. Blocks data leaks, PII, credential exposure before the user sees it.
$ pip install sovereign-shield-client
from sovereign_shield_client import SovereignShield
shield = SovereignShield(api_key="ss_your_key")
# 1. Scan user input - blocks dangerous prompts
safe_input = shield.scan(user_message)
# 2. Safe input goes to your LLM
llm_response = your_llm.generate(safe_input)
# 3. Scan LLM response - blocks data leaks, PII, etc.
safe_response = shield.veto(llm_response)
# 4. Safe response goes to the user
return safe_response
$ curl -X POST https://api.sovereign-shield.net/api/v1/scan \
-H "Authorization: Bearer ss_your_key" \
-H "Content-Type: application/json" \
-d '{"input": "user message here"}'
{
"allowed": true,
"stage": "Approved",
"reason": "All security checks passed.",
"clean_input": "user message here",
"scan_id": "a1b2c3d4",
"latency_ms": 0.42
}
Run SovereignShield locally as a standalone microservice. Install the Python package, boot the daemon, and any app in any language can POST JSON to localhost:8765 for instant security scanning. No cloud dependency.
$ pip install sovereign-shield
# Boots a minimal-dependency HTTP server on localhost:8765
# Any app (Python, Node, Go, etc.) can POST JSON to scan inputs.
$ sovereign-shield-daemon
Using the OpenClaw AI agent framework? This plugin hooks directly into OpenClaw's before_tool_call lifecycle. Every shell command, file write, and OS call your agent makes is intercepted and verified before the host machine executes it. Requires either the local SovereignShield daemon running or a Cloud API key.
$ openclaw plugins install github.com/mattijsmoens/openclaw-sovereign-shield
# Local mode (default): run the SovereignShield daemon first
# pip install sovereign-shield && sovereign-shield-daemon
# Cloud mode: export your API key instead
# export SOVEREIGN_SHIELD_API_KEY="your_key"
# export SOVEREIGN_SHIELD_MODE="remote"