Policy & Threats
CyberCage protects your AI development environment by detecting and blocking security threats in MCP (Model Context Protocol) traffic. You control what to protect through policies that define which threats to monitor and how to respond.
How It Works
CyberCage inspects all MCP communication between your AI assistants and MCP servers. When suspicious activity is detected:
- Threat Detection - The system identifies potential security issues using both pattern-based detection and AI analysis
- Policy Check - Your configured policies determine whether to allow or block the activity
- Response - Actions are blocked or allowed based on policy, with full logging for audit and investigation
All activity is logged in the dashboard for security review, whether blocked or allowed.
What CyberCage Protects Against
CyberCage detects and blocks threats across 11 categories:
AI-Specific Threats
Prompt Injection Detects attempts to manipulate your AI assistant's behavior, including jailbreak attempts, instruction overrides, and context poisoning.
Tool Poisoning Identifies malicious MCP tools or attempts to modify legitimate tool behavior.
Credential & Data Theft
Credential Exfiltration Catches attempts to steal SSH keys, cloud credentials (AWS, GCP, Azure), API tokens, browser credentials, and other secrets.
Data Exfiltration Detects unauthorized data extraction including file theft, database dumps, and encoded data transfers.
Execution & System Attacks
Code Execution Blocks unauthorized code execution attempts including shell injection, eval/exec patterns, and reverse shells.
System Tampering Identifies modifications to system files, security settings, logs, and package managers.
Persistence Detects backdoor installation, cron jobs, SSH key manipulation, and service installations designed to maintain access.
Privilege Escalation Catches attempts to gain elevated privileges through SUID exploitation, sudo abuse, or container escapes.
Process Injection Identifies code injection into running processes through various techniques.
Kernel Modification Detects rootkits, kernel module loading, and other kernel-level attacks.
Defense Evasion Catches obfuscation, proxy tunneling, history disabling, and other anti-detection techniques.
Policies: Control Your Protection
Policies determine which threats CyberCage actively monitors and how it responds when threats are detected.
How Policies Work
CyberCage provides a catalog of threat detection policies. Your organization chooses which policies to enable based on your security needs. Each policy:
- Targets a specific threat category (e.g., credential theft, prompt injection)
- Has a severity level (Critical, High, Medium, Low, Info)
- Can be configured to DENY (block) or ALLOW
Policy Presets
Choose a preset configuration when setting up your organization:
- Essential Only - Critical threats only (minimal protection, fewer alerts)
- Recommended - Critical + High threats (balanced security)
- Maximum Protection - All threat categories (strongest security)
- Custom - Select individual policies for your specific needs
By default, new organizations enable Critical and High severity policies, providing protection against credential theft, data exfiltration, code execution, privilege escalation, prompt injection, and tool poisoning.
Managing Policies
From the dashboard, you can:
- Enable or disable policies at any time
- View how often each policy has triggered
- See which threats each policy has caught
- Add policies from the catalog as your needs evolve
When Threats Are Detected
CyberCage responds immediately based on your policy configuration:
| Policy Action | What Happens |
|---|---|
| DENY | Blocks the request and logs details to the dashboard |
| ALLOW | Permits the request and logs the event for audit |
Every threat detection is logged with full context including what was detected, the threat category, and the complete request details. Use the dashboard to investigate threats, understand what triggered detection, and adjust policies as needed.
Investigating Threats
The dashboard provides detailed threat reports showing:
- What threat was detected and why
- The threat category and severity
- Full request and response details
- When it occurred and which user/application was involved
See the Threat Investigation Guide for detailed workflows on analyzing and responding to threats.
Reducing False Positives
CyberCage is designed to minimize false positives while catching real threats. If a policy blocks legitimate activity:
- Review the threat report in the dashboard to understand what triggered detection
- Temporarily disable the policy if needed
- Adjust policy settings or work with your security team to tune detection
- Consider whether the activity should be an exception
Next Steps
- Threat Investigation Guide - Investigate and respond to detected threats
- Applications - Configure protected applications
- MCP Servers - Manage server catalog and approvals
- Dashboard Overview - Navigate the security dashboard