350+ verified vulnerabilities across 195+ AI/ML repositories. 166 responsible disclosures. Two novel vulnerability classes discovered.
Offensive security methodology applied to AI safety evaluation. Find the gaps. Document them. Ship the fix.
195+ reposScanned 195+ repositories across NVIDIA, Microsoft, Meta, Google, HuggingFace, OpenAI, and 50+ other organizations. 350+ verified vulnerabilities. 166 formal disclosure reports.
2 novel classesIdentified two previously undocumented vulnerability classes affecting model serialization formats and sandboxed code execution environments. Details pending coordinated disclosure.
166 disclosuresCritical and high-severity findings in PyTorch, DeepSpeed, BentoML, TorchServe, Ray, Ollama, vLLM, LangChain, and dozens of production ML systems. Coordinated disclosure in progress.
PR #798Accepted pull request to UK AISI's ControlArena benchmark. Demonstrated monitor prompt injection -- agents evading their own safety oversight.
Same pipeline that found 350+ vulnerabilities across NVIDIA, Microsoft, Meta, Google, and HuggingFace infrastructure. Applied to your systems. Scoped engagements with CVE-quality findings.
Automated adversarial analysis of your ML stack. Deserialization, injection, auth bypass, model format exploits, supply chain. Same methodology behind 350+ verified vulnerabilities across production systems at NVIDIA, Microsoft, Meta, and Google.
We test the evaluators. Monitor bypass, compound judge failures, signal dilution, sandbox escapes. Your safety infrastructure is an attack surface -- we prove it before an adversarial agent does.
Systematic adversarial campaigns against your agents and tool integrations. Privilege escalation, exfiltration, goal hijacking, memory poisoning. CVE-quality findings with reproduction steps.
We scope. We test. You get a report. No retainers, no ongoing fees unless you want them. Typical engagement: 2-4 weeks.
Book a Scoping CallSix packages. All free. All MIT licensed. Download the full stack or pick individual tools.
These started as internal tools for our adversarial research. Policy engine to test guardrails. Content scanner to probe classifiers. Monitor to detect behavioral drift. We use them daily. You should too.
npx @authensor/create-authensor my-agentCopy@authensor/aegisContent safety scanner. 210+ detection rules. Prompt injection, memory poisoning, PII, credential leaks. Zero dependencies, sub-ms latency.
@authensor/sentinelReal-time behavioral monitor. EWMA/CUSUM anomaly detection. Per-agent baselines, deny rate tracking, chain depth alerts.
@authensor/engineDeclarative policy evaluation. Session forbidden sequences, budget enforcement, constraint checking. Synchronous, pure, zero dependencies.
@authensor/mcp-serverTransparent policy proxy for any MCP server. Implements SEP authorization protocol. Drop-in protection for Claude Desktop and any MCP client.
@authensor/redteamAdversarial red team harness. 15 attack seeds mapped to MITRE ATT&CK. Automated safety regression testing.
@authensor/safeclawLocal agent gating for Claude Code. Browser dashboard, approval workflows, audit ledger. One command install.
Wrap any agent action with guard() and policy evaluation, content scanning, and audit logging happen automatically.
# Download the full safety stack npx @authensor/create-authensor my-agent cd my-agent && npm install # Or install individual tools: npm install @authensor/aegis # Content scanner npm install @authensor/sentinel # Behavioral monitor npm install @authensor/engine # Policy engine npm install @authensor/mcp-server # MCP Gateway npm install @authensor/redteam # Red team harness
We apply offensive security methodology to AI safety evaluation. Penetration testing for guardrails. Red teaming for agents. Adversarial probing for classifiers.
The safety stack is our toolkit, open-sourced. The red teaming is what we do with it.
195+ repos audited. 350+ verified vulnerabilities. 166 responsible disclosures. 2 novel vulnerability classes discovered. ControlArena contributor (UK AISI).
Download the free safety stack. Or hire the team that built it to red team your systems.