Runtime safety
for AI agents.

AI agents reason in natural language. The consequences are real money, real emails, real production code. Chimera is the deterministic layer that sits between an agent and its tools, so "mostly safe" becomes "provably safe."

The problem

Today's agent safety is probabilistic.

Modern AI agents call tools, move money, send emails, write code. The safety stack of RLHF, instruction hierarchies, and prompt filters catches most attacks. Most isn't enough when an agent is approving wire transfers or filing compliance reports.

Our research benchmarked GPT-4o, Claude Sonnet 4, and Gemini 2.0 Flash against 73 adversarial scenarios across 12 categories. Frontier models scored between 75% and 86% attack resistance. Six attacks bypassed every model on every run. Same input, three different providers, identical failures. The vulnerabilities are structural, not provider-specific.

Adversarial scenarios

Attack categories

API calls in study

1,062

Universal bypasses

What we build

Three pieces, one system.

Chimera Runtime

Library + dashboard

A deterministic policy enforcement layer between your agent and its tools. Every tool call passes through a CSL policy you define. Same input, same output. Audit-ready by default. Every decision is signed, replayable, and EU AI Act-aligned.

AgentScan

Free safety scanner

Paste a GitHub repo URL. We extract your agent's structure, build a shadow copy in our sandbox, and run it against 73 proven adversarial attacks. Free, no signup. The first time you see your agent get hijacked, you understand why runtime enforcement matters.

CSL

Policy language

Chimera Specification Language. Formally verifiable with Z3 and TLA+. Your safety constraints stop being best-guess assertions and start being mathematical guarantees. Open source under the chimera-protocol GitHub org.

Why now

The compliance window is open.

The EU AI Act enforces in 2026. AI agents are moving from labs into production. Companies that figure out runtime safety first will own the category, the way Sentry owned errors, Vercel owned deploys, and Snyk owned dependencies.

We're building from a research foundation, not a vibes-based one: peer-reviewable benchmarks, formally verifiable policies, signed audit trails. The kind of infrastructure that holds up under regulatory scrutiny five years from now.

Who built this

From neuro-symbolic research to runtime infrastructure.

Project Chimera began as a research effort in neuro-symbolic causal AI: an attempt to give learned models the kind of structured, verifiable reasoning that pure deep learning couldn't guarantee. Working on it, we kept hitting the same wall. The interesting failures weren't in the model. They were in what the model decided to do at runtime.

That insight reshaped the project. The same deterministic principles we used for reasoning became the basis for a runtime policy layer. Today Chimera is the safety infrastructure between AI agents and the consequences of their decisions.

We're early. We're shipping fast. If you're building agents in production and runtime safety is on your mind, we want to hear from you.

Try AgentScan free Visit Runtime dashboard Get in touch

Runtime safetyfor AI agents.