When building safety for AI agents, there are two fundamentally different approaches: deterministic and probabilistic. Understanding the difference is critical for building systems that actually work under attack.
Probabilistic safety uses the language model itself to make safety decisions. Common examples:
These approaches are probabilistic because the model's output is not guaranteed. The same input can produce different outputs. Under adversarial conditions (prompt injection), the model can be convinced to override its safety instructions.
Deterministic safety uses code-based rules that run outside the language model. A policy engine evaluates the tool call against declared rules. The same input always produces the same decision. The language model cannot influence the evaluation.
// Deterministic: always blocks this pattern
if (toolName === 'shell.execute' && /rm -rf/.test(args.command)) {
return { action: 'block' };
}
This code will block rm -rf regardless of what the model says, what the prompt contains, or how creative the injection is. The decision happens in JavaScript, not in the model.
| Property | Probabilistic | Deterministic | |----------|--------------|---------------| | Consistency | Variable | Same input = same output | | Attack resistance | Can be bypassed | Cannot be bypassed by model | | Coverage | Can handle novel cases | Only handles declared rules | | Speed | Slow (LLM inference) | Fast (microseconds) | | Testability | Difficult | Unit-testable | | Auditability | Nondeterministic | Fully auditable |
Deterministic safety handles known threats with certainty. Probabilistic safety can reason about novel situations. The right architecture uses both:
The deterministic layer is the floor. It guarantees a minimum level of safety. The probabilistic layer adds nuance above that floor.
When you are under attack, you need guarantees, not probabilities. A prompt injection that bypasses system prompt instructions will not bypass a policy engine running in code. This is why deterministic enforcement is the foundation of any production safety system.
Build the deterministic rules first. Add probabilistic analysis as an enhancement, not a replacement.
Explore more guides on AI agent safety, prompt injection, and building secure systems.
View All Guides