Policy evaluation latency depends partly on how quickly the system retrieves the active policy. Fetching policies from PostgreSQL on every request adds unnecessary overhead. Redis caching eliminates this bottleneck, keeping policy lookups under a millisecond.
Policy definitions are the primary cache target. These are small JSON documents (typically under 50 KB) that change infrequently. Cache them with a TTL of 30 to 60 seconds. When a policy is updated, invalidate the cache key immediately.
Policy lookup results map an agent ID or tenant ID to their active policy version. This avoids the database join between agents and policies on every request.
Rate limiting counters track action counts per agent per time window. Redis's atomic increment operations are ideal for this.
Do not cache evaluation results. Each evaluation depends on the specific action envelope, which varies per request. Caching decisions risks returning stale safety verdicts.
Use a dedicated Redis instance for safety infrastructure. Sharing Redis with your application cache risks eviction of safety-critical data under memory pressure.
Set maxmemory-policy to volatile-lru so that keys with TTLs are evicted before persistent keys. Policy definitions should use TTLs. Rate limiting counters should also use TTLs matching their time window.
When a policy is created or updated through Authensor's control plane API, the handler writes to PostgreSQL and then deletes the corresponding Redis key. The next request triggers a cache miss, loading the fresh policy from the database.
For multi-node deployments, use Redis Pub/Sub to broadcast invalidation events. Each control plane instance subscribes to the invalidation channel and clears its local in-process cache when notified.
If Redis is unavailable, fall back to direct database queries. Never fail open by skipping policy evaluation because the cache is down. Authensor's control plane treats cache misses as a normal code path, not an error condition.
Monitor cache hit rates. A healthy deployment should see 95%+ hit rates for policy lookups. Low hit rates indicate either TTLs that are too short or an invalidation problem.
Policy caching is lightweight. A deployment with 1,000 distinct policies uses roughly 50 MB of Redis memory. Rate limiting counters for 100,000 agents with minute-level granularity add another 50 MB. A small Redis instance handles most deployments comfortably.
Explore more guides on AI agent safety, prompt injection, and building secure systems.
View All Guides