Privilege escalation in AI agents occurs when the agent accesses tools, data, or capabilities beyond what it was authorized for. This can happen through tool chaining, credential reuse, or exploiting gaps in the policy.
Tool chaining: The agent uses one tool to access another. For example, using a file-read tool to read a configuration file that contains database credentials, then using those credentials with a database tool.
Permission gap exploitation: The policy allows database.query for SELECT statements but does not restrict database.execute, which runs arbitrary SQL.
Role confusion: In multi-agent systems, an agent assumes the identity or permissions of another agent.
Runtime modification: The agent uses a tool to modify its own configuration, policy, or credentials.
The strongest defense is a deny-by-default policy where only explicitly allowed tools are accessible:
rules:
# Explicitly allowed tools
- tool: "search.web"
action: allow
- tool: "calculator.compute"
action: allow
# Everything else blocked
- tool: "*"
action: block
reason: "Tool not in allowlist"
With this pattern, the agent cannot use any tool that was not specifically authorized. New tools require an explicit policy update.
Block access to sensitive files that contain credentials or configuration:
- tool: "file.read"
action: block
when:
args.path:
matches: "\\.env$|\\.config$|credentials|secrets|/etc/shadow"
reason: "Access to credential files blocked"
Block access to tools that could modify the agent's own permissions:
- tool: "config.*"
action: block
reason: "Agents cannot modify their own configuration"
- tool: "iam.*"
action: block
reason: "Agents cannot modify IAM policies"
- tool: "policy.*"
action: block
reason: "Agents cannot modify safety policies"
Use separate, minimally scoped credentials for each tool:
If the agent discovers one credential, it cannot use it to access other systems because the credential is scoped to a single purpose.
A sequence of denied actions targeting increasingly sensitive tools is a sign of privilege escalation attempts:
BLOCK: config.read (attempt 1)
BLOCK: file.read /etc/shadow (attempt 2)
BLOCK: shell.execute whoami (attempt 3)
Sentinel detects this pattern through denial rate monitoring. Configure alerts for sessions with multiple denied actions targeting different tool categories.
Explore more guides on AI agent safety, prompt injection, and building secure systems.
View All Guides