There is a tension at the center of every AI agent deployment: the reason you built an agent is automation, but the reason you need oversight is that automation can fail catastrophically. Teams resolve this tension in one of two ways, and both are wrong.
The first approach is full autonomy. The agent does whatever the policy engine allows, and the policy engine allows almost everything. This works fine until the agent deletes a production database, sends a mass email to the wrong list, or transfers money to the wrong account. At that point, "the agent had full access" is not a defense — it is the problem.
The second approach is full oversight. Every action requires human approval. This feels safe, but it is not a real deployment. If a human has to approve every file read, every API call, every database query, you have not built an agent — you have built a very expensive UI for making your team do the agent's work manually.
The right answer is in the middle. The hard part is knowing exactly where.
Think of agent oversight as a spectrum with five positions:
Full autonomy — No human involvement. The agent acts on any decision the policy engine allows. Appropriate only for extremely low-risk, reversible, scoped operations where the blast radius of a mistake is near zero.
Notify on anomaly — The agent acts autonomously, but the system alerts humans when behavior deviates from the baseline. Appropriate for medium-risk operations where you want observability without blocking speed.
Review on threshold — Actions below a threshold (dollar amount, resource scope, sensitivity level) are autonomous. Actions above the threshold are routed for approval. Appropriate for most production agents.
Review on category — Certain categories of actions always require approval regardless of arguments. Write operations, external communications, credential access. Appropriate for regulated environments.
Full oversight — Everything requires approval. Not practical for most operations, but appropriate for the highest-risk single actions (production deploys, large financial transfers, irreversible data operations).
Authensor supports all five positions through a combination of ALLOW, REVIEW, and DENY decisions, threshold conditions, and Sentinel behavioral alerts.
Before showing what works, it is worth being explicit about what does not.
Approval fatigue. If reviewers see 200 approval requests per day and 199 of them are routine, the 200th gets approved reflexively. The cognitive overhead of constant review trains people to rubber-stamp. A workflow that generates too many low-stakes approvals provides the illusion of oversight while defeating its purpose.
Slow approvals that block agents. If the agent has to wait four hours for a human to approve a file read, you have broken the agent. Teams respond to this by either expanding the autonomous allow-list (reducing oversight) or abandoning the approval workflow entirely.
Approvals without context. "Agent wants to call create_transfer" is not enough information for a meaningful approval decision. Reviewers need the full intent: exact arguments, the policy rule that flagged it, the agent's recent history, and why this specific call triggered review.
No escalation path. What happens if the approver is unavailable? If there is no timeout policy and no escalation, a pending approval blocks the agent indefinitely.
Authensor addresses all four failure modes.
The key design decision is: which actions go to humans? This should be explicit in your policy, not implicit in your hope that humans will catch problems.
# authensor-policy.yaml
version: "1"
rules:
# Read operations: fully autonomous
- id: read-autonomous
action:
- read_file
- query_database
- search_web
- get_customer_record
decision: ALLOW
# External communications: always require review
- id: external-comms-review
action:
- send_email
- post_to_slack
- send_sms
- create_webhook
decision: REVIEW
meta:
reviewer_group: "comms-approvers"
timeout_minutes: 60
on_timeout: DENY
# Financial operations above threshold: require review
- id: large-transfer-review
action: create_transfer
condition:
args:
amount:
greaterThan: 10000
decision: REVIEW
meta:
reviewer_group: "finance-ops"
timeout_minutes: 30
on_timeout: DENY
# Financial operations below threshold: autonomous
- id: small-transfer-allow
action: create_transfer
decision: ALLOW
# Data deletion: always require review
- id: destructive-ops-review
action:
- delete_record
- drop_table
- purge_bucket
decision: REVIEW
meta:
reviewer_group: "data-team"
require_multi_party: true
approval_count: 2
timeout_minutes: 120
on_timeout: DENY
Notice the on_timeout: DENY setting. When no approver responds within the window, the action is denied. The agent fails closed. This is the correct default. An approval workflow where timeout means "go ahead anyway" is not a safety control.
When the Authensor control plane returns a REVIEW decision, the agent should pause, not fail. The intent has been logged, an approval request has been queued, and the agent needs to wait for the outcome.
import { AuthensorClient } from "@authensor/sdk";
const client = new AuthensorClient({
apiKey: process.env.AUTHENSOR_KEY,
agentId: "agent-finance-prod",
});
async function executeWithApproval(action: string, args: Record<string, unknown>) {
const result = await client.evaluate({
action,
resource: args.resource as string,
context: { ...args, agentId: "agent-finance-prod" },
});
if (result.decision === "ALLOW") {
return await executeTool(action, args);
}
if (result.decision === "REVIEW") {
console.log(`Action ${action} queued for approval (receipt: ${result.receiptId})`);
// Wait for human decision — the SDK polls until resolved or timeout
const approval = await client.waitForApproval(result.receiptId, {
pollIntervalMs: 5000,
timeoutMs: 1800000, // 30 minutes
});
if (approval.decision === "APPROVED") {
return await executeTool(action, args);
} else {
throw new Error(`Action rejected by reviewer: ${approval.reason}`);
}
}
throw new Error(`Action denied: ${result.reason}`);
}
The approval request that a reviewer sees should contain everything needed to make an informed decision — not everything in the system, but the right things.
// What the reviewer sees when they open an approval request
interface ApprovalRequest {
receiptId: string;
agentId: string;
action: string;
resource: string;
args: Record<string, unknown>;
matchedRule: string; // Which rule triggered review
ruleReason: string; // Human-readable explanation
agentRecentActions: Receipt[]; // Last N actions by this agent
aegisScanResult: string; // Was the input scanned? Any flags?
requestedAt: string;
expiresAt: string; // When the request times out
}
The reviewer does not need to know how the policy engine works internally. They need to know: what is the agent trying to do, what arguments are involved, is this unusual for this agent, and is there anything suspicious about the request. Authensor surfaces all of this in the approval request.
For the highest-risk operations, a single approver is not sufficient. A compromised approver account, a coerced employee, or a simple mistake by one person should not be enough to authorize an irreversible action.
rules:
- id: production-deploy-approval
action: deploy_to_production
decision: REVIEW
meta:
reviewer_group: "engineering-leads"
require_multi_party: true
approval_count: 2 # Two approvals required
approval_quorum: "any" # Either two from the group
timeout_minutes: 240
on_timeout: DENY
// The SDK handles multi-party coordination automatically.
// The action is not released until the required number of
// distinct approvals have been collected.
const approval = await client.waitForApproval(result.receiptId, {
pollIntervalMs: 10000,
timeoutMs: 14400000, // 4 hours
});
// approval.decision is APPROVED only when all required approvals have been collected.
// approval.approvers lists who approved and when.
This satisfies the four-eyes principle required in many regulated environments and significantly raises the bar for insider threat scenarios.
The design principle for avoiding approval fatigue is: humans should only see decisions that require genuine judgment. Routine operations at normal parameters should be autonomous. Review should be reserved for the cases where a human's contextual judgment genuinely adds value.
Three rules help achieve this:
Set thresholds deliberately. Do not set the review threshold for financial transfers at $1 if your agent routinely processes $5,000 transfers. Calibrate thresholds against the actual risk profile of your operations.
Tier your reviewer groups. Low-risk review requests (time-of-day violations, soft rate limit warnings) go to an operations team. High-risk requests (destructive operations, large transfers, anomaly-triggered reviews) go to senior staff or security.
Use Sentinel to catch what policy cannot. Some rogue behavior is not visible at the individual action level — it emerges from patterns. Sentinel's behavioral monitoring generates alerts when deny rates spike, action volumes exceed baseline, or delegation chains get unusually deep. These alerts go to humans without requiring every individual action to go through review.
import { SentinelMonitor } from "@authensor/sentinel";
const sentinel = new SentinelMonitor({
agentId: "agent-finance-prod",
alertThresholds: {
denyRateIncrease: 3.0, // Alert if deny rate triples
actionVolumeSpike: 5.0, // Alert if volume spikes 5x
chainDepthMax: 4,
},
onAlert: async (alert) => {
await notifySlack({
channel: "#security-alerts",
message: `Behavioral anomaly: ${alert.agentId} — ${alert.description}`,
receiptId: alert.latestReceiptId,
});
},
});
Human attention is finite. The goal of a well-designed oversight system is to direct that attention where it matters most, not to spray every action at every reviewer.
Article 14 of the EU AI Act requires that high-risk AI systems can be "effectively overseen by natural persons" and that humans can "intervene on the operation of the high-risk AI system or interrupt the system." The REVIEW decision workflow is direct compliance with this requirement.
Every REVIEW decision is recorded in the receipt chain with the approver's identity and timestamp. This creates a verifiable record that human oversight was exercised. The ability to set on_timeout: DENY satisfies the "interrupt the system" requirement — if oversight is unavailable, the system stops rather than proceeding unsupervised.
The deadline for high-risk system compliance is August 2, 2026.
Run npx create-authensor to scaffold a project with REVIEW-based approval workflows configured out of the box. The scaffolded policy has sensible defaults for the most common categories of actions that benefit from review. Adjust thresholds and reviewer groups to match your organization.
Docs at authensor.com/docs. The code is open source at github.com/authensor/authensor — feedback and contributions welcome.