← Back to Learn
approval-workflowsexplaineragent-safety

What is human-in-the-loop AI?

Authensor

Human-in-the-loop (HITL) is a design pattern where an AI system includes checkpoints that require human review and approval before the system takes certain actions. The human is literally in the loop: the system cannot complete its task without human input at these checkpoints.

Why keep humans in the loop

AI agents make mistakes. They hallucinate, misinterpret instructions, and get manipulated by prompt injection. For low-stakes tasks, these mistakes are tolerable. For high-stakes tasks, they are not.

A human reviewer catches mistakes that the AI cannot catch in itself:

  • The agent is about to send an email to the wrong person
  • The agent's interpretation of a vague instruction is incorrect
  • The agent is operating on stale or wrong information
  • The agent is attempting an action that makes no sense in context

HITL patterns

Pre-execution review: The agent proposes an action and waits for approval before executing it. This is the most common pattern.

Batch review: The agent queues up a set of actions, and a human reviews and approves them as a batch. Efficient for high-volume, similar actions.

Exception-based review: The agent operates autonomously until it encounters an action the policy flags as risky. Only flagged actions go to a human.

Confidence-based escalation: The agent escalates when its own confidence in the action is below a threshold. "I'm not sure this is the right API endpoint, please confirm."

Implementation with policy rules

Exception-based HITL maps directly to policy escalation rules:

rules:
  # Low-risk: automatic
  - tool: "search.web"
    action: allow
  - tool: "file.read"
    action: allow

  # Medium-risk: human review
  - tool: "file.write"
    action: escalate
    reason: "File writes require review"
  - tool: "email.send"
    action: escalate
    reason: "Outbound email requires review"

  # High-risk: blocked
  - tool: "shell.execute"
    action: block

The cost of HITL

HITL slows things down. Every escalation pauses the agent and waits for a human to respond. This creates a bottleneck. The key is to calibrate: escalate too much and you eliminate the productivity benefit of the agent. Escalate too little and you accept more risk.

Start with aggressive escalation (many things require approval) and relax over time as you build confidence in the agent and your policies. Track the approval rate: if 99% of escalations are approved, you can likely convert some of those rules to auto-allow.

The human must actually review

HITL only works if the human reviewer actually reads and evaluates the request. "Approve all" is not a review. Present the reviewer with clear context: what the agent wants to do, why the policy flagged it, and what the consequences might be. Make approval a deliberate act, not a rubber stamp.

Keep learning

Explore more guides on AI agent safety, prompt injection, and building secure systems.

View All Guides