← Back to Learn
content-safetytutorialprompt-injectionguardrails

Configuring the Aegis content safety scanner

Authensor

Aegis is Authensor's content safety scanner. It runs in-process with zero runtime dependencies, analyzing text for prompt injection attempts, PII exposure, credential leaks, and other content threats. This guide covers configuration options beyond the defaults.

Default detectors

Out of the box, Aegis includes these detectors:

| Detector | What it catches | |----------|----------------| | prompt_injection | Instruction overrides, role injection, delimiter attacks | | pii | Email addresses, phone numbers, social security numbers | | credentials | API keys, tokens, passwords in plaintext | | code_injection | SQL injection, shell injection in tool arguments | | encoding_tricks | Base64-encoded instructions, Unicode homoglyphs |

Selecting detectors

Enable only the detectors you need:

const guard = createGuard({
  policy,
  aegis: {
    enabled: true,
    detectors: ['prompt_injection', 'credentials'],
  }
});

Fewer detectors means faster scanning. If your agent never handles PII, disable the PII detector to reduce false positives.

Threshold tuning

Each detector returns a confidence score between 0 and 1. The threshold determines the cutoff for blocking:

aegis: {
  enabled: true,
  threshold: 0.7,           // Global threshold
  detectorThresholds: {
    prompt_injection: 0.6,  // More sensitive for injections
    pii: 0.9,               // Less sensitive for PII
  }
}

Custom patterns

Add your own detection patterns for domain-specific threats:

aegis: {
  enabled: true,
  customPatterns: [
    {
      name: 'internal_url',
      type: 'credentials',
      pattern: /https?:\/\/internal\..+\.corp/,
      score: 0.95,
      description: 'Internal corporate URL detected in agent input',
    },
    {
      name: 'trading_instruction',
      type: 'prompt_injection',
      pattern: /execute.*trade|buy.*shares|sell.*position/i,
      score: 0.85,
      description: 'Potential unauthorized trading instruction',
    }
  ]
}

Allowlists

Reduce false positives by allowlisting known-safe patterns:

aegis: {
  enabled: true,
  allowlist: [
    /support@yourcompany\.com/,           // Your support email
    /api\.yourcompany\.com\/v[0-9]+/,     // Your API URLs
  ]
}

Content matching an allowlist pattern is excluded from scanning.

Scan results

When Aegis detects a threat, the guard decision includes scan details:

const decision = guard('search.web', { query: maliciousInput });

if (decision.threats) {
  for (const threat of decision.threats) {
    console.log(threat.type);       // 'prompt_injection'
    console.log(threat.detector);   // 'instruction_override'
    console.log(threat.score);      // 0.92
    console.log(threat.snippet);    // The matching text
  }
}

Performance

Aegis scans are synchronous and run in microseconds for typical inputs. Scanning a 10,000-character input with all detectors enabled takes under 1ms on modern hardware. This is negligible compared to LLM inference time.

For very large inputs (100K+ characters), consider scanning only the first N characters or splitting the input into chunks. Most injection attacks appear early in the text where they can influence the model's behavior.

Keep learning

Explore more guides on AI agent safety, prompt injection, and building secure systems.

View All Guides