Prompt injection detection falls into two broad categories: rule-based (regex) and machine learning (ML). Each has distinct strengths. Production systems benefit from using both.
Regex-based detection matches input text against predefined patterns. Common patterns target phrases like "ignore previous instructions," "you are now," or "system prompt" in various forms.
Strengths: Near-zero latency. Fully deterministic. No external dependencies. Easy to audit and explain. Authensor's Aegis scanner includes a regex-based detection layer that processes input in microseconds.
Weaknesses: Only catches what you write rules for. Attackers bypass regex by misspelling words, using synonyms, inserting Unicode characters, or encoding payloads. Maintaining a comprehensive ruleset requires constant updates.
ML-based detection uses trained models to classify input as benign or malicious. Approaches range from fine-tuned BERT classifiers to embedding similarity search.
Strengths: Generalizes to novel attack patterns. Catches semantic variations that regex misses. Handles multilingual attacks better.
Weaknesses: Higher latency (10 to 100 milliseconds). Probabilistic, so false positives and negatives are inevitable. Requires training data and periodic retraining. Harder to explain individual decisions.
Use regex as your first filter and ML as your second. This pattern optimizes for both speed and coverage.
Regex runs first and catches known, high-confidence attack patterns instantly. The vast majority of requests pass through in microseconds. ML runs second on all inputs, catching the creative attacks that regex misses.
If either layer flags the input, the policy engine decides the response. Authensor supports configuring multiple detection layers with independent thresholds and actions per layer.
Start with regex. It is free, fast, and catches the most common attacks. Add ML detection when you have budget and traffic volume that justifies the investment. Use your logs from regex-detected attacks as training data for your ML model.
Review both layers quarterly. New attack patterns need new regex rules. ML models need fresh training data from real-world attacks.
Explore more guides on AI agent safety, prompt injection, and building secure systems.
View All Guides