Retrieval Augmented Generation (RAG) is a technique that enhances language model outputs by retrieving relevant information from external knowledge sources before generating a response. Instead of relying solely on knowledge encoded during training, the model accesses up-to-date, domain-specific documents at inference time.
A RAG pipeline has three stages:
Retrieval. The user's query is used to search a knowledge base, typically a vector database containing embedded document chunks. The search returns the most semantically similar documents.
Augmentation. The retrieved documents are inserted into the model's context window alongside the user's query. This provides the model with specific, relevant information to draw from.
Generation. The model generates a response informed by both the query and the retrieved context. The result is typically more accurate and grounded than a response from the model's parametric knowledge alone.
RAG is widely used in enterprise AI deployments because it allows models to answer questions about proprietary data, recent events, and domain-specific topics without fine-tuning.
However, RAG introduces a specific safety concern: the retrieved documents become part of the model's input. If those documents contain adversarial content, the model's behavior can be manipulated. This is the indirect prompt injection attack vector.
Consider a RAG-powered customer service agent. It retrieves product documentation to answer questions. If an attacker modifies a document in the knowledge base to include hidden instructions, the agent will retrieve those instructions and potentially follow them. The model cannot reliably distinguish between legitimate documentation and injected commands.
Securing RAG pipelines requires multiple layers of defense. Content scanning should inspect retrieved documents before they enter the context window. Policy engines should constrain what actions the agent can take regardless of what the retrieved context suggests. Audit trails should record which documents were retrieved for each interaction, enabling forensic analysis when incidents occur.
RAG amplifies both the utility and the attack surface of AI agents. Building it safely requires treating every retrieved document as potentially untrusted input.
Explore more guides on AI agent safety, prompt injection, and building secure systems.
View All Guides