Agentic RAG is an architecture in which an LLM autonomously and iteratively generates search queries, evaluates results, and decides whether to re-retrieve information as an agent, achieving answer accuracy that cannot be obtained with simple single-turn RAG.
A standard RAG pipeline operates in a linear flow: "user question → vector search → pass retrieved documents to LLM → generate answer." This is sufficient when the intent of the question is clear and the necessary information can be retrieved in a single search, but in practice, there are frequent cases where a single search does not yield all the required information.
In Agentic RAG, the LLM itself determines whether "the search results are insufficient" or "the query should be changed," rewriting the query or querying a different data source as needed. By incorporating multi-step reasoning, it can progressively collect and integrate multiple pieces of information to construct a final answer.
Consider the example of querying an internal knowledge base. A question such as "Which proposal templates were used in the top 3 deals by sales last month?" requires multiple steps: searching sales data → identifying the deals → searching the proposal documents for each deal. By having the agent handle this decomposition and sequential search, the user can obtain an answer with a single question.
However, as the number of agent loop iterations increases, so do latency and token costs. Setting a loop limit and designing the system to return intermediate progress via streaming are essential for production use.


RAG (Retrieval-Augmented Generation) is a technique that improves the accuracy and currency of responses by retrieving relevant information from external knowledge sources and appending the results to the input of an LLM.

A next-generation RAG architecture that combines knowledge graphs and vector search, leveraging relationships between entities to improve retrieval accuracy.

Agentic AI is a general term for AI systems that interpret goals and autonomously repeat the cycle of planning, executing, and verifying actions without requiring step-by-step human instruction.

What is a Vector Database? A Complete Guide to How It Works, Top Product Comparisons, and RAG Applications

A data model that represents entities and their relationships in a graph structure. It is used to improve the accuracy of RAG and AI search.