Agentic RAG is an architecture in which an LLM autonomously and iteratively generates search queries, evaluates results, and decides whether to re-retrieve information as an agent, achieving answer accuracy that cannot be obtained with simple single-turn RAG.
## Differences from Conventional RAG A standard RAG pipeline operates in a linear flow: "user question → vector search → pass retrieved documents to LLM → generate answer." This is sufficient when the intent of the question is clear and the necessary information can be retrieved in a single search, but in practice, there are frequent cases where a single search does not yield all the required information. In Agentic RAG, the LLM itself determines whether "the search results are insufficient" or "the query should be changed," rewriting the query or querying a different data source as needed. By incorporating multi-step reasoning, it can progressively collect and integrate multiple pieces of information to construct a final answer. ## When Is It Effective? Consider the example of querying an internal knowledge base. A question such as "Which proposal templates were used in the top 3 deals by sales last month?" requires multiple steps: searching sales data → identifying the deals → searching the proposal documents for each deal. By having the agent handle this decomposition and sequential search, the user can obtain an answer with a single question. However, as the number of agent loop iterations increases, so do latency and token costs. Setting a loop limit and designing the system to return intermediate progress via streaming are essential for production use.


RAG (Retrieval-Augmented Generation) is a technique that improves the accuracy and currency of responses by retrieving relevant information from external knowledge sources and appending the results to the input of an LLM.

Agentic AI is a general term for AI systems that interpret goals and autonomously repeat the cycle of planning, executing, and verifying actions without requiring step-by-step human instruction.

A2A (Agent-to-Agent Protocol) is a communication protocol that enables different AI agents to perform capability discovery, task delegation, and state synchronization, published by Google in April 2025.

What is Human-in-the-Loop (HITL)? The Basics of "Human Participation" Design for Establishing AI-Driven Business Process Automation

Agent Skills are reusable instruction sets defined to enable AI agents to perform specific tasks or areas of expertise, functioning as modular units that extend the capabilities of an agent.