RAG (Retrieval-Augmented Generation)

RAG (Retrieval-Augmented Generation) is a technique that improves the accuracy and currency of responses by retrieving relevant information from external knowledge sources and appending the results to the input of an LLM.
LLMs only possess knowledge up to their training cutoff date. Moreover, even with the knowledge they do have, they can be confidently wrong (hallucination). RAG has established itself as a practical solution to these two weaknesses.
The mechanism is intuitive. Upon receiving a user's question, relevant documents are first retrieved from internal documents or a knowledge base. The retrieved results are then passed to the LLM along with the question. The LLM generates a response based not only on its own knowledge, but grounded in the provided documents. Since sources can be explicitly cited, verifying responses becomes straightforward.
Breaking down the components of RAG, they consist of document preprocessing (chunking), vector embedding, similarity search (semantic search), and prompt construction for the LLM. Each step involves choices, and something as simple as how chunks are split can significantly impact response quality.
The distinction between RAG and fine-tuning is frequently debated, but they serve different roles. RAG is a method for "having the model reference external knowledge," while fine-tuning is a method for "adjusting the model's behavior and tone." If the goal is to have the model accurately answer questions based on internal manuals, RAG is the reasonable starting point; if the goal is to standardize the format and style of responses, fine-tuning is. Many projects employ both in combination.
Related Terms

AI ROI (Return on Investment in AI)
AI ROI is a metric that quantitatively measures the effects obtained — such as operational efficienc

AI Observability
An operational practice of continuously monitoring and visualizing the inputs/outputs, latency, cost

Ambient AI
Ambient AI refers to an AI system that is seamlessly embedded in the user's environment, continuousl

BPO (Business Process Outsourcing)
BPO refers to a form of outsourcing in which a company delegates specific business processes to an e