Chunk size refers to the size (in number of tokens or characters) of the unit into which documents are split when stored in a vector store within a RAG pipeline. It is a critical parameter that directly affects retrieval accuracy and answer quality.
## Why Splitting Is Necessary LLMs have an upper limit on their context window. Since it is not possible to pass hundreds of pages of internal manuals as-is, documents must be split into appropriate granular units (chunking), vectorized, and only the sections relevant to a query retrieved. At this point, "how large to make each cut" becomes the question of chunk size. ## Too Large or Too Small Both Cause Problems If chunks are too small, a single chunk lacks sufficient context, meaning that even when retrieved, it may not contain the information the LLM needs to construct an answer. Conversely, if chunks are too large, irrelevant information enters as noise, degrading answer accuracy while also increasing token costs. Generally, 256–1,024 tokens is considered a starting point, but the optimal value depends on the domain and the nature of the queries. For short Q&A content such as FAQs, a smaller size is appropriate; for documents where surrounding context is important, such as technical specifications, a larger size is the standard practical approach. ## The Technique of Overlap To mitigate the problem of context being cut off at chunk boundaries, "overlap"—partially duplicating adjacent chunks—is commonly used. For example, with a chunk size of 512 tokens and an overlap of 64 tokens, the last 64 tokens of the previous chunk are also included at the beginning of the next chunk. This contributes to improved accuracy in BM25 and vector search, though storage and index size increase as a result.


A2A (Agent-to-Agent Protocol) is a communication protocol that enables different AI agents to perform capability discovery, task delegation, and state synchronization, published by Google in April 2025.

Acceptance testing is a testing method that verifies whether developed features meet business requirements and user stories, from the perspective of the product owner and stakeholders.

Agent Skills are reusable instruction sets defined to enable AI agents to perform specific tasks or areas of expertise, functioning as modular units that extend the capabilities of an agent.


Agentic AI is a general term for AI systems that interpret goals and autonomously repeat the cycle of planning, executing, and verifying actions without requiring step-by-step human instruction.