Semantic Searchとは？

Fundamental Differences from Keyword Search

Traditional keyword search (Sparse Models, typified by BM25) directly evaluates whether words contained in a query appear in a document. Searching for "automobile" will return documents containing "automobile," but it cannot retrieve documents using synonyms like "car" or "auto."

Semantic search transcends this limitation. It converts text into vectors of hundreds to thousands of dimensions using an embedding model, then performs nearest-neighbor search on a vector database. "I want to improve my automobile's fuel efficiency" and "ways to reduce a car's gasoline consumption" share almost no overlapping vocabulary, yet they are mapped to nearby positions in semantic space and will therefore match.

Where It Excels and Where It Falls Short

Semantic search excels at paraphrasing, synonyms, and concept-level queries. It delivers high recall for queries that differ in expression but share the same intent—such as "steps for the resignation process" and "what to do when leaving a company." It pairs well with internal knowledge bases and FAQ search.

On the other hand, it struggles with queries that require exact vocabulary matches, such as model numbers (XR-990), legal statute numbers, or program code. In embedding space, "XR-990" and "XR-991" may be mapped to nearly identical positions, making them indistinguishable. To compensate for this weakness, hybrid search combining semantic search with BM25 has been widely adopted in practice.

Role in RAG

In RAG (Retrieval-Augmented Generation), semantic search serves as the core of the retrieval phase. The user's question is vectorized, semantically relevant chunks are retrieved from an external knowledge base, and these are passed to the LLM. If retrieval accuracy is low at this stage, the LLM generates responses based on irrelevant documents, leading to hallucinations.

The practical keys to improving retrieval quality lie in selecting the right embedding model (whether multilingual support is needed, or whether domain-specific fine-tuning is effective) and in designing chunk sizes. In the author's experience, simply changing the chunk size from 256 tokens to 512 tokens with the same model has shifted Recall@10 by more than 10 points. Evaluating the model and chunk size together has become a cardinal rule.

Semantic Search

Fundamental Differences from Keyword Search

Where It Excels and Where It Falls Short

Role in RAG

Related Terms

AI ROI (Return on Investment in AI)

AI Observability

Ambient AI

BPO (Business Process Outsourcing)