Embedding is a technique that transforms unstructured data such as text, images, and audio into fixed-length numerical vectors while preserving semantic relationships.
A computer cannot determine from raw strings that "apple" and "orange" are similar. Embedding solves this problem. When "apple" is converted into a vector like [0.23, -0.41, 0.87, ...] with hundreds of dimensions, the vector for "orange" is close by while "automobile" is far away. Semantic closeness becomes numerical closeness. Embeddings play a core role inside LLMs as well. Input text is first tokenized, and each token is converted into an embedding vector. The Transformer processes this sequence of vectors to generate output. In practice, sentence-level embeddings are used most frequently. Models such as OpenAI's text-embedding-3-small and Cohere's embed-v4 convert entire sentences into single vectors. Storing these vectors in a vector database enables semantic search and the retrieval layer for RAG. When selecting a model, dimensionality, supported languages, and cost are the key criteria. For Japanese or Thai language processing, benchmarking multilingual model accuracy beforehand is important.


A vector database stores text, images, and other data as numerical vectors (embeddings) and provides fast search based on semantic similarity.

Hybrid search is a technique that combines keyword-based full-text search (such as BM25) with vector search (semantic search), leveraging the strengths of both to improve retrieval accuracy.

Context Engineering is a technical discipline focused on systematically designing and optimizing the context provided to AI models — including codebase structure, commit history, design intent, and domain knowledge.


What is PEFT (Parameter-Efficient Fine-Tuning)? A Technology That Reduces AI Model Customization Costs by 90%

Chunk size refers to the size (in number of tokens or characters) of the unit into which documents are split when stored in a vector store within a RAG pipeline. It is a critical parameter that directly affects retrieval accuracy and answer quality.