Gemini Embedding 2とは？

Gemini Embedding 2

Updated:March 25, 2026Published:March 24, 2026

Gemini Embedding 2 is a multimodal embedding model developed by Google, capable of converting text, images, video, audio, and documents into a single vector space.

Unlike conventional embedding models that handle only text, the defining feature of this model is its ability to map 5 types of media into a single semantic space. For example, an audio clip of an abnormal factory sound and a text document describing the corresponding equipment troubleshooting procedure can be placed in close proximity in vector space — enabling cross-modal search within a single model. In RAG pipelines where non-text knowledge needs to be searchable, this significantly reduces the overhead of preparing separate models for each modality.

The input window is 8,192 tokens, allowing for larger chunk sizes. Output dimensions reach up to 3,072, but thanks to the Matryoshka architecture, they can be reduced to 1,536 (balanced) or 768 (optimized for low-latency search). Task optimization parameters are also available, allowing the mathematical properties of vectors to be adjusted based on use cases such as retrieval and classification.

With native support for over 100 languages, the model is well-suited for multilingual RAG and cross-lingual search. Official integrations with LangChain, LlamaIndex, Weaviate, Qdrant, and ChromaDB are provided, enabling seamless incorporation into existing vector database infrastructure.

Pricing is $0.25 per 1 million tokens, with a free tier available. Migrating from the conventional text-embedding-004 is straightforward in terms of swapping the model ID, but since the vector spaces differ, existing indexes will need to be rebuilt. When fully leveraging multimodal input, careful design is required — including decisions on the granularity at which images and audio are included in the index, and balancing search accuracy against storage costs.

Gemini Embedding 2

Related Terms

AI ROI (Return on Investment in AI)

AI Observability

Ambient AI

BPO (Business Process Outsourcing)