Generating Embeddings with Pydantic AI
A comprehensive guide on how to generate embeddings using Pydantic AI across multiple providers like OpenAI, Google, Cohere, Bedrock, and VoyageAI.
Posted on: 2026-03-05 by AI Assistant

Embeddings are vector representations of text that capture semantic meaning. They’re essential for building semantic search, Retrieval-Augmented Generation (RAG) systems, similarity detection, and classification. Pydantic AI provides a unified interface for generating embeddings across multiple providers.
In this post, we’ll explore how to use the Embedder class in Pydantic AI.
Quick Start
The Embedder class is the high-level interface for generating embeddings. You can start by embedding a search query or multiple documents:
from pydantic_ai import Embedder
import asyncio
embedder = Embedder('openai:text-embedding-3-small')
async def main():
# Embed a search query
result = await embedder.embed_query('What is machine learning?')
print(f'Embedding dimensions: {len(result.embeddings[0])}')
# Embed multiple documents at once
docs = [
'Machine learning is a subset of AI.',
'Deep learning uses neural networks.',
'Python is a programming language.',
]
result = await embedder.embed_documents(docs)
print(f'Embedded {len(result.embeddings)} documents')
asyncio.run(main())
Tip: Some embedding models optimize differently for queries and documents. Use
embed_query()for search queries andembed_documents()for content you’re indexing.
Exploring the Embedding Result
All embed methods return an EmbeddingResult containing the embeddings along with useful metadata like token usage and cost.
from pydantic_ai import Embedder
import asyncio
embedder = Embedder('openai:text-embedding-3-small')
async def main():
result = await embedder.embed_query('Hello world')
# Access embeddings - each is a sequence of floats
embedding = result.embeddings[0]
print(f'Dimensions: {len(embedding)}')
# Check usage
print(f'Tokens used: {result.usage.input_tokens}')
asyncio.run(main())
Supported Providers
Pydantic AI supports several leading embedding providers.
1. OpenAI and Compatible Models
Pydantic AI seamlessly works with OpenAI’s embeddings API and any OpenAI-compatible provider. You can install the required packages with pip install "pydantic-ai-slim[openai]".
from pydantic_ai import Embedder
import asyncio
embedder = Embedder('openai:text-embedding-3-small')
async def main():
result = await embedder.embed_query('Hello world')
print(len(result.embeddings[0]))
asyncio.run(main())
OpenAI’s text-embedding-3-* models also support dimension reduction:
from pydantic_ai.embeddings import EmbeddingSettings
embedder = Embedder(
'openai:text-embedding-3-small',
settings=EmbeddingSettings(dimensions=256),
)
2. Google (Gemini and Vertex AI)
Google’s embedding models can be accessed via the Gemini API (Google AI Studio) or Vertex AI (pip install "pydantic-ai-slim[google]").
embedder = Embedder('google-gla:gemini-embedding-001')
3. Cohere
Cohere offers excellent multilingual models (pip install "pydantic-ai-slim[cohere]"). They support extra settings such as controlling truncation for long inputs.
from pydantic_ai.embeddings.cohere import CohereEmbeddingSettings
embedder = Embedder(
'cohere:embed-v4.0',
settings=CohereEmbeddingSettings(
dimensions=512,
cohere_truncate='END',
),
)
4. VoyageAI
VoyageAI provides specialized models for code, finance, and legal domains (pip install "pydantic-ai-slim[voyageai]").
5. AWS Bedrock
With Bedrock (pip install "pydantic-ai-slim[bedrock]"), you can access Amazon Titan, Cohere, and Amazon Nova models.
6. Local Embeddings with Sentence Transformers
For offline use, privacy, or avoiding API costs, you can run embeddings locally using the sentence-transformers library (pip install "pydantic-ai-slim[sentence-transformers]").
embedder = Embedder('sentence-transformers:all-MiniLM-L6-v2')
Settings and Token Counting
The EmbeddingSettings object provides common configurations like dimensions and truncate.
You can also check token counts before making a request to avoid exceeding limits:
from pydantic_ai import Embedder
import asyncio
embedder = Embedder('openai:text-embedding-3-small')
async def main():
token_count = await embedder.count_tokens('Hello world, this is a test.')
print(f'Tokens: {token_count}')
max_tokens = await embedder.max_input_tokens()
print(f'Max tokens: {max_tokens}')
asyncio.run(main())
Conclusion
Whether you need a cutting-edge proprietary model or want to run an embedding model locally for privacy, Pydantic AI’s unified Embedder interface makes it simple to integrate semantic understanding into your application.