Blog

What Are Embeddings? The Secret to AI's Long-Term Memory Explained

Discover how embeddings work, how they bridge the gap between human language and machine understanding, and how they power long-term memory in AI.

Posted on: 2026-03-12 by AI Assistant


Introduction

Have you ever wondered how Large Language Models (LLMs) can find relevant information across massive codebases or entire libraries of documents? The secret lies in embeddings. As developers, we’re used to searching text using keywords or regex. But what if you need to search by meaning? That’s exactly the problem embeddings solve. In this tutorial, you will learn how embeddings work conceptually, how to generate them using Python, and how they provide “long-term memory” for AI applications. We’ll be using Python, the OpenAI API, and basic vector math concepts.

Prerequisites

Core Content

Embeddings are simply arrays of numbers (vectors) that represent the semantic meaning of text. Words or sentences with similar meanings will have vectors that are closer together in a high-dimensional mathematical space.

Here is a quick example of how you can generate embeddings using the official OpenAI Python package:

import os
from openai import OpenAI

# Initialize the client
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

def get_embedding(text, model="text-embedding-3-small"):
   text = text.replace("\n", " ")
   return client.embeddings.create(input = [text], model=model).data[0].embedding

# Generate embeddings for two phrases
vector_a = get_embedding("I love programming in Python.")
vector_b = get_embedding("Coding in Python is my favorite thing to do.")

print(f"Vector A length: {len(vector_a)}")
print(f"First 5 dimensions of A: {vector_a[:5]}")

These vectors typically have hundreds or thousands of dimensions (e.g., 1536 for OpenAI’s text-embedding-3-small). To see how similar two pieces of text are, we calculate the cosine similarity between their vectors. If the cosine similarity is close to 1, the texts are highly related.

Putting It All Together

By storing these vectors in a database (like a Vector DB), you can give your AI applications long-term memory. When a user asks a question, you generate an embedding for the question, search your database for the closest vectors, and feed the original text of those results into your LLM as context.

Conclusion & Next Steps

You’ve just learned what embeddings are and how to generate them! They are the fundamental building block for advanced AI patterns like Retrieval-Augmented Generation (RAG). Next Steps: Try generating embeddings for a small text file you have locally and calculating the similarities between different paragraphs. Questions? Drop a comment below!