Back to Blog
2 min read

Embedding Models: Choosing Between OpenAI, Azure, and Open Source

Embedding models convert text into dense vector representations, enabling semantic search and similarity comparisons. Choosing the right embedding model impacts both quality and cost of your AI applications.

Azure OpenAI Embeddings

The text-embedding-ada-002 and newer text-embedding-3 models provide high-quality embeddings with minimal setup.

from openai import AzureOpenAI

client = AzureOpenAI(
    api_key=os.environ["AZURE_OPENAI_KEY"],
    api_version="2024-08-01-preview",
    azure_endpoint=os.environ["AZURE_OPENAI_ENDPOINT"]
)

def get_embeddings(texts: list[str], model: str = "text-embedding-3-small") -> list[list[float]]:
    """Get embeddings for a list of texts."""
    response = client.embeddings.create(
        model=model,
        input=texts,
        dimensions=1536  # Can reduce for text-embedding-3 models
    )

    return [item.embedding for item in response.data]

# Usage
texts = ["How to implement RAG", "Building search applications"]
embeddings = get_embeddings(texts)

Open Source Alternatives

Sentence Transformers provide local embedding generation without API calls.

from sentence_transformers import SentenceTransformer

# Load a high-quality open source model
model = SentenceTransformer('BAAI/bge-large-en-v1.5')

def get_local_embeddings(texts: list[str]) -> list[list[float]]:
    """Generate embeddings locally."""
    embeddings = model.encode(texts, normalize_embeddings=True)
    return embeddings.tolist()

# For multilingual support
multilingual_model = SentenceTransformer('sentence-transformers/paraphrase-multilingual-mpnet-base-v2')

Comparison Table

ModelDimensionsCostLatencyQuality
text-embedding-3-small512-1536LowLowGood
text-embedding-3-large256-3072MediumLowExcellent
BGE-large-en1024FreeMediumExcellent
E5-large-v21024FreeMediumVery Good

Selection Criteria

Use Azure OpenAI embeddings when you need simplicity, consistent quality, and can afford API costs. Choose open source when you need to control costs at scale, require offline operation, or have specific domain requirements that benefit from fine-tuning.

Michael John Peña

Michael John Peña

Senior Data Engineer based in Sydney. Writing about data, cloud, and technology.