RAG • Chapter 2

Text Embeddings & Embedding Models

RAG engineering module on Text Embeddings & Embedding Models.

6 note blocks4 exam topics

🎯 Exam Focus Areas

Evaluate chunking and embedding strategies.Understand Vector DB indexing architectures like HNSW.Analyze RAG prompts for injection vulnerabilities.Calculate and utilize RAGAS evaluation metrics.

Embeddings are numerical representations of text where semantic meaning is captured in a high-dimensional vector space. In this space, words or sentences with similar meanings are located physically closer to each other.

Advanced System Mechanics

Historically, techniques like TF-IDF or Word2Vec were used, but modern RAG relies on Transformer-based embedding models (e.g., OpenAI's text-embedding-ada-002, or open-source models like BGE and MiniLM). These models generate dense vectors (often 384 to 1536 dimensions) that capture deep contextual relationships, rather than just keyword frequencies.

1Understand the vector space implications of this concept.
2Identify potential hallucination risks.
3Optimize for low latency and high relevance.
4Ensure robust system prompts.

Implementation Blueprint

import numpy as np
from sentence_transformers import SentenceTransformer

model = SentenceTransformer('all-MiniLM-L6-v2')

sentences = ["The cat sits outside", "A man is playing guitar", "The feline rests outdoors"]
embeddings = model.encode(sentences)

# Compute cosine similarity between sentence 0 and 2
def cosine_similarity(a, b):
    return np.dot(a, b)/(np.linalg.norm(a)*np.linalg.norm(b))

print(f"Similarity (Cat/Feline): {cosine_similarity(embeddings[0], embeddings[2]):.4f}")

📝 Quick Revision Points

1Review the differences between similarity metrics.
2Practice the LangChain/LlamaIndex code snippets.
3Understand the HyDE architecture deeply.
4Memorize the security guardrail implementations.

← PreviousIntroduction to RAG & LLMs Next →Vector Databases Architecture

Loading notes...

import numpy as np from sentence_transformers import SentenceTransformer model = SentenceTransformer('all-MiniLM-L6-v2') sentences = ["The cat sits outside", "A man is playing guitar", "The feline rests outdoors"] embeddings = model.encode(sentences) # Compute cosine similarity between sentence 0 and 2 def cosine_similarity(a, b): return np.dot(a, b)/(np.linalg.norm(a)*np.linalg.norm(b)) print(f"Similarity (Cat/Feline): {cosine_similarity(embeddings[0], embeddings[2]):.4f}")