LAB 4|Hybrid Search

Three Ways to Search

Every search strategy has blind spots. See exactly what each approach finds — and misses — then understand why Hybrid captures the best of both.

VECTOR SEARCH

Understands meaning

Converts your query into 1,536 numbers (a vector embedding using OpenAI).

Finds chunks whose vectors point in the same direction — even if the words differ.

✓"time off" finds "annual leave"

✓"vacation days" finds "PTO policy"

✗Misses exact keyword matches like "W-77B"

✗Can return vague semantic neighbors

BM25 KEYWORD

Counts exact words

Scores each chunk by how often your exact words appear (TF), boosted when those words are rare across all chunks (IDF).

Runs entirely in your browser — pure JavaScript, no Elasticsearch, no server.

✓"W-77B form" finds exact form name

✓Fast, explainable, zero AI cost

✗"time off" misses "annual leave"

✗Synonyms and paraphrases score zero

HYBRID = BEST OF BOTH

Weighted combination

Blends both scores using a single parameter α (alpha):

hybrid = α × vector_score + (1−α) × bm25_score

✓Catches exact keywords AND semantic matches

✓Fixed α = 0.5 here. Tune it in Lab 5.

✗Requires both embeddings and BM25 index

✗α needs tuning per use-case

📚BM25 runs entirely in your browser · Pure JavaScript · No Elasticsearch · No server required · Index built from the same chunks as vector search

Try these to see the difference: