LAB 4|Hybrid Search

Three Ways to Search

Every search strategy has blind spots. See exactly what each approach finds — and misses — then understand why Hybrid captures the best of both.

VECTOR SEARCH
Understands meaning

Converts your query into 1,536 numbers (a vector embedding using OpenAI).

Finds chunks whose vectors point in the same direction — even if the words differ.

"time off" finds "annual leave"
"vacation days" finds "PTO policy"
Misses exact keyword matches like "W-77B"
Can return vague semantic neighbors
BM25 KEYWORD
Counts exact words

Scores each chunk by how often your exact words appear (TF), boosted when those words are rare across all chunks (IDF).

Runs entirely in your browser — pure JavaScript, no Elasticsearch, no server.

"W-77B form" finds exact form name
Fast, explainable, zero AI cost
"time off" misses "annual leave"
Synonyms and paraphrases score zero
HYBRID = BEST OF BOTH
Weighted combination

Blends both scores using a single parameter α (alpha):

hybrid = α × vector_score + (1−α) × bm25_score

Catches exact keywords AND semantic matches
Fixed α = 0.5 here. Tune it in Lab 5.
Requires both embeddings and BM25 index
α needs tuning per use-case
📚BM25 runs entirely in your browser · Pure JavaScript · No Elasticsearch · No server required · Index built from the same chunks as vector search
Try these to see the difference: