Hybrid Retrieval Explained

LH42 uses a hybrid retrieval approach that combines three complementary methods to deliver the best search results.

The Three Pillars

1. Dense Vectors (Semantic Search)

Dense vectors capture semantic meaning. Words with similar meanings have similar vectors, even if they're spelled differently.

"car" ≈ "automobile" ≈ "vehicle"

We use BGE-M3, a state-of-the-art embedding model that:

Supports 100+ languages
Captures nuanced meaning
Handles synonyms and paraphrasing

2. Sparse Vectors (SPLADE)

Sparse vectors identify important keywords and expand queries with related terms.

Query: "ML model training"
Expanded: ["machine learning", "neural network", "training", "optimization", "model"]

Benefits:

Explicit term matching
Query expansion
Handles rare terms well

3. BM25 (Keyword Matching)

BM25 is a classic information retrieval algorithm that scores based on term frequency.

Score = Σ IDF(term) × TF(term, doc) × (k1 + 1) / (TF + k1 × (1 - b + b × docLen/avgDocLen))

Benefits:

Fast and efficient
Handles exact matches
No model required

Reciprocal Rank Fusion (RRF)

We combine results using RRF, which:

Gets top results from each method
Assigns scores based on rank position
Combines scores with configurable weights

python

RRF_score = Σ 1 / (k + rank_i)

Configuring Weights

Tune the balance for your use case:

python

results = client.search(
    query="technical documentation",
    options={
        "weights": {
            "dense": 0.5,   # Semantic similarity
            "sparse": 0.3,  # Term expansion
            "bm25": 0.2     # Exact matching
        }
    }
)

When to Use Each Method

Use Case	Recommended Weights
General search	0.5 / 0.3 / 0.2
Technical docs	0.3 / 0.3 / 0.4
Conversational	0.6 / 0.3 / 0.1
Exact matching	0.2 / 0.2 / 0.6

Reranking

After initial retrieval, a reranker scores the top results for final ordering:

python

results = client.search(
    query="...",
    options={
        "rerank": True,
        "rerank_model": "cross-encoder"
    }
)