Introduction
You can find all the C# code samples here: RAG GitHub Repository
Before reading this post I recommend reading:
- Introduction to embeddings: Capture the meaning of data
- Vector Search in Azure AI Search: A Practical Guide
- Naive RAG Explained: The Core Pattern
Hey all!
One of the biggest traps I see developers fall into when building RAG (Retrieval Augmented Generation) pipelines is going “all-in” on vector search. Don’t get me wrong… vectors are great for semantic meaning, but they are so good at finding specific part numbers, SKUs, IDs, or exact product names.
In this blog post, I broke down why the most robust architecture isn’t “either/or”, it’s Hybrid Search in RAG. By combining the battle-tested BM25 algorithm with modern Vector Search, we can build a retrieval system that is both smart and precise.
Hybrid Search in RAG is considered to be the best approach for most of the setups (there might be some scenarios when BM25 or Vector Search alone might be a better option though).
This blog post is all about how I built this Hybrid Search in Azure AI Search for the “Galactic Voyages” demo fleet FAQ system (see an example of such a starhip below).
{
"Id": "gv-shuttle-001",
"Title": "Aurora-Class Shuttle",
"ProductId": "GV-99-ALPHA",
"Category": "Shuttle",
"Overview": "The Aurora-Class Shuttle is a small ship used for short trips between nearby planets and stations.",
"Specifications": {
"TopSpeed": "Warp 2.1",
"Fuel": "Helium-3 Capsules",
"Seats": 32,
"ArtificialGravity": false
},
"Features": [
"Simple seating",
"Automatic safety announcements"
],
"Notes": "The Aurora-Class is reliable and easy to maintain, making it the everyday shuttle of the fleet."
},The Secret Sauce: The Inverted Index
To understand why full-text search is so fast, you have to understand the Inverted Index. When I push starship data into Azure AI Search, the engine doesn’t just store the JSON. It parses every field marked as searchable and builds a specialized data structure.

Think of it as a mapping where the “Term” is the key and the “Document IDs” are the values. But it goes deeper than just storage. Before a word enters the index, it passes through a Lexical Analyzer. This process:
- Lowercases everything for consistency.
- Removes Stop Words (like “the,” “is,” “a”) that don’t add value.
- Converts words to their root form (so “flying” and “fly” both point to the same entry).
For now, all you need to remember is that for all the fields in your index which are marked as Searchable + are not the fields which store vectors (like OverviewVector and NotesVector in this example) Azure AI Search builds an inverted index. In this specific scenario it means 9 inverted indexes.

QueryMode, SearchMode and SearchFields
Let’s talk about how a query that we send to an Azure AI Search index looks like.
{
"search": "Solar Sails",
"queryType": "simple",
"searchMode": "any",
"count": true
}The two properties which you need to understand first are QueryType and SearchMode. These two settings determine how strictly or flexibly the Azure AI Search engine interprets your user’s input.
- QueryType: This is your toggle between the Simple and Full query parsers. While
Simpleis great for basic keyword matching, switching toFullis what unlocks the more advanced retrieval features. - SearchMode: This defines the logic used to include a document in the results. Setting this to
Any(the default) means a document matches if at least one of your search terms is found, whereasAllrequires every single term to be present, drastically increasing the precision of your retrieval which is not always desired (it depends on your use-case).
Another property which you should know is the “searchFields” property. The rule is really simple here: if you want the BM25 in Azure AI Search to take into account only specific fields (marked as searchable of course) then you specify it using that field, otherwise all the fields will be used during the retrieval operation.
Beyond “Simple” Queries: Lucene Syntax
Most of us start with simple keyword searches, but Azure AI Search supports the Lucene query language. This is where you can get really creative with your retrieval logic.
First of all, we must denote that we want to use the Lucene query syntax by setting the queryType property to “full”. By doing that Azure AI Serach starts interpreting your query in a different way (it builds a query tree with various expressions supporting the Lucene syntax).
One feature which is commonly used is Fuzzy Search. By using the Levenshtein distance, we can handle user typos gracefully. For example, if a user types SolarSails with a typo, the engine calculates how many character changes are needed to find a match.
Query (check the double “i” in the “Saiils” term):
{
"search": "Solar Saiils~1",
"queryType": "full",
"searchMode": "all",
"count": true
}It found the result because I specified “~1” as the allowed “distance criteria”.
{
"@odata.context": "https://deployed-in-azure-aisearch.search.windows.net/indexes('starships-index')/$metadata#docs(*)",
"@odata.count": 1,
"value": [
{
"@search.score": 3.609089,
"Id": "gv-luxury-007",
"Title": "Celestial-Cruise-Liner",
"ProductId": "GV-55-ECHO",
"Category": "Luxury",
"Overview": "An opulent passenger vessel offering premium travel experiences between core galactic hubs.",
"TopSpeed": "Warp 2.5",
"Fuel": "Solar Sails",
"Seats": 500,
"ArtificialGravity": true,
"Features": [
"Panoramic observation deck",
"Hydroponic gardens"
],
"Notes": "The pinnacle of luxury travel, where the journey is more important than the destination."
}
]
}Other useful Lucene query features:
- Regular Expression Search: Pattern-match specific strings by enclosing a lower-case regex between forward slashes. For example, if you need to find documents following a specific ID format like
/[a-z]{2}-\d{3}/, this is the perfect way to identify data series without knowing an exact name. - Proximity Search: Find terms located near each other by adding a tilde
~and a distance number to a phrase. Searching for"Helium Capsules"~3will return documents whereHeliumandCapsulesoccur within 3 words of one another. - Term Boosting: Use the caret
^symbol with a multiplier to prioritize specific terms within a single query. If you searchHeavy^2 Lifter, you are telling the engine that the termHeavyis twice as important asLifter. (If you are familiar with the Scoring Profile concept, then the difference is that the Scoring Profile is focused on certain fields within an index, whereas Term Boosting as the name suggest is focused on a term. These 2 techniques can be combined for a “double boost”). - Wildcard Search: Perform flexible matching using
*for multiple characters or?for a single character. A search forStell*will catch any term starting with that prefix likeStellar.
The 4 stages of the search process
When you send a query to Azure AI Search, it doesn’t just “look for a word.” Behind the scenes, the engine executes a sophisticated pipeline to ensure the results are both fast and accurate. Understanding these four stages is key to troubleshooting why a specific document is (or isn’t) showing up.
- Query Parsing: The engine starts by breaking down your search string into a query tree. If you’re using the Full query type, this is where it identifies your special syntax, like those Fuzzy distances, Regex patterns, Proximity markers, or Term Boosts we discussed, to understand the structural intent of the request.
- Lexical Analysis: Just like during the indexing phase, your search terms are passed through a lexical analyzer. The query is lowercased, stop words are stripped out, and terms are reduced to their root forms (stemming). This ensures that a search for “traveling” matches a document containing “travel.”
- Document Retrieval: This is where the BM25 algorithm does the heavy lifting. The engine scans the Inverted Indexes to find all documents that contain the analyzed terms. It discards the “noise” and pulls the candidate documents into a preliminary result set.
- Scoring and Ordering: In the final stage, the engine assigns a relevance score to each retrieved document based on TF-IDF and BM25 logic. The results are then sorted in descending order by this score (thanks to which the most relevant TOP N starships can be then embedded in a prompt which is sent to an LLM – the essence of the RAG pattern).
How BM25 Scores Your Data
We often talk about “relevance,” but how is it actually calculated? In Azure AI Search, we use BM25 (Best Matching 25). This is an evolution of the classic TF-IDF (Term Frequency – Inverse Document Frequency).
To understand how this math works, imagine you are building a search engine for a real estate agency. A user enters the query: “house with a large garden and a swimming pool.” How should the engine score the results?
- Term Frequency (TF): This measures how often a term appears in a specific document. If a listing mentions “swimming pool” five times, it’s likely more relevant than a listing that mentions it once. The rule is simple: the higher the term frequency, the better the score.
- Inverse Document Frequency (IDF): This measures how unique a term is across your entire index. In a real estate database, the word “house” probably appears in every single document. Because it’s so common, its IDF is very low, it doesn’t help us narrow things down. However, “swimming pool” is much rarer. Therefore, its IDF is high, and the engine gives it much more weight when calculating the final score.
How is BM25 different than TF-IDF?
BM25 builds on the ideas behind TF-IDF but adds important improvements. Both methods boost rare terms and reduce the weight of very common ones, but BM25 also limits how much repeated terms can influence the score and adjusts for document length. These refinements make BM25 more stable and effective in real-world search scenarios.
Lexical vs. Semantic – Pros and Cons
Now, once we know enough about the FullText BM25 search let’s compare it to the Vector Search.

As you can see in the graphic above, both search methods have their strong and weak points. Let me break it down simply for you.
Full-Text Search (Lexical / BM25)
I like to think of Full-Text search as the precision tool.
- The Good: If you are looking for an exact match, like a specific error code or a product SKU, it will find it perfectly. It is also incredibly fast, computationally cheap to run, and fully explainable, you can always prove exactly why a document was returned.
- The Bad: It is very literal. It suffers from what I call “synonym blindness“, for example, it doesn’t know that “car” and “automobile” are the same thing (unless you create synonym maps which can mitigate that problem but still, it’s not a perfect solution). It also ignores the context and the intent because it treats these search terms very “mechanically”.
Vector Search (Semantic)
Vector search, on the other hand, is the meaning engine.
- The Good: It actually understands the intent behind your query. For example, it can recognize that someone searching for “my phone won’t turn on” is related to “battery diagnostics” or “power issues,” even if those exact words aren’t used. It also handles typos effortlessly and has the “magic” ability to match queries across different languages natively.
- The Bad: Because it looks for similar meanings, it has a “fuzzy” problem. It can struggle to find exact technical strings or specific model numbers. It is also much more resource-intensive, requiring more compute power and therefore being quite expensive. It also operates a bit like a “black box”, making it hard (or even impossible) to explain exactly why one result ranked higher than another (we generate embeddings and then just perform a vector search, all we understand is the math formula e.q. Cosine similarity).
C# Example
Now, let’s analyze the C# example.
Secure Access using RBAC
In this example, I use the DefaultAzureCredential class (Azure.Identity NuGet package).
Whenever I can I will be encouraging you to use Entra, RBAC roles and Managed Identities instead of API keys. If you want to adhere to the best security guidelines then you should never use API keys (read more here).
When running locally, the DefaultAzureCredential falls back to VisualStudioCredential, which can obtain an access token because I’m signed in to Visual Studio under Tools > Options > Azure Service Authentication. In production, this same code will typically use ManagedIdentityCredential instead, without requiring any changes (or EnvironmentCredential if you host some service outside of the Azure environment).
Here is the initialization code:
public class HybridRagExample
{
private readonly EmbeddingClient _embeddingClient;
private readonly ChatClient _chatClient;
private readonly SearchClient _searchClient;
private readonly SearchService _searchService;
public HybridRagExample()
{
var credential = new DefaultAzureCredential();
_searchClient = new SearchClient(
new Uri(Environment.GetEnvironmentVariable("AZURE_AI_SEARCH_URI")!),
indexName: Environment.GetEnvironmentVariable("AZURE_AI_SEARCH_INDEX")!,
credential);
var openAiClient = new AzureOpenAIClient(
new Uri(Environment.GetEnvironmentVariable("AZURE_OPEN_AI_CLIENT_URI")!),
credential);
_embeddingClient = openAiClient.GetEmbeddingClient(Environment.GetEnvironmentVariable("AZURE_OPEN_AI_EMBEDDING_CLIENT_DEPLOYMENT_NAME")!);
_chatClient = openAiClient.GetChatClient(Environment.GetEnvironmentVariable("AZURE_OPEN_AI_EMBEDDING_CHAT_CLIENT_DEPLOYMENT_NAME")!);
_searchService = new SearchService(_searchClient, _embeddingClient);
}To make this work securely without API keys, I assigned the following roles:
- Azure AI Search – Search Index Data Contributor
- Microsoft Foundry – Azure AI User
This setup ensures that authentication is handled entirely through Azure AD and RBAC, allowing me to follow best‑practice security guidelines while avoiding hard‑coded secrets.
Implementation in C#
With the infrastructure in place, the next step is to implement the actual search logic. The SearchService class encapsulates all search methods which can be selected in that tiny C# app:

FullText Search (BM25)
The simplest method is a standard full‑text search. The query string is passed directly to Azure AI Search, and the results are collected. SearchOptions does not contain any VectorSearchOptions here.
public async Task<IReadOnlyList<StarshipSearchDocumentResult>> InvokeFullTextSearchAsync(string question, SearchOptions searchOptions)
{
var response = await searchClient.SearchAsync<StarshipSearchDocumentResult>(question, searchOptions);
return await CollectDocumentsAsync(response.Value);
}Vector Search
For vector search, the user’s question is first converted into an embedding. That embedding becomes the query vector, which is then matched against the OverviewVector field. Please note that searchText is null meaning there is no full text search invoked.
public async Task<IReadOnlyList<StarshipSearchDocumentResult>> InvokeVectorSearchAsync(string question, SearchOptions searchOptions)
{
var embedding = await embeddingClient.GenerateEmbeddingAsync(question);
var queryVector = embedding.Value.ToFloats().ToArray();
searchOptions.VectorSearch = new VectorSearchOptions
{
Queries =
{
new VectorizedQuery(queryVector)
{
KNearestNeighborsCount = searchOptions.Size,
Fields = { nameof(StarshipSearchDocument.OverviewVector) }
}
}
};
var response = await searchClient.SearchAsync<StarshipSearchDocumentResult>(searchText: null, searchOptions);
return await CollectDocumentsAsync(response.Value);
}
Hybrid Search: FullText Search + Vector
Hybrid search combines BM25 with vector relevance. In this example, the vector query is given a higher weight (2.0f), making semantic similarity more influential than keyword matching:
public async Task<IReadOnlyList<StarshipSearchDocumentResult>> InvokeHybridFullTextVectorAsync(string question, SearchOptions searchOptions)
{
var embedding = await embeddingClient.GenerateEmbeddingAsync(question);
var queryVector = embedding.Value.ToFloats().ToArray();
searchOptions.VectorSearch = new VectorSearchOptions
{
Queries =
{
new VectorizedQuery(queryVector)
{
KNearestNeighborsCount = searchOptions.Size,
Fields = { nameof(StarshipSearchDocument.OverviewVector) },
Weight = 2.0f
}
}
};
var response = await searchClient.SearchAsync<StarshipSearchDocumentResult>(question, searchOptions);
return await CollectDocumentsAsync(response.Value);
}Multi‑Vector Search
Documents in our index contain 2 semantic fields. In this case, both OverviewVector and NotesVector are used. Each field receives its own VectorizedQuery, and Azure AI Search merges the scores. searchText equals null again.
public async Task<IReadOnlyList<StarshipSearchDocumentResult>> InvokeVectorVectorAsync(string question, SearchOptions searchOptions)
{
var embedding = await embeddingClient.GenerateEmbeddingAsync(question);
var queryVector = embedding.Value.ToFloats().ToArray();
searchOptions.VectorSearch = BuildVectorSearchOptions(
queryVector,
searchOptions.Size ?? 3,
nameof(StarshipSearchDocument.OverviewVector),
nameof(StarshipSearchDocument.NotesVector));
var response = await searchClient.SearchAsync<StarshipSearchDocumentResult>(searchText: null, searchOptions);
return await CollectDocumentsAsync(response.Value);
}
Hybrid Search with Multiple Vector Fields
This version combines all three signals: BM25, OverviewVector, and NotesVector into a single ranking:
public async Task<IReadOnlyList<StarshipSearchDocumentResult>> InvokeHybridFullTextVectorVectorAsync(string question, SearchOptions searchOptions)
{
var embedding = await embeddingClient.GenerateEmbeddingAsync(question);
var queryVector = embedding.Value.ToFloats().ToArray();
searchOptions.VectorSearch = BuildVectorSearchOptions(
queryVector,
searchOptions.Size ?? 3,
nameof(StarshipSearchDocument.OverviewVector),
nameof(StarshipSearchDocument.NotesVector));
var response = await searchClient.SearchAsync<StarshipSearchDocumentResult>(question, searchOptions);
return await CollectDocumentsAsync(response.Value);
}
Reciprocal Rank Fusion (RRF)
When you run a Hybrid Search, you get two sets of results: one from BM25 and one from your Vector Index. But how do you merge a score of 3.5 (BM25) with a score of 0.85 (Cosine Similarity)?
You don’t. Azure AI Search uses RRF. It looks at the rank of the documents in each list and combines them into a single, unified list.
You can control this step by setting the Weight property in the VectorizedQuery class. The default value is 1.0. If you assign 2.0, that particular VectorizedQuery will be twice as important as the other query invoked in parallel (which could be a full‑text search query or another vector search query). If you want to lower the importance of a given vector query, assign a value below 1.0, such as 0.5.
public async Task<IReadOnlyList<StarshipSearchDocumentResult>> InvokeHybridFullTextVectorAsync(string question, SearchOptions searchOptions)
{
var embedding = await embeddingClient.GenerateEmbeddingAsync(question);
var queryVector = embedding.Value.ToFloats().ToArray();
searchOptions.VectorSearch = new VectorSearchOptions
{
Queries =
{
new VectorizedQuery(queryVector)
{
KNearestNeighborsCount = searchOptions.Size,
Fields = { nameof(StarshipSearchDocument.OverviewVector) },
Weight = 2.0f
}
}
};
var response = await searchClient.SearchAsync<StarshipSearchDocumentResult>(question, searchOptions);
return await CollectDocumentsAsync(response.Value);
}Results
Let’s walk through a few example queries. The dataset is small (just 10 records), which makes it harder to show noticeable differences, but the patterns are still visible.
Demo 1 – BM25 wins
When searching for “GV‑55‑FOXTROT”, BM25 returns it as the top match. No surprises here. Vector Search struggles with this kind of exact identifier, and Hybrid Search also places “GV‑55‑FOXTROT” in the first position.

Demo 2 – Vector Search wins
For the query “comfortable trip”, BM25 fails to return meaningful results.

I chose the word comfortable deliberately, there are two luxury ships in the dataset, and while BM25 can’t connect the dots (but there is no comforatble word), Vector Search easily identifies the correct top matches based on semantic similarity.

Demo 3 – Hybrid Search
When searching for “A vessel designed to offer refined hospitality and elevated comfort during interstellar journeys.”, Hybrid Search produces the best results. Both BM25 and Vector Search return reasonably accurate matches, but the hybrid approach (with the vector query weight set to 2.0) ranks the most relevant ship at the top.

Summary
I hope you learned something truly useful about how these search methods interact.
I want you to remember that even if vector search is the shiny new technology on the market (maybe not sooo new in 2026…), BM25 remains a very powerful option that you should always consider while deciding which method to chose.
In most scenarios, the hybrid approach is the best choice because it doesn’t force you to choose between exact-match precision and semantic meaning so don’t treat vectors as a holy grail, but rather as one half of a complete, high-performance search strategy.
Thanks for reading and see you in the next post!
