Introduction
You can find all the C# code samples here: Embeddings GitHub Repository
I encourage you to read this post first to get the most out of the information provided here:
In the previous post, we learned what an embedding is and how we can compare vectors using cosine similarity. Equipped with that knowledge, we can now delve into the next topic: vector databases in Azure.
Let me encourage you to continue by sharing one interesting fact:
More than 80% of all data in the world is unstructured: text, images, videos, audio, and more.
With embeddings and vector databases, you gain a key that unlocks limitless opportunities to enhance your applications.
Let’s get started and explore the world of vector databases.
Basics
Custom Vector Database
To lay the foundations for further considerations, let’s first create our own vector database.
Every vector database must support two basic operations:
- Indexing a new record –
public void Index(VectorSearchRecord? vectorDocument)in our example - Searching for similar records by performing a vector search –
public IReadOnlyCollection Search(float[] queryVector, int topK)in our example
For the sake of the example let’s store all our vectors in a dictionary:
private readonly Dictionary<string, VectorSearchRecord> _vectors = new();Indexing a new record
public void Index(VectorSearchRecord? vectorDocument)
{
ArgumentNullException.ThrowIfNull(vectorDocument);
if (vectorDocument.Vector.Length != SUPPORTED_VECTOR_DIMENSION)
{
throw new InvalidOperationException($"Invalid vector dimension. The only supported dimension is {SUPPORTED_VECTOR_DIMENSION}.");
}
if (_vectors.ContainsKey(vectorDocument.Id))
{
throw new InvalidOperationException($"A document with ID '{vectorDocument.Id}' already exists.");
}
_vectors[vectorDocument.Id] = vectorDocument;
}As you can see, the logic is very simple. We add our record to the dictionary and check whether the ID is unique. A more important validation is the one that ensures the vector length matches the value expected by the database (the user always provides the vector length when defining a vector search index).
❗❗❗Comparing vectors of different lengths makes no sense.
Searching for similar records
public IReadOnlyCollection<VectorSearchResult> Search(float[] queryVector, int topK)
{
if (queryVector.Length != SUPPORTED_VECTOR_DIMENSION)
{
throw new InvalidOperationException($"Invalid vector dimension. The only supported dimension is {SUPPORTED_VECTOR_DIMENSION}.");
}
if (topK <= 0)
{
throw new ArgumentOutOfRangeException(nameof(topK), "topK must be greater than zero.");
}
return _vectors.Values
.Select(record => new
{
Document = record,
Similarity = TensorPrimitives.CosineSimilarity(queryVector, record.Vector)
})
.OrderByDescending(x => x.Similarity)
.Take(topK)
.Select(x => new VectorSearchResult
{
Id = x.Document.Id,
Similarity = x.Similarity,
Data = x.Document.Data
})
.ToList();
}Based on this method, we can conclude that vector search belongs to the category of nearest‑neighbor search, where topK in the example above can be considered a form of kNN (k‑nearest neighbors).
We can also see that a naive solution could follow these steps:
- Compute the similarity between the query vector and all vectors stored in the database
- Order all results by similarity in descending order
- Return the
topKvectors with the highest similarity
Looks simple, right? If the idea has already crossed your mind to implement your own custom vector database in a real project, then I must warn you… it’s not that simple!
Before we delve into some of the real challenges, I’d like to clarify one more thing.
Quality of embeddings
I prepared some sample test data to make sure one additional aspect is clear to you (and I encourage you to explore the source code yourself as well). For each of our four well‑known keywords: Mars, Apollo 11, Neil Armstrong, and Curiosity Rover I created 5 different phrases related to that topic. I also added 20 phrases that are completely unrelated to any of these subjects (marked with a Random tag). The program compares each of our 4 keywords against every phrase in the test dataset.
Below are the results when using text-embedding-3-small embedding model deployed in Microsoft Foundry.
Top 5 similar items to "Mars":
- [Tag:Mars] Mars exploration: 0.75
- [Tag:Mars] Mars atmosphere: 0.74
- [Tag:Curiosity Rover] Mars rover: 0.72
- [Tag:Curiosity Rover] Mars Science Laboratory: 0.70
- [Tag:Mars] Martian surface: 0.68
Top 5 similar items to "Apollo 11":
- [Tag:Apollo 11] Moon landing mission: 0.59
- [Tag:Neil Armstrong] First man on the Moon: 0.55
- [Tag:Neil Armstrong] Buzz Aldrin: 0.53
- [Tag:Apollo 11] NASA 1969: 0.47
- [Tag:Curiosity Rover] Rover landing: 0.46
Top 5 similar items to "Neil Armstrong":
- [Tag:Neil Armstrong] Buzz Aldrin: 0.66
- [Tag:Neil Armstrong] First man on the Moon: 0.63
- [Tag:Apollo 11] Moon landing mission: 0.56
- [Tag:Neil Armstrong] Astronaut: 0.56
- [Tag:Neil Armstrong] NASA astronaut corps: 0.47
Top 5 similar items to "Curiosity Rover":
- [Tag:Curiosity Rover] Mars rover: 0.75
- [Tag:Curiosity Rover] Mars Science Laboratory: 0.62
- [Tag:Curiosity Rover] Rover landing: 0.60
- [Tag:Mars] Mars exploration: 0.58
- [Tag:Mars] Red Planet: 0.48Now let’s use a more powerful embedding model which is text-embedding-ada-002.
Top 5 similar items to "Mars":
- [Tag:Mars] Mars atmosphere: 0.90
- [Tag:Curiosity Rover] Mars rover: 0.90
- [Tag:Mars] Mars exploration: 0.90
- [Tag:Curiosity Rover] Mars Science Laboratory: 0.88
- [Tag:Mars] Red Planet: 0.88
Top 5 similar items to "Apollo 11":
- [Tag:Apollo 11] Moon landing mission: 0.91
- [Tag:Neil Armstrong] First man on the Moon: 0.89
- [Tag:Apollo 11] NASA 1969: 0.89
- [Tag:Neil Armstrong] Astronaut: 0.88
- [Tag:Neil Armstrong] NASA astronaut corps: 0.86
Top 5 similar items to "Neil Armstrong":
- [Tag:Neil Armstrong] Buzz Aldrin: 0.89
- [Tag:Neil Armstrong] First man on the Moon: 0.87
- [Tag:Neil Armstrong] Astronaut: 0.87
- [Tag:Neil Armstrong] NASA astronaut corps: 0.86
- [Tag:Apollo 11] Moon landing mission: 0.85
Top 5 similar items to "Curiosity Rover":
- [Tag:Curiosity Rover] Mars rover: 0.94
- [Tag:Mars] Mars exploration: 0.89
- [Tag:Curiosity Rover] Mars Science Laboratory: 0.89
- [Tag:Curiosity Rover] Rover landing: 0.88
- [Tag:Mars] Red Planet: 0.86As you can see, the similarity scores are much higher than in the previous example. We could rephrase this by saying that this embedding model can capture the meaning of the text data better.
I showed you these 2 examples to make sure you understand one crucial point: the quality of your embeddings matters a lot BUT…
it’s not the vector database’s job to create meaningful embeddings. Its job is simply to perform vector search as fast and as accurately as possible.
Think of it like SQL Server. SQL Server doesn’t magically understand your data or fix poorly structured tables. It just stores rows and retrieves them efficiently. If your schema is bad, SQL Server won’t save you. Similarly, if your embeddings are low‑quality or inconsistent, even the best vector database won’t make them meaningful.
Once we have that matter clarified, we can finally move on to the main limitation of our custom vector database and the biggest challenge that inevitably appears: how do you perform vector search efficiently on large datasets?
Vector Search algorithms
In our custom database, we worked with just 40 vectors. That’s perfectly fine for a toy example but now imagine you’re dealing with 100 million vectors. I think we can both agree that computing the similarity score for all 100 million vectors on every single query wouldn’t be the most efficient approach.
For large datasets, vector search is usually performed efficiently thanks to the use of an approximate nearest‑neighbor (ANN) techniques. To understand how ANN achieves this efficiency, it helps to look at the main algorithms behind it.
ℹ️ Understanding these low-level details isn’t necessary but I believe that getting familiar with the terminology will help you recognize it when working with various vector databases.
The most popular vector search algorithms nowadays are:
- Graph‑based:
- HNSW (Hierarchical Navigable Small World Graphs) – a graph‑based structure that lets you jump quickly to the region where your nearest neighbors are likely to be.
- DiskANN (Microsoft Disk‑Accelerated ANN) – a graph‑based algorithm designed for very large vector datasets, combining an in‑memory graph with SSD‑optimized storage to achieve high accuracy at massive scale.
- Hash‑based:
- LSH (Locality‑Sensitive Hashing) – maps similar vectors into the same buckets, drastically reducing the search space.
- Cluster‑based:
- PQ (Product Quantization) – compresses vectors into compact codes so similarity can be computed extremely fast.
- IVF (Inverted File Index) – clusters vectors and searches only the most relevant clusters instead of scanning everything.
- Tree‑based:
- Random Projection Trees – splits the vector space into tree structures so you only explore the most promising branches.
Below are the most popular libraries that implement these algorithms:
- FAISS (Facebook AI Similarity Search) – a high‑performance library by Meta that implements HNSW, and more, with optional GPU acceleration. Written in C++.
- Google ScaNN (Google Scalable Nearest Neighbors) – optimized for extremely fast and accurate ANN search, especially for high‑dimensional embeddings.
- Annoy (Approximate Nearest Neighbors Oh Yeah) – a lightweight library that uses random projection trees to perform efficient nearest‑neighbor queries. Used by Spotify.
Custom Vector Database using FAISS
In the CustomVectorDbFaissExample, you can find a simple custom implementation of a vector database (DeployedInAzureVectorDbFaiss) that uses the FAISS library. As I mentioned earlier, FAISS is written in C++, so I used the FaissNet library, which acts as a wrapper.
public DeployedInAzureVectorDbFaiss()
{
_index = FaissNet.Index.Create(
SUPPORTED_VECTOR_DIMENSION,
"IDMap2,Flat",
FaissNet.MetricType.METRIC_INNER_PRODUCT);
}As you can see below I used 2 comma separated values: IDMap2 and Flat. You can find more details about the available configuration options in the FAISS documentation, but let me briefly explain these two:
IDMap2– is a FAISS index wrapper that adds ID management on top of a base indexFlat– a brute‑force index (exact search, no ANN). I could use HNSW, IVF type of index and other too.
I also used METRIC_INNER_PRODUCT method for finding similar vectors. You can find all the available options here.
I don’t expect you’ll be using the FAISS library directly in most real‑world projects, but having a basic understanding of how it works under the hood can be surprisingly helpful. It gives you the vocabulary to understand documentation, debug issues, and make better decisions when choosing or configuring a vector database.
This is how the search function looks like:
public IReadOnlyCollection<VectorSearchResult> Search(float[] queryVector, int topK)
{
var result = _index.SearchFlat(1, queryVector, topK);
var distance = result.Item1;
var ids = result.Item2;
return ids.Zip(distance)
.Select(kvp => new VectorSearchResult
{
Id = kvp.First.ToString(),
Similarity = kvp.Second,
Data = _storage[kvp.First.ToString()].Data
})
.ToList();
}Vector DBs in Azure
Now that we’ve covered the essential concepts and have a basic understanding of how vector search works under the hood, we can take a look at what options are available to us in Azure.
The options available in Azure for vector search are:
- Azure AI Search
- Azure Cosmos DB for NoSQL
- Azure DocumentDB
- Azure Cosmos DB for PostgreSQL
- Azure Database for PostgreSQL
- Azure SQL Database
I encourage you to read this page which provides a very good starting point for considering which service to use.
ℹ️ My main remark about the “Key Requirements” concerns the first question “Do you frequently insert, update or delete vector data and need search results to stay up to date?”. In my opininion having data and a correspondinx index within the same DB is the most performant option however, for more than 90% of applications, using a separate vector database (like Azure AI Search) with an asynchronous update pipeline is perfectly acceptable. With the right techniques, the data can still remain almost real‑time up to date.
Criteria for choosing the right vector DB
I hope that what we’ve discussed so far gives you some insight into how vector DBs work, and that this helps you choose the right service. Below are a few questions you should ask yourself:
- Does the service support the embedding dimension you need (e.g., 1536, 3072)?
- Do you need more than one vector per document (e.g., title + body + image)?
- Consider whether the service supports full‑text search, hybrid search, and built‑in reranking.
- Check which indexing algorithms the service supports (HNSW, IVF etc.)
There are other criteria you should consider too:
- What vector ingestion rate do you expect? (100/s, 1000/s etc.)?
- How many vector search queries per second do you anticipate?
- What latency do you require for vector search operations ( <1ms, <10ms, <50ms etc.)?
- How much accuracy does your vector search workload require?
- Different ANN algorithms (HNSW, IVF, PQ, LSH, RP‑Trees) make different trade‑offs between speed and accuracy (very often you are allowed to configure the accuracy vs latency trade-off)
- Does it support vector quantization?
Summary
Vector databases play a crucial role in modern AI applications by enabling fast and accurate similarity search across massive collections of embeddings. In this post, we explored how vector search works at a fundamental level, why embedding quality matters, and what challenges arise when scaling from a toy example to millions of vectors. We also looked at the core concepts behind ANN algorithms and reviewed the vector search options available in Azure, along with practical criteria to help you choose the right service for your workload.
In the next posts, we’ll dive deeper into several of these Azure services, examine their strengths and trade‑offs, and walk through concrete examples of vector search.
Thanks for reading and see you in the next post!