Vector Search in Azure Cosmos DB for NoSQL: A Practical Guide

Table of Contents

Introduction

You can find all the C# code samples here: Embeddings GitHub Repository

I encourage you to read these posts first to get the most out of the information provided here:

After reading the previous posts, you now have a solid foundation to focus on implementing vector search in Azure Cosmos DB for NoSQL. I also hope you now feel equipped to make an informed decision about “why” and “when” you might want to choose this service for your applications.

So let’s not waste your precious time and get straight to the point!

Demo

In the previous post, we went through a simple custom implementation of a vector database in C# (with and without the FAISS library). I would like to continue using the same sample dataset here, but instead of a custom local vector DB, we’ll use Azure Cosmos DB for NoSQL.

Creating the Vector Index

First of all, we have to enable the Vector Search feature in our Cosmos DB instance.

Enabling Vector Search for NoSQL API feature in Azure Cosmos DB.

Now, let’s define a simple C# class that represents the intended schema:

namespace DeployedInAzure.EmbeddingsExamples.CosmosDbForNoSql
{
    public record CosmosDbForNoSqlDocumentModel
    {
        public required string id { get; init; }
        public required string Phrase { get; init; }
        public required List<string> Tags { get; init; } = [];
        public required float[] Vector { get; init; }
    }
}

namespace DeployedInAzure.EmbeddingsExamples.CosmosDbForNoSql
{
    public record CosmosDbForNoSqlDocumentModel
    {
        public required string id { get; init; }
        public required string Phrase { get; init; }
        public required List<string> Tags { get; init; } = [];
        public required float[] Vector { get; init; }
    }
}

Once we have our class defined, we can create a new container in Cosmos DB.

❗❗❗ Currently vector indexes can only be defined when creating a new container.

Create a new vector index of flat type in Azure Cosmos DB.

As you can see, we need to configure a few key properties when creating a vector index:

Path – the document property that stores our vector
Data Type – the type of vector values. We are going to store vector embeddings and therefore float32 is the best option (other options uint8 and int8)
Distance function – determines how similarity is measured between vectors. Common options include cosine, euclidean, and dotProduct. In this case, we’re using cosine, which is well-suited for semantic similarity (do you remember TensorPrimitives.CosineSimilarity that we used in the previous examples?)
Dimensions – the number of elements in each vector
Index type – defines how the vector index is built and stored. Supported values: flat, quantizedFlat, and diskANN (you can find all the details here)

As you can see, we’ve just come across the first limitation of the flat index type: its maximum dimension limit of 505. The flat index type is a good fit for so‑called “brute‑force searches” which basically means a mechanism similar to what we implemented in the previous post (DeployedInAzureVectorDb and DeployedInAzureVectorDbFaiss). Most embeddings have a size larger than 505 (as 1536 in our example), so we’re forced to choose another method. Let’s go with quantizedFlat (I’ll dive deep into vector quantization in one of the next posts, so don’t miss it!) and continue.

Create a new vector index of quantized flat type in Azure Cosmos DB.

I left the quantizationByteSize option empty. This setting controls the accuracy vs latency trade‑off, and leaving it blank means Azure will choose the best option automatically.

Now let’s see what JSON was generated behind the scenes (the Indexing Policy tab).

{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [
        {
            "path": "/*"
        }
    ],
    "excludedPaths": [
        {
            "path": "/\"_etag\"/?"
        },
        {
            "path": "/Vector/*"
        }
    ],
    "fullTextIndexes": [],
    "vectorIndexes": [
        {
            "path": "/Vector",
            "type": "quantizedFlat",
            "quantizerType": "product",
            "quantizationByteSize": 96
        }
    ]
}

{
    "indexingMode": "consistent",
    "automatic": true,
    "includedPaths": [
        {
            "path": "/*"
        }
    ],
    "excludedPaths": [
        {
            "path": "/\"_etag\"/?"
        },
        {
            "path": "/Vector/*"
        }
    ],
    "fullTextIndexes": [],
    "vectorIndexes": [
        {
            "path": "/Vector",
            "type": "quantizedFlat",
            "quantizerType": "product",
            "quantizationByteSize": 96
        }
    ]
}

As you can see, all the information we provided appears under the vectorIndexes property. Since it’s an array, you can define more than one vector index if needed. Also notice that the /Vector path was added to the excludedPaths section. This is intentional because excluding the vector field from the regular index improves insertion performance.

❗❗❗If it wasn’t excluded, every vector write would incur significantly higher RU charges and increased latency.

Assigning RBAC role

The index is created, so now we should ensure that our application can access Azure Cosmos DB. After reading this article, I hope I convinced you to rely on Azure RBAC instead of connection strings, so let’s assign an appropriate role.

We cannot do it directly in Azure Portal so let’s use Azure CLI (you can find all the details here). After logging in with az login let’s check which roles are available by running:

az cosmosdb sql role definition list -g "deployed-in-azure-ai-search" -a "deployed-in-azure-cosmosdb"

az cosmosdb sql role definition list -g "deployed-in-azure-ai-search" -a "deployed-in-azure-cosmosdb"

There are two built‑in roles available:

Cosmos DB Built‑in Data Reader (id ends with ...001)
Cosmos DB Built‑in Data Contributor (id ends with ...002)

We’ll choose the latter (...002), because our sample app needs to both read and write data in Cosmos DB. Based on the results from the previous query, we can now fill in all the required parameters for az cosmosdb sql role assignment create.

ℹ️ I’m using my user security principal ID to run it locally, but in production you’ll likely use a sytem managed identity from a compute service such as Azure Functions or Azure App Service (or a user-assigined Managed Identity).

Okay, we’ve already done the configuration, so let’s focus on the code now.

Data Indexing

First, we need to add a reference to Microsoft.Azure.Cosmos in order to use CosmosClient. Once that’s done, we can instantiate the client.

❗❗❗ Normally, you would register it as a singleton in your DI container.

private readonly CosmosClient _cosmosClient = new CosmosClient("https://deployed-in-azure-cosmosdb.documents.azure.com", new DefaultAzureCredential());

private Container VectorSearchContainer => _cosmosClient
    .GetDatabase("MyDatabase")
    .GetContainer("VectorSearchContainer");

private readonly CosmosClient _cosmosClient = new CosmosClient("https://deployed-in-azure-cosmosdb.documents.azure.com", new DefaultAzureCredential());

private Container VectorSearchContainer => _cosmosClient
    .GetDatabase("MyDatabase")
    .GetContainer("VectorSearchContainer");

Now we can create embeddings (the same way as we did in the previous examples) and push that data to Cosmos.

private async Task UpsertSampleDocumentsAsync()
{
    var documentsToBeIndexed = new List<CosmosDbForNoSqlDocumentModel>();

    foreach (var item in TestData.GetAllTestData().Select((phraseAndTagPair, index) => (phraseAndTagPair, Index: index + 1)))
    {
        // this could be run in parallel if needed too using Task.WhenAll
        var response = await _embeddingClient.GenerateEmbeddingAsync(item.phraseAndTagPair.Phrase);

        var document = new CosmosDbForNoSqlDocumentModel()
        {
            id = item.Index.ToString(),
            Phrase = item.phraseAndTagPair.Phrase,
            Vector = response.Value.ToFloats().ToArray(),
            Tags = [item.phraseAndTagPair.Tag]
        };

        documentsToBeIndexed.Add(document);
    }

    var container = VectorSearchContainer;

    // if you use Visual Studio Authentication and 401 or 403 is returned even if you have 'Cosmos DB Built‑in Data Contributor' RBAC role assigned
    // make sure to set the environment variable `AZURE_TENANT_ID` to your Entra tenant ID where the Microsoft Foundry resource is deployed
    await Task.WhenAll(documentsToBeIndexed.Select(document => container.UpsertItemAsync(document, new PartitionKey(document.id))));

    Console.WriteLine($"`{documentsToBeIndexed.Count}` documents were usperted to Cosmos DB successfully!");
}

private async Task UpsertSampleDocumentsAsync()
{
    var documentsToBeIndexed = new List<CosmosDbForNoSqlDocumentModel>();

    foreach (var item in TestData.GetAllTestData().Select((phraseAndTagPair, index) => (phraseAndTagPair, Index: index + 1)))
    {
        // this could be run in parallel if needed too using Task.WhenAll
        var response = await _embeddingClient.GenerateEmbeddingAsync(item.phraseAndTagPair.Phrase);

        var document = new CosmosDbForNoSqlDocumentModel()
        {
            id = item.Index.ToString(),
            Phrase = item.phraseAndTagPair.Phrase,
            Vector = response.Value.ToFloats().ToArray(),
            Tags = [item.phraseAndTagPair.Tag]
        };

        documentsToBeIndexed.Add(document);
    }

    var container = VectorSearchContainer;

    // if you use Visual Studio Authentication and 401 or 403 is returned even if you have 'Cosmos DB Built‑in Data Contributor' RBAC role assigned
    // make sure to set the environment variable `AZURE_TENANT_ID` to your Entra tenant ID where the Microsoft Foundry resource is deployed
    await Task.WhenAll(documentsToBeIndexed.Select(document => container.UpsertItemAsync(document, new PartitionKey(document.id))));

    Console.WriteLine($"`{documentsToBeIndexed.Count}` documents were usperted to Cosmos DB successfully!");
}

After this operation is completed we can see our documents with vectors in the Data Explorer in Cosmos.

Sample document with a vector field in Azure Cosmos DB data explorer.

We’re now halfway through the process, so let’s turn our attention to vector search now!

Vector Search using VectorDistance function

Azure Cosmos DB for NoSQL provides several built‑in functions, and one of them is VECTORDISTANCE which measures similarity between vectors. Below you can see how such a query would like look if you used a raw SQL query. Please note that ORDER BY does not use DESC. When you use the VectorDistance function in conjunction with ORDER BY, then Cosmos DB knows how to order the results (for example, descending for cosine similarity and ascending for Euclidean distance).

public async Task<IReadOnlyCollection<CosmosDbForNoSqlVectorSearchResult>> SearchAsync(float[] queryVector, int topK)
{
    var sql = """
        SELECT TOP @topK
            c.id,
            c.Phrase,
            c.Tags,
            VectorDistance(c.Vector, @queryVector) AS SimilarityScore
        FROM c
        ORDER BY VectorDistance(c.Vector, @queryVector)
        """;

    var query = new QueryDefinition(sql)
        .WithParameter("@topK", topK)
        .WithParameter("@queryVector", queryVector);

    var results = new List<CosmosDbForNoSqlVectorSearchResult>();

    using var iterator = VectorSearchContainer.GetItemQueryIterator<CosmosDbForNoSqlVectorSearchResult>(query);    
    while (iterator.HasMoreResults)
    {
        var partialResponse = await iterator.ReadNextAsync();
        results.AddRange(partialResponse);
    }

    return results;
}

public async Task<IReadOnlyCollection<CosmosDbForNoSqlVectorSearchResult>> SearchAsync(float[] queryVector, int topK)
{
    var sql = """
        SELECT TOP @topK
            c.id,
            c.Phrase,
            c.Tags,
            VectorDistance(c.Vector, @queryVector) AS SimilarityScore
        FROM c
        ORDER BY VectorDistance(c.Vector, @queryVector)
        """;

    var query = new QueryDefinition(sql)
        .WithParameter("@topK", topK)
        .WithParameter("@queryVector", queryVector);

    var results = new List<CosmosDbForNoSqlVectorSearchResult>();

    using var iterator = VectorSearchContainer.GetItemQueryIterator<CosmosDbForNoSqlVectorSearchResult>(query);    
    while (iterator.HasMoreResults)
    {
        var partialResponse = await iterator.ReadNextAsync();
        results.AddRange(partialResponse);
    }

    return results;
}

The equivalent of the SQL logic shown above, written in LINQ (which I encourage you to use unless you truly need full control and want to write the SQL query yourself, which might be necessary in some edge cases) is:

public async Task<IReadOnlyCollection<CosmosDbForNoSqlVectorSearchResult>> SearchAsync(float[] queryVector, int topK)
{
    var queryable = VectorSearchContainer.GetItemLinqQueryable<CosmosDbForNoSqlDocumentModel>()
        .OrderByRank(x => CosmosLinqExtensions.VectorDistance(x.Vector, queryVector, false, options: null))
        .Select(x => new CosmosDbForNoSqlVectorSearchResult()
        {
            id = x.id,
            Phrase = x.Phrase,
            Tags = x.Tags,
            SimilarityScore = CosmosLinqExtensions.VectorDistance(x.Vector, queryVector, false, options: null)
        })
        .Take(topK);

    var results = new List<CosmosDbForNoSqlVectorSearchResult>();
    using var iterator = queryable.ToFeedIterator();
    while (iterator.HasMoreResults)
    {
        var partialResponse = await iterator.ReadNextAsync();
        results.AddRange(partialResponse);
    }

    return results.ToList();
}

public async Task<IReadOnlyCollection<CosmosDbForNoSqlVectorSearchResult>> SearchAsync(float[] queryVector, int topK)
{
    var queryable = VectorSearchContainer.GetItemLinqQueryable<CosmosDbForNoSqlDocumentModel>()
        .OrderByRank(x => CosmosLinqExtensions.VectorDistance(x.Vector, queryVector, false, options: null))
        .Select(x => new CosmosDbForNoSqlVectorSearchResult()
        {
            id = x.id,
            Phrase = x.Phrase,
            Tags = x.Tags,
            SimilarityScore = CosmosLinqExtensions.VectorDistance(x.Vector, queryVector, false, options: null)
        })
        .Take(topK);

    var results = new List<CosmosDbForNoSqlVectorSearchResult>();
    using var iterator = queryable.ToFeedIterator();
    while (iterator.HasMoreResults)
    {
        var partialResponse = await iterator.ReadNextAsync();
        results.AddRange(partialResponse);
    }

    return results.ToList();
}

Below are the results:

Top 5 similar items to "Mars":
- [Tag:Mars] Mars exploration: 0.75
- [Tag:Mars] Mars atmosphere: 0.74
- [Tag:Curiosity Rover] Mars rover: 0.72
- [Tag:Curiosity Rover] Mars Science Laboratory: 0.70
- [Tag:Mars] Martian surface: 0.68

Top 5 similar items to "Apollo 11":
- [Tag:Apollo 11] Moon landing mission: 0.59
- [Tag:Neil Armstrong] First man on the Moon: 0.55
- [Tag:Neil Armstrong] Buzz Aldrin: 0.53
- [Tag:Apollo 11] NASA 1969: 0.47
- [Tag:Curiosity Rover] Rover landing: 0.46

Top 5 similar items to "Neil Armstrong":
- [Tag:Neil Armstrong] Buzz Aldrin: 0.66
- [Tag:Neil Armstrong] First man on the Moon: 0.63
- [Tag:Apollo 11] Moon landing mission: 0.56
- [Tag:Neil Armstrong] Astronaut: 0.56
- [Tag:Neil Armstrong] NASA astronaut corps: 0.47

Top 5 similar items to "Curiosity Rover":
- [Tag:Curiosity Rover] Mars rover: 0.75
- [Tag:Curiosity Rover] Mars Science Laboratory: 0.62
- [Tag:Curiosity Rover] Rover landing: 0.60
- [Tag:Mars] Mars exploration: 0.58
- [Tag:Mars] Red Planet: 0.48

Top 5 similar items to "Mars":
- [Tag:Mars] Mars exploration: 0.75
- [Tag:Mars] Mars atmosphere: 0.74
- [Tag:Curiosity Rover] Mars rover: 0.72
- [Tag:Curiosity Rover] Mars Science Laboratory: 0.70
- [Tag:Mars] Martian surface: 0.68

Top 5 similar items to "Apollo 11":
- [Tag:Apollo 11] Moon landing mission: 0.59
- [Tag:Neil Armstrong] First man on the Moon: 0.55
- [Tag:Neil Armstrong] Buzz Aldrin: 0.53
- [Tag:Apollo 11] NASA 1969: 0.47
- [Tag:Curiosity Rover] Rover landing: 0.46

Top 5 similar items to "Neil Armstrong":
- [Tag:Neil Armstrong] Buzz Aldrin: 0.66
- [Tag:Neil Armstrong] First man on the Moon: 0.63
- [Tag:Apollo 11] Moon landing mission: 0.56
- [Tag:Neil Armstrong] Astronaut: 0.56
- [Tag:Neil Armstrong] NASA astronaut corps: 0.47

Top 5 similar items to "Curiosity Rover":
- [Tag:Curiosity Rover] Mars rover: 0.75
- [Tag:Curiosity Rover] Mars Science Laboratory: 0.62
- [Tag:Curiosity Rover] Rover landing: 0.60
- [Tag:Mars] Mars exploration: 0.58
- [Tag:Mars] Red Planet: 0.48

We can, of course, combine vector search with a regular WHERE clause, allowing us to first narrow down the set of records (based on business rules) and then order them using VECTORDISTANCE. The example below assumes we only want results that contain a specific tag, using the ARRAY_CONTAINS function (you can do the same using LINQ query too):

public async Task<IReadOnlyCollection<CosmosDbForNoSqlVectorSearchResult>> SearchAsync(float[] queryVector, int topK, string tag)
{
    var sql = """
        SELECT TOP @topK
            c.id,
            c.Phrase,
            c.Tags,
            VectorDistance(c.Vector, @queryVector) AS SimilarityScore
        FROM c
        WHERE ARRAY_CONTAINS(c.Tags, @tag)
        ORDER BY VectorDistance(c.Vector, @queryVector)
        """;

    var query = new QueryDefinition(sql)
        .WithParameter("@topK", topK)
        .WithParameter("@queryVector", queryVector)
        .WithParameter("@tag", tag);

    var results = new List<CosmosDbForNoSqlVectorSearchResult>();

    using var iterator = VectorSearchContainer.GetItemQueryIterator<CosmosDbForNoSqlVectorSearchResult>(query);
    while (iterator.HasMoreResults)
    {
        var partialResponse = await iterator.ReadNextAsync();
        results.AddRange(partialResponse);
    }

    return results;
}

public async Task<IReadOnlyCollection<CosmosDbForNoSqlVectorSearchResult>> SearchAsync(float[] queryVector, int topK, string tag)
{
    var sql = """
        SELECT TOP @topK
            c.id,
            c.Phrase,
            c.Tags,
            VectorDistance(c.Vector, @queryVector) AS SimilarityScore
        FROM c
        WHERE ARRAY_CONTAINS(c.Tags, @tag)
        ORDER BY VectorDistance(c.Vector, @queryVector)
        """;

    var query = new QueryDefinition(sql)
        .WithParameter("@topK", topK)
        .WithParameter("@queryVector", queryVector)
        .WithParameter("@tag", tag);

    var results = new List<CosmosDbForNoSqlVectorSearchResult>();

    using var iterator = VectorSearchContainer.GetItemQueryIterator<CosmosDbForNoSqlVectorSearchResult>(query);
    while (iterator.HasMoreResults)
    {
        var partialResponse = await iterator.ReadNextAsync();
        results.AddRange(partialResponse);
    }

    return results;
}

Below are the results:

Top 5 similar items to "Mars" with the Tags filter applied:
- [Tag:Mars] Mars exploration: 0.75
- [Tag:Mars] Mars atmosphere: 0.74
- [Tag:Mars] Martian surface: 0.68
- [Tag:Mars] Red Planet: 0.68
- [Tag:Mars] Olympus Mons: 0.43

Top 5 similar items to "Apollo 11" with the Tags filter applied:
- [Tag:Apollo 11] Moon landing mission: 0.59
- [Tag:Apollo 11] NASA 1969: 0.47
- [Tag:Apollo 11] Lunar module: 0.44
- [Tag:Apollo 11] Saturn V rocket: 0.41
- [Tag:Apollo 11] Sea of Tranquility: 0.25

Top 5 similar items to "Neil Armstrong" with the Tags filter applied:
- [Tag:Neil Armstrong] Buzz Aldrin: 0.66
- [Tag:Neil Armstrong] First man on the Moon: 0.63
- [Tag:Neil Armstrong] Astronaut: 0.56
- [Tag:Neil Armstrong] NASA astronaut corps: 0.47
- [Tag:Neil Armstrong] Space suit: 0.31

Top 5 similar items to "Curiosity Rover" with the Tags filter applied:
- [Tag:Curiosity Rover] Mars rover: 0.75
- [Tag:Curiosity Rover] Mars Science Laboratory: 0.62
- [Tag:Curiosity Rover] Rover landing: 0.60
- [Tag:Curiosity Rover] Martian soil analysis: 0.47
- [Tag:Curiosity Rover] Gale Crater: 0.42

Top 5 similar items to "Mars" with the Tags filter applied:
- [Tag:Mars] Mars exploration: 0.75
- [Tag:Mars] Mars atmosphere: 0.74
- [Tag:Mars] Martian surface: 0.68
- [Tag:Mars] Red Planet: 0.68
- [Tag:Mars] Olympus Mons: 0.43

Top 5 similar items to "Apollo 11" with the Tags filter applied:
- [Tag:Apollo 11] Moon landing mission: 0.59
- [Tag:Apollo 11] NASA 1969: 0.47
- [Tag:Apollo 11] Lunar module: 0.44
- [Tag:Apollo 11] Saturn V rocket: 0.41
- [Tag:Apollo 11] Sea of Tranquility: 0.25

Top 5 similar items to "Neil Armstrong" with the Tags filter applied:
- [Tag:Neil Armstrong] Buzz Aldrin: 0.66
- [Tag:Neil Armstrong] First man on the Moon: 0.63
- [Tag:Neil Armstrong] Astronaut: 0.56
- [Tag:Neil Armstrong] NASA astronaut corps: 0.47
- [Tag:Neil Armstrong] Space suit: 0.31

Top 5 similar items to "Curiosity Rover" with the Tags filter applied:
- [Tag:Curiosity Rover] Mars rover: 0.75
- [Tag:Curiosity Rover] Mars Science Laboratory: 0.62
- [Tag:Curiosity Rover] Rover landing: 0.60
- [Tag:Curiosity Rover] Martian soil analysis: 0.47
- [Tag:Curiosity Rover] Gale Crater: 0.42

Also note that I never return a raw vector from Cosmos DB. In 99% of scenarios, you don’t need the raw vector at all. What really matters is getting a result that contains the TOP N most similar vectors to the one you provided in the query.

Summary

I hope that after reading this article, a few things have become clearer. My goal was to show that once you understand embeddings, indexes, and the mechanics behind similarity search, Cosmos DB becomes an approachable and powerful option.

If you followed along step by step, you now have everything you need to build your own vector‑powered features on top of Azure Cosmos DB.

Thanks for your time, and I wish you plenty of success as you implement vector search in Azure Cosmos DB!

See you in the next post!

Tags: Azure Cosmos DB, Microsoft Foundry

Categorized in:

Databases,

Vector Search in Azure Cosmos DB for NoSQL: A Practical Guide

Introduction

Demo

Creating the Vector Index

Assigning RBAC role

Data Indexing

Vector Search using VectorDistance function

Summary

Vector Databases in Azure: Powering AI Apps at Scale

Vector Search in Azure AI Search: A Practical Guide

Did you know? 📊

Thanks for the visit 🙌

Press ESC to close

Introduction

Demo

Creating the Vector Index

Assigning RBAC role

Data Indexing

Vector Search using VectorDistance function

Summary

Vector Databases in Azure: Powering AI Apps at Scale

Vector Search in Azure AI Search: A Practical Guide

More in this CategoryDatabases

Vector Databases in Azure: Powering AI Apps at Scale

Patch API in Azure Cosmos DB: 3 Reasons You Should Use It

Concurrency in Azure Cosmos DB: Handling Conflicts in Multi-Write Regions

Concurrency in Azure Cosmos DB: Mastering ETag, IfMatch and IfNoneMatch

Did you know? 📊

Thanks for the visit 🙌