Mastering Data Indexing in Azure AI Search: Take Full Control (Push Approach)

Table of Contents

In Getting Started with Azure AI Search: Features, Benefits, and Basics Explained, we laid the foundation by exploring what Azure AI Search can do and why it matters. Then, in Effective Data Indexing in Azure AI Search: Data Sources and Indexers (Pull Approach), we looked at how to bring data in automatically using pull-based indexing. Now it’s time to take things a step further – this article focuses on the push approach, a custom method that gives you full control over how and when your data is indexed.

Introduction

Push indexing in Azure AI Search is all about control. Unlike the pull approach described in the previous article, where data sources and indexers define how information is ingested, the push method works differently. In the push method there is no concept of a data source or indexer. Instead, documents are sent directly to the index through APIs, giving you full flexibility to decide what, when, and how data is ingested. This makes push indexing ideal for scenarios that demand precision, customization, or integration with non‑Azure sources.

How Push Indexing Works

Diagram illustrating the push approach to data indexing in Azure AI Search, using the Azure.Search.Documents Client SDK with Cosmos DB change feed integration through a Function App

To see push indexing in action, let’s revisit the SpaceEntity concept introduced in the previous article. What we are building here is a solid foundation for index synchronization, assuming we keep our data in Azure Cosmos DB.

The solution leverages Cosmos DB Change Feed in conjunction with an Azure Function App, which consumes messages as they appear in the feed. From there, we use the Azure.Search.Documents NuGet package, a client SDK that abstracts the raw REST API contract and exposes easy to use methods for pushing data into the index. This eliminates the need to manually shape HTTP requests.

Finally, you’ll notice the SpaceEntityIndexModel class, which directly maps to the index schema. This model ensures that the structure of your ingested documents aligns with the schema defined in Azure AI Search, making the pipeline both reliable and maintainable.

Registering the SearchClient and Pushing Data

First, we need to grant the Azure Function App’s managed identity the correct permissions. Assign the RBAC role Search Index Data Contributor to that identity at the scope of your Azure AI Search service. This ensures the Function App can securely add, update, and delete documents in the index. While access keys are possible too, modern solutions that follow best security practices should avoid them.

Once this role is configured, we can proceed with registering the SearchClient in the DI container.

using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Configuration;
using Azure.Identity;
using Azure.Search.Documents;

namespace DeployedInAzure.AzureAiSearchSamples
{
    public static class ServiceCollectionExtensions
    {
        extension(IServiceCollection services)
        {
            public IServiceCollection RegisterAzureAISearchClients(IConfiguration configuration)
            {
                var credential = new DefaultAzureCredential();
                var endpoint = new Uri(configuration["AiSearch:Uri"]);

                services.AddKeyedSingleton(
                    serviceKey: AzureAiSearchIndexes.SPACE_ENTITIES, 
                    new SearchClient(endpoint, configuration["AiSearch:Indexes:SpaceEntities:IndexName"], credential));

                return services;
            }
        }
    }
}

using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Configuration;
using Azure.Identity;
using Azure.Search.Documents;

namespace DeployedInAzure.AzureAiSearchSamples
{
    public static class ServiceCollectionExtensions
    {
        extension(IServiceCollection services)
        {
            public IServiceCollection RegisterAzureAISearchClients(IConfiguration configuration)
            {
                var credential = new DefaultAzureCredential();
                var endpoint = new Uri(configuration["AiSearch:Uri"]);

                services.AddKeyedSingleton(
                    serviceKey: AzureAiSearchIndexes.SPACE_ENTITIES, 
                    new SearchClient(endpoint, configuration["AiSearch:Indexes:SpaceEntities:IndexName"], credential));

                return services;
            }
        }
    }
}

Based on this code, it’s worth highlighting a few important points.

First, the use of DefaultAzureCredential ensures that once the Function App is deployed to Azure, it can automatically fetch its managed identity (system‑assigned or user‑assigned). That identity is then used to authenticate and authorize against the Azure AI Search service, leveraging the Search Index Data Contributor role. This approach avoids the need for access keys and aligns with best practices for security.

Second, notice the use of services.AddKeyedSingleton. Registering SearchClient as a singleton is important, this is the standard pattern when working with Azure SDK clients in C#. By registering it as a KeyedService, we gain the ability to inject a specific “keyed” instance of SearchClient wherever it’s needed.

This becomes especially valuable when working with multiple indexes. In the past, developers often had to rely on custom wrappers or factory methods to manage multiple clients. With keyed services, the process is much simpler and more explicit.

On the next screen, we’ll see just how easy it is to inject a specific SearchClient instance into the service responsible for indexing data. Let’s analyze the service responsible for indexing data then.

using Azure.Search.Documents;
using Azure.Search.Documents.Models;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;

namespace DeployedInAzure.AzureAiSearchSamples
{
    public class SpaceEntityIndexer(
        ILogger<SpaceEntityIndexer> logger,
        [FromKeyedServices(AzureAiSearchIndexes.SPACE_ENTITIES)] SearchClient searchClient)
    {
        public async Task UploadAllAsync(IEnumerable<SpaceEntityIndexModel> documents, CancellationToken cancellationToken)
        {
            var batch = IndexDocumentsBatch.Upload(documents);

            IndexDocumentsResult result = await searchClient.IndexDocumentsAsync(batch, options: null, cancellationToken);

            foreach (var failedIndexingAttempt in result.Results.Where(result => !result.Succeeded))
            {
                logger.LogError("Failed to index document with key {Key}: {ErrorMessage}", failedIndexingAttempt.Key, failedIndexingAttempt.ErrorMessage);
            }
        }

        public async Task UploadAllAndThrowAsync(IEnumerable<SpaceEntityIndexModel> documents, CancellationToken cancellationToken)
        {
            var batch = IndexDocumentsBatch.Upload(documents);

            var options = new IndexDocumentsOptions
            {
                ThrowOnAnyError = true 
            };

            await searchClient.IndexDocumentsAsync(batch, options, cancellationToken);
        }
    }
}

using Azure.Search.Documents;
using Azure.Search.Documents.Models;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;

namespace DeployedInAzure.AzureAiSearchSamples
{
    public class SpaceEntityIndexer(
        ILogger<SpaceEntityIndexer> logger,
        [FromKeyedServices(AzureAiSearchIndexes.SPACE_ENTITIES)] SearchClient searchClient)
    {
        public async Task UploadAllAsync(IEnumerable<SpaceEntityIndexModel> documents, CancellationToken cancellationToken)
        {
            var batch = IndexDocumentsBatch.Upload(documents);

            IndexDocumentsResult result = await searchClient.IndexDocumentsAsync(batch, options: null, cancellationToken);

            foreach (var failedIndexingAttempt in result.Results.Where(result => !result.Succeeded))
            {
                logger.LogError("Failed to index document with key {Key}: {ErrorMessage}", failedIndexingAttempt.Key, failedIndexingAttempt.ErrorMessage);
            }
        }

        public async Task UploadAllAndThrowAsync(IEnumerable<SpaceEntityIndexModel> documents, CancellationToken cancellationToken)
        {
            var batch = IndexDocumentsBatch.Upload(documents);

            var options = new IndexDocumentsOptions
            {
                ThrowOnAnyError = true 
            };

            await searchClient.IndexDocumentsAsync(batch, options, cancellationToken);
        }
    }
}

In the code, we can see the usage of FromKeyedServices in the primary constructor. This ensures that the class receives the exact SearchClient instance associated with the specified key, rather than a random one, which is especially important when multiple clients are registered.

The class exposes two methods: UploadAllAsync and UploadAllAndThrowAsync.

UploadAllAndThrowAsync will throw an exception if at least one operation in the batch fails. This enforces a strict fail‑fast behavior, signaling that the indexing process did not fully succeed. However, it’s important to note that Azure AI Search does not provide transactional rollback which means that documents that were successfully indexed before the failure remain in the index.
UploadAllAsync, on the other hand, does not set ThrowOnAnyError (so it defaults to false). In this case, the method will not throw an exception if some operations fail. Instead, it returns a result object that allows you to inspect individual outcomes and determine which documents succeeded and which failed.

Choosing between these two approaches depends on the specific needs of your project:

Use UploadAllAndThrowAsync when you need immediate failure signaling and cannot silently allow partial successes.
Use UploadAllAsync when you want more flexibility and prefer to handle failures on a per‑document basis.

It is also worth mentioning that we used the Upload action IndexDocumentsBatch.Upload(documents), which works like an upsert operation (create an index document if it does not exist or update an existing one). Other supported actions include:

Merge – updates only the specified fields of an existing document; fails if the document does not exist.
MergeOrUpload – behaves like merge if the document exists, or upload if it is new.
Delete – removes the entire document from the index; to clear a single field, use merge and set that field to null.

To ensure everything functions correctly, we need to verify that the SpaceEntityIndexModel class maps properly to our index definition.

using System.Text.Json.Serialization;

namespace DeployedInAzure.AzureAiSearchSamples
{
    public record SpaceEntityIndexModel
    {
        public required string Id { get; init; }
        public required string Name { get; init; }
        public required string Type { get; init; }

        [JsonPropertyName("distance_light_years")]
        public required double DistanceInLightYears { get; init; }
    }
}

using System.Text.Json.Serialization;

namespace DeployedInAzure.AzureAiSearchSamples
{
    public record SpaceEntityIndexModel
    {
        public required string Id { get; init; }
        public required string Name { get; init; }
        public required string Type { get; init; }

        [JsonPropertyName("distance_light_years")]
        public required double DistanceInLightYears { get; init; }
    }
}

Below is a truncated version of the index definition.

{
  ...
  "name": "space-entities-index",
  "fields": [
    {
      "name": "id",
      "type": "Edm.String",
	  ...
      "key": true,
    },
    {
      "name": "name",
      "type": "Edm.String",
	  ...
      "key": false,
    },
    {
      "name": "type",
      "type": "Edm.String",
	  ...
      "key": false,
    },
    {
      "name": "distance_light_years",
      "type": "Edm.Double",
	  ...
      "key": false,
    }
  ],
  ...
}

{
  ...
  "name": "space-entities-index",
  "fields": [
    {
      "name": "id",
      "type": "Edm.String",
	  ...
      "key": true,
    },
    {
      "name": "name",
      "type": "Edm.String",
	  ...
      "key": false,
    },
    {
      "name": "type",
      "type": "Edm.String",
	  ...
      "key": false,
    },
    {
      "name": "distance_light_years",
      "type": "Edm.Double",
	  ...
      "key": false,
    }
  ],
  ...
}

Please note that one of the fields has to be marked as a key (identifier) using key: true . In our case it is the ID of the SpaceEntityIndexModel .

Summary

This article explored push indexing in Azure AI Search, a method that gives you full control over when and how data is ingested. Unlike the pull approach, push indexing sends documents directly to the index via APIs, making it ideal for custom scenarios.

We demonstrated this with Cosmos DB Change Feed and an Azure Function App, using the Azure.Search.Documents SDK and a SpaceEntityIndexModel to ensure schema alignment. Key practices included secure authentication with DefaultAzureCredential and registering SearchClient as a Keyed Singleton for easy multi‑index support.

Finally, the SpaceEntityIndexerdemonstrated two strategies:

UploadAllAndThrowAsync – enforces a fail‑fast behavior by throwing if any document fails, though successful documents remain indexed.
UploadAllAsync– provides flexible handling by returning per‑document results, allowing you to inspect successes and failures individually.

At this point we know enough about the pull and the push approach so it’s the right time to compare them in the next post. See you there!

Tags: Azure AI Search

Categorized in:

AI Services,

Mastering Data Indexing in Azure AI Search: Take Full Control (Push Approach)

Introduction

How Push Indexing Works

Registering the SearchClient and Pushing Data

Summary

Effective Data Indexing in Azure AI Search: Data Sources and Indexers (Pull Approach)

Azure AI Search Indexing Strategies: Pull vs Push Approach

Did you know? 📊

Thanks for the visit 🙌

Press ESC to close

Introduction

How Push Indexing Works

Registering the SearchClient and Pushing Data

Summary

Effective Data Indexing in Azure AI Search: Data Sources and Indexers (Pull Approach)

Azure AI Search Indexing Strategies: Pull vs Push Approach

More in this CategoryAI Services

Vector Quantization in Azure AI Search: Cut RAM by 97%

Integrated Vectorization in Azure AI Search: How to Automate Embeddings

Vectorizers in Azure AI Search: 5 Key Insights You Must Know

Vector Search in Azure AI Search: A Practical Guide

Did you know? 📊

Thanks for the visit 🙌