Introduction
I encourage you to read these posts first to get the most out of the information provided here:
- Getting Started with Azure AI Search: Features, Benefits, and Basics Explained
- Effective Data Indexing in Azure AI Search: Data Sources and Indexers (Pull Approach)
- Mastering Data Indexing in Azure AI Search: Take Full Control (Push Approach)
We have already discussed the basics of Azure AI Search, followed by how to leverage data sources and indexers for automated data ingestion using the pull approach. In another post, we explored the push approach, where the responsibility for indexing the data lies entirely on our side. With that foundational knowledge of the indexing strategies, we are now ready to compare these two approaches and address the key question that often arises in the early stages of working with Azure AI Search: when should you use the pull approach, and when the push approach?
When to use the Pull approach
The fact that, thanks to data sources and indexers, you can easily index your data and make it searchable is very appealing. In addition, you can apply various powerful skillsets using Azure AI Services such as Azure Vision, Azure AI Document Intelligence, AI Translator, and others to enrich the data. Let’s now consider the most suitable use cases for choosing this approach.
Pull indexing is best suited for:
- Rapid prototyping or proof‑of‑concept projects.
- Ingesting data from Azure‑native sources with minimal effort.
- Teams that prefer automation over custom development.
- Scenarios where scheduled refreshes are sufficient.
- Important: Indexers can be scheduled, but the minimum refresh interval is 5 minutes. If this is not acceptable in your project then push indexing or a hybrid approach is the only option!
- If your solution requires AI enrichment (skillsets) or integrated vectorization, you must use the pull model. Skillsets are bound to indexers and cannot run independently, so push indexing is not supported in these scenarios.
When to use the Push approach
What I like about the push approach is the control it gives us (and as developers, we do like having control, don’t we?). Instead of relying on data sources and indexers to handle ingestion automatically, we take full responsibility for deciding what gets indexed and when. This lets us tailor the process to complex scenarios, custom pipelines, or data that doesn’t fit predefined data sources. It also allows us to integrate indexing directly into our applications, ensuring updates happen exactly as we intend. With that in mind, let’s look at the situations where the push approach makes the most sense.
Push indexing works best in these scenarios:
- the minimal 5 minutes refresh interval is just not acceptable in your project
- Data originates outside Azure (e.g., CRM, ERP, third‑party APIs) and there is not a built-in data source you could simply use.
- You need per‑document control with actions like
Upload,Merge,MergeOrUpload, andDelete. - Complex transformations or enrichment are required before indexing.
- Integration with CI/CD pipelines or event‑driven workflows is needed.
- Important: Please remember that if you decide to rely fully on the push approach, you need to create a process that allows you to sync all data to the index yourself. In most cases, you will be updating individual index documents, but having the ability to index all data on demand is essential for enterprise solutions.
Hybrid strategies
I can imagine scenarios where readers feel uncertain because there are good reasons to leverage the pull approach, but the minimum 5 minutes refresh interval (as an example) can quickly become a limitation. Dear reader, all is not lost. We can take the best of both approaches and combine them!
Imagine we’re storing celestial objects in Azure Cosmos DB with fields like name, type, description and distance_light_years.
{
"id": "1",
"name": "Sirius",
"type": "Star",
"description": "Sirius is the brightest star in the night sky, located in the constellation Canis Major.",
"distance_light_years": 8.6,
"_rid": "9oRCANrQQLABAAAAAAAAAA==",
"_self": "dbs/9oRCAA==/colls/9oRCANrQQLA=/docs/9oRCANrQQLABAAAAAAAAAA==/",
"_etag": "\"00005800-0000-5600-0000-693d14210000\"",
"_attachments": "attachments/",
"_ts": 1765610529
}Now for the sake of that example let’s imagine that we need to fulfill two business requirements:
- The
descriptionfield must be translated into a few languages, but immediate updates are not mandatory. This data can be refreshed even every 24 hours. - Business also requires that changes to the
priority_levelproperty must be reflected immediately, defined as within <30 seconds. Thepriority_levelfield may have two states and an event is published to the Azure Service Bus queuepriority_level_changedwhenever its value changes.normal– default state for most objects.critical_observation– urgent cases requiring immediate attention.
Below is an example of an architecture that fulfills both of these requirements:

Let’s first focus on the upper part of the diagram. Here we see an Azure Function App that consumes messages from a specific Azure Service Bus queue (using a built‑in binding, for example) and then invokes the Merge operation to update a single field, priority_level. Keep in mind that the Merge operation assumes the document already exists. Such edge cases should be considered when implementing the final solution.
In the lower part of the diagram, we see an indexer that uses Azure Cosmos DB as a data source and points to our container with celestial objects. Within the indexer definition, there is a skill Microsoft.Skills.Text.TranslationSkill which can translate the description field into multiple languages using the Azure Translator service behind the scenes. The CRON expression configured in the index definition ensures the indexer runs every 24 hours, as required by the business.
ℹ️ Please remember that you can define additional indexers and data sources that point to the same index.
Summary
This article explained the pull, push, and hybrid approaches in Azure AI Search. The pull model works well when scheduled updates and automated AI enrichment are enough, while the push model is better when you need full control and immediate changes. Each approach has clear strengths, but in many projects the most effective solution is to combine them. Pull indexers can handle stable data and AI enrichment tasks, while push merges ensure that urgent updates, such as changes to the priority_level field, are applied right away.
In the end, hybrid indexing gives you both reliability and speed, scheduled updates for predictable data, and real‑time changes when fast reactions are required.
