Introduction
Hey everyone!
You can find all the C# code samples here: MicrosoftAgentFramework GitHub Repo
If you are looking for information to deepen your expertise about RAG in Microsoft Agent Framework, I am sure you will find this post helpful.
We will go through the basics, discussing the AIContextProvider class and TextSearchProvider class. We will cover different TextSearchBehavior options like BeforeAIInvoke and OnDemandFunctionCalling, and we will also analyze which approach to use: an ordinary tool call using the AIFunctionFactory.Create method or an AIContextProvider.
We will go beyond the basics so that you will know how to connect such an AIContextProvider with the chat history and what customization options are available. All of that will certainly help you master the RAG in Microsoft Agent Framework topic.
It might be helpful to read these posts first:
- Microsoft Agent Framework Tutorial: Get Started with AI Agents in .NET
- Chat History in Microsoft Agent Framework: Moving to Azure Production Storage
- Chat History in Microsoft Agent Framework: Service-Managed Chat History
Let’s begin.
Let’s provide the context!
The essence of RAG in the Microsoft Agent Framework (and in general) is providing additional context… so let’s provide it!
In the SimpleAiContextProviderExample, you can see such logic in the constructor:
public SimpleAiContextProviderExample()
{
// trimmed for brevity
_mafAgent = new ChatClientAgent(responsesClient, new ChatClientAgentOptions()
{
Name = "Simple Context Provider Test",
ChatOptions = new ChatOptions
{
Instructions = "You are a helpful assistant."
},
AIContextProviders = [new SimpleKeywordContextProvider()]
});
}As you can see, I am passing SimpleKeywordContextProvider to the AIContextProviders collection when creating a new ChatClientAgent. So, the first thing to remember is that you can provide many AIContextProviders, not just a single one.
Let’s see how such a simple provider looks like:
public class SimpleKeywordContextProvider : AIContextProvider
{
protected override ValueTask<AIContext> ProvideAIContextAsync(
InvokingContext context,
CancellationToken cancellationToken = default)
{
var lastUserMessage = context.AIContext.Messages?
.LastOrDefault(m => m.Role == ChatRole.User)?.Text ?? string.Empty;
var customContext = new AIContext();
var contextBuilder = new StringBuilder();
if (lastUserMessage.Contains("remote", StringComparison.OrdinalIgnoreCase) ||
lastUserMessage.Contains("work", StringComparison.OrdinalIgnoreCase))
{
contextBuilder.AppendLine("Source: Corporate Remote Work Policy 2026\n" +
"Content: The 2026 remote work policy allows employees to work from anywhere within Poland for up to 3 days per week. Mondays are mandatory office days.");
}
if (contextBuilder.Length > 0)
{
Console.WriteLine(" -> Matching context discovered! Injecting grounding data.");
customContext.Messages = [
new ChatMessage(ChatRole.User, $"Use this background data to accurately answer the user:\n{contextBuilder}")
];
}
else
{
Console.WriteLine(" -> No matching keywords found. Proceeding with base prompt.");
}
return ValueTask.FromResult(customContext);
}
}You can see that we inherit from the AIContextProvider base class and override the ProvideAIContextAsync method, which receives an InvokingContext (which consists of AIAgent, AgentSession, and AIContext) and returns an AIContext.
I use a primitive search method here, but in reality, you will be using your intelligent search (such as Azure AI Search, Azure Cosmos DB, or others like Neo4j, Qdrant, etc.). I perform a search operation using the last user message here, but of course, you can customize it however you want. We will get back to it later.
The most important piece of this logic is:
if (contextBuilder.Length > 0)
{
Console.WriteLine(" -> Matching context discovered! Injecting grounding data.");
customContext.Messages = [
new ChatMessage(ChatRole.User, $"Use this background data to accurately answer the user:\n{contextBuilder}")
];
}where I am adding the context to the AIContext by simply creating a ChatMessage (Microsoft.Extensions.AI) with a User role and the appropriate text. Please also remember that AIContext consists of Tools, Instructions, and Messages, but in this particular example, I pass just Messages to enrich the context.
Now, let’s take a deep dive into how AIContextProvider works behind the scenes, including how it merges various AIContext objects when multiple providers are registered to support RAG in the Microsoft Agent Framework.
The 4 Lifecycle Hooks in AIContextProvider
In the KeywordSearchAiContextProviderExample, I register CustomKeywordSearchContextProvider as the only AIContextProvider. Here, I would like to focus on the four methods that we can override when using AIContextProvider, including the ProvideAIContextAsync method you already know:
public class CustomKeywordSearchContextProvider : AIContextProvider
{
public CustomKeywordSearchContextProvider() : base(
provideInputMessageFilter: null,
storeInputRequestMessageFilter: null,
storeInputResponseMessageFilter: null) { }
protected override async ValueTask<AIContext> InvokingCoreAsync(InvokingContext context, CancellationToken cancellationToken = default)
{
// trimmed for brevity
}
protected override ValueTask<AIContext> ProvideAIContextAsync(InvokingContext context, CancellationToken cancellationToken = default)
{
// trimmed for brevity
}
protected override async ValueTask InvokedCoreAsync(InvokedContext context, CancellationToken cancellationToken = default)
{
// trimmed for brevity
}
protected override ValueTask StoreAIContextAsync(InvokedContext context, CancellationToken cancellationToken = default)
{
// trimmed for brevity
}
}I trimmed the content of these methods because ProvideAIContextAsync looks the same as in the previous example, and for the other three methods, I just added logging and invoked the base implementation. Let’s run the agent and analyze the console output:
You: remote work on Monday
[1. InvokingCoreAsync] - Pipeline orchestration wrapper triggered.
[2. ProvideAIContextAsync] - Custom business lookup logic processing...
-> Match successfully discovered. Injecting text as transient prompt context.
[3. InvokedCoreAsync] - Post-turn pipeline interceptor starting...
[4. StoreAIContextAsync] - Turn completed. Evaluating outcome state...
-> Session step completed. Succeeded processing 1 response messages.
Agent: On Monday, remote work is not allowed under the 2026 policy.
Policy summary:
- Remote work is allowed from anywhere within Poland
- Up to 3 days per week
- Mondays are mandatory office days
So for Monday, you need to work from the office.You can clearly see the exact order of execution, and if you read my previous post about the chat history provider, you will find many similarities here. We can easily split this execution pipeline into two distinct phases: before the LLM call and after the LLM call. The first two methods are executed before the model runs, while the other two are invoked immediately after the LLM returns its response.
But the most powerful similarity lies directly in the base constructor filters: provideInputMessageFilter, storeInputRequestMessageFilter, and storeInputResponseMessageFilter. Just like the built-in filters found in the chat history providers, these predicates give you control over the pipeline message stream:
provideInputMessageFilter: Controls which historical messages from the active session are allowed to pass into this specific provider’sProvideAIContextAsyncloop. If you leave this asnull, the framework defaults to an external-only filter (AgentRequestMessageSourceType.External). This means that by default, your provider will only see the current turn’s fresh user input, completely blinding it to past chat history unless you explicitly change it (I will change it later to include the chat history as well).storeInputRequestMessageFilter&storeInputResponseMessageFilter: Define which input requests and output response messages are tracked or preserved by the provider’s storage lifecycle hooks. Similar to the input filter, the request storage filter also defaults to an external-only filter if left unconfigured, while the response filter acts as a no-op, allowing all assistant outputs through.
Understanding the 4 Methods
Important Note: In the vast majority of RAG scenarios, you do not need to override anything other than
ProvideAIContextAsync. The framework provides robust base implementations for the other hooks, meaning you should only touch them if you need advanced out-of-band telemetry, custom caching, or complex pipeline modifications.
Here is what each of these four methods does behind the scenes, including how they call into one another:

InvokingCoreAsync
This method serves as the entry point of the provider’s setup sequence before the LLM is called, merging all registered AIContext objects by combining their Tools, Instructions, and Messages. Before doing anything else, it runs the session history through your provideInputMessageFilter to decide exactly what data your provider can see. It then triggers the call to ProvideAIContextAsync with those filtered messages, takes the resulting grounding data, and loops through them to explicitly stamp each one with AgentRequestMessageSourceType.AIContextProvider and your class name for proper pipeline attribution.
ProvideAIContextAsync
This is the core lifecycle method where you implement your custom business lookup logic, such as executing keyword matching or querying a vector database like Azure AI Search. It returns a transient AIContext containing the specific grounding data that will be appended to the current turn’s prompt.

InvokedCoreAsync
This method triggers immediately after the LLM finishes generating its response but before that output is handed back to the user. It acts as a post-turn interceptor, first checking if the turn failed, and then running the accumulated request and response messages through your storeInputRequestMessageFilter and storeInputResponseMessageFilter. Once it isolates exactly which messages are allowed through the filters, it invokes StoreAIContextAsync with those filtered results.
StoreAIContextAsync
This is the final lifecycle hook executed at the very end of a conversational turn to handle state persistence. It receives the already filtered collections directly from InvokedCoreAsync, allowing you to save runtime metadata, audit logs, or updated session state parameters back into your persistent storage layer.
In this post we will skip that part completely because I focus entirely on the context injection part and not saving any data.
TextSearchProvider deep dive
There is a class that you can use out of the box from the Microsoft.Agents.AI namespace to inject additional context, and that class is TextSearchProvider. Let’s see how it works:
public TextSearchProviderExample()
{
_mafAgent = new ChatClientAgent(responesClient, new ChatClientAgentOptions()
{
// trimmed for brevity
AIContextProviders = [
new TextSearchProvider(SearchMethodAsync, new TextSearchProviderOptions
{
SearchTime = TextSearchProviderOptions.TextSearchBehavior.BeforeAIInvoke,
RecentMessageMemoryLimit = 2,
FunctionToolDescription = "This tool contains information about work policies and benefits in various offices and locations."
})
]
});
}TextSearchResults and the Prompt
private Task<IEnumerable<TextSearchProvider.TextSearchResult>> SearchMethodAsync(string query, CancellationToken cancellationToken)
{
List<TextSearchProvider.TextSearchResult> results = [];
if (query.Contains("remote", StringComparison.OrdinalIgnoreCase))
{
results.Add(new()
{
RawRepresentation = null,
SourceName = "Corporate Remote Work Policy 2026",
SourceLink = "https://internal.company.com/policies/remote-2026",
Text = "The 2026 remote work policy allows employees to work from anywhere within Poland for up to 3 days per week. Mondays are mandatory office days."
});
}
return Task.FromResult<IEnumerable<TextSearchProvider.TextSearchResult>>(results);
}When you use TextSearchProvider, your search delegate (like SearchMethodAsync in my example) must return a collection of TextSearchResult objects. Each result contains properties like Text, SourceName,SourceLink and RawPresentation. Behind the scenes, the framework formats these results into a text block and appends them directly to the prompt as a standard ChatMessage with a User role so the model can use them for grounding.
Customization Options
You have plenty of flexibility to control how this context behaves by using TextSearchProviderOptions. You can configure options like ContextPrompt to prepend custom instructions right before the search results, or CitationsPrompt to enforce strict citation rules. If you want full control over the final string layout, you can assign a custom delegate to the ContextFormatter property, which overrides the default formatting entirely. This is how it looks with the default setting if there is a single match found:
## Additional Context
Consider the following information from source documents when responding to the user:
SourceDocName: Corporate Remote Work Policy 2026
SourceDocLink: https://internal.company.com/policies/remote-2026
Contents: The 2026 remote work policy allows employees to work from anywhere within Poland for up to 3 days per week. Mondays are mandatory office days.
----
Include citations to the source document with document name and link if document name and link is available.
BeforeAIInvoke vs. OnDemandFunctionCalling
The choice of SearchTime completely changes how the query is shaped and executed:
BeforeAIInvoke: The framework automatically triggers the search before every single LLM call. It evaluates the current context using a rolling conversation window controlled byRecentMessageMemoryLimit.OnDemandFunctionCalling: The search provider stops running automatically and turns into an explicit tool call. The framework exposes it to the agent as a functional tool, using the text you provided inFunctionToolDescriptionto advertise it to the model.

The query is shaped differently here (What is the policy for remote work on Monday?) because the LLM generates the search arguments dynamically based on its understanding of the user’s intent, instead of just grabbing the text from the last few messages.
Pitfalls to Watch Out For
While TextSearchProvider simplifies a lot of plumbing, I ran into two architectural limitations that can easily catch you off guard when moving toward a production setup.
1. The Single Parameter Constraint
The underlying search delegate wrapped by the out-of-the-box TextSearchProvider is built to accept only a single string parameter (the text query). If you want to design a tool with multiple parameters, such as query, country, and officeCity because you need to apply filtering inside Azure AI Search (for example:filter="Country eq Poland and City eq Krakow" ), you cannot do it using this built-in provider. To handle multi parameter structured queries, you will need to bypass TextSearchProvider and either implement your own custom class inheriting from AIContextProvider to expose those complex tools, or register them as standalone tools using the AIFunctionFactory.Create method.
2. The Empty StateBag Dilemma in BeforeAIInvoke
When you use BeforeAIInvoke, the provider relies on reading past message context from the session state layer. However, when a new HTTP request arrives at your API, this state bag will be completely empty unless you serialize and deserialize the entire AgentSession object across requests. If you are only saving the output of your chat history provider to an external database, the TextSearchProvider will have no access to it, blinding its rolling message window on subsequent turns. You can check the HistoryReadingAiContextProviderExample in the repo to see how to solve that problem.
Choosing the Right Pattern
Now you may wonder, which approach should you use? There are 3 distinct approaches available:
AIContextProvider.ProvideAIContextAsyncwhich provides a User Chat message within theAIContext.AIContextProvider.ProvideAIContextAsyncwhich returns an executable tool within theAIContextclass.- Ordinary standalone function tool calling (
tools: [AIFunctionFactory.Create(YourSearchMethod)]).
My way of thinking is the following: if you want to gain easy access to the 4 lifecycle hooks that AIContextProvider provides, you should go that route.
Do you need to ensure that the context is injected all the time, no matter what? Then choose the AIContextProvider with a User chat message approach. If you don’t have such a requirement and want to delegate the decision of invoking a given tool to the LLM + letting the model formulate the question or fill multiple parameters for your search function, then wrapping it in a provider as an on-demand tool is the right way to go.
If you want to start quickly, you can just use TextSearchProvider and choose either BeforeAIInvoke or OnDemandFunctionCalling. In scenarios that do not require much customization, it might be the simplest and fastest option.
On the other hand, if you want to use an ordinary tool call and you don’t need these lifecycle hooks at all, just register it directly as a standalone tool.
Summary
I hope that this post helped you to understand the various options which are available for RAG in the Microsoft Agent Framework. Moving beyond the default configurations gives us incredible control over how our models interact with backend data. With that knowledge and knowing your specific project requirements, you can make the right decision of how to inject that additional context now.
Thank you for reading and see you in the next post!
