Introduction
You can find all the C# code samples here: RAG GitHub Repository.
Graph RAG is the next topic in the RAG series, so you can always start with more basic topics because this one is rather more complicated.
- Naive RAG Explained: The Core Pattern
- Hybrid Search in RAG: BM25 + Vectors for Better Retrieval
- Semantic Ranking in Azure AI Search: How Cross-Encoders Improve RAG Retrieval
- Multi-Query Retrieval for RAG: Query Rewrites in Azure AI Search
- HyDE for RAG in Azure: Improve Retrieval with Hypothetical Embeddings
- Chunking Strategies for RAG with C#: Fixed-Size, Semantic, Parent-Child and more
- Contextual Retrieval for RAG: Additional Context to Boost Accuracy
This post is inspired by the Graph RAG proposed by Microsoft, but I want to emphasise that it is a gentle introduction to the Graph RAG topic, especially for C# developers and for those for whom graph databases are rather an exotic topic. Please do not expect any advanced Graph related topics to be covered (like creating communities and different variants of graph search).
I believe that after reading this article you are ready to delve into the nuances of Graph RAG that Microsoft proposed.
Let’s discuss a basic Graph RAG based on a real C# example using Neo4j as a graph database.
Why naive RAG is not enough

When I talk about naive RAG, I mean the simple solution where we take data from a source, split it into chunks, create vectors, and then push those chunks into a search index (for example, Azure AI Search).
You start testing your solution. You upload the first file and ask a specific question. You are happy to see that the information you wanted was found in the top results of a vector search. You try a few more questions, everything seems to work, and then you pass the solution to the QA team and go to grab a coffee, satisfied with the results.
A few moments later…
The testers start their work and they find some problems. They ask a very general question that requires understanding the whole document, but the AI starts giving low-quality or irrelevant answers. This shows a main weakness: standard RAG is not good at understanding the “big picture” or summarizing concepts across a large document. It is hard for the system to understand general ideas when it can only look at the data through small, separate chunks.
Then, they ask another question where the answer is not in a single chunk, but is scattered across different sections. Again, the AI fails. This is a classic case where baseline RAG fails to “connect the dots”. It is difficult for the system to put together different pieces of information that are linked by shared details to give a complete answer. This happens because the AI cannot easily move between different parts of the document to find how they are related.
The Solution
Let’s focus on the main limitation of classic RAG which is its inability to “connect the dots.” If you think about connecting the dots, what kind of data structure comes to mind?
Graphs!
Graphs are all about creating nodes (also known as vertices) and relationships (also known as edges). They are often used in solutions where the number and nature of relationships are so high and rich that they cannot be easily expressed using a traditional SQL database.
Knowledge Graphs

The RAG pattern is about injecting the right piece of data into the LLM. To do this better, we create knowledge graphs with:
- Nodes: Entities that represent various logical concepts in your data source (like a service name or a technical term).
- Relationships: Links between these entities that show how one concept is related to another (for example, “Service A Depends On Service B”).
The goal of a knowledge graph is to capture as much meaningful information as possible from your source document. However, it doesn’t stop there. You can also connect information from other documents. Because of this, the knowledge boundary is no longer limited to a single file, but can span across many different sources.
How LLMs can help
Sometimes you already have a graph database with meaningful data in your project. But in most cases, when you start with Graph RAG, you face a challenge: how do you create such a knowledge graph while processing a document in your data ingestion pipeline?
Use an LLM!
LLMs are excellent at finding entities and relationships in text, so we should leverage this capability. We can craft a specific prompt and simply ask the model: “Hey, given this chunk of text, please identify the main entities and the relationships between them to create a knowledge graph.”
Now that we have the basic concepts, let’s analyze step by step how the entire Graph RAG process works.
How Graph RAG works
Let’s split the entire process into logical steps so it is easier to understand how Graph RAG works. In this post and in the C# example, I will be using a markdown file that contains Microsoft technical documentation (you can find it in the 08_GraphRAG project under Data/grounding-data-design.md).
Source Data Ingestion

For each of our source documents, we follow these steps:
- Split the document into chunks: We don’t have to use any specific chunking strategies just because we are using Graph RAG. You should stick to the strategy that works best for your classic RAG pipeline.
- Call an LLM for each chunk to build a knowledge graph: When we create an entity, we also create a vector that represents it. In my example, I use a combination of
Entity:Name+Entity:Descriptionto create this vector. - Add metadata: We can ask the LLM to do additional things, such as measuring the strength of a relationship between two entities.
- Aggregate and Merge: Finally, we take these “per-chunk” knowledge graphs and merge them with the existing data in our graph database.
Our goal is to build a knowledge graph based on many different documents. This ensures that relationships between concepts in different files are preserved.
The Importance of Aggregation
The aggregation step is very important because there are a few “edge cases” to consider, such as:
- What to do when a duplicate of an existing
Entity → Relationship → Entitypath is found. - What to do when another entity with the same name and label is found, but it has a different description.
I don’t want to focus on these edge cases in this blog post, and in the C# example, I used a very simple approach. However, I want to emphasize that the quality of the final knowledge graph directly affects the quality of the results you get. I hope it is an understandable trade-off… better input > better output, as always 🙂
The Search Phase
In classic RAG, the search phase is very straightforward:
- Convert the user prompt into a vector.
- Perform a vector search (or a hybrid search).
- Return the TOP N most similar vectors.
- Retrieve the raw chunks that were used to create these vectors.
- Inject the chunks into the final prompt.
The Graph RAG search pipeline follows a very similar path.

- Convert the user prompt into a query vector.
- Perform a vector search against our graph database. (Yes, this is possible, and I will show you how to do it using Neo4j).
- Return the TOP N most similar “Seed Entities.” These are the entry points to our graph.
- Traverse the graph. We can now follow the relationships from these entry points to find the most relevant entities and connections.
- Format the data. We take the information we found and format it into a structure that the LLM can easily understand.
- Inject the data into the final LLM prompt.
I think this is enough dose of theoretical introduction, so now let’s jump straight into the C# example which uses models deployed in Microsoft Foundry and Neo4j as a graph database.
C# example
These are the dependencies I use in this C# example:
public class GraphRAGExample
{
private readonly ChatClient _chatClient;
private readonly EmbeddingClient _embeddingClient;
private readonly TextChunker _textChunker;
private readonly GraphDb _graphDb;
public GraphRAGExample()
{
var openAiClient = new AzureOpenAIClient(
new Uri(Environment.GetEnvironmentVariable("AZURE_OPEN_AI_CLIENT_URI")!),
new DefaultAzureCredential());
_chatClient = openAiClient.GetChatClient(Environment.GetEnvironmentVariable("AZURE_OPEN_AI_EMBEDDING_CHAT_CLIENT_DEPLOYMENT_NAME")!);
_embeddingClient = openAiClient.GetEmbeddingClient(Environment.GetEnvironmentVariable("AZURE_OPEN_AI_EMBEDDING_CLIENT_DEPLOYMENT_NAME")!);
_textChunker = new TextChunker();
_graphDb = new GraphDb();
}In the constructor of the GraphRAGExample class, we initialize the following services:
- Secure AI Clients: I set up the
ChatClientandEmbeddingClientusingAzureOpenAIClient. By usingDefaultAzureCredential, I ensure the application is ready for a secure, keyless production environment (I useAzure AI UserRBAC role).- I use
gpt-5.4-minifor chat andtext-embedding-ada-002for creating embeddings. Both are deployed in Microsoft Foundry
- I use
- Chunking: The
TextChunkeris configured to use fixed-size chunking with an overlap. This simple strategy is sufficient here but you can use a more advanced technique too. - Graph Database: Finally, I initialize the
GraphDbwrapper, which manages our Neo4j connections and provides a clean interface for our Cypher queries (Cypher is a graph optimized query language).
Data indexing
Let me skip the chunking part and let’s focus on what matters the most in the indexing pipeline which is creating a knowledge graph per chunk.
private async Task<KnowledgeGraph> GetKnowledgeGraphAsync(string chunk)
{
ChatCompletion chatCompletion = await _chatClient.CompleteChatAsync(new List<ChatMessage>
{
new SystemChatMessage(Prompts.GetSystemPrompt()),
new UserChatMessage(Prompts.GetUserPrompt(chunk))
},
new ChatCompletionOptions()
{
ResponseFormat = ChatResponseFormat.CreateJsonSchemaFormat(
"knowledge_graph",
jsonSchema: BinaryData.FromString(JsonSchemas.KnowledgeGraph)),
Temperature = 0
});
var result = JsonSerializer.Deserialize<KnowledgeGraph>(chatCompletion.Content[0].Text, _jsonSerializerOptions)
?? throw new InvalidOperationException("Failed to deserialize rewrites.");
var embeddingInputs = result.Entities.Select(e => $"{e.Name}: {e.Description}").ToList();
// in a real production app you should batch these embeddingInputs to not exceed some limits
var embeddingResult = await _embeddingClient.GenerateEmbeddingsAsync(embeddingInputs);
var entitiesWithEmbeddings = result.Entities
.Zip(embeddingResult.Value, (entity, embedding) => entity with { Embedding = embedding.ToFloats().ToArray() })
.ToList();
return result with { Entities = entitiesWithEmbeddings };
}This method is the engine that transforms raw text into a structured knowledge graph. I use the LLM to extract entities and relationships using a strict JSON schema and zero temperature to ensure the results are more predictable and easy to parse. Once I have the data, I generate embeddings by combining each entity’s name and description ($"{e.Name}: {e.Description}") into a single string to represent its semantic meaning. These vectors are then linked back to the entities using a simple Zip operation before returning the final result.
We should also ensure that the vector search index exists.
public async Task EnsureVectorIndexAsync()
{
await _driver.ExecutableQuery($$"""
CREATE VECTOR INDEX {{VectorIndexName}} IF NOT EXISTS
FOR (e:Entity)
ON e.embedding
OPTIONS {
indexConfig: {
`vector.dimensions`: {{EmbeddingDimensions}},
`vector.similarity_function`: 'cosine'
}
}
""")
.ExecuteAsync();
}Now, it’s the right time to check how the system prompt responsible for creating these knowledge graphs looks like:
public static string GetSystemPrompt()
{
return """
# ROLE
You are a Principal Azure Solutions Architect. Your goal is to convert technical documentation into a Layered Knowledge Graph. You must categorize entities by their functional domain to ensure architectural consistency.
# TASK
Analyze the SOURCE_DOCUMENT_CHUNK. Extract Entities and Relationships, mapping them strictly to the Domain-Driven Schema below.
# THE DOMAIN-DRIVEN SCHEMA (CRITICAL)
## 1. RESOURCE DOMAIN (The "What")
- Type: `AZURE_RESOURCE`
- Focus: Deployable, billable resources and locations.
- Examples: Azure AI Search, Cosmos DB, North Europe Region.
## 2. DESIGN DOMAIN (The "How")
- Type: `LOGICAL_CONCEPT`
- Focus: Architectural patterns, methodologies, and abstract designs.
- Examples: RAG, Multi-tenant Architecture, Semantic Search, Chunking.
## 3. FUNCTIONAL DOMAIN (The "Parts")
- Type: `TECHNICAL_FEATURE`
- Focus: Specific knobs, sub-components, or capabilities within a resource.
- Examples: HNSW Index, Managed Identity, Private Endpoint, Vector Quantization.
## 4. GOVERNANCE DOMAIN (The "Why")
- Type: `QUALITY_ATTRIBUTE`
- Focus: Non-functional requirements, metrics, SLAs, and security constraints.
- Examples: Latency, 99.9% Availability, SOC2 Compliance, Cost Optimization.
# NAMING & GRANULARITY RULES
1. ATOMIC NAMES: Use "Azure AI Search", not "The implementation of the search service".
2. NO META-DATA: Do not extract article titles, section headers, or "the documentation" as entities.
3. STANDARDIZATION: Use industry-standard terms. Map "Managed ID" to "Managed Identity".
# RELATIONSHIP PREDICATES (Layer-to-Layer)
- `CONTAINS`: Structural (Resource -> Feature).
- `IMPLEMENTS`: Logic (Resource -> Concept).
- `DEPENDS_ON`: Infrastructure (Resource -> Resource).
- `SECURES`: Protection (Feature -> Resource/Concept).
- `IMPACTS`: Performance (Feature/Resource -> Quality Attribute).
# RELATIONSHIP WEIGHT (Centrality Score)
For every relationship you extract, assign a `weight` (float, 0.0–1.0) that reflects how central that connection is to the meaning of the text:
- **1.0**: Primary dependency — the relationship is the core point of the sentence/paragraph (e.g., the text exists to explain this connection).
- **0.5**: Supporting detail — the relationship is mentioned to clarify or back up a main point.
- **0.1**: Brief mention — the relationship appears only in passing and is not elaborated upon.
Use intermediate values (e.g., 0.7, 0.3) when the centrality falls between those anchor points.
# EXAMPLE: CROSS-DOMAIN MAPPING
**Input:** "Using Private Endpoints in Azure AI Search reduces the attack surface but may impact latency."
**Output:**
{
"entities": [
{"name": "Azure AI Search", "type": "AZURE_RESOURCE", "description": "AI-powered retrieval service."},
{"name": "Private Endpoint", "type": "TECHNICAL_FEATURE", "description": "Network interface for private service access."},
{"name": "Latency", "type": "QUALITY_ATTRIBUTE", "description": "Network and processing delay metric."}
],
"relationships": [
{"source": "Azure AI Search", "target": "Private Endpoint", "label": "CONTAINS", "description": "Service supports private network integration.", "weight": 0.8},
{"source": "Private Endpoint", "target": "Latency", "label": "IMPACTS", "description": "Network encapsulation can introduce measurable delay.", "weight": 0.7}
]
}
""";
}The system prompt is like a set of rules that tells the LLM how to understand and categorize your technical data. I really recommend creating a domain specific system prompt if possible because using terms from your own field, like Azure architecture here makes the knowledge graph much more accurate. By defining clear rules for naming and relationships, you help the AI find important connections that a general prompt would probably miss. The more domain knowledge and specific rules you put in this prompt, the better and more professional your final knowledge graph will be.
Below you can see what the LLM generated based on a single chunk (it is per-chunk knowledge graph).

There were 22 entities (the screenshot is truncated) and 22 relationships created.

The last step is to build an aggregated knowledge graph for the entire document and push it to Neo4j using Cypher (I don’t want focus on Cypher too much here but you can find all the queries in the repository). This part of logic is very relevant because as mentioned before, you would need to define your strategy for resolving conflicts which may appear (like a duplicate entity or a duplicate relationship).
Performing Search

There are 3 parameters which control the simple graph traversal algorithm used in this C# example:
- topK: This determines how many “seed nodes” we find using the initial vector search. These are the entry points where the graph exploration begins.
- traversalDepth: This defines the maximum number of steps (or “hops”) the algorithm takes to explore the neighborhood around your seed nodes.
- minPathScore: This acts as a quality filter that only keeps paths where the combined strength of the relationships meets your minimum requirement.
You can find the full query in the repo but I would like to focus on the first 2 phases of that Cypher query because it explains the most important part:
// Phase 1: vector ANN search — find seed nodes closest to the query
CALL db.index.vector.queryNodes('{{VectorIndexName}}', $topK, $queryVector)
YIELD node AS seedNode, score
WITH collect(seedNode) AS seedNodes
// Phase 2: optional variable-depth traversal > seed nodes are kept even when no paths qualify
UNWIND seedNodes AS seedNode
OPTIONAL MATCH path = (seedNode)-[rels*1..{{traversalDepth}}]-(endNode)
WITH seedNodes,
collect(DISTINCT CASE
WHEN rels IS NOT NULL
AND reduce(score = 1.0, r IN rels | score * coalesce(r.weight, 0.5)) >= $minPathScore
THEN {nodes: nodes(path), rels: rels}
ELSE null
END) AS qualifiedPathsI would like to describe the 3 lines from that query:
- Line 2: I use a vector search to find the
topKseed nodes based on aqueryVectorwhich are the semantic entry points to our graph based on the user’s query. - Line 8: This step explores the neighborhood around those seeds up to a specific depth (
traversalDepth), keeping the seed nodes even if no connections are found (OPTIONAL MATCH). - Line 12: I calculate a path score by multiplying the weights of all relationships and filter out any paths that are too “weak” for the LLM (
minPathScore).
If I increas the traversalDepth, the amount of results grows very fast. This is why the reduce function is so useful because it calculates the “strength” of a path and helps me filter out the noise. You should look further than just the nearest neighbors because exploring deeper into the graph is exactly how you find those abstract and important concepts.
Let me show you how it looks like in Neo4j when I change the value of that minPathScore param. I will be using such a query assuming one of our entry points is Query performance , let’s keep traversalDepth as 4:
MATCH p=(n:Entity)-[*1..4]-(m)
WHERE n.name IN ['Query performance'] AND reduce(score = 1.0, rel IN relationships(p) | score * rel.weight) >= [MIN_PATH_SCORE]
RETURN pMin. path score = 0.2 (one arrow points to the entry point node and 2 other point to the total number of Nodes and Relationships found). Results: Nodes=56 and Relationships=63:

Min. path score = 0.4, Results: Nodes=44 and Relationships=48

Min. path score = 0.6, Results: Nodes=15 and Relationships=14

I showed this simple example to demonstrate how to perform such a search operation. It is mostly to build some intuition and understanding of the topic, so please do not treat it as prod ready.
What I want you to remember from that section are 2 things:
- 1st – to perform an effective search you should first find the entry points to the graph (by using a vector seach for instance)
- 2nd – once entry points are found then you should perform graph traversal, ideally quite deep but to do that you also need a way to assess if the route from your entry point to a node is really relevant. Using the reduce function with relationship weights is a great starting point for this.
Of course, the last step is just to format the response received from such a query and inject it into a final prompt.
That is a basic Graph RAG in a nutshell using C# and Neo4j.
Other variations of Graph RAG
Now, once we have covered basics let me mention briefly about a few alternative setups you may consider while implement Graph RAG in your project.
- Entity extraction from the query: You can use NLP or an LLM to pull entities directly from the user’s question to find the best entry points in your graph.
- Neo4j search variations: You can use full-text search (+ fuzzy search) or even a hybrid search directly within Neo4j to locate your nodes.
- Mixing graph with “unstructured” retrieval: If you already have an Azure AI Search index using Hybrid Search with a Semantic Reranker, you can treat Neo4j as a supplement. You can run two queries in parallel > one to get the “structured knowledge graph” from Neo4j and a second one from Azure AI Search. This allows you to provide the original text verbatim and include citations.
Summary
I hope that this blog post helped you to build some intuition around the Graph RAG pattern especially if that topic is completely new to you. I believe that now you are ready to explore more advanced concepts used in Microsoft Graph RAG implementation.
Thanks for reading and see you in the next post!
