Getting Started with GraphRAG SDK¶
A step-by-step tutorial for building a knowledge graph from documents and querying it with natural language.
1. Prerequisites¶
- Python 3.10+
- FalkorDB (easiest via Docker -- see below)
- An LLM API key from one of the supported providers:
- Azure OpenAI
- OpenAI
- Anthropic
2. Installation¶
Install the SDK with all optional dependencies:
For a local editable install from a cloned repo:
3. Start FalkorDB¶
Run FalkorDB as a Docker container:
Verify it is running:
4. Configure Environment¶
Set the environment variables for your LLM provider. The example below uses Azure OpenAI:
export AZURE_OPENAI_API_KEY="your-key"
export AZURE_OPENAI_ENDPOINT="https://your-resource.openai.azure.com/"
export AZURE_OPENAI_API_VERSION="2024-12-01-preview"
If you use a .env file, load it yourself before importing the SDK (e.g., via python-dotenv or export commands). The SDK reads environment variables but does not auto-load .env files.
5. Define a Schema¶
A GraphSchema tells the extraction pipeline which entity and relationship types to look for in your documents.
from graphrag_sdk import GraphSchema, EntityType, RelationType
schema = GraphSchema(
entities=[
EntityType(label="Person", description="A human being"),
EntityType(label="Organization", description="A company or institution"),
EntityType(label="Place", description="A geographic location"),
],
relations=[
RelationType(label="WORKS_AT", description="Employment relationship"),
RelationType(label="LOCATED_IN", description="Geographic location"),
RelationType(label="KNOWS", description="Personal acquaintance"),
],
)
You can add as many entity and relationship types as your domain requires. Descriptions help the LLM decide when to extract each type.
6. Initialize GraphRAG¶
Create a GraphRAG instance by providing a connection, LLM, embedder, and schema:
from graphrag_sdk import GraphRAG, ConnectionConfig, LiteLLM, LiteLLMEmbedder
rag = GraphRAG(
connection=ConnectionConfig(host="localhost", graph_name="my_graph"),
llm=LiteLLM(model="azure/gpt-4.1"),
embedder=LiteLLMEmbedder(model="azure/text-embedding-3-large", dimensions=256),
schema=schema,
embedding_dimension=256, # must match your embedding model's output dimension
)
The default is 256 (matched-Matryoshka dimensions of text-embedding-3-large). If your embedding model produces a different dimensionality (e.g., 1024, 1536, or 3072), set embedding_dimension accordingly.
ConnectionConfig accepts additional parameters such as port, username, password, and query_timeout_ms. See docs/configuration.md for the full list.
Alternative providers: The SDK also exports
OpenRouterLLMandOpenRouterEmbedderfor use with OpenRouter. See docs/configuration.md for details.
7. Ingest a Document¶
From a file path¶
result = await rag.ingest("path/to/document.txt")
print(f"Created {result.nodes_created} nodes, {result.relationships_created} relationships")
print(f"Indexed {result.chunks_indexed} chunks")
From raw text¶
result = await rag.ingest("acme_doc", text="Acme Corp was founded in 1985 by Jane Doe in Austin, Texas.")
print(f"Created {result.nodes_created} nodes, {result.relationships_created} relationships")
print(f"Indexed {result.chunks_indexed} chunks")
The ingestion pipeline runs a 9-step process: Load, Chunk, Lexical Graph, Extract (includes quality filtering), Prune, Resolve, Write, then Mentions and Chunk Indexing in parallel.
8. Query the Knowledge Graph¶
Retrieve context only¶
Use retrieve() when you want to inspect the context or use your own LLM:
context = await rag.retrieve("Who works at Acme Corp?")
for item in context.items:
print(f"[{item.score:.2f}] {item.content[:100]}...")
Generate an answer¶
Use completion() for the full RAG pipeline — retrieval + answer generation:
With context inspection¶
Pass return_context=True to see which chunks and entities the retriever used to build the answer:
result = await rag.completion("Who works at Acme Corp?", return_context=True)
for item in result.retriever_result.items:
print(f"[{item.metadata.get('section', '')}] {item.content[:100]}...")
Multi-turn conversations¶
completion() supports native multi-turn conversations. Messages are passed directly to the LLM's chat API as structured messages:
from graphrag_sdk import ChatMessage
result = await rag.completion(
"What happened to her after that?",
history=[
ChatMessage(role="user", content="Who works at Acme Corp?"),
ChatMessage(role="assistant", content="Jane Doe works at Acme Corp."),
],
)
print(result.answer)
You can also pass history as plain dicts:
result = await rag.completion(
"Tell me more.",
history=[
{"role": "user", "content": "Who founded Acme?"},
{"role": "assistant", "content": "Jane Doe founded Acme in 1985."},
],
)
Supported roles: "system", "user", "assistant". Invalid roles raise ValueError.
9. Inspect Graph Contents¶
Use get_statistics() to see a summary of what the graph contains:
stats = await rag.graph_store.get_statistics()
print(f"Nodes: {stats['node_count']}, Edges: {stats['edge_count']}")
You can also run raw Cypher queries against the graph:
results = await rag.graph_store.query_raw("MATCH (p:Person)-[:WORKS_AT]->(o:Organization) RETURN p.name, o.name LIMIT 10")
for row in results.result_set:
print(row)
10. Finalize¶
After all documents have been ingested, run finalize() to deduplicate entities, backfill embeddings, and create indexes:
results = await rag.finalize()
print(f"Deduplicated: {results['entities_deduplicated']}")
print(f"Embedded: {results['entities_embedded']} entities, {results['relationships_embedded']} relationships")
This step is important for query accuracy. It merges duplicate entities (e.g., "J. Doe" and "Jane Doe") and ensures all entities have vector embeddings for semantic search.
11. Next Steps¶
- docs/configuration.md -- Tuning connection settings, chunking parameters, and retrieval options.
- docs/strategies.md -- Custom extraction and resolution strategies.
- docs/benchmark.md -- Reproducing benchmark results on the GraphRAG-Bench Novel corpus (20 novels, 2,010 questions).
Synchronous API¶
If you are not in an async context, use the synchronous convenience methods:
result = rag.ingest_sync("path/to/document.txt")
context = rag.retrieve_sync("Who works at Acme Corp?")
result = rag.completion_sync("Who works at Acme Corp?")
results = rag.finalize_sync()
These wrap the async methods in asyncio.run() and are useful for scripts and notebooks.