Strategy Reference¶
GraphRAG SDK uses the Strategy pattern for every algorithmic concern. Each concern has an abstract base class (ABC) with one or more built-in implementations. You can swap any implementation or write your own.
Overview¶
| # | Concern | ABC | Built-in Implementations |
|---|---|---|---|
| 1 | Loading | LoaderStrategy |
TextLoader, PdfLoader, MarkdownLoader |
| 2 | Chunking | ChunkingStrategy |
FixedSizeChunking, SentenceTokenCapChunking, ContextualChunking, CallableChunking, StructuralChunking |
| 3 | Extraction | ExtractionStrategy |
GraphExtraction |
| 4 | Resolution | ResolutionStrategy |
ExactMatchResolution, DescriptionMergeResolution |
| 5 | Retrieval | RetrievalStrategy |
LocalRetrieval, MultiPathRetrieval |
| 6 | Reranking | RerankingStrategy |
CosineReranker |
1. LoaderStrategy¶
Reads raw text from a data source.
ABC¶
from graphrag_sdk import LoaderStrategy
class LoaderStrategy(ABC):
@abstractmethod
async def load(self, source: str, ctx: Context) -> DocumentOutput:
"""Load text from the given source identifier."""
...
Built-in: TextLoader¶
Reads plain text and markdown files.
from graphrag_sdk.ingestion.loaders.text_loader import TextLoader
loader = TextLoader(encoding="utf-8") # default encoding
Built-in: PdfLoader¶
Extracts text from PDF files. Requires pip install graphrag-sdk[pdf].
Built-in: MarkdownLoader¶
Extracts text from Markdown files. Requires pip install graphrag-sdk[markdown].
Design Note: Markup Preservation
For complex elements like tables, lists, and code blocks, MarkdownLoader intentionally outputs the raw markdown source (including pipes |, list dashes -, and code fences) rather than stripping the syntax. While this introduces minor syntax "noise", it preserves critical structural cues (such as spatial column alignment and nested indentation) that the LLM requires during the Extraction phase to accurately parse relational data.
Default Behavior¶
If no loader is specified in ingest():
- .pdf files use PdfLoader
- .md files use MarkdownLoader
- Everything else uses TextLoader
- If text= is passed directly, the loader is skipped
Writing Your Own¶
class HtmlLoader(LoaderStrategy):
async def load(self, source: str, ctx: Context) -> DocumentOutput:
from bs4 import BeautifulSoup
with open(source) as f:
soup = BeautifulSoup(f.read(), "html.parser")
return DocumentOutput(text=soup.get_text())
await rag.ingest("page.html", loader=HtmlLoader())
2. ChunkingStrategy¶
Splits document text into overlapping chunks for processing.
ABC¶
from graphrag_sdk import ChunkingStrategy
class ChunkingStrategy(ABC):
@abstractmethod
async def chunk(self, text: str, ctx: Context) -> TextChunks:
"""Split text into chunks."""
...
Built-in: FixedSizeChunking¶
Fixed-size character windows with configurable overlap.
from graphrag_sdk.ingestion.chunking_strategies.fixed_size import FixedSizeChunking
chunker = FixedSizeChunking(
chunk_size=1000, # characters per chunk (default: 1000)
chunk_overlap=100, # overlap between chunks (default: 100)
)
Tuning guidance:
- Default (1000/100) works well for general use
- Benchmark-winning config uses 1500/200 for richer extraction context
- Smaller chunks (500) for fine-grained retrieval, larger (2000) for broader context
Built-in: SentenceTokenCapChunking¶
Splits at sentence boundaries (never mid-sentence) and enforces a hard token cap per chunk using tiktoken. No LLM or embedder required.
from graphrag_sdk.ingestion.chunking_strategies.sentence_token_cap import SentenceTokenCapChunking
chunker = SentenceTokenCapChunking(
max_tokens=512, # max tokens per chunk (default: 512)
overlap_sentences=2, # sentences shared between chunks (default: 2)
encoding_name="cl100k_base", # tiktoken encoding (default: cl100k_base)
)
Built-in: ContextualChunking¶
Sentence-boundary chunking with LLM-generated context prefixes prepended to each chunk (Anthropic's contextual retrieval approach). Improves retrieval for cross-chunk co-reference questions.
from graphrag_sdk.ingestion.chunking_strategies.contextual_chunking import ContextualChunking
chunker = ContextualChunking(
llm=my_llm,
max_tokens=512, # token cap per chunk (default: 512)
overlap_sentences=2, # sentence overlap (default: 2)
max_document_tokens=16_000, # truncation limit for the doc reference in prompts (default: 16000)
)
Cost note: generates one LLM call per chunk at ingestion time.
Built-in: CallableChunking (bring your own framework)¶
Adapts any text -> list[str] function into a chunking strategy. Use this to plug in any chunking library -- LlamaIndex, LangChain, Unstructured, spaCy, or your own logic -- without the SDK carrying those dependencies.
Works with sync functions, async functions, and callable classes.
from graphrag_sdk.ingestion.chunking_strategies.callable_chunking import CallableChunking
# Plain function
chunker = CallableChunking(lambda text: text.split("\n\n"))
Built-in: StructuralChunking¶
Groups content by heading hierarchy into token-bounded chunks. Each chunk stores a breadcrumbs metadata field that is written as a property on the Chunk node in the knowledge graph, making section paths directly queryable via Cypher.
from graphrag_sdk.ingestion.chunking_strategies.structural_chunking import StructuralChunking
chunker = StructuralChunking(
max_tokens=512, # max tokens per chunk (default: 512)
overlap_sentences=2, # sentences shared between chunks (default: 2)
)
Design Features:
- Strict Fallback Configuration: If you supply a custom fallback_chunker (to handle elements that individually exceed max_tokens), you cannot pass shorthand arguments like overlap_sentences or encoding_name to StructuralChunking. Those must be configured directly on your custom fallback chunker instance. This prevents configuration parameters from being silently dropped.
- Deep-Tree Resilience: While loaders like MarkdownLoader produce flat element lists, the internal _flatten algorithm uses a recursive DFS approach. This guarantees future compatibility with highly nested DOM structures (like HTML or DOCX parsers) while preserving full hierarchical breadcrumbs.
- Graceful Raw Text Fallback: Designed to compose safely with any loader. If the preceding loader does not extract structural AST elements (e.g., PdfLoader or TextLoader which output elements=None), the chunker gracefully bypasses its structural logic and delegates the entire raw text to the fallback chunker, without crashing or dropping content.
Writing Your Own¶
class SentenceChunking(ChunkingStrategy):
async def chunk(self, text: str, ctx: Context) -> TextChunks:
import nltk
sentences = nltk.sent_tokenize(text)
chunks = []
for i, sent in enumerate(sentences):
chunks.append(TextChunk(text=sent, index=i))
return TextChunks(chunks=chunks)
3. ExtractionStrategy¶
Extracts entities, relationships, and entity mentions from text chunks.
ABC¶
from graphrag_sdk import ExtractionStrategy
class ExtractionStrategy(ABC):
@abstractmethod
async def extract(
self,
chunks: TextChunks,
schema: GraphSchema,
ctx: Context,
) -> GraphData:
"""Extract graph data from text chunks."""
...
Built-in: GraphExtraction¶
Composable 2-step extraction with pluggable entity NER and LLM relationship extraction.
Step 1 -- Entity NER (pluggable via EntityExtractor ABC):
- GLiNERExtractor (default): Local GLiNER transformer model, no API calls. Returns typed entities with confidence scores and character spans.
- LLMExtractor: Uses a structured NER prompt. Returns entities with confidence, spans, and descriptions.
- Custom: Subclass EntityExtractor and implement extract_entities().
Step 2 -- LLM Verify + Relationship Extraction: The LLM receives the pre-extracted entities and original text, verifies entities (removes invalid, adds missed), and extracts relationships with descriptions, keywords, confidence, and evidence spans.
Entity Ontology: Every extracted entity is mapped to a known type from the ontology. Entities that don't match any type are labeled "Unknown". There are three ways to define the ontology:
1. Use the defaults (11 built-in types, good for general use):
from graphrag_sdk import GraphExtraction
# Uses: Person, Organization, Technology, Product, Location, Date,
# Event, Concept, Law, Dataset, Method
extractor = GraphExtraction(llm=llm)
2. Pass entity_types directly (overrides defaults completely):
# Biomedical domain
extractor = GraphExtraction(
llm=llm,
entity_types=["Gene", "Protein", "Disease", "Drug", "Pathway"],
)
# Legal domain
extractor = GraphExtraction(
llm=llm,
entity_types=["Person", "Organization", "Law", "Court", "Jurisdiction", "Date"],
)
3. Use GraphSchema entities (schema types override both defaults and entity_types):
from graphrag_sdk import GraphRAG, GraphSchema, EntityType
schema = GraphSchema(entities=[
EntityType(label="Vehicle", description="Cars, trucks, etc."),
EntityType(label="Road", description="Streets, highways, etc."),
EntityType(label="Location", description="Cities, countries, etc."),
])
# Schema entity types are automatically used for extraction
rag = GraphRAG(connection=conn, llm=llm, embedder=embedder, schema=schema)
await rag.ingest("traffic_report.txt")
# Extraction uses: ["Vehicle", "Road", "Location"]
The priority order is: schema.entities > entity_types parameter > defaults.
Default entity types: Person, Organization, Technology, Product, Location, Date, Event, Concept, Law, Dataset, Method.
Choosing an Entity Extractor¶
from graphrag_sdk import GraphExtraction, GLiNERExtractor, LLMExtractor
# Default: GLiNER for step 1, LLM for step 2
extractor = GraphExtraction(llm=llm)
# Use LLM for step 1 instead of GLiNER
extractor = GraphExtraction(
llm=llm,
entity_extractor=LLMExtractor(llm),
)
# GLiNER with custom threshold
extractor = GraphExtraction(
llm=llm,
entity_extractor=GLiNERExtractor(threshold=0.6),
)
# With coreference resolution
from graphrag_sdk import FastCorefResolver
extractor = GraphExtraction(
llm=llm,
coref_resolver=FastCorefResolver(), # pip install graphrag-sdk[fastcoref]
)
Entity Extractors:
| Class | Description | Parameters |
|---|---|---|
GLiNERExtractor |
Local GLiNER model (default, no API calls) | threshold=0.75, model_name="urchade/gliner_medium-v2.1" |
LLMExtractor |
LLM-based NER via structured prompt | llm (required), threshold=0.75 |
| Custom subclass | Your own EntityExtractor subclass |
Implement extract_entities() |
All extractors share the same threshold behavior: entities with confidence below the threshold are labeled "Unknown".
Graph output:
- All relationships use RELATES edge type. The original type (e.g. WORKS_AT) is in properties["rel_type"].
- Entity IDs are type-qualified: compute_entity_id("Paris", "Location") -> "paris__location".
- Character spans stored as properties["spans"] = {chunk_id: [{start, end}]} on both entities and relationships.
- Entity mentions (MENTIONED_IN edges) link entities to source chunks.
Writing Your Own Entity Extractor¶
Subclass EntityExtractor and implement extract_entities():
from graphrag_sdk import EntityExtractor, GraphExtraction
from graphrag_sdk.core.models import ExtractedEntity
class SpaCyExtractor(EntityExtractor):
def __init__(self, model_name="en_core_web_sm"):
import spacy
self._nlp = spacy.load(model_name)
async def extract_entities(self, text, entity_types, source_chunk_id):
import asyncio
doc = await asyncio.to_thread(self._nlp, text)
return [
ExtractedEntity(
name=ent.text,
type=ent.label_,
description="",
source_chunk_ids=[source_chunk_id],
spans={source_chunk_id: [{"start": ent.start_char, "end": ent.end_char}]},
)
for ent in doc.ents
]
# Use it
extractor = GraphExtraction(llm=llm, entity_extractor=SpaCyExtractor())
await rag.ingest("doc.txt", extractor=extractor)
Writing Your Own Extraction Strategy¶
Replace the entire 2-step pipeline by subclassing ExtractionStrategy:
class MyExtraction(ExtractionStrategy):
async def extract(self, chunks, schema, ctx):
nodes, rels = [], []
for chunk in chunks.chunks:
# Your extraction logic
...
return GraphData(nodes=nodes, relationships=rels)
await rag.ingest("doc.txt", extractor=MyExtraction())
4. ResolutionStrategy¶
Deduplicates entities that refer to the same real-world thing.
ABC¶
from graphrag_sdk import ResolutionStrategy
class ResolutionStrategy(ABC):
@abstractmethod
async def resolve(
self,
graph_data: GraphData,
ctx: Context,
) -> ResolutionResult:
"""Deduplicate entities in the graph data."""
...
Returns ResolutionResult with deduplicated nodes, remapped relationships, and merged_count.
Built-in: ExactMatchResolution¶
Deduplicates by exact property match (default: id). Fast, no LLM calls.
from graphrag_sdk.ingestion.resolution_strategies.exact_match import ExactMatchResolution
resolver = ExactMatchResolution(
resolve_property="id", # property to match on (default: "id")
)
When to use: Default. Fast and deterministic. Works well when extraction produces consistent entity IDs.
Built-in: DescriptionMergeResolution¶
Deduplicates by (normalized name, label) -- same-name entities with different labels (e.g. Person "Paris" vs Location "Paris") are kept separate. Merges descriptions:
- If fewer than force_summary_threshold descriptions: concatenates them
- If more: uses LLM to summarize into a single description
from graphrag_sdk.ingestion.resolution_strategies.description_merge import DescriptionMergeResolution
resolver = DescriptionMergeResolution(
llm=llm, # LLMInterface for summarization (optional)
force_summary_threshold=3, # Trigger LLM summary at this many descriptions (default: 3)
max_summary_tokens=500, # Max tokens for LLM summary (default: 500)
)
When to use: Multi-document ingestion where the same entity appears with different descriptions. Used in the benchmark-winning pipeline.
5. RetrievalStrategy¶
Searches the knowledge graph to find context for answering a question. Uses the Template Method pattern: search() handles validation and formatting, you implement _execute().
ABC¶
from graphrag_sdk import RetrievalStrategy
class RetrievalStrategy(ABC):
def __init__(self, graph_store=None, vector_store=None):
self.graph_store = graph_store
self.vector_store = vector_store
async def search(self, query: str, ctx: Context = None, **kwargs) -> RetrieverResult:
"""Public API: validate -> execute -> format."""
...
@abstractmethod
async def _execute(self, query: str, ctx: Context, **kwargs) -> RawSearchResult:
"""Implement your search logic here."""
...
Built-in: LocalRetrieval¶
Simple retrieval: vector search on chunks + 1-hop entity traversal.
from graphrag_sdk.retrieval.strategies.local import LocalRetrieval
retriever = LocalRetrieval(
graph_store=rag.graph_store,
vector_store=rag.vector_store,
embedder=embedder,
top_k=5, # chunks to retrieve (default: 5)
include_entities=True, # include connected entities (default: True)
)
When to use: Simple use cases, low latency requirements, small graphs.
Built-in: MultiPathRetrieval¶
Production-grade retrieval with RELATES edge vector search, 2-path entity discovery, 4-path chunk retrieval, and cosine reranking. This is the default and the benchmark-winning strategy.
from graphrag_sdk import MultiPathRetrieval
retriever = MultiPathRetrieval(
graph_store=rag.graph_store,
vector_store=rag.vector_store,
embedder=embedder,
llm=llm,
chunk_top_k=15, # final chunks after reranking (default: 15)
max_entities=30, # total entity cap (default: 30)
max_relationships=20, # max relationships in context (default: 20)
rel_top_k=15, # RELATES edge vector search results (default: 15)
keyword_limit=10, # max keywords from question (default: 10)
)
rag = GraphRAG(connection=conn, llm=llm, embedder=embedder, retrieval_strategy=retriever)
When to use: Default choice. Best accuracy on benchmark. Handles complex multi-hop questions.
Writing Your Own¶
class GlobalRetrieval(RetrievalStrategy):
async def _execute(self, query, ctx, **kwargs):
# Your custom retrieval logic
results = await self.vector_store.search(query_vector, top_k=20)
return RawSearchResult(records=results)
rag = GraphRAG(connection=conn, llm=llm, embedder=embedder, retrieval_strategy=GlobalRetrieval())
6. RerankingStrategy¶
Reranks retrieval results before they are passed to the LLM for answer generation.
ABC¶
from graphrag_sdk import RerankingStrategy
class RerankingStrategy(ABC):
@abstractmethod
async def rerank(
self,
query: str,
result: RetrieverResult,
ctx: Context,
) -> RetrieverResult:
"""Rerank and filter retrieval results."""
...
Built-in: CosineReranker¶
Reranks by cosine similarity between query embedding and item embeddings.
from graphrag_sdk import CosineReranker
reranker = CosineReranker(
embedder=embedder,
top_k=15, # keep top N results (default: 15)
)
result = await rag.completion("question", reranker=reranker)
Note: MultiPathRetrieval already includes cosine reranking internally. The standalone CosineReranker is useful when using LocalRetrieval or a custom strategy.
Writing Your Own¶
class LLMReranker(RerankingStrategy):
async def rerank(self, query, result, ctx):
# Score each item with the LLM and sort
scored = []
for item in result.items:
score = await self.llm.ainvoke(f"Rate relevance 0-10: {query} vs {item.content}")
scored.append((float(score.content), item))
scored.sort(reverse=True)
return RetrieverResult(items=[item for _, item in scored[:10]])