Skip to content

Strategy Reference

GraphRAG SDK uses the Strategy pattern for every algorithmic concern. Each concern has an abstract base class (ABC) with one or more built-in implementations. You can swap any implementation or write your own.

Overview

# Concern ABC Built-in Implementations
1 Loading LoaderStrategy TextLoader, PdfLoader, MarkdownLoader
2 Chunking ChunkingStrategy FixedSizeChunking, SentenceTokenCapChunking, ContextualChunking, CallableChunking, StructuralChunking
3 Extraction ExtractionStrategy GraphExtraction
4 Resolution ResolutionStrategy ExactMatchResolution, DescriptionMergeResolution
5 Retrieval RetrievalStrategy LocalRetrieval, MultiPathRetrieval
6 Reranking RerankingStrategy CosineReranker

1. LoaderStrategy

Reads raw text from a data source.

ABC

from graphrag_sdk import LoaderStrategy

class LoaderStrategy(ABC):
    @abstractmethod
    async def load(self, source: str, ctx: Context) -> DocumentOutput:
        """Load text from the given source identifier."""
        ...

Built-in: TextLoader

Reads plain text and markdown files.

from graphrag_sdk.ingestion.loaders.text_loader import TextLoader

loader = TextLoader(encoding="utf-8")  # default encoding

Built-in: PdfLoader

Extracts text from PDF files. Requires pip install graphrag-sdk[pdf].

from graphrag_sdk.ingestion.loaders.pdf_loader import PdfLoader

loader = PdfLoader()

Built-in: MarkdownLoader

Extracts text from Markdown files. Requires pip install graphrag-sdk[markdown].

from graphrag_sdk.ingestion.loaders.markdown_loader import MarkdownLoader

loader = MarkdownLoader()

Design Note: Markup Preservation For complex elements like tables, lists, and code blocks, MarkdownLoader intentionally outputs the raw markdown source (including pipes |, list dashes -, and code fences) rather than stripping the syntax. While this introduces minor syntax "noise", it preserves critical structural cues (such as spatial column alignment and nested indentation) that the LLM requires during the Extraction phase to accurately parse relational data.

Default Behavior

If no loader is specified in ingest(): - .pdf files use PdfLoader - .md files use MarkdownLoader - Everything else uses TextLoader - If text= is passed directly, the loader is skipped

Writing Your Own

class HtmlLoader(LoaderStrategy):
    async def load(self, source: str, ctx: Context) -> DocumentOutput:
        from bs4 import BeautifulSoup
        with open(source) as f:
            soup = BeautifulSoup(f.read(), "html.parser")
        return DocumentOutput(text=soup.get_text())

await rag.ingest("page.html", loader=HtmlLoader())

2. ChunkingStrategy

Splits document text into overlapping chunks for processing.

ABC

from graphrag_sdk import ChunkingStrategy

class ChunkingStrategy(ABC):
    @abstractmethod
    async def chunk(self, text: str, ctx: Context) -> TextChunks:
        """Split text into chunks."""
        ...

Built-in: FixedSizeChunking

Fixed-size character windows with configurable overlap.

from graphrag_sdk.ingestion.chunking_strategies.fixed_size import FixedSizeChunking

chunker = FixedSizeChunking(
    chunk_size=1000,    # characters per chunk (default: 1000)
    chunk_overlap=100,  # overlap between chunks (default: 100)
)

Tuning guidance: - Default (1000/100) works well for general use - Benchmark-winning config uses 1500/200 for richer extraction context - Smaller chunks (500) for fine-grained retrieval, larger (2000) for broader context

Built-in: SentenceTokenCapChunking

Splits at sentence boundaries (never mid-sentence) and enforces a hard token cap per chunk using tiktoken. No LLM or embedder required.

from graphrag_sdk.ingestion.chunking_strategies.sentence_token_cap import SentenceTokenCapChunking

chunker = SentenceTokenCapChunking(
    max_tokens=512,         # max tokens per chunk (default: 512)
    overlap_sentences=2,    # sentences shared between chunks (default: 2)
    encoding_name="cl100k_base",  # tiktoken encoding (default: cl100k_base)
)

Built-in: ContextualChunking

Sentence-boundary chunking with LLM-generated context prefixes prepended to each chunk (Anthropic's contextual retrieval approach). Improves retrieval for cross-chunk co-reference questions.

from graphrag_sdk.ingestion.chunking_strategies.contextual_chunking import ContextualChunking

chunker = ContextualChunking(
    llm=my_llm,
    max_tokens=512,              # token cap per chunk (default: 512)
    overlap_sentences=2,         # sentence overlap (default: 2)
    max_document_tokens=16_000,  # truncation limit for the doc reference in prompts (default: 16000)
)

Cost note: generates one LLM call per chunk at ingestion time.

Built-in: CallableChunking (bring your own framework)

Adapts any text -> list[str] function into a chunking strategy. Use this to plug in any chunking library -- LlamaIndex, LangChain, Unstructured, spaCy, or your own logic -- without the SDK carrying those dependencies.

Works with sync functions, async functions, and callable classes.

from graphrag_sdk.ingestion.chunking_strategies.callable_chunking import CallableChunking

# Plain function
chunker = CallableChunking(lambda text: text.split("\n\n"))

Built-in: StructuralChunking

Groups content by heading hierarchy into token-bounded chunks. Each chunk stores a breadcrumbs metadata field that is written as a property on the Chunk node in the knowledge graph, making section paths directly queryable via Cypher.

from graphrag_sdk.ingestion.chunking_strategies.structural_chunking import StructuralChunking

chunker = StructuralChunking(
    max_tokens=512,  # max tokens per chunk (default: 512)
    overlap_sentences=2,  # sentences shared between chunks (default: 2)
)

Design Features: - Strict Fallback Configuration: If you supply a custom fallback_chunker (to handle elements that individually exceed max_tokens), you cannot pass shorthand arguments like overlap_sentences or encoding_name to StructuralChunking. Those must be configured directly on your custom fallback chunker instance. This prevents configuration parameters from being silently dropped. - Deep-Tree Resilience: While loaders like MarkdownLoader produce flat element lists, the internal _flatten algorithm uses a recursive DFS approach. This guarantees future compatibility with highly nested DOM structures (like HTML or DOCX parsers) while preserving full hierarchical breadcrumbs. - Graceful Raw Text Fallback: Designed to compose safely with any loader. If the preceding loader does not extract structural AST elements (e.g., PdfLoader or TextLoader which output elements=None), the chunker gracefully bypasses its structural logic and delegates the entire raw text to the fallback chunker, without crashing or dropping content.

Writing Your Own

class SentenceChunking(ChunkingStrategy):
    async def chunk(self, text: str, ctx: Context) -> TextChunks:
        import nltk
        sentences = nltk.sent_tokenize(text)
        chunks = []
        for i, sent in enumerate(sentences):
            chunks.append(TextChunk(text=sent, index=i))
        return TextChunks(chunks=chunks)

3. ExtractionStrategy

Extracts entities, relationships, and entity mentions from text chunks.

ABC

from graphrag_sdk import ExtractionStrategy

class ExtractionStrategy(ABC):
    @abstractmethod
    async def extract(
        self,
        chunks: TextChunks,
        schema: GraphSchema,
        ctx: Context,
    ) -> GraphData:
        """Extract graph data from text chunks."""
        ...

Built-in: GraphExtraction

Composable 2-step extraction with pluggable entity NER and LLM relationship extraction.

Step 1 -- Entity NER (pluggable via EntityExtractor ABC): - GLiNERExtractor (default): Local GLiNER transformer model, no API calls. Returns typed entities with confidence scores and character spans. - LLMExtractor: Uses a structured NER prompt. Returns entities with confidence, spans, and descriptions. - Custom: Subclass EntityExtractor and implement extract_entities().

Step 2 -- LLM Verify + Relationship Extraction: The LLM receives the pre-extracted entities and original text, verifies entities (removes invalid, adds missed), and extracts relationships with descriptions, keywords, confidence, and evidence spans.

Entity Ontology: Every extracted entity is mapped to a known type from the ontology. Entities that don't match any type are labeled "Unknown". There are three ways to define the ontology:

1. Use the defaults (11 built-in types, good for general use):

from graphrag_sdk import GraphExtraction

# Uses: Person, Organization, Technology, Product, Location, Date,
#       Event, Concept, Law, Dataset, Method
extractor = GraphExtraction(llm=llm)

2. Pass entity_types directly (overrides defaults completely):

# Biomedical domain
extractor = GraphExtraction(
    llm=llm,
    entity_types=["Gene", "Protein", "Disease", "Drug", "Pathway"],
)

# Legal domain
extractor = GraphExtraction(
    llm=llm,
    entity_types=["Person", "Organization", "Law", "Court", "Jurisdiction", "Date"],
)

3. Use GraphSchema entities (schema types override both defaults and entity_types):

from graphrag_sdk import GraphRAG, GraphSchema, EntityType

schema = GraphSchema(entities=[
    EntityType(label="Vehicle", description="Cars, trucks, etc."),
    EntityType(label="Road", description="Streets, highways, etc."),
    EntityType(label="Location", description="Cities, countries, etc."),
])

# Schema entity types are automatically used for extraction
rag = GraphRAG(connection=conn, llm=llm, embedder=embedder, schema=schema)
await rag.ingest("traffic_report.txt")
# Extraction uses: ["Vehicle", "Road", "Location"]

The priority order is: schema.entities > entity_types parameter > defaults.

Default entity types: Person, Organization, Technology, Product, Location, Date, Event, Concept, Law, Dataset, Method.

Choosing an Entity Extractor

from graphrag_sdk import GraphExtraction, GLiNERExtractor, LLMExtractor

# Default: GLiNER for step 1, LLM for step 2
extractor = GraphExtraction(llm=llm)

# Use LLM for step 1 instead of GLiNER
extractor = GraphExtraction(
    llm=llm,
    entity_extractor=LLMExtractor(llm),
)

# GLiNER with custom threshold
extractor = GraphExtraction(
    llm=llm,
    entity_extractor=GLiNERExtractor(threshold=0.6),
)

# With coreference resolution
from graphrag_sdk import FastCorefResolver

extractor = GraphExtraction(
    llm=llm,
    coref_resolver=FastCorefResolver(),  # pip install graphrag-sdk[fastcoref]
)

Entity Extractors:

Class Description Parameters
GLiNERExtractor Local GLiNER model (default, no API calls) threshold=0.75, model_name="urchade/gliner_medium-v2.1"
LLMExtractor LLM-based NER via structured prompt llm (required), threshold=0.75
Custom subclass Your own EntityExtractor subclass Implement extract_entities()

All extractors share the same threshold behavior: entities with confidence below the threshold are labeled "Unknown".

Graph output: - All relationships use RELATES edge type. The original type (e.g. WORKS_AT) is in properties["rel_type"]. - Entity IDs are type-qualified: compute_entity_id("Paris", "Location") -> "paris__location". - Character spans stored as properties["spans"] = {chunk_id: [{start, end}]} on both entities and relationships. - Entity mentions (MENTIONED_IN edges) link entities to source chunks.

Writing Your Own Entity Extractor

Subclass EntityExtractor and implement extract_entities():

from graphrag_sdk import EntityExtractor, GraphExtraction
from graphrag_sdk.core.models import ExtractedEntity

class SpaCyExtractor(EntityExtractor):
    def __init__(self, model_name="en_core_web_sm"):
        import spacy
        self._nlp = spacy.load(model_name)

    async def extract_entities(self, text, entity_types, source_chunk_id):
        import asyncio
        doc = await asyncio.to_thread(self._nlp, text)
        return [
            ExtractedEntity(
                name=ent.text,
                type=ent.label_,
                description="",
                source_chunk_ids=[source_chunk_id],
                spans={source_chunk_id: [{"start": ent.start_char, "end": ent.end_char}]},
            )
            for ent in doc.ents
        ]

# Use it
extractor = GraphExtraction(llm=llm, entity_extractor=SpaCyExtractor())
await rag.ingest("doc.txt", extractor=extractor)

Writing Your Own Extraction Strategy

Replace the entire 2-step pipeline by subclassing ExtractionStrategy:

class MyExtraction(ExtractionStrategy):
    async def extract(self, chunks, schema, ctx):
        nodes, rels = [], []
        for chunk in chunks.chunks:
            # Your extraction logic
            ...
        return GraphData(nodes=nodes, relationships=rels)

await rag.ingest("doc.txt", extractor=MyExtraction())

4. ResolutionStrategy

Deduplicates entities that refer to the same real-world thing.

ABC

from graphrag_sdk import ResolutionStrategy

class ResolutionStrategy(ABC):
    @abstractmethod
    async def resolve(
        self,
        graph_data: GraphData,
        ctx: Context,
    ) -> ResolutionResult:
        """Deduplicate entities in the graph data."""
        ...

Returns ResolutionResult with deduplicated nodes, remapped relationships, and merged_count.

Built-in: ExactMatchResolution

Deduplicates by exact property match (default: id). Fast, no LLM calls.

from graphrag_sdk.ingestion.resolution_strategies.exact_match import ExactMatchResolution

resolver = ExactMatchResolution(
    resolve_property="id",  # property to match on (default: "id")
)

When to use: Default. Fast and deterministic. Works well when extraction produces consistent entity IDs.

Built-in: DescriptionMergeResolution

Deduplicates by (normalized name, label) -- same-name entities with different labels (e.g. Person "Paris" vs Location "Paris") are kept separate. Merges descriptions: - If fewer than force_summary_threshold descriptions: concatenates them - If more: uses LLM to summarize into a single description

from graphrag_sdk.ingestion.resolution_strategies.description_merge import DescriptionMergeResolution

resolver = DescriptionMergeResolution(
    llm=llm,                       # LLMInterface for summarization (optional)
    force_summary_threshold=3,     # Trigger LLM summary at this many descriptions (default: 3)
    max_summary_tokens=500,        # Max tokens for LLM summary (default: 500)
)

When to use: Multi-document ingestion where the same entity appears with different descriptions. Used in the benchmark-winning pipeline.


5. RetrievalStrategy

Searches the knowledge graph to find context for answering a question. Uses the Template Method pattern: search() handles validation and formatting, you implement _execute().

ABC

from graphrag_sdk import RetrievalStrategy

class RetrievalStrategy(ABC):
    def __init__(self, graph_store=None, vector_store=None):
        self.graph_store = graph_store
        self.vector_store = vector_store

    async def search(self, query: str, ctx: Context = None, **kwargs) -> RetrieverResult:
        """Public API: validate -> execute -> format."""
        ...

    @abstractmethod
    async def _execute(self, query: str, ctx: Context, **kwargs) -> RawSearchResult:
        """Implement your search logic here."""
        ...

Built-in: LocalRetrieval

Simple retrieval: vector search on chunks + 1-hop entity traversal.

from graphrag_sdk.retrieval.strategies.local import LocalRetrieval

retriever = LocalRetrieval(
    graph_store=rag.graph_store,
    vector_store=rag.vector_store,
    embedder=embedder,
    top_k=5,                # chunks to retrieve (default: 5)
    include_entities=True,  # include connected entities (default: True)
)

When to use: Simple use cases, low latency requirements, small graphs.

Built-in: MultiPathRetrieval

Production-grade retrieval with RELATES edge vector search, 2-path entity discovery, 4-path chunk retrieval, and cosine reranking. This is the default and the benchmark-winning strategy.

from graphrag_sdk import MultiPathRetrieval

retriever = MultiPathRetrieval(
    graph_store=rag.graph_store,
    vector_store=rag.vector_store,
    embedder=embedder,
    llm=llm,
    chunk_top_k=15,         # final chunks after reranking (default: 15)
    max_entities=30,        # total entity cap (default: 30)
    max_relationships=20,   # max relationships in context (default: 20)
    rel_top_k=15,           # RELATES edge vector search results (default: 15)
    keyword_limit=10,       # max keywords from question (default: 10)
)

rag = GraphRAG(connection=conn, llm=llm, embedder=embedder, retrieval_strategy=retriever)

When to use: Default choice. Best accuracy on benchmark. Handles complex multi-hop questions.

Writing Your Own

class GlobalRetrieval(RetrievalStrategy):
    async def _execute(self, query, ctx, **kwargs):
        # Your custom retrieval logic
        results = await self.vector_store.search(query_vector, top_k=20)
        return RawSearchResult(records=results)

rag = GraphRAG(connection=conn, llm=llm, embedder=embedder, retrieval_strategy=GlobalRetrieval())

6. RerankingStrategy

Reranks retrieval results before they are passed to the LLM for answer generation.

ABC

from graphrag_sdk import RerankingStrategy

class RerankingStrategy(ABC):
    @abstractmethod
    async def rerank(
        self,
        query: str,
        result: RetrieverResult,
        ctx: Context,
    ) -> RetrieverResult:
        """Rerank and filter retrieval results."""
        ...

Built-in: CosineReranker

Reranks by cosine similarity between query embedding and item embeddings.

from graphrag_sdk import CosineReranker

reranker = CosineReranker(
    embedder=embedder,
    top_k=15,          # keep top N results (default: 15)
)

result = await rag.completion("question", reranker=reranker)

Note: MultiPathRetrieval already includes cosine reranking internally. The standalone CosineReranker is useful when using LocalRetrieval or a custom strategy.

Writing Your Own

class LLMReranker(RerankingStrategy):
    async def rerank(self, query, result, ctx):
        # Score each item with the LLM and sort
        scored = []
        for item in result.items:
            score = await self.llm.ainvoke(f"Rate relevance 0-10: {query} vs {item.content}")
            scored.append((float(score.content), item))
        scored.sort(reverse=True)
        return RetrieverResult(items=[item for _, item in scored[:10]])