NewMCP ServerView docs
Configuration

Chunking Strategies

Learn about document chunking strategies and how to optimize for your use case.

7 min readUpdated 2026-01-16

Chunking Strategies

How documents are split into chunks significantly impacts search quality.

Default Strategy

python
{
  "strategy": "recursive",
  "chunk_size": 512,      # tokens
  "chunk_overlap": 64,    # tokens
  "separators": ["\n\n", "\n", ". ", " "]
}

Strategies

Recursive (Default)

Splits on natural boundaries (paragraphs, sentences, words).

python
client.documents.upload(
    file,
    chunking={"strategy": "recursive"}
)

Fixed Size

Equal-sized chunks, regardless of content structure.

python
chunking={"strategy": "fixed", "chunk_size": 256}

Semantic

Groups related content together using embeddings.

python
chunking={"strategy": "semantic", "similarity_threshold": 0.7}

By Header

Splits on document headers (H1, H2, etc.).

python
chunking={"strategy": "header", "max_levels": 2}

Chunk Size Guidelines

Document TypeRecommended Size
Technical docs256-512 tokens
Legal documents512-1024 tokens
Conversations128-256 tokens
Code256-512 tokens

Overlap

Overlap ensures context isn't lost at chunk boundaries:

python
chunking={
    "chunk_size": 512,
    "chunk_overlap": 64  # 12.5% overlap
}

Custom Metadata

Add context to chunks:

python
chunking={
    "include_metadata": True,
    "metadata_fields": ["title", "section", "page_number"]
}