Agentic RAG is a retrieval approach where AI agents use primitive search tools (grep, glob, LSP servers) to find information on demand, instead of pre-indexing content into vector databases with embeddings and chunking.

How is agentic RAG different from traditional RAG?

Traditional RAG pre-processes content into chunks, generates embeddings, stores them in vector databases, and uses similarity search. Agentic RAG lets agents search directly with tools like grep, file traversal, and structured symbol lookups at query time.

Why does Claude Code use agentic RAG instead of vector databases?

Claude Code uses agentic RAG because it avoids chunking artifacts, preserves full context, handles code structure better via LSP servers, and scales to large repositories without embedding costs. The agent decides what to search and when.

What is PageIndex and how does it work?

PageIndex is a RAG approach that uses graph-based indexing without vector embeddings. It organizes content hierarchically (projects, files, sections, pages) and retrieves via graph traversal instead of similarity search, reducing chunking and embedding overhead.

When should I use traditional RAG vs agentic RAG?

Use traditional RAG for semantic similarity queries on unstructured text (e.g., finding similar documents). Use agentic RAG for structured data like code, well-organized documentation, or when you need precise symbol-level retrieval with full context.

RAG vs Agentic RAG: why search beats embeddings for code | explainx.ai Blog

A growing debate in the AI retrieval space: do you need vector databases at all?

Traditional Retrieval-Augmented Generation (RAG) has become the standard for giving large language models access to external knowledge. The pattern is familiar:

Chunk your documents into pieces
Generate embeddings for each chunk
Store embeddings in a vector database
At query time, embed the query and search for similar chunks
Feed retrieved chunks to the LLM

But there is another approach gaining traction: agentic RAG. Instead of pre-indexing everything, you give the agent primitive search tools and let it find what it needs on demand.

The traditional RAG pipeline

RAG has been the dominant pattern for grounding LLMs in external knowledge since 2020. Here is how it works:

1. Chunking

Break documents into smaller pieces (typically 256-1024 tokens) because:

Embeddings have size limits
Smaller chunks improve retrieval precision
LLMs have context window constraints

Problem: Chunking destroys context. A function split across two chunks loses coherence. Overlapping windows help but add redundancy.

2. Embedding

Convert each chunk into a dense vector (e.g., 768 or 1536 dimensions) using models like:

OpenAI text-embedding-3
Cohere embed-v3
Sentence-BERT variants

Problem: Embeddings are lossy. Semantic similarity doesn't always match intent. Code structure matters more than surface-level similarity.

3. Vector storage

Store embeddings in specialized databases:

Pinecone
Weaviate
Chroma
Qdrant
pgvector

Problem: Infrastructure overhead. You now manage an additional database, syncing, versioning, and reindexing when content changes.

4. Similarity search

At query time:

Embed the user query
Find top-k nearest neighbors (cosine similarity, dot product)
Return associated chunks

Problem: Nearest-neighbor search is probabilistic. You might miss exact matches or retrieve irrelevant "similar" content.

5. LLM generation

Feed retrieved chunks as context to the LLM and generate a response.

Problem: Chunks might not contain enough context. The LLM doesn't know what was excluded.

The case for agentic RAG

Agentic RAG flips the script: instead of pre-indexing, give the agent search tools and let it decide what to retrieve.

RAG in practice: building a retrieval-augmented chatbot end-to-end.

What is agentic RAG?

Agentic RAG means:

No pre-chunking - Content stays intact
No embeddings - No dense vectors
No vector database - No similarity search
Tool-based search - Agents use grep, glob, file reads, LSP servers, symbol search

The agent gets tools like:

Grep: Search file contents by regex
Glob: Find files matching patterns
Read: Read specific files
LSP servers: Navigate code symbols (functions, classes, imports)
Structured traversal: Follow links, references, imports

At query time, the agent decides:

What to search for
Which files to read
How to combine information

Why agentic RAG works for code

Code is structured, not unstructured text. Traditional RAG treats code like documents, but code has:

Syntax trees - Functions, classes, variables, imports
Symbols - Definitions, references, call graphs
File systems - Organized hierarchies
Build systems - Dependencies, modules

Agentic RAG exploits this structure. Instead of embedding code chunks and hoping similarity search finds the right function, the agent can:

Glob for files matching **/auth*.ts
Grep for function authenticate
Read the exact file
LSP query for all references to authenticate
Follow imports to understand dependencies

This is deterministic and context-preserving. No chunking artifacts, no missed symbols, no lossy embeddings.

Claude Code's agentic RAG approach

Claude Code has been using agentic RAG for over a year. The team has repeatedly stated:

"The best way to do RAG is agentic RAG. No indexing. No database. No nothing. Just let the agent search with primitive tools or structured symbol traversal (ex: LSP servers for code)."

How Claude Code does it

Claude Code gives the agent tools:

Glob: Find files by pattern (e.g., **/*.tsx)
Grep: Search file contents (e.g., pattern: "export function"))
Read: Read specific files
Task (Explore agent): Multi-step codebase exploration
LSP integration (via MCP servers): Symbol-level code navigation

When you ask Claude Code a question like:

"Where is the authentication logic?"

The agent doesn't query a vector database. It:

Globs for **/auth*.ts, **/login*.ts
Greps for authenticate, login, session
Reads promising files
Follows references via LSP
Synthesizes an answer from exact matches

This is faster, cheaper, and more accurate than RAG for code.

Why it beats traditional RAG

Traditional RAG	Agentic RAG (Claude Code)
Pre-chunks code, loses structure	Preserves full file context
Embeds chunks, lossy representation	Exact text search (grep, glob)
Similarity search, probabilistic	Deterministic pattern matching
Retrieves partial chunks	Reads entire functions/classes
Misses cross-file references	Follows imports via LSP
Requires vector DB infra	Uses filesystem + grep
Expensive to maintain	No index to maintain

The "RAG industry is about to get cooked" claim

A recent tweet sparked debate:

"The entire RAG industry is about to get cooked. Researchers have built a new RAG approach that:

does not need a vector DB

does not embed data

involves no chunking

performs no similarity search"

The approach in question: PageIndex.

What is PageIndex?

PageIndex is a graph-based retrieval system that reimagines RAG without vector embeddings.

GitHub: VectifyAI/PageIndex

Website: pageindex.ai

How PageIndex works

PageIndex organizes content hierarchically:

snippet

Project (root)
├── File (e.g., README.md)
│   ├── Section (## Heading)
│   │   └── Page (paragraph or code block)
│   └── Section
└── File

Key insights:

Hierarchical structure - Content naturally organizes into projects → files → sections → pages
Graph relationships - Nodes connect via parent-child, sibling, and cross-reference edges
No embeddings - Retrieval uses graph traversal, not similarity search
No chunking - Sections and pages preserve natural boundaries
Deterministic retrieval - Follow paths through the graph

At query time:

Parse the query to identify intent (e.g., "authentication logic")
Traverse the graph to find relevant nodes (files, sections)
Retrieve full context from matching nodes
Feed context to LLM

PageIndex vs vector RAG

Vector RAG	PageIndex
Embeddings required	No embeddings
Similarity search	Graph traversal
Arbitrary chunking	Natural boundaries (sections, pages)
Probabilistic	Deterministic
No structure	Hierarchical graph

When PageIndex excels

PageIndex works best for:

Well-structured documentation - Markdown with clear headings
Code repositories - Files, modules, functions
Technical wikis - Hierarchical pages
API references - Organized by endpoints, methods

PageIndex struggles with:

Unstructured text - Blog posts, articles without headings
Semantic similarity queries - "Find documents similar to X"
Large media - Images, videos (no text to graph)

Traditional RAG vs agentic RAG vs PageIndex

Let's compare three approaches:

Traditional RAG:

Chunk all files (500 token chunks, 50 token overlap)
Embed each chunk (1536-dim vectors)
Store in vector DB (e.g., Pinecone)
At query time: embed "login function"
Retrieve top-10 similar chunks
Feed chunks to LLM

Problems:

Login function might span multiple chunks
Chunks might include irrelevant code
Similarity search might return "logout" or "session" code
No guarantee of finding the exact function

Agentic RAG (Claude Code):

Grep for function login or class.*Login
Read matching files
LSP query for login symbol
Read function definition + references
Synthesize answer

Benefits:

Exact match, no false positives
Full function context
Follow references to callers
No infrastructure overhead

PageIndex:

Build graph: project → files → functions (sections)
Query: traverse graph for nodes matching "login"
Retrieve full function (page) + parent file (section)
Feed to LLM

Benefits:

Natural boundaries (function = page)
Graph preserves relationships
No chunking artifacts
No embeddings needed

The token cost argument

Agentic RAG has a downside: token usage.

Critics argue:

"The people behind Claude Code explain how useful it is to give agents free reign on reading files. Of course — they bill by tokens!"

Fair point. Agentic RAG is more expensive per query because:

Agents read full files, not pre-selected chunks
Multiple tool calls (grep, read, LSP) consume tokens
Exploration is iterative (agent might search multiple times)

Counter-argument:

Traditional RAG has costs too:

Embedding costs - Generating embeddings for large codebases
Vector DB costs - Storage, indexing, syncing
Maintenance costs - Reindexing when content changes
False retrievals - Irrelevant chunks waste LLM tokens anyway

For code-heavy or documentation-heavy use cases, agentic RAG is often cheaper in total because:

No embedding generation
No vector database
Higher accuracy = fewer retries

When traditional RAG still wins

Agentic RAG is not a silver bullet. Traditional RAG is better for:

1. Semantic similarity queries

If you need to find documents semantically similar to a query, embeddings excel:

"Find articles about climate change policy"

Agentic RAG can't grep for "climate change policy" if those exact words don't appear. Embeddings capture semantic meaning.

2. Large unstructured corpora

If you have millions of documents with no clear structure, vector search is more efficient than letting an agent explore files one by one.

If you need to search images, audio, or video, embeddings are the only option. Agents can't grep pixels.

4. Pre-filtered contexts

If you want to narrow context before the agent starts working, RAG can surface top candidates. The agent then refines.

Hybrid approaches: the pragmatic middle ground

The best systems often combine both:

Use vector search to retrieve top-20 candidate chunks
Give agent tools to read full files, follow references
Agent refines retrieval with grep, LSP, structured traversal

Example:

RAG surfaces "auth.ts is relevant"
Agent reads full file, greps for authenticate, follows imports
Agent combines RAG candidates + exploration results

PageIndex + agentic traversal

Use PageIndex graph to find relevant sections
Agent traverses graph with tool calls
Agent reads pages, follows cross-references

Real-world examples

Claude Code (agentic RAG in production)

Claude Code's approach has been battle-tested on large codebases:

Grep + Glob for initial discovery
Read for full context
Explore agent for multi-step codebase navigation
LSP servers (via MCP) for symbol-level traversal

Result: fast, accurate, no vector DB overhead.

Anthropic docs (traditional RAG)

Anthropic's documentation uses traditional RAG:

Embed all docs pages
Store in vector DB
Similarity search at query time

Why? Documentation is less structured than code. Semantic similarity matters more.

PageIndex (graph-based alternative)

PageIndex is experimental but shows promise for:

Well-organized documentation sites
Code repositories with clear module structure
Technical wikis

Early benchmarks show PageIndex outperforms vector RAG on structured datasets but underperforms on unstructured text.

Practical recommendations

Choose traditional RAG if:

Your data is unstructured (blog posts, articles, books)
You need semantic similarity ("find documents like X")
You have millions of documents (pre-filtering saves time)
You work with multi-modal data (images, audio, video)

Choose agentic RAG if:

Your data is structured (code, organized docs, APIs)
You need exact matches (functions, classes, symbols)
You want full context (no chunking artifacts)
You can afford higher token costs per query
You want zero infrastructure overhead (no vector DB)

Choose PageIndex if:

Your data has clear hierarchies (files → sections → pages)
You want deterministic retrieval (graph traversal)
You avoid chunking and embedding overhead
Your content is well-structured (markdown, code, wikis)

Choose hybrid if:

You want speed + accuracy (RAG for candidates, agent for refinement)
Your data is mixed (structured + unstructured)
You want to balance cost (RAG is cheaper upfront) and accuracy (agents refine)

The future of RAG

The RAG landscape is evolving:

Agentic RAG is gaining traction for code and structured data
Graph-based approaches like PageIndex challenge vector dominance
Hybrid systems combine the best of both worlds
Long-context LLMs (200K+ tokens) reduce retrieval needs altogether

Key insight: RAG architecture should match data structure.

Code → agentic RAG + LSP
Unstructured text → vector RAG
Hierarchical docs → PageIndex or hybrid
Mixed data → hybrid RAG + agentic refinement

Bottom line

The "RAG industry is getting cooked" claim is partly true:

For code, agentic RAG is superior (Claude Code proves it)
For structured docs, PageIndex offers a simpler alternative
For unstructured text, vector RAG remains the best option

Agentic RAG is not a replacement for traditional RAG. It is a specialized tool for domains where structure matters more than semantics.

PageIndex is a promising middle ground: no embeddings, no chunking, but still deterministic retrieval via graphs.

The real lesson: stop treating all data the same. Code is not text. Wikis are not articles. Match your retrieval strategy to your data structure, and you'll get better results at lower cost.

For code and structured docs, the future is agentic. For everything else, embeddings still have their place.

Related resources:

Claude Code approach to codebase understanding - How Claude Code uses agentic search
What are agent skills? Complete guide - Understanding agent tool use
What is MCP? Model Context Protocol explained - How agents connect to tools like LSP servers

External links:

PageIndex GitHub repository
PageIndex official site
llms.txt specification - Related approach to structured documentation

A growing debate in the AI retrieval space: do you need vector databases at all?

Traditional Retrieval-Augmented Generation (RAG) has become the standard for giving large language models access to external knowledge. The pattern is familiar:

Chunk your documents into pieces
Generate embeddings for each chunk
Store embeddings in a vector database
At query time, embed the query and search for similar chunks
Feed retrieved chunks to the LLM

But there is another approach gaining traction: agentic RAG. Instead of pre-indexing everything, you give the agent primitive search tools and let it find what it needs on demand.

The traditional RAG pipeline

RAG has been the dominant pattern for grounding LLMs in external knowledge since 2020. Here is how it works:

1. Chunking

Break documents into smaller pieces (typically 256-1024 tokens) because:

Embeddings have size limits
Smaller chunks improve retrieval precision
LLMs have context window constraints

Problem: Chunking destroys context. A function split across two chunks loses coherence. Overlapping windows help but add redundancy.

2. Embedding

Convert each chunk into a dense vector (e.g., 768 or 1536 dimensions) using models like:

OpenAI text-embedding-3
Cohere embed-v3
Sentence-BERT variants

Problem: Embeddings are lossy. Semantic similarity doesn't always match intent. Code structure matters more than surface-level similarity.

3. Vector storage

Store embeddings in specialized databases:

Pinecone
Weaviate
Chroma
Qdrant
pgvector

Problem: Infrastructure overhead. You now manage an additional database, syncing, versioning, and reindexing when content changes.

4. Similarity search

At query time:

Embed the user query
Find top-k nearest neighbors (cosine similarity, dot product)
Return associated chunks

Problem: Nearest-neighbor search is probabilistic. You might miss exact matches or retrieve irrelevant "similar" content.

5. LLM generation

Feed retrieved chunks as context to the LLM and generate a response.

Problem: Chunks might not contain enough context. The LLM doesn't know what was excluded.

The case for agentic RAG

Agentic RAG flips the script: instead of pre-indexing, give the agent search tools and let it decide what to retrieve.

RAG in practice: building a retrieval-augmented chatbot end-to-end.

What is agentic RAG?

Agentic RAG means:

No pre-chunking - Content stays intact
No embeddings - No dense vectors
No vector database - No similarity search
Tool-based search - Agents use grep, glob, file reads, LSP servers, symbol search

The agent gets tools like:

Grep: Search file contents by regex
Glob: Find files matching patterns
Read: Read specific files
LSP servers: Navigate code symbols (functions, classes, imports)
Structured traversal: Follow links, references, imports

At query time, the agent decides:

What to search for
Which files to read
How to combine information

Why agentic RAG works for code

Code is structured, not unstructured text. Traditional RAG treats code like documents, but code has:

Syntax trees - Functions, classes, variables, imports
Symbols - Definitions, references, call graphs
File systems - Organized hierarchies
Build systems - Dependencies, modules

Agentic RAG exploits this structure. Instead of embedding code chunks and hoping similarity search finds the right function, the agent can:

Glob for files matching **/auth*.ts
Grep for function authenticate
Read the exact file
LSP query for all references to authenticate
Follow imports to understand dependencies

This is deterministic and context-preserving. No chunking artifacts, no missed symbols, no lossy embeddings.

Claude Code's agentic RAG approach

Claude Code has been using agentic RAG for over a year. The team has repeatedly stated:

"The best way to do RAG is agentic RAG. No indexing. No database. No nothing. Just let the agent search with primitive tools or structured symbol traversal (ex: LSP servers for code)."

How Claude Code does it

Claude Code gives the agent tools:

Glob: Find files by pattern (e.g., **/*.tsx)
Grep: Search file contents (e.g., pattern: "export function"))
Read: Read specific files
Task (Explore agent): Multi-step codebase exploration
LSP integration (via MCP servers): Symbol-level code navigation

When you ask Claude Code a question like:

"Where is the authentication logic?"

The agent doesn't query a vector database. It:

Globs for **/auth*.ts, **/login*.ts
Greps for authenticate, login, session
Reads promising files
Follows references via LSP
Synthesizes an answer from exact matches

This is faster, cheaper, and more accurate than RAG for code.

Why it beats traditional RAG

Traditional RAG	Agentic RAG (Claude Code)
Pre-chunks code, loses structure	Preserves full file context
Embeds chunks, lossy representation	Exact text search (grep, glob)
Similarity search, probabilistic	Deterministic pattern matching
Retrieves partial chunks	Reads entire functions/classes
Misses cross-file references	Follows imports via LSP
Requires vector DB infra	Uses filesystem + grep
Expensive to maintain	No index to maintain

The "RAG industry is about to get cooked" claim

A recent tweet sparked debate:

"The entire RAG industry is about to get cooked. Researchers have built a new RAG approach that:

does not need a vector DB

does not embed data

involves no chunking

performs no similarity search"

The approach in question: PageIndex.

What is PageIndex?

PageIndex is a graph-based retrieval system that reimagines RAG without vector embeddings.

GitHub: VectifyAI/PageIndex

Website: pageindex.ai

How PageIndex works

PageIndex organizes content hierarchically:

snippet

Project (root)
├── File (e.g., README.md)
│   ├── Section (## Heading)
│   │   └── Page (paragraph or code block)
│   └── Section
└── File

Key insights:

Hierarchical structure - Content naturally organizes into projects → files → sections → pages
Graph relationships - Nodes connect via parent-child, sibling, and cross-reference edges
No embeddings - Retrieval uses graph traversal, not similarity search
No chunking - Sections and pages preserve natural boundaries
Deterministic retrieval - Follow paths through the graph

At query time:

Parse the query to identify intent (e.g., "authentication logic")
Traverse the graph to find relevant nodes (files, sections)
Retrieve full context from matching nodes
Feed context to LLM

PageIndex vs vector RAG

Vector RAG	PageIndex
Embeddings required	No embeddings
Similarity search	Graph traversal
Arbitrary chunking	Natural boundaries (sections, pages)
Probabilistic	Deterministic
No structure	Hierarchical graph

When PageIndex excels

PageIndex works best for:

Well-structured documentation - Markdown with clear headings
Code repositories - Files, modules, functions
Technical wikis - Hierarchical pages
API references - Organized by endpoints, methods

PageIndex struggles with:

Unstructured text - Blog posts, articles without headings
Semantic similarity queries - "Find documents similar to X"
Large media - Images, videos (no text to graph)

Traditional RAG vs agentic RAG vs PageIndex

Let's compare three approaches:

Traditional RAG:

Chunk all files (500 token chunks, 50 token overlap)
Embed each chunk (1536-dim vectors)
Store in vector DB (e.g., Pinecone)
At query time: embed "login function"
Retrieve top-10 similar chunks
Feed chunks to LLM

Problems:

Login function might span multiple chunks
Chunks might include irrelevant code
Similarity search might return "logout" or "session" code
No guarantee of finding the exact function

Agentic RAG (Claude Code):

Grep for function login or class.*Login
Read matching files
LSP query for login symbol
Read function definition + references
Synthesize answer

Benefits:

Exact match, no false positives
Full function context
Follow references to callers
No infrastructure overhead

PageIndex:

Build graph: project → files → functions (sections)
Query: traverse graph for nodes matching "login"
Retrieve full function (page) + parent file (section)
Feed to LLM

Benefits:

Natural boundaries (function = page)
Graph preserves relationships
No chunking artifacts
No embeddings needed

The token cost argument

Agentic RAG has a downside: token usage.

Critics argue:

"The people behind Claude Code explain how useful it is to give agents free reign on reading files. Of course — they bill by tokens!"

Fair point. Agentic RAG is more expensive per query because:

Agents read full files, not pre-selected chunks
Multiple tool calls (grep, read, LSP) consume tokens
Exploration is iterative (agent might search multiple times)

Counter-argument:

Traditional RAG has costs too:

Embedding costs - Generating embeddings for large codebases
Vector DB costs - Storage, indexing, syncing
Maintenance costs - Reindexing when content changes
False retrievals - Irrelevant chunks waste LLM tokens anyway

For code-heavy or documentation-heavy use cases, agentic RAG is often cheaper in total because:

No embedding generation
No vector database
Higher accuracy = fewer retries

When traditional RAG still wins

Agentic RAG is not a silver bullet. Traditional RAG is better for:

1. Semantic similarity queries

If you need to find documents semantically similar to a query, embeddings excel:

"Find articles about climate change policy"

Agentic RAG can't grep for "climate change policy" if those exact words don't appear. Embeddings capture semantic meaning.

2. Large unstructured corpora

If you have millions of documents with no clear structure, vector search is more efficient than letting an agent explore files one by one.

If you need to search images, audio, or video, embeddings are the only option. Agents can't grep pixels.

4. Pre-filtered contexts

If you want to narrow context before the agent starts working, RAG can surface top candidates. The agent then refines.

Hybrid approaches: the pragmatic middle ground

The best systems often combine both:

Use vector search to retrieve top-20 candidate chunks
Give agent tools to read full files, follow references
Agent refines retrieval with grep, LSP, structured traversal

Example:

RAG surfaces "auth.ts is relevant"
Agent reads full file, greps for authenticate, follows imports
Agent combines RAG candidates + exploration results

PageIndex + agentic traversal

Use PageIndex graph to find relevant sections
Agent traverses graph with tool calls
Agent reads pages, follows cross-references

Real-world examples

Claude Code (agentic RAG in production)

Claude Code's approach has been battle-tested on large codebases:

Grep + Glob for initial discovery
Read for full context
Explore agent for multi-step codebase navigation
LSP servers (via MCP) for symbol-level traversal

Result: fast, accurate, no vector DB overhead.

Anthropic docs (traditional RAG)

Anthropic's documentation uses traditional RAG:

Embed all docs pages
Store in vector DB
Similarity search at query time

Why? Documentation is less structured than code. Semantic similarity matters more.

PageIndex (graph-based alternative)

PageIndex is experimental but shows promise for:

Well-organized documentation sites
Code repositories with clear module structure
Technical wikis

Early benchmarks show PageIndex outperforms vector RAG on structured datasets but underperforms on unstructured text.

Practical recommendations

Choose traditional RAG if:

Your data is unstructured (blog posts, articles, books)
You need semantic similarity ("find documents like X")
You have millions of documents (pre-filtering saves time)
You work with multi-modal data (images, audio, video)

Choose agentic RAG if:

Your data is structured (code, organized docs, APIs)
You need exact matches (functions, classes, symbols)
You want full context (no chunking artifacts)
You can afford higher token costs per query
You want zero infrastructure overhead (no vector DB)

Choose PageIndex if:

Your data has clear hierarchies (files → sections → pages)
You want deterministic retrieval (graph traversal)
You avoid chunking and embedding overhead
Your content is well-structured (markdown, code, wikis)

Choose hybrid if:

You want speed + accuracy (RAG for candidates, agent for refinement)
Your data is mixed (structured + unstructured)
You want to balance cost (RAG is cheaper upfront) and accuracy (agents refine)

The future of RAG

The RAG landscape is evolving:

Agentic RAG is gaining traction for code and structured data
Graph-based approaches like PageIndex challenge vector dominance
Hybrid systems combine the best of both worlds
Long-context LLMs (200K+ tokens) reduce retrieval needs altogether

Key insight: RAG architecture should match data structure.

Code → agentic RAG + LSP
Unstructured text → vector RAG
Hierarchical docs → PageIndex or hybrid
Mixed data → hybrid RAG + agentic refinement

Bottom line

The "RAG industry is getting cooked" claim is partly true:

For code, agentic RAG is superior (Claude Code proves it)
For structured docs, PageIndex offers a simpler alternative
For unstructured text, vector RAG remains the best option

Agentic RAG is not a replacement for traditional RAG. It is a specialized tool for domains where structure matters more than semantics.

PageIndex is a promising middle ground: no embeddings, no chunking, but still deterministic retrieval via graphs.

The real lesson: stop treating all data the same. Code is not text. Wikis are not articles. Match your retrieval strategy to your data structure, and you'll get better results at lower cost.

For code and structured docs, the future is agentic. For everything else, embeddings still have their place.

Related resources:

Claude Code approach to codebase understanding - How Claude Code uses agentic search
What are agent skills? Complete guide - Understanding agent tool use
What is MCP? Model Context Protocol explained - How agents connect to tools like LSP servers

External links:

PageIndex GitHub repository
PageIndex official site
llms.txt specification - Related approach to structured documentation

The traditional RAG pipeline

1. Chunking

2. Embedding

3. Vector storage

4. Similarity search

5. LLM generation

The case for agentic RAG

What is agentic RAG?

Why agentic RAG works for code

Claude Code's agentic RAG approach

How Claude Code does it

Why it beats traditional RAG

The "RAG industry is about to get cooked" claim

What is PageIndex?

How PageIndex works

PageIndex vs vector RAG

When PageIndex excels

Traditional RAG vs agentic RAG vs PageIndex

Example: "Find the login function in this codebase"

The token cost argument

When traditional RAG still wins

1. Semantic similarity queries

2. Large unstructured corpora

3. Multi-modal retrieval

4. Pre-filtered contexts

Hybrid approaches: the pragmatic middle ground

RAG + agentic refinement

PageIndex + agentic traversal

Real-world examples

Claude Code (agentic RAG in production)

Anthropic docs (traditional RAG)

PageIndex (graph-based alternative)

Practical recommendations

Choose traditional RAG if:

Choose agentic RAG if:

Choose PageIndex if:

Choose hybrid if:

The future of RAG

Bottom line

The traditional RAG pipeline

1. Chunking

2. Embedding

3. Vector storage

4. Similarity search

5. LLM generation

The case for agentic RAG

What is agentic RAG?

Why agentic RAG works for code

Claude Code's agentic RAG approach

How Claude Code does it

Why it beats traditional RAG

The "RAG industry is about to get cooked" claim

What is PageIndex?

How PageIndex works

PageIndex vs vector RAG

When PageIndex excels

Traditional RAG vs agentic RAG vs PageIndex

Example: "Find the login function in this codebase"

The token cost argument

When traditional RAG still wins

1. Semantic similarity queries

2. Large unstructured corpora

3. Multi-modal retrieval

4. Pre-filtered contexts

Hybrid approaches: the pragmatic middle ground

RAG + agentic refinement

PageIndex + agentic traversal

Real-world examples

Claude Code (agentic RAG in production)

Anthropic docs (traditional RAG)

PageIndex (graph-based alternative)

Practical recommendations

Choose traditional RAG if:

Choose agentic RAG if:

Choose PageIndex if:

Choose hybrid if:

The future of RAG

Bottom line

Related posts

RAG and context injection: designing retrieval pipelines that actually work in 2026