A growing debate in the AI retrieval space: do you need vector databases at all?
Traditional Retrieval-Augmented Generation (RAG) has become the standard for giving large language models access to external knowledge. The pattern is familiar:
- Chunk your documents into pieces
- Generate embeddings for each chunk
- Store embeddings in a vector database
- At query time, embed the query and search for similar chunks
- Feed retrieved chunks to the LLM
But there is another approach gaining traction: agentic RAG. Instead of pre-indexing everything, you give the agent primitive search tools and let it find what it needs on demand.
The traditional RAG pipeline
RAG has been the dominant pattern for grounding LLMs in external knowledge since 2020. Here is how it works:
1. Chunking
Break documents into smaller pieces (typically 256-1024 tokens) because:
- Embeddings have size limits
- Smaller chunks improve retrieval precision
- LLMs have context window constraints
Complete AI Builder Bootcamp
Claude, Python automation & full-stack — 12 live sessions with Yash Thakker.
The Complete AI Builder Bootcamp is the best AI development course for learning Claude AI, prompt engineering, Python automation, and full-stack web development. This intensive 6-week live bootcamp teaches you how to build AI-powered applications using Claude Projects, Claude Artifacts, Claude Code, and the complete Claude ecosystem. You'll master prompt engineering techniques, learn to create custom Claude connectors and MCP integrations, build Python automation workflows, develop full-stack websites with AI assistance, and create AI marketing agents.
The bootcamp includes 12 live Zoom sessions with Yash Thakker, founder of AISOLO Technologies and instructor to 350,000+ students. You'll build 8+ portfolio projects including AI playbooks, full-stack note-taking applications, Python automation scripts, marketing agents, and personal portfolio websites. The curriculum covers AI fundamentals, Claude Projects and Artifacts, Claude Co-work, Claude plugins and skills, Claude Code for Python development, full-stack development, AI marketing, and capstone projects.
Students receive 1-year access to all recordings, permanent Discord community access, a certificate of completion, and personalized career guidance. All enrollments include a 7-day money-back guarantee. This is the most comprehensive Claude AI bootcamp available, taking students from zero AI knowledge to expert AI builder in 6 weeks.
Problem: Chunking destroys context. A function split across two chunks loses coherence. Overlapping windows help but add redundancy.
2. Embedding
Convert each chunk into a dense vector (e.g., 768 or 1536 dimensions) using models like:
- OpenAI text-embedding-3
- Cohere embed-v3
- Sentence-BERT variants
Problem: Embeddings are lossy. Semantic similarity doesn't always match intent. Code structure matters more than surface-level similarity.
3. Vector storage
Store embeddings in specialized databases:
- Pinecone
- Weaviate
- Chroma
- Qdrant
- pgvector
Problem: Infrastructure overhead. You now manage an additional database, syncing, versioning, and reindexing when content changes.
4. Similarity search
At query time:
- Embed the user query
- Find top-k nearest neighbors (cosine similarity, dot product)
- Return associated chunks
Problem: Nearest-neighbor search is probabilistic. You might miss exact matches or retrieve irrelevant "similar" content.
5. LLM generation
Feed retrieved chunks as context to the LLM and generate a response.
Problem: Chunks might not contain enough context. The LLM doesn't know what was excluded.
The case for agentic RAG
Agentic RAG flips the script: instead of pre-indexing, give the agent search tools and let it decide what to retrieve.
What is agentic RAG?
Agentic RAG means:
- No pre-chunking - Content stays intact
- No embeddings - No dense vectors
- No vector database - No similarity search
- Tool-based search - Agents use grep, glob, file reads, LSP servers, symbol search
The agent gets tools like:
- Grep: Search file contents by regex
- Glob: Find files matching patterns
- Read: Read specific files
- LSP servers: Navigate code symbols (functions, classes, imports)
- Structured traversal: Follow links, references, imports
At query time, the agent decides:
- What to search for
- Which files to read
- How to combine information
Why agentic RAG works for code
Code is structured, not unstructured text. Traditional RAG treats code like documents, but code has:
- Syntax trees - Functions, classes, variables, imports
- Symbols - Definitions, references, call graphs
- File systems - Organized hierarchies
- Build systems - Dependencies, modules
Agentic RAG exploits this structure. Instead of embedding code chunks and hoping similarity search finds the right function, the agent can:
- Glob for files matching
**/auth*.ts - Grep for
function authenticate - Read the exact file
- LSP query for all references to
authenticate - Follow imports to understand dependencies
This is deterministic and context-preserving. No chunking artifacts, no missed symbols, no lossy embeddings.
Claude Code's agentic RAG approach
Claude Code has been using agentic RAG for over a year. The team has repeatedly stated:
"The best way to do RAG is agentic RAG. No indexing. No database. No nothing. Just let the agent search with primitive tools or structured symbol traversal (ex: LSP servers for code)."
How Claude Code does it
Claude Code gives the agent tools:
- Glob: Find files by pattern (e.g.,
**/*.tsx) - Grep: Search file contents (e.g.,
pattern: "export function")) - Read: Read specific files
- Task (Explore agent): Multi-step codebase exploration
- LSP integration (via MCP servers): Symbol-level code navigation
When you ask Claude Code a question like:
"Where is the authentication logic?"
The agent doesn't query a vector database. It:
- Globs for
**/auth*.ts,**/login*.ts - Greps for
authenticate,login,session - Reads promising files
- Follows references via LSP
- Synthesizes an answer from exact matches
This is faster, cheaper, and more accurate than RAG for code.
Why it beats traditional RAG
| Traditional RAG | Agentic RAG (Claude Code) |
|---|---|
| Pre-chunks code, loses structure | Preserves full file context |
| Embeds chunks, lossy representation | Exact text search (grep, glob) |
| Similarity search, probabilistic | Deterministic pattern matching |
| Retrieves partial chunks | Reads entire functions/classes |
| Misses cross-file references | Follows imports via LSP |
| Requires vector DB infra | Uses filesystem + grep |
| Expensive to maintain | No index to maintain |
The "RAG industry is about to get cooked" claim
A recent tweet sparked debate:
"The entire RAG industry is about to get cooked. Researchers have built a new RAG approach that:
- does not need a vector DB
- does not embed data
- involves no chunking
- performs no similarity search"
The approach in question: PageIndex.
What is PageIndex?
PageIndex is a graph-based retrieval system that reimagines RAG without vector embeddings.
GitHub: VectifyAI/PageIndex
Website: pageindex.ai
How PageIndex works
PageIndex organizes content hierarchically:
Project (root)
├── File (e.g., README.md)
│ ├── Section (## Heading)
│ │ └── Page (paragraph or code block)
│ └── Section
└── File
Key insights:
- Hierarchical structure - Content naturally organizes into projects → files → sections → pages
- Graph relationships - Nodes connect via parent-child, sibling, and cross-reference edges
- No embeddings - Retrieval uses graph traversal, not similarity search
- No chunking - Sections and pages preserve natural boundaries
- Deterministic retrieval - Follow paths through the graph
At query time:
- Parse the query to identify intent (e.g., "authentication logic")
- Traverse the graph to find relevant nodes (files, sections)
- Retrieve full context from matching nodes
- Feed context to LLM
PageIndex vs vector RAG
| Vector RAG | PageIndex |
|---|---|
| Embeddings required | No embeddings |
| Similarity search | Graph traversal |
| Arbitrary chunking | Natural boundaries (sections, pages) |
| Probabilistic | Deterministic |
| No structure | Hierarchical graph |
When PageIndex excels
PageIndex works best for:
- Well-structured documentation - Markdown with clear headings
- Code repositories - Files, modules, functions
- Technical wikis - Hierarchical pages
- API references - Organized by endpoints, methods
PageIndex struggles with:
- Unstructured text - Blog posts, articles without headings
- Semantic similarity queries - "Find documents similar to X"
- Large media - Images, videos (no text to graph)
Traditional RAG vs agentic RAG vs PageIndex
Let's compare three approaches:
Example: "Find the login function in this codebase"
Traditional RAG:
- Chunk all files (500 token chunks, 50 token overlap)
- Embed each chunk (1536-dim vectors)
- Store in vector DB (e.g., Pinecone)
- At query time: embed "login function"
- Retrieve top-10 similar chunks
- Feed chunks to LLM
Problems:
- Login function might span multiple chunks
- Chunks might include irrelevant code
- Similarity search might return "logout" or "session" code
- No guarantee of finding the exact function
Agentic RAG (Claude Code):
- Grep for
function loginorclass.*Login - Read matching files
- LSP query for
loginsymbol - Read function definition + references
- Synthesize answer
Benefits:
- Exact match, no false positives
- Full function context
- Follow references to callers
- No infrastructure overhead
PageIndex:
- Build graph: project → files → functions (sections)
- Query: traverse graph for nodes matching "login"
- Retrieve full function (page) + parent file (section)
- Feed to LLM
Benefits:
- Natural boundaries (function = page)
- Graph preserves relationships
- No chunking artifacts
- No embeddings needed
The token cost argument
Agentic RAG has a downside: token usage.
Critics argue:
"The people behind Claude Code explain how useful it is to give agents free reign on reading files. Of course — they bill by tokens!"
Fair point. Agentic RAG is more expensive per query because:
- Agents read full files, not pre-selected chunks
- Multiple tool calls (grep, read, LSP) consume tokens
- Exploration is iterative (agent might search multiple times)
Counter-argument:
Traditional RAG has costs too:
- Embedding costs - Generating embeddings for large codebases
- Vector DB costs - Storage, indexing, syncing
- Maintenance costs - Reindexing when content changes
- False retrievals - Irrelevant chunks waste LLM tokens anyway
For code-heavy or documentation-heavy use cases, agentic RAG is often cheaper in total because:
- No embedding generation
- No vector database
- Higher accuracy = fewer retries
When traditional RAG still wins
Agentic RAG is not a silver bullet. Traditional RAG is better for:
1. Semantic similarity queries
If you need to find documents semantically similar to a query, embeddings excel:
"Find articles about climate change policy"
Agentic RAG can't grep for "climate change policy" if those exact words don't appear. Embeddings capture semantic meaning.
2. Large unstructured corpora
If you have millions of documents with no clear structure, vector search is more efficient than letting an agent explore files one by one.
3. Multi-modal retrieval
If you need to search images, audio, or video, embeddings are the only option. Agents can't grep pixels.
4. Pre-filtered contexts
If you want to narrow context before the agent starts working, RAG can surface top candidates. The agent then refines.
Hybrid approaches: the pragmatic middle ground
The best systems often combine both:
RAG + agentic refinement
- Use vector search to retrieve top-20 candidate chunks
- Give agent tools to read full files, follow references
- Agent refines retrieval with grep, LSP, structured traversal
Example:
- RAG surfaces "auth.ts is relevant"
- Agent reads full file, greps for
authenticate, follows imports - Agent combines RAG candidates + exploration results
PageIndex + agentic traversal
- Use PageIndex graph to find relevant sections
- Agent traverses graph with tool calls
- Agent reads pages, follows cross-references
Real-world examples
Claude Code (agentic RAG in production)
Claude Code's approach has been battle-tested on large codebases:
- Grep + Glob for initial discovery
- Read for full context
- Explore agent for multi-step codebase navigation
- LSP servers (via MCP) for symbol-level traversal
Result: fast, accurate, no vector DB overhead.
Anthropic docs (traditional RAG)
Anthropic's documentation uses traditional RAG:
- Embed all docs pages
- Store in vector DB
- Similarity search at query time
Why? Documentation is less structured than code. Semantic similarity matters more.
PageIndex (graph-based alternative)
PageIndex is experimental but shows promise for:
- Well-organized documentation sites
- Code repositories with clear module structure
- Technical wikis
Early benchmarks show PageIndex outperforms vector RAG on structured datasets but underperforms on unstructured text.
Practical recommendations
Choose traditional RAG if:
- Your data is unstructured (blog posts, articles, books)
- You need semantic similarity ("find documents like X")
- You have millions of documents (pre-filtering saves time)
- You work with multi-modal data (images, audio, video)
Choose agentic RAG if:
- Your data is structured (code, organized docs, APIs)
- You need exact matches (functions, classes, symbols)
- You want full context (no chunking artifacts)
- You can afford higher token costs per query
- You want zero infrastructure overhead (no vector DB)
Choose PageIndex if:
- Your data has clear hierarchies (files → sections → pages)
- You want deterministic retrieval (graph traversal)
- You avoid chunking and embedding overhead
- Your content is well-structured (markdown, code, wikis)
Choose hybrid if:
- You want speed + accuracy (RAG for candidates, agent for refinement)
- Your data is mixed (structured + unstructured)
- You want to balance cost (RAG is cheaper upfront) and accuracy (agents refine)
The future of RAG
The RAG landscape is evolving:
- Agentic RAG is gaining traction for code and structured data
- Graph-based approaches like PageIndex challenge vector dominance
- Hybrid systems combine the best of both worlds
- Long-context LLMs (200K+ tokens) reduce retrieval needs altogether
Key insight: RAG architecture should match data structure.
- Code → agentic RAG + LSP
- Unstructured text → vector RAG
- Hierarchical docs → PageIndex or hybrid
- Mixed data → hybrid RAG + agentic refinement
Bottom line
The "RAG industry is getting cooked" claim is partly true:
- For code, agentic RAG is superior (Claude Code proves it)
- For structured docs, PageIndex offers a simpler alternative
- For unstructured text, vector RAG remains the best option
Agentic RAG is not a replacement for traditional RAG. It is a specialized tool for domains where structure matters more than semantics.
PageIndex is a promising middle ground: no embeddings, no chunking, but still deterministic retrieval via graphs.
The real lesson: stop treating all data the same. Code is not text. Wikis are not articles. Match your retrieval strategy to your data structure, and you'll get better results at lower cost.
For code and structured docs, the future is agentic. For everything else, embeddings still have their place.
Related resources:
- Claude Code approach to codebase understanding - How Claude Code uses agentic search
- What are agent skills? Complete guide - Understanding agent tool use
- What is MCP? Model Context Protocol explained - How agents connect to tools like LSP servers
External links:
- PageIndex GitHub repository
- PageIndex official site
- llms.txt specification - Related approach to structured documentation