LlamaIndex - Data Framework for LLM Applications
The leading framework for connecting LLMs with your data.
When to use LlamaIndex
Use LlamaIndex when:
- Building RAG (retrieval-augmented generation) applications
- Need document question-answering over private data
- Ingesting data from multiple sources (300+ connectors)
- Creating knowledge bases for LLMs
- Building chatbots with enterprise data
- Need structured data extraction from documents
Metrics:
- 45,100+ GitHub stars
- 23,000+ repositories use LlamaIndex
- 300+ data connectors (LlamaHub)
- 1,715+ contributors
- v0.14.7 (stable)
Use alternatives instead:
- LangChain: More general-purpose, better for agents
- Haystack: Production search pipelines
- txtai: Lightweight semantic search
- Chroma: Just need vector storage
Quick start
Installation
pip install llama-index
pip install llama-index-core
pip install llama-index-llms-openai
pip install llama-index-embeddings-openai
5-line RAG example
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
documents = SimpleDirectoryReader("data").load_data()
index = VectorStoreIndex.from_documents(documents)
query_engine = index.as_query_engine()
response = query_engine.query("What did the author do growing up?")
print(response)
Core concepts
1. Data connectors - Load documents
from llama_index.core import SimpleDirectoryReader, Document
from llama_index.readers.web import SimpleWebPageReader
from llama_index.readers.github import GithubRepositoryReader
documents = SimpleDirectoryReader("./data").load_data()
reader = SimpleWebPageReader()
documents = reader.load_data(["https://example.com"])
reader = GithubRepositoryReader(owner="user", repo="repo")
documents = reader.load_data(branch="main")
doc = Document(
text="This is the document content",
metadata={"source": "manual", "date": "2025-01-01"}
)
2. Indices - Structure data
from llama_index.core import VectorStoreIndex, ListIndex, TreeIndex
vector_index = VectorStoreIndex.from_documents(documents)
list_index = ListIndex.from_documents(documents)
tree_index = TreeIndex.from_documents(documents)
index.storage_context.persist(persist_dir="./storage")
from llama_index.core import load_index_from_storage, StorageContext
storage_context = StorageContext.from_defaults(persist_dir="./storage")
index = load_index_from_storage(storage_context)
3. Query engines - Ask questions
query_engine = index.as_query_engine()
response = query_engine.query("What is the main topic?")
print(response)
query_engine = index.as_query_engine(streaming=True)
response = query_engine.query("Explain quantum computing")
for text in response.response_gen:
print(text, end="", flush=True)
query_engine = index.as_query_engine(
similarity_top_k=3,
response_mode="compact",
verbose=True
)
4. Retrievers - Find relevant chunks
retriever = index.as_retriever(similarity_top_k=5)
nodes = retriever.retrieve("machine learning")
retriever = index.as_retriever(
similarity_top_k=3,
filters={"metadata.category": "tutorial"}
)
from llama_index.core.retrievers import BaseRetriever
class CustomRetriever(BaseRetriever):
def _retrieve(self, query_bundle):
return nodes
Agents with tools
Basic agent
from llama_index.core.agent import FunctionAgent
from llama_index.llms.openai import OpenAI
def multiply(a: int, b: int) -> int:
"""Multiply two numbers."""
return a * b
def add(a: int, b: int) -> int:
"""Add two numbers."""
return a + b
llm = OpenAI(model="gpt-4o")
agent = FunctionAgent.from_tools(
tools=[multiply, add],
llm=llm,
verbose=True
)
response = agent.chat("What is 25 * 17 + 142?")
print(response)
RAG agent (document search + tools)
from llama_index.core.tools import QueryEngineTool
index = VectorStoreIndex.from_documents(documents)
query_tool = QueryEngineTool.from_defaults(
query_engine=index.as_query_engine(),
name="python_docs",
description="Useful for answering questions about Python programming"
)
agent = FunctionAgent.from_tools(
tools=[query_tool, multiply, add],
llm=llm
)
response = agent.chat("According to the docs, what is Python used for?")
Advanced RAG patterns
Chat engine (conversational)
from llama_index.core.chat_engine import CondensePlusContextChatEngine
chat_engine = index.as_chat_engine(
chat_mode="conden