LitGPT - Clean LLM Implementations
Quick start
LitGPT provides 20+ pretrained LLM implementations with clean, readable code and production-ready training workflows.
Installation:
pip install 'litgpt[extra]'
Load and use any model:
from litgpt import LLM
llm = LLM.load("microsoft/phi-2")
result = llm.generate(
"What is the capital of France?",
max_new_tokens=50,
temperature=0.7
)
print(result)
List available models:
litgpt download list
Common workflows
Workflow 1: Fine-tune on custom dataset
Copy this checklist:
Fine-Tuning Setup:
- [ ] Step 1: Download pretrained model
- [ ] Step 2: Prepare dataset
- [ ] Step 3: Configure training
- [ ] Step 4: Run fine-tuning
Step 1: Download pretrained model
litgpt download meta-llama/Meta-Llama-3-8B
litgpt download microsoft/phi-2
litgpt download google/gemma-2b
Models are saved to checkpoints/ directory.
Step 2: Prepare dataset
LitGPT supports multiple formats:
Alpaca format (instruction-response):
[
{
"instruction": "What is the capital of France?",
"input": "",
"output": "The capital of France is Paris."
},
{
"instruction": "Translate to Spanish: Hello, how are you?",
"input": "",
"output": "Hola, ยฟcรณmo estรกs?"
}
]
Save as data/my_dataset.json.
Step 3: Configure training
litgpt finetune \
meta-llama/Meta-Llama-3-8B \
--data JSON \
--data.json_path data/my_dataset.json \
--train.max_steps 1000 \
--train.learning_rate 2e-5 \
--train.micro_batch_size 1 \
--train.global_batch_size 16
litgpt finetune_lora \
microsoft/phi-2 \
--data JSON \
--data.json_path data/my_dataset.json \
--lora_r 16 \
--lora_alpha 32 \
--lora_dropout 0.05 \
--train.max_steps 1000 \
--train.learning_rate 1e-4
Step 4: Run fine-tuning
Training saves checkpoints to out/finetune/ automatically.
Monitor training:
tail -f out/finetune/logs.txt
tensorboard --logdir out/finetune/lightning_logs
Workflow 2: LoRA fine-tuning on single GPU
Most memory-efficient option.
LoRA Training:
- [ ] Step 1: Choose base model
- [ ] Step 2: Configure LoRA parameters
- [ ] Step 3: Train with LoRA
- [ ] Step 4: Merge LoRA weights (optional)
Step 1: Choose base model
For limited GPU memory (12-16GB):
- Phi-2 (2.7B) - Best quality/size tradeoff
- Llama 3 1B - Smallest, fastest
- Gemma 2B - Good reasoning
Step 2: Configure LoRA parameters
litgpt finetune_lora \
microsoft/phi-2 \
--data JSON \
--data.json_path data/my_dataset.json \
--lora_r 16 \
--lora_alpha 32 \
--lora_dropout 0.05 \
--lora_query true \
--lora_key false \
--lora_value true \
--lora_projection true \
--lora_mlp false \
--lora_head false
LoRA rank guide:
r=8: Lightweight, 2-4MB adapters
r=16: Standard, good quality
r=32: High capacity, use for complex tasks
r=64: Maximum quality, 4ร larger adapters
Step 3: Train with LoRA
litgpt finetune_lora \
microsoft/phi-2 \
--data JSON \
--data.json_path data/my_dataset.json \
--lora_r 16 \
--train.epochs 3 \
--train.learning_rate 1e-4 \
--train.micro_batch_size 4 \
--train.global_batch_size 32 \
--out_dir out/phi2-lora
Step 4: Merge LoRA weights (optional)
Merge LoRA adapters into base model for deployment:
litgpt merge_lora \
out/phi2-lora/final \
--out_dir out/phi2-merged
Now use merged model:
from litgpt import LLM
llm = LLM.load("out/phi2-merged")
Workflow 3: Pretrain from scratch
Train new model on your domain data.
Pretraining:
- [ ] Step 1: Prepare pretraining dataset
- [ ] Step 2: Configure model architecture
- [ ] Step 3: Set up multi-GPU training
- [ ] Step 4: Launch pretraining
Step 1: Prepare pretraining dataset
LitGPT expects tokenized data. Use prepare_dataset.py:
python scripts/prepare_dataset.py \
--source_path data/my_corpus.txt \
--checkpoint_dir checkpoints/tokenizer \
--destination_path data/pretrain \
--split train,val
Step 2: Configure model architecture
Edit config file or use existing:
model_name: pythia-160m
block_size: 2048
vocab_size: 50304
n_layer: 12
n_head: 12
n_embd: 768
rotary_percentage: 0.25
parallel_residual: true
bias: true
Step 3: Set up multi-GPU training
litgpt pretrain \
--config config/pythia-160m.yaml \
--data.data_dir data/pretrain \
--train.max_tokens 10_000_000_000
litgpt pretrain \
--config config/pythia-1b.yaml \
--data.data_dir data/pretrain \
--devices 8 \
--train.max_tokens 100_000_000_000
Step 4: Launch pretraining
For large-scale pretraining on cluster:
sbatch --nodes=8 --gpus-per-node=8 \
pretrain_script.sh
litgpt pretrain \
--config config/pythia-1b.yaml \
--data.data_dir /shared/data/pretrain \
--devices 8 \
--num_nodes 8 \
--train.global_batch_size 512 \
--train.max_tokens 300_000_000_000
Workflow 4: Convert and deploy model
Export LitGPT models for production.
Model Deployment:
- [ ] Step 1: Test inference locally
- [ ] Step 2: Quantize model (optional)
- [ ] Step 3: Convert to GGUF (for llama.cpp)
- [ ] Step 4: Deploy with API
Step 1: Test inference locally
from litgpt import LLM
llm = LLM.load("out/phi2-lora/final")
print(llm.generate("What is machine learning?"))
for token in llm.generate("Explain quantum computing", stream=True):
print(token, end="", flush=True)
prompts