GrepAI Storage with PostgreSQL
This skill covers using PostgreSQL with the pgvector extension as the storage backend for GrepAI.
When to Use This Skill
- Team environments with shared index
- Large codebases (10K+ files)
- Need concurrent access
- Integration with existing PostgreSQL infrastructure
Prerequisites
- PostgreSQL 14+ with pgvector extension
- Database user with create table permissions
- Network access to PostgreSQL server
Advantages
| Benefit |
Description |
| ๐ฅ Team sharing |
Multiple users can access same index |
| ๐ Scalable |
Handles large codebases |
| ๐ Concurrent |
Multiple simultaneous searches |
| ๐พ Persistent |
Data survives machine restarts |
| ๐ง Familiar |
Standard database tooling |
Setting Up PostgreSQL with pgvector
Option 1: Docker (Recommended for Development)
docker run -d \
--name grepai-postgres \
-e POSTGRES_USER=grepai \
-e POSTGRES_PASSWORD=grepai \
-e POSTGRES_DB=grepai \
-p 5432:5432 \
pgvector/pgvector:pg16
Option 2: Install on Existing PostgreSQL
sudo apt install postgresql-16-pgvector
git clone https://github.com/pgvector/pgvector.git
cd pgvector
make
sudo make install
Then enable the extension:
CREATE EXTENSION IF NOT EXISTS vector;
Option 3: Managed Services
- Supabase: pgvector included by default
- Neon: pgvector available
- AWS RDS: Install pgvector extension
- Azure Database: pgvector available
Configuration
Basic Configuration
store:
backend: postgres
postgres:
dsn: postgres://user:password@localhost:5432/grepai
With Environment Variable
store:
backend: postgres
postgres:
dsn: ${DATABASE_URL}
Set the environment variable:
export DATABASE_URL="postgres://user:password@localhost:5432/grepai"
Full DSN Options
store:
backend: postgres
postgres:
dsn: postgres://user:password@host:5432/database?sslmode=require
DSN components:
user: Database username
password: Database password
host: Server hostname or IP
5432: Port (default: 5432)
database: Database name
sslmode: SSL mode (disable, require, verify-full)
SSL Modes
| Mode |
Description |
Use Case |
disable |
No SSL |
Local development |
require |
SSL required |
Production |
verify-full |
SSL + verify certificate |
High security |
store:
backend: postgres
postgres:
dsn: postgres://user:[email protected]:5432/grepai?sslmode=require
Database Schema
GrepAI automatically creates these tables:
CREATE TABLE IF NOT EXISTS embeddings (
id SERIAL PRIMARY KEY,
file_path TEXT NOT NULL,
chunk_index INTEGER NOT NULL,
content TEXT NOT NULL,
start_line INTEGER,
end_line INTEGER,
embedding vector(768),
created_at TIMESTAMP DEFAULT NOW(),
UNIQUE(file_path, chunk_index)
);
CREATE INDEX ON embeddings USING ivfflat (embedding vector_cosine_ops);
Verifying Setup
Check pgvector Extension
psql -U grepai -d grepai
SELECT * FROM pg_extension WHERE extname = 'vector';
\dt
Test Connection from GrepAI
grepai status
Performance Tuning
PostgreSQL Configuration
For better vector search performance:
SET work_mem = '256MB';
SET effective_cache_size = '4GB';
SET shared_buffers = '1GB';
Index Tuning
For large indices, tune the IVFFlat index:
CREATE INDEX ON embeddings
USING ivfflat (embedding vector_cosine_ops)
WITH (lists = 100);
Rule of thumb: lists = sqrt(rows)
Concurrent Access
PostgreSQL handles concurrent access automatically:
- Multiple
grepai search commands work simultaneously
- One
grepai watch daemon per codebase
- Many users can share the same index
Team Setup
Shared Database
All team members point to the same database:
store:
backend: postgres
postgres:
dsn: postgres://team:secret@shared-db.company.com:5432/grepai
Per-Project Databases
For isolated projects, use separate databases:
createdb -U postgres grepai_projecta
createdb -U postgres grepai_projectb
store:
backend: postgres
postgres:
dsn: postgres://user:pass@localhost:5432/grepai_projecta
Backup and Restore
Backup
pg_dump -U grepai -d grepai > grepai_backup.sql
Restore
psql -U grepai -d grepai < grepai_backup.sql
Migrating from GOB
- Set up PostgreSQL with pgvector
- Update configuration:
store:
backend: postgres
postgres:
dsn: postgres://user:pass@localhost:5432/grepai
- Delete old index:
rm .grepai/index.gob
- Re-index:
grepai watch
Common Issues
โ Problem: FATAL: password authentication failed
โ
Solution: Check DSN credentials and pg_hba.conf
โ Problem: ERROR: extension "vector" is not available
โ
Solution: Install pgvector:
sudo apt install postgresql-16-pgvector
โ Problem: ERROR: type "vector" does not exist
โ
Solution: Enable extension in the database:
CREATE EXTENSION IF NOT EXISTS vector;
โ Problem: Connection refused
โ
Solution:
- Check PostgreSQL is running
- Verify host and port
- Check firewall rules
โ Problem: Slow searches
โ
Solution:
- Add IVFFlat index
- Increase
work_mem
- Vacuum and analyze tables
Best Practices
- Use environment variables: Don't commit credentials
- Enable SSL: For remote databases
- Regular backups: pg_dump before major changes
- Monitor performance: Check query times
- Index maintenance: Regular VACUUM ANALYZE
Output Format
PostgreSQL storage status:
โ
PostgreSQL Storage Configured
Backend: PostgreSQL + pgvector
Host: localhost:5432
Database: grepai
SSL: disabled
Contents:
- Files: 2,450
- Chunks: 12,340
- Vector dimension: 768
Performance:
- Connection: OK
- IVFFlat index: Yes
- Search latency: ~50ms