Cloudflare Vectorize
Complete implementation guide for Cloudflare Vectorize - a globally distributed vector database for building semantic search, RAG (Retrieval Augmented Generation), and AI-powered applications with Cloudflare Workers.
Status: Production Ready โ
Last Updated: 2026-01-21
Dependencies: cloudflare-worker-base (for Worker setup), cloudflare-workers-ai (for embeddings)
Latest Versions: [email protected], @cloudflare/[email protected]
Token Savings: ~70%
Errors Prevented: 14
Dev Time Saved: ~4 hours
What This Skill Provides
Core Capabilities
- โ
Index Management: Create, configure, and manage vector indexes
- โ
Vector Operations: Insert, upsert, query, delete, and list vectors (list-vectors added August 2025)
- โ
Metadata Filtering: Advanced filtering with 10 metadata indexes per index
- โ
Semantic Search: Find similar vectors using cosine, euclidean, or dot-product metrics
- โ
RAG Patterns: Complete retrieval-augmented generation workflows
- โ
Workers AI Integration: Native embedding generation with @cf/baai/bge-base-en-v1.5
- โ
OpenAI Integration: Support for text-embedding-3-small/large models
- โ
Document Processing: Text chunking and batch ingestion pipelines
- โ
Testing Setup: Vitest configuration with Vectorize bindings
Templates Included
- basic-search.ts - Simple vector search with Workers AI
- rag-chat.ts - Full RAG chatbot with context retrieval
- document-ingestion.ts - Document chunking and embedding pipeline
- metadata-filtering.ts - Advanced filtering patterns
โ ๏ธ Vectorize V2 Breaking Changes (September 2024)
IMPORTANT: Vectorize V2 became GA in September 2024 with significant breaking changes.
What Changed in V2
Performance Improvements:
- Index capacity: 200,000 โ 5 million vectors per index
- Query latency: 549ms โ 31ms median (18ร faster)
- TopK limit: 20 โ 100 results per query
- Scale limits: 100 โ 50,000 indexes per account
- Namespace limits: 100 โ 50,000 namespaces per index
Breaking API Changes:
-
Async Mutations - All mutations now asynchronous:
const result = await env.VECTORIZE_INDEX.insert(vectors);
console.log(result.mutationId);
-
returnMetadata Parameter - Boolean โ String enum:
{ returnMetadata: true }
{ returnMetadata: 'all' | 'indexed' | 'none' }
-
Metadata Indexes Required Before Insert:
- V2 requires metadata indexes created BEFORE vectors inserted
- Vectors added before metadata index won't be indexed
- Must re-upsert vectors after creating metadata index
V1 Deprecation Timeline:
- December 2024: Can no longer create V1 indexes
- Existing V1 indexes: Continue to work (other operations unaffected)
- Migration: Use
wrangler vectorize --deprecated-v1 flag for V1 operations
Wrangler Version Required:
Check Mutation Status
const info = await env.VECTORIZE_INDEX.describe();
console.log(info.mutationId);
console.log(info.processedUpToMutation);
Critical Setup Rules
โ ๏ธ MUST DO BEFORE INSERTING VECTORS
npx wrangler vectorize create my-index \
--dimensions=768 \
--metric=cosine
npx wrangler vectorize create-metadata-index my-index \
--property-name=category \
--type=string
npx wrangler vectorize create-metadata-index my-index \
--property-name=timestamp \
--type=number
Why: Metadata indexes MUST exist before vectors are inserted. Vectors added before a metadata index was created won't be filterable on that property.
Index Configuration (Cannot Be Changed Later)
Wrangler Configuration
wrangler.jsonc:
{
"name": "my-vectorize-worker",
"main": "src/index.ts",
"compatibility_date": "2025-10-21",
"vectorize": [
{
"binding": "VECTORIZE_INDEX",
"index_name": "my-index"
}
],
"ai": {
"binding": "AI"
}
}
TypeScript Types
export interface Env {
VECTORIZE_INDEX: VectorizeIndex;
AI: Ai;
}
interface VectorizeVector {
id: string;
values: number[] | Float32Array | Float64Array;
namespace?: string;
metadata?: Record<string, string | number | boolean | string[]>;
}
interface VectorizeMatches {
matches: Array<{
id: string;
score: number;
values?: number[];
metadata?: Record<string, any>;
namespace?: string;
}>;
count: number;
}
Metadata Filter Operators (V2)
Vectorize V2 supports advanced metadata filtering with range queries:
{ category: "docs" }
{ status: { $ne: "archived" } }
{ category: { $in: ["docs", "tutorials"] } }
{ category: { $nin: ["deprecated", "draft"] } }
{ timestamp: { $gte: 1704067200, $lt: 1735689600 } }
{ url: { $gte: "/docs/workers", $lt: "/docs/workersz" } }
{ "author.id": "user123" }
{ category: "docs", language: "en", "metadata.published": true }
Metadata Best Practices
1. Cardinality Considerations
Low Cardinality (Good for $eq filters):
metadata: {
category: "docs",
language: "en",
published: true
}
High Cardinality (Avoid in range queries):
metadata: {
user_id: "uuid-v4...",
timestamp_ms: 1704067200123
}
2. Metadata Limits
- Max 10 metadata indexes per Vectorize index
- Max 10 KiB metadata per vector
- String indexes: First 64 bytes (UTF-8)
- Number indexes: Float64 precision
- Filter size: Max 2048 bytes (compact JSON)
3. Vector Dimension Limit
Current Limit: 1536 dimensions per vector
Source: GitHub Issue #8729
Supported Embedding Models:
- Workers AI
@cf/baai/bge-base-en-v1.5: 768 dimensions โ
- OpenAI
text-embedding-3-small: 1536 dimensions โ
- OpenAI
text-embedding-3-large: 3072 dimensions โ (requires dimension reduction)
Unsupported Models (>1536 dimensions):
nomic-embed-code: 3584 dimensions
Qodo-Embed-1-7B: >1536 dimensions
Workaround:
Use dimensionality reduction (e.g., PCA) to compress embeddings to 1536 or fewer dimensions, though this may reduce semantic quality.
Feature Request: Higher dimension support is under consideration. Use Limit Increase Request Form if this blocks your use case.
4. Key Restrictions
metadata: {
"": "value",