Zvec is an open-source, in-process vector database built by Alibaba Group. It runs as a library embedded directly inside your application — no external server process required. It supports dense and sparse vectors, hybrid search, write-ahead logging (WAL) for persistence, and multi-process read concurrency.

How is Zvec different from Faiss or Chroma?

Faiss is a pure similarity search library with no persistence layer, no filtering, and no production-ready API. Chroma and similar databases run as separate server processes. Zvec combines the speed of a native library with persistence (WAL), structured filters, sparse vector support, and multi-process read access — all in-process with no server overhead.

What platforms does Zvec support?

Zvec supports Linux (x86_64, ARM64), macOS (ARM64), and Windows (x86_64). As of v0.4.0 it also supports Android (arm64-v8a) and iOS (arm64) via the official Flutter/Dart FFI package.

Does Zvec persist data between restarts?

Yes. Zvec uses write-ahead logging (WAL) to guarantee that data is never lost, even on process crash or power failure. The WAL is flushed before each insert returns.

Can multiple processes read from the same Zvec collection?

Yes. Multiple processes can open the same collection in read mode simultaneously. Writes require single-process exclusive access.

Is Zvec production-ready?

Zvec has been battle-tested within Alibaba Group in production workloads. It has 9.9k+ GitHub stars, 8 releases, and 27 contributors as of June 2026.

Zvec: Alibaba In-Process Vector Database — Open Source 2026 | explainx.ai Blog

Alibaba just open-sourced the vector database they've been running in production. Zvec — an in-process, embedded vector database — landed on GitHub in late 2025 and has already reached 9.9k stars. The v0.5.0 release dropped in June 2026, adding WAL guarantees, libaio support, and prefetch configuration for tuning search latency.

The pitch is straightforward: the speed of a native library, the durability of a real database, and zero server setup. If you've ever added Chroma or Qdrant to a project just to get local vector search and felt like it was too much infrastructure for what you needed — Zvec is the answer.

What Is Zvec?

Zvec is an in-process vector database — a library that runs inside your application rather than as a separate server. You import it, open a collection (a directory on disk), insert vectors, and query them. There is no daemon to start, no port to configure, no connection pooling to manage.

This puts it in the same category as SQLite, not Postgres. The analogy is intentional: just as SQLite gave developers a production-grade relational database they could embed in any app, Zvec gives developers a production-grade vector database with the same zero-infrastructure profile.

Alibaba has been running it internally across multiple production workloads before open-sourcing it, which means the battle-testing has already happened at serious scale.

Why In-Process Matters

Most popular vector databases (Qdrant, Weaviate, Chroma in server mode, Milvus) require running a separate process. That's fine for teams with dedicated infra, but it creates real friction for:

Local development — you need Docker or a running daemon before writing a single line of retrieval code
Edge / mobile — shipping a server process to Android or iOS is not practical
CLI tools and notebooks — standing up infrastructure for a script is over-engineering
Serverless functions — ephemeral environments make external connections expensive

Zvec eliminates all of these friction points. Since it's just a library, it starts the instant your process starts and shuts down cleanly when your process exits.

For context on where in-process vector search fits in the broader RAG landscape, see our comparison of RAG vs MCP for context-aware AI systems.

Core Features

Dense + Sparse Vector Support

Zvec handles both dense vectors (typical embeddings from models like text-embedding-3) and sparse vectors (BM25-style term weights from models like SPLADE). Both can be queried in a single call, enabling true hybrid retrieval without a separate keyword search layer.

Hybrid Search with Filters

Combine semantic similarity with structured filters in one query. This matters for most real-world retrieval tasks where you want "most semantically similar to X" combined with "must have field Y = value Z."

WAL Persistence

Write-ahead logging means every insert is durable before it returns. Zvec survives process crashes and power failures without data loss — which separates it from pure in-memory libraries like FAISS.

Multi-Process Read Concurrency

Multiple processes can open the same Zvec collection simultaneously for reads. Only writes require exclusive single-process access. This makes it viable for multi-worker serving scenarios.

Indexing: HNSW and DiskANN

Zvec supports HNSW (the standard graph-based ANN algorithm) and DiskANN (Microsoft's disk-based index that enables billion-scale search from commodity hardware with limited RAM). DiskANN support was added in the v0.4.x cycle.

Quick Start

python

import zvec

# Define schema
schema = zvec.CollectionSchema(
    name="docs",
    vectors=zvec.VectorSchema("embedding", zvec.DataType.VECTOR_FP32, 1536),
)

# Open (creates if not exists)
collection = zvec.create_and_open(path="./my_collection", schema=schema)

# Insert
collection.insert([
    zvec.Doc(id="doc_1", vectors={"embedding": embedding_1}, fields={"category": "tech"}),
    zvec.Doc(id="doc_2", vectors={"embedding": embedding_2}, fields={"category": "finance"}),
])

# Query with filter
results = collection.query(
    zvec.VectorQuery("embedding", vector=query_embedding),
    topk=10,
    filters={"category": "tech"},
)

print(results)  # [{'id': 'doc_1', 'score': 0.94, ...}, ...]

Node.js is also supported via npm install @zvec/zvec — the API mirrors the Python SDK.

What's New in Recent Releases

v0.5.0 (June 2026)

libaio support for Linux async I/O (lower read latency on DiskANN workloads)
Prefetch configuration exposed as search params (PO and PL) for tuning I/O prefetch depth during ANN search
Further compiler warning fixes with -Werror across all CI platforms

v0.4.0 (May 2026)

Dart/Flutter SDK: official package with FFI bindings for Android (arm64-v8a) and iOS (arm64) — no manual native compilation required
iOS build support: expanding cross-platform coverage to Apple mobile
Enlarged topK limit: relaxed the upper bound for larger recall scenarios
Bug fixes: SQ8 quantizer recall drop, Windows path handling, sparse vector index ordering

Earlier milestones

DiskANN index support (v0.3.x era)
Full-text search (FTS) support
SQ8 scalar quantization

How It Compares

	Zvec	FAISS	Chroma	Qdrant
In-process	Yes	Yes	Optional	No
Persistence (WAL)	Yes	No	Yes	Yes
Hybrid search	Yes	No	Limited	Yes
Sparse vectors	Yes	No	No	Yes
Mobile support	Yes (Flutter)	No	No	No
Multi-process reads	Yes	No	No	N/A
DiskANN	Yes	No	No	No
Setup required	None	None	Server optional	Server required

For a deeper dive into where vector search fits vs. keyword retrieval in code contexts, see RAG vs Agentic RAG: why search beats embeddings for code retrieval.

Performance

Alibaba benchmarks Zvec against billion-scale datasets. The headline claims: searches across billions of vectors in milliseconds with the HNSW index, and larger-scale recall with DiskANN. The project publishes full benchmark methodology, configurations, and results at their docs site.

For comparison, Google's own vector compression work (TurboVec/TurboQuant) recently demonstrated compressing 10M vectors from 31GB to 4GB — a different angle on the same infrastructure problem of making vector search practical at scale. See our coverage of Google TurboVec and TurboQuant for how the tradeoffs compare.

Who Should Use Zvec

Good fit:

Python or Node.js apps that need local vector search without infrastructure overhead
RAG pipelines in notebooks, scripts, or serverless functions
Mobile apps (Flutter/React Native with FFI) that need on-device semantic search
Production services where you want embedded, not networked, vector retrieval
Teams who want DiskANN's disk-based billion-scale search without Milvus's operational complexity

Less ideal:

Multi-tenant SaaS needing shared vector infrastructure across many isolated users
Workloads requiring horizontal write scaling across many nodes
Teams already invested in a managed vector database with cloud-native features

Getting It

bash

# Python
pip install zvec

# Node.js
npm install @zvec/zvec

Source is at github.com/alibaba/zvec under Apache 2.0. The project has 27 contributors, 8 releases, and active CI across Linux, macOS, Windows, Android, and iOS.

Zvec is the kind of infrastructure release that quietly becomes load-bearing in a lot of projects. In-process vector search with WAL durability, hybrid filtering, and mobile support fills a real gap — and the Alibaba production provenance means you're not betting on an untested project.

What Is Zvec?

Alibaba has been running it internally across multiple production workloads before open-sourcing it, which means the battle-testing has already happened at serious scale.

Why In-Process Matters

Most popular vector databases (Qdrant, Weaviate, Chroma in server mode, Milvus) require running a separate process. That's fine for teams with dedicated infra, but it creates real friction for:

Local development — you need Docker or a running daemon before writing a single line of retrieval code
Edge / mobile — shipping a server process to Android or iOS is not practical
CLI tools and notebooks — standing up infrastructure for a script is over-engineering
Serverless functions — ephemeral environments make external connections expensive

Zvec eliminates all of these friction points. Since it's just a library, it starts the instant your process starts and shuts down cleanly when your process exits.

For context on where in-process vector search fits in the broader RAG landscape, see our comparison of RAG vs MCP for context-aware AI systems.

Core Features

Dense + Sparse Vector Support

Hybrid Search with Filters

WAL Persistence

Multi-Process Read Concurrency

Multiple processes can open the same Zvec collection simultaneously for reads. Only writes require exclusive single-process access. This makes it viable for multi-worker serving scenarios.

Indexing: HNSW and DiskANN

Quick Start

python

import zvec

# Define schema
schema = zvec.CollectionSchema(
    name="docs",
    vectors=zvec.VectorSchema("embedding", zvec.DataType.VECTOR_FP32, 1536),
)

# Open (creates if not exists)
collection = zvec.create_and_open(path="./my_collection", schema=schema)

# Insert
collection.insert([
    zvec.Doc(id="doc_1", vectors={"embedding": embedding_1}, fields={"category": "tech"}),
    zvec.Doc(id="doc_2", vectors={"embedding": embedding_2}, fields={"category": "finance"}),
])

# Query with filter
results = collection.query(
    zvec.VectorQuery("embedding", vector=query_embedding),
    topk=10,
    filters={"category": "tech"},
)

print(results)  # [{'id': 'doc_1', 'score': 0.94, ...}, ...]

Node.js is also supported via npm install @zvec/zvec — the API mirrors the Python SDK.

What's New in Recent Releases

v0.5.0 (June 2026)

libaio support for Linux async I/O (lower read latency on DiskANN workloads)
Prefetch configuration exposed as search params (PO and PL) for tuning I/O prefetch depth during ANN search
Further compiler warning fixes with -Werror across all CI platforms

v0.4.0 (May 2026)

Dart/Flutter SDK: official package with FFI bindings for Android (arm64-v8a) and iOS (arm64) — no manual native compilation required
iOS build support: expanding cross-platform coverage to Apple mobile
Enlarged topK limit: relaxed the upper bound for larger recall scenarios
Bug fixes: SQ8 quantizer recall drop, Windows path handling, sparse vector index ordering

Earlier milestones

DiskANN index support (v0.3.x era)
Full-text search (FTS) support
SQ8 scalar quantization

How It Compares

	Zvec	FAISS	Chroma	Qdrant
In-process	Yes	Yes	Optional	No
Persistence (WAL)	Yes	No	Yes	Yes
Hybrid search	Yes	No	Limited	Yes
Sparse vectors	Yes	No	No	Yes
Mobile support	Yes (Flutter)	No	No	No
Multi-process reads	Yes	No	No	N/A
DiskANN	Yes	No	No	No
Setup required	None	None	Server optional	Server required

For a deeper dive into where vector search fits vs. keyword retrieval in code contexts, see RAG vs Agentic RAG: why search beats embeddings for code retrieval.

Performance

Who Should Use Zvec

Good fit:

Python or Node.js apps that need local vector search without infrastructure overhead
RAG pipelines in notebooks, scripts, or serverless functions
Mobile apps (Flutter/React Native with FFI) that need on-device semantic search
Production services where you want embedded, not networked, vector retrieval
Teams who want DiskANN's disk-based billion-scale search without Milvus's operational complexity

Less ideal:

Multi-tenant SaaS needing shared vector infrastructure across many isolated users
Workloads requiring horizontal write scaling across many nodes
Teams already invested in a managed vector database with cloud-native features

Getting It

bash

# Python
pip install zvec

# Node.js
npm install @zvec/zvec

Source is at github.com/alibaba/zvec under Apache 2.0. The project has 27 contributors, 8 releases, and active CI across Linux, macOS, Windows, Android, and iOS.

What Is Zvec?

Why In-Process Matters

Core Features

Dense + Sparse Vector Support

Hybrid Search with Filters

WAL Persistence

Multi-Process Read Concurrency

Indexing: HNSW and DiskANN

Quick Start

What's New in Recent Releases

v0.5.0 (June 2026)

v0.4.0 (May 2026)

Earlier milestones

How It Compares

Performance

Who Should Use Zvec

Getting It

What Is Zvec?

Why In-Process Matters

Core Features

Dense + Sparse Vector Support

Hybrid Search with Filters

WAL Persistence

Multi-Process Read Concurrency

Indexing: HNSW and DiskANN

Quick Start

What's New in Recent Releases

v0.5.0 (June 2026)

v0.4.0 (May 2026)

Earlier milestones

How It Compares

Performance

Who Should Use Zvec

Getting It

Related posts

MinerU 3.4: PDF and Office Parsing for LLM, RAG, and Agent Workflows

PixelRAG: Berkeley's Visual RAG That Reads Web Pages as Screenshots (Not HTML)

Turso: The SQLite-Compatible Database Rewritten in Rust — MVCC, Async I/O, Vector Search, and an MCP Server

Related posts

MinerU 3.4: PDF and Office Parsing for LLM, RAG, and Agent Workflows

PixelRAG: Berkeley's Visual RAG That Reads Web Pages as Screenshots (Not HTML)

Turso: The SQLite-Compatible Database Rewritten in Rust — MVCC, Async I/O, Vector Search, and an MCP Server