MinerU is an open-source document parsing engine from OpenDataLab (github.com/opendatalab/MinerU) that converts PDF, images, DOCX, PPTX, and XLSX into machine-readable Markdown and JSON for RAG, agents, and pre-training pipelines. Born during InternLM pre-training; now at v3.4.0 with ~69.7k GitHub stars.

What changed in MinerU 3.4?

Released June 18, 2026. Pipeline backend OCR upgraded to PP-OCRv6 (~11% accuracy gain on OmniDocBench v1.6). OCR processing speed improved ~100%. Model download logic adds automatic source selection by network environment and prioritizes local cache reuse before remote downloads.

How accurate is MinerU?

On OmniDocBench v1.6 end-to-end scores: pipeline backend 86.47; hybrid/VLM high effort 95.39; hybrid medium (new default) 95.26; VLM HTTP client 95.30. Medium effort trades 0.13 points vs high for 35–220% speed gains depending on platform.

Can MinerU run on CPU only?

Yes. Use the pipeline backend: mineru -p input.pdf -o output/ -b pipeline. Pipeline runs on pure CPU with min 16GB RAM (32GB recommended). Hybrid and VLM backends require GPU (min 8GB VRAM) or Apple Silicon for acceleration.

How do I install MinerU?

pip install uv && uv pip install -U "mineru[all]" — includes all core features on Windows, Linux, and macOS. Docker deployment supported on Linux and Windows WSL2. CLI: mineru -p input_path -o output_path.

What license does MinerU use?

Since v3.1.0 (April 2026), MinerU uses the MinerU Open Source License — based on Apache 2.0 with additional conditions — replacing the earlier AGPLv3. This lowers friction for commercial deployments while keeping the project open.

MinerU 3.4 Guide — PDF/Office to Markdown for RAG & Agents | explainx.ai Blog

explainx.ainewsletter3.5k

workshops ↗

MinerU 3.4 Guide — PDF/Office to Markdown for RAG & Agents | explainx.ai Blog | explainx.ai

If your RAG pipeline still treats PDFs as "extract text with PyPDF and hope," you are leaving layout, tables, formulas, and multi-column structure on the floor. MinerU — OpenDataLab's document parsing engine with ~69.7k GitHub stars — exists to fix that: turn complex PDFs and Office documents into LLM-ready Markdown and JSON with headings, tables, formulas, and images preserved.

Version 3.4.0 landed June 18, 2026 with a focused upgrade: PP-OCRv6 for the pipeline backend (~11% OCR accuracy gain on OmniDocBench v1.6), roughly 100% faster OCR processing, and smarter model download / cache reuse. For agent builders, MinerU is increasingly the default ingestion layer before chunking, embedding, and retrieval.

TL;DR

Detail	MinerU 3.4
Repo	github.com/opendatalab/MinerU
Docs	opendatalab.github.io/MinerU
Latest release	mineru-3.4.0 (June 2026)
Stars / forks	~69.7k / ~5.9k
Inputs	PDF, images, DOCX, PPTX, XLSX

Backend	Accuracy (OmniDocBench v1.6 E2E)	CPU	GPU	Best for
pipeline	86.47	✅	Optional	Homelab, CPU-only, batch OCR
hybrid medium (default)	95.26	❌	8GB+ VRAM	Daily production — speed/accuracy balance
hybrid high	95.39	❌	8GB+ VRAM	Max accuracy, image analysis
vlm / vlm-http-client	95.30	❌	2GB+ VRAM (client)	OpenAI-compatible remote servers

Platform	Text PDF speedup	OCR scenario speedup
Linux	~80%	~35%
Windows	~90%	~45%
macOS	~220%	~50%

	pipeline	hybrid / vlm
OS	Linux 2019+, Windows, macOS 14+	Same
Python	3.10–3.13 (Windows: 3.10–3.12)	Same
RAM	Min 16GB, rec 32GB+	Min 16GB
VRAM	4GB optional	Min 8GB (hybrid), 2GB (http client)
Disk	Min 20GB SSD recommended	Min 2GB (+ models)

Tool	Strength	Trade-off
MinerU 3.4	Full stack, multi-backend, Office formats, router	Heavy install, GPU for best accuracy
Unlimited-OCR	One-shot long PDFs, SGLang throughput	Vision-model path, different architecture
Mistral OCR 4	Managed API, bounding boxes, confidence	Not self-hosted weights
Generic PyPDF	Fast, trivial	No layout, tables, or formulas

Demo	Notes
Official web app	Full features, login required
OpenDataLab	Same as official
ModelScope Gradio	Core parsing, no login
HuggingFace Gradio	Core parsing, no login

Post	Connection
Baidu Unlimited-OCR	Alternative long-horizon parsing approach
Mistral OCR 4	Managed document AI API comparison
RAG vs agentic RAG	What to do with parsed documents
Liquid AI LFM2.5-230M	Edge extraction after MinerU ingestion
arXiv AI-generated errors ban	Why grounded document pipelines matter
Vesuvius Challenge scroll read	Extreme document recovery — ML ink detection + human transcription

MinerU 3.4: PDF and Office Parsing for LLM, RAG, and Agent Workflows

TL;DR

Related posts

Baidu's Unlimited-OCR: One-Shot Long-Horizon Document Parsing Is Here

Mistral OCR 4: Bounding Boxes, Document AI, and the New OCR API

PixelRAG: Berkeley's Visual RAG That Reads Web Pages as Screenshots (Not HTML)

Why MinerU Matters for RAG and Agents

Version 3.4: What Changed (June 18, 2026)

PP-OCRv6 upgrade

~100% OCR speed improvement

Model download and cache

Parsing Backends Compared

VLM model: MinerU2.5-Pro

Key Features

Quick Start

Install

Parse a document (GPU path)

Parse on CPU only

Docker

Production: mineru-router and Multi-GPU

Hardware Requirements (Summary)

MinerU vs Alternatives (June 2026)

License Evolution

Online Demos (Try Before Deploy)

Summary

TL;DR

Related posts

Baidu's Unlimited-OCR: One-Shot Long-Horizon Document Parsing Is Here

Mistral OCR 4: Bounding Boxes, Document AI, and the New OCR API

PixelRAG: Berkeley's Visual RAG That Reads Web Pages as Screenshots (Not HTML)

Why MinerU Matters for RAG and Agents

Version 3.4: What Changed (June 18, 2026)

PP-OCRv6 upgrade

~100% OCR speed improvement

Model download and cache

Parsing Backends Compared

VLM model: MinerU2.5-Pro

Key Features

Quick Start

Install

Parse a document (GPU path)

Parse on CPU only

Docker

Production: mineru-router and Multi-GPU

Hardware Requirements (Summary)

MinerU vs Alternatives (June 2026)

License Evolution

Online Demos (Try Before Deploy)

Related explainx.ai coverage

Summary