markdown-tools▌
daymade/claude-code-skills · updated Apr 8, 2026
Convert documents to high-quality markdown with intelligent multi-tool orchestration.
Markdown Tools
Convert documents to high-quality markdown with intelligent multi-tool orchestration.
Dual Mode Architecture
| Mode | Speed | Quality | Use Case |
|---|---|---|---|
| Quick (default) | Fast | Good | Drafts, simple documents |
| Heavy | Slower | Best | Final documents, complex layouts |
Quick Start
Installation
# Required: PDF/DOCX/PPTX support
uv tool install "markitdown[pdf]"
pip install pymupdf4llm
brew install pandoc
Basic Conversion
# Quick Mode (default) - fast, single best tool
uv run --with pymupdf4llm --with markitdown scripts/convert.py document.pdf -o output.md
# Heavy Mode - multi-tool parallel execution with merge
uv run --with pymupdf4llm --with markitdown scripts/convert.py document.pdf -o output.md --heavy
# Check available tools
uv run scripts/convert.py --list-tools
Tool Selection Matrix
| Format | Quick Mode Tool | Heavy Mode Tools |
|---|---|---|
| pymupdf4llm | pymupdf4llm + markitdown | |
| DOCX | pandoc | pandoc + markitdown |
| PPTX | markitdown | markitdown + pandoc |
| XLSX | markitdown | markitdown |
Tool Characteristics
- pymupdf4llm: LLM-optimized PDF conversion with native table detection and image extraction
- markitdown: Microsoft's universal converter, good for Office formats
- pandoc: Excellent structure preservation for DOCX/PPTX
Heavy Mode Workflow
Heavy Mode runs multiple tools in parallel and selects the best segments:
- Parallel Execution: Run all applicable tools simultaneously
- Segment Analysis: Parse each output into segments (tables, headings, images, paragraphs)
- Quality Scoring: Score each segment based on completeness and structure
- Intelligent Merge: Select best version of each segment across tools
Merge Criteria
| Segment Type | Selection Criteria |
|---|---|
| Tables | More rows/columns, proper header separator |
| Images | Alt text present, local paths preferred |
| Headings | Proper hierarchy, appropriate length |
| Lists | More items, nested structure preserved |
| Paragraphs | Content completeness |
Image Extraction
# Extract images with metadata
uv run --with pymupdf scripts/extract_pdf_images.py document.pdf -o ./assets
# Generate markdown references file
uv run --with pymupdf scripts/extract_pdf_images.py document.pdf --markdown refs.md
Output:
- Images:
assets/img_page1_1.png,assets/img_page2_1.jpg - Metadata:
assets/images_metadata.json(page, position, dimensions)
Quality Validation
# Validate conversion quality
uv run --with pymupdf scripts/validate_output.py document.pdf output.md
# Generate HTML report
uv run --with pymupdf scripts/validate_output.py document.pdf output.md --report report.html
Quality Metrics
| Metric | Pass | Warn | Fail |
|---|---|---|---|
| Text Retention | >95% | 85-95% | <85% |
| Table Retention | 100% | 90-99% | <90% |
| Image Retention | 100% | 80-99% | <80% |
Merge Outputs Manually
# Merge multiple markdown files
python scripts/merge_outputs.py output1.md output2.md -o merged.md
# Show segment attribution
python scripts/merge_outputs.py output1.md output2.md -o merged.md --verbose
Path Conversion (Windows/WSL)
# Windows → WSL conversion
python scripts/convert_path.py "C:\Users\name\Documents\file.pdf"
# Output: /mnt/c/Users/name/Documents/file.pdf
Common Issues
"No conversion tools available"
# Install all tools
pip install pymupdf4llm
uv tool install "markitdown[pdf]"
brew install pandoc
FontBBox warnings during PDF conversion
- Harmless font parsing warnings, output is still correct
Images missing from output
- Use Heavy Mode for better image preservation
- Or extract separately with
scripts/extract_pdf_images.py
Tables broken in output
- Use Heavy Mode - it selects the most complete table version
- Or validate with
scripts/validate_output.py
Bundled Scripts
| Script | Purpose |
|---|---|
convert.py |
Main orchestrator with Quick/Heavy mode |
merge_outputs.py |
Merge multiple markdown outputs |
validate_output.py |
Quality validation with HTML report |
extract_pdf_images.py |
PDF image extraction with metadata |
convert_path.py |
Windows to WSL path converter |
References
references/heavy-mode-guide.md- Detailed Heavy Mode documentationreferences/tool-comparison.md- Tool capabilities comparisonreferences/conversion-examples.md- Batch operation examples
Discussion
Product Hunt–style comments (not star reviews)- No comments yet — start the thread.
Ratings
4.5★★★★★65 reviews- ★★★★★Sakura Choi· Dec 28, 2024
Keeps context tight: markdown-tools is the kind of skill you can hand to a new teammate without a long onboarding doc.
- ★★★★★Ren Johnson· Dec 16, 2024
We added markdown-tools from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
- ★★★★★Emma Mehta· Dec 8, 2024
markdown-tools fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.
- ★★★★★Ganesh Mohane· Dec 4, 2024
Useful defaults in markdown-tools — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.
- ★★★★★Neel Tandon· Nov 27, 2024
Registry listing for markdown-tools matched our evaluation — installs cleanly and behaves as described in the markdown.
- ★★★★★Sakshi Patil· Nov 23, 2024
markdown-tools is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.
- ★★★★★Ren Kapoor· Nov 19, 2024
I recommend markdown-tools for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.
- ★★★★★Soo Lopez· Nov 7, 2024
markdown-tools reduced setup friction for our internal harness; good balance of opinion and flexibility.
- ★★★★★Liam Sethi· Oct 26, 2024
Registry listing for markdown-tools matched our evaluation — installs cleanly and behaves as described in the markdown.
- ★★★★★Chinedu Khanna· Oct 18, 2024
markdown-tools reduced setup friction for our internal harness; good balance of opinion and flexibility.
showing 1-10 of 65