How do I install doc-parser?

Run `npx skills add https://github.com/claude-office-skills/skills --skill doc-parser` in your terminal. You need to have run `npx skills init` once in your project first.

Which agent frameworks does doc-parser support?

doc-parser works with any agent framework supported by the Skills registry, including Claude Code, Cursor, GitHub Copilot, Cline, Codex, and Gemini CLI.

Is doc-parser free to use?

Yes. doc-parser is free to install and use. It is available from the open explainx.ai skill registry published by claude-office-skills.

Where can I read ratings and reviews for doc-parser?

Community ratings and review text appear on this explainx.ai skill page below the description. Reviews use a 1–5 scale and may include short written feedback from signed-in members.

How do I install doc-parser?

Run `npx skills add https://github.com/claude-office-skills/skills --skill doc-parser` in your terminal. You need to have run `npx skills init` once in your project first.

Which agent frameworks does doc-parser support?

doc-parser works with any agent framework supported by the Skills registry, including Claude Code, Cursor, GitHub Copilot, Cline, Codex, and Gemini CLI.

Is doc-parser free to use?

Yes. doc-parser is free to install and use. It is available from the open explainx.ai skill registry published by claude-office-skills.

Where can I read ratings and reviews for doc-parser?

Community ratings and review text appear on this explainx.ai skill page below the description. Reviews use a 1–5 scale and may include short written feedback from signed-in members.

How do I install doc-parser?

Run `npx skills add https://github.com/claude-office-skills/skills --skill doc-parser` in your terminal. You need to have run `npx skills init` once in your project first.

Which agent frameworks does doc-parser support?

doc-parser works with any agent framework supported by the Skills registry, including Claude Code, Cursor, GitHub Copilot, Cline, Codex, and Gemini CLI.

Is doc-parser free to use?

Yes. doc-parser is free to install and use. It is available from the open explainx.ai skill registry published by claude-office-skills.

Where can I read ratings and reviews for doc-parser?

Community ratings and review text appear on this explainx.ai skill page below the description. Reviews use a 1–5 scale and may include short written feedback from signed-in members.

How do I install doc-parser?

Run `npx skills add https://github.com/claude-office-skills/skills --skill doc-parser` in your terminal. You need to have run `npx skills init` once in your project first.

Which agent frameworks does doc-parser support?

doc-parser works with any agent framework supported by the Skills registry, including Claude Code, Cursor, GitHub Copilot, Cline, Codex, and Gemini CLI.

Is doc-parser free to use?

Yes. doc-parser is free to install and use. It is available from the open explainx.ai skill registry published by claude-office-skills.

Where can I read ratings and reviews for doc-parser?

Community ratings and review text appear on this explainx.ai skill page below the description. Reviews use a 1–5 scale and may include short written feedback from signed-in members.

How do I install doc-parser?

Run `npx skills add https://github.com/claude-office-skills/skills --skill doc-parser` in your terminal. You need to have run `npx skills init` once in your project first.

Which agent frameworks does doc-parser support?

doc-parser works with any agent framework supported by the Skills registry, including Claude Code, Cursor, GitHub Copilot, Cline, Codex, and Gemini CLI.

Is doc-parser free to use?

Yes. doc-parser is free to install and use. It is available from the open explainx.ai skill registry published by claude-office-skills.

Where can I read ratings and reviews for doc-parser?

Community ratings and review text appear on this explainx.ai skill page below the description. Reviews use a 1–5 scale and may include short written feedback from signed-in members.

How do I install doc-parser?

Run `npx skills add https://github.com/claude-office-skills/skills --skill doc-parser` in your terminal. You need to have run `npx skills init` once in your project first.

Which agent frameworks does doc-parser support?

doc-parser works with any agent framework supported by the Skills registry, including Claude Code, Cursor, GitHub Copilot, Cline, Codex, and Gemini CLI.

Is doc-parser free to use?

Yes. doc-parser is free to install and use. It is available from the open explainx.ai skill registry published by claude-office-skills.

Where can I read ratings and reviews for doc-parser?

Community ratings and review text appear on this explainx.ai skill page below the description. Reviews use a 1–5 scale and may include short written feedback from signed-in members.

How do I install doc-parser?

Run `npx skills add https://github.com/claude-office-skills/skills --skill doc-parser` in your terminal. You need to have run `npx skills init` once in your project first.

Which agent frameworks does doc-parser support?

doc-parser works with any agent framework supported by the Skills registry, including Claude Code, Cursor, GitHub Copilot, Cline, Codex, and Gemini CLI.

Is doc-parser free to use?

Yes. doc-parser is free to install and use. It is available from the open explainx.ai skill registry published by claude-office-skills.

Where can I read ratings and reviews for doc-parser?

Community ratings and review text appear on this explainx.ai skill page below the description. Reviews use a 1–5 scale and may include short written feedback from signed-in members.

How do I install doc-parser?

Run `npx skills add https://github.com/claude-office-skills/skills --skill doc-parser` in your terminal. You need to have run `npx skills init` once in your project first.

Which agent frameworks does doc-parser support?

doc-parser works with any agent framework supported by the Skills registry, including Claude Code, Cursor, GitHub Copilot, Cline, Codex, and Gemini CLI.

Is doc-parser free to use?

Yes. doc-parser is free to install and use. It is available from the open explainx.ai skill registry published by claude-office-skills.

Where can I read ratings and reviews for doc-parser?

Community ratings and review text appear on this explainx.ai skill page below the description. Reviews use a 1–5 scale and may include short written feedback from signed-in members.

How do I install doc-parser?

Run `npx skills add https://github.com/claude-office-skills/skills --skill doc-parser` in your terminal. You need to have run `npx skills init` once in your project first.

Which agent frameworks does doc-parser support?

doc-parser works with any agent framework supported by the Skills registry, including Claude Code, Cursor, GitHub Copilot, Cline, Codex, and Gemini CLI.

Is doc-parser free to use?

Yes. doc-parser is free to install and use. It is available from the open explainx.ai skill registry published by claude-office-skills.

Where can I read ratings and reviews for doc-parser?

Community ratings and review text appear on this explainx.ai skill page below the description. Reviews use a 1–5 scale and may include short written feedback from signed-in members.

How do I install doc-parser?

Run `npx skills add https://github.com/claude-office-skills/skills --skill doc-parser` in your terminal. You need to have run `npx skills init` once in your project first.

Which agent frameworks does doc-parser support?

doc-parser works with any agent framework supported by the Skills registry, including Claude Code, Cursor, GitHub Copilot, Cline, Codex, and Gemini CLI.

Is doc-parser free to use?

Yes. doc-parser is free to install and use. It is available from the open explainx.ai skill registry published by claude-office-skills.

Where can I read ratings and reviews for doc-parser?

Community ratings and review text appear on this explainx.ai skill page below the description. Reviews use a 1–5 scale and may include short written feedback from signed-in members.

How do I install doc-parser?

Run `npx skills add https://github.com/claude-office-skills/skills --skill doc-parser` in your terminal. You need to have run `npx skills init` once in your project first.

Documents

doc-parser▌

claude-office-skills/skills · updated Apr 15, 2026

$npx skills add https://github.com/claude-office-skills/skills --skill doc-parser

0 commentsdiscussion

summary

This skill enables advanced document parsing using docling - IBM's state-of-the-art document understanding library. Parse complex PDFs, Word documents, and images while preserving structure, extracting tables, figures, and handling multi-column layouts.

skill.md

Document Parser Skill

Overview

This skill enables advanced document parsing using docling - IBM's state-of-the-art document understanding library. Parse complex PDFs, Word documents, and images while preserving structure, extracting tables, figures, and handling multi-column layouts.

How to Use

Provide the document to parse
Specify what you want to extract (text, tables, figures, etc.)
I'll parse it and return structured data

Example prompts:

"Parse this PDF and extract all tables"
"Convert this academic paper to structured markdown"
"Extract figures and captions from this document"
"Parse this report preserving the document structure"

Domain Knowledge

docling Fundamentals

from docling.document_converter import DocumentConverter

# Initialize converter
converter = DocumentConverter()

# Convert document
result = converter.convert("document.pdf")

# Access parsed content
doc = result.document
print(doc.export_to_markdown())

Supported Formats

Format	Extension	Notes
PDF	.pdf	Native and scanned
Word	.docx	Full structure preserved
PowerPoint	.pptx	Slides as sections
Images	.png, .jpg	OCR + layout analysis
HTML	.html	Structure preserved

Basic Usage

from docling.document_converter import DocumentConverter

# Create converter
converter = DocumentConverter()

# Convert single document
result = converter.convert("report.pdf")

# Access document
doc = result.document

# Export options
markdown = doc.export_to_markdown()
text = doc.export_to_text()
json_doc = doc.export_to_dict()

Advanced Configuration

from docling.document_converter import DocumentConverter
from docling.datamodel.base_models import InputFormat
from docling.datamodel.pipeline_options import PdfPipelineOptions

# Configure pipeline
pipeline_options = PdfPipelineOptions()
pipeline_options.do_ocr = True
pipeline_options.do_table_structure = True
pipeline_options.table_structure_options.do_cell_matching = True

# Create converter with options
converter = DocumentConverter(
    allowed_formats=[InputFormat.PDF, InputFormat.DOCX],
    pdf_backend_options=pipeline_options
)

result = converter.convert("document.pdf")

Document Structure

# Document hierarchy
doc = result.document

# Access metadata
print(doc.name)
print(doc.origin)

# Iterate through content
for element in doc.iterate_items():
    print(f"Type: {element.type}")
    print(f"Text: {element.text}")
    
    if element.type == "table":
        print(f"Rows: {len(element.data.table_cells)}")

Extracting Tables

from docling.document_converter import DocumentConverter
import pandas as pd

def extract_tables(doc_path):
    """Extract all tables from document."""
    converter = DocumentConverter()
    result = converter.convert(doc_path)
    doc = result.document
    
    tables = []
    
    for element in doc.iterate_items():
        if element.type == "table":
            # Get table data
            table_data = element.export_to_dataframe()
            tables.append({
                'page': element.prov[0].page_no if element.prov else None,
                'dataframe': table_data
            })
    
    return tables

# Usage
tables = extract_tables("report.pdf")
for i, table in enumerate(tables):
    print(f"Table {i+1} on page {table['page']}:")
    print(table['dataframe'])

Extracting Figures

def extract_figures(doc_path, output_dir):
    """Extract figures with captions."""
    import os
    
    converter = DocumentConverter()
    result = converter.convert(doc_path)
    doc = result.document
    
    figures = []
    os.makedirs(output_dir, exist_ok=True)
    
    for element in doc.iterate_items():
        if element.type == "picture":
            figure_info = {
                'caption': element.caption if hasattr(element, 'caption') else None,
                'page': element.prov[0].page_no if element.prov else None,
            }
            
            # Save image if available
            if hasattr(element, 'image'):
                img_path = os.path.join(output_dir, f"figure_{len(figures)+1}.png")
                element.image.save(img_path)
                figure_info['path'] = img_path
            
            figures.append(figure_info)
    
    return figures

Handling Multi-column Layouts

from docling.document_converter import DocumentConverter

def parse_multicolumn(doc_path):
    """Parse document with multi-column layout."""
    
    converter = DocumentConverter()
    result = converter.convert(doc_path)
    doc = result.document
    
    # docling automatically handles column detection
    # Text is returned in reading order
    
    structured_content = []
    
    for element in doc.iterate_items():
        content_ite

`Discussion`

Product Hunt–style comments (not star reviews)

No comments yet — start the thread.

general reviews

`Ratings`

4.5★★★★★70 reviews

★★★★★Shikha Mishra· Dec 28, 2024
Useful defaults in doc-parser — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.
★★★★★Aditi Smith· Dec 12, 2024
We added doc-parser from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
★★★★★Kabir Wang· Dec 12, 2024
doc-parser is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.
★★★★★Michael Flores· Dec 4, 2024
doc-parser reduced setup friction for our internal harness; good balance of opinion and flexibility.
★★★★★Michael Haddad· Nov 23, 2024
We added doc-parser from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.
★★★★★Aditi Martinez· Nov 3, 2024
doc-parser reduced setup friction for our internal harness; good balance of opinion and flexibility.
★★★★★Ama Martin· Nov 3, 2024
Solid pick for teams standardizing on skills: doc-parser is focused, and the summary matches what you get after install.
★★★★★Sophia Gupta· Oct 22, 2024
Registry listing for doc-parser matched our evaluation — installs cleanly and behaves as described in the markdown.
★★★★★Kabir Tandon· Oct 22, 2024
doc-parser has been reliable in day-to-day use. Documentation quality is above average for community skills.
★★★★★Ishan Rao· Oct 14, 2024
doc-parser fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.

showing 1-10 of 70

1 / 7