paper-audit▌
bahayonghang/academic-writing-skills · updated Apr 8, 2026
paper-audit is now deep-review-first. Its core job is to behave like a serious reviewer: find technical, methodological, claim-level, and cross-section issues; keep script-backed findings separate from reviewer judgment; and return a structured issue bundle plus a revision roadmap.
Paper Audit Skill v4.2
paper-audit is now deep-review-first. Its core job is to behave like a serious reviewer: find technical, methodological, claim-level, and cross-section issues; keep script-backed findings separate from reviewer judgment; and return a structured issue bundle plus a revision roadmap.
Use it for audit and review. Do not use it as the first tool for source editing, sentence rewriting, or build fixing.
What This Skill Produces
quick-audit: fast submission-readiness screen with script-backed findingsdeep-review: reviewer-style structured issue bundle with major/moderate/minor findingsgate: PASS/FAIL decision calibrated for submission blockersre-audit: compare current issue bundle against a previous auditpolish: precheck-only handoff into a polishing workflow
The primary product is no longer just a score. For deep-review, the main outputs are:
final_issues.jsonoverall_assessment.txtreview_report.mdrevision_roadmap.md
Do Not Use
- direct source surgery on
.tex/.typ - compilation debugging as the main task
- free-form literature survey writing
- cosmetic grammar cleanup without an audit goal
Critical Rules
- Never rewrite the paper source unless the user explicitly switches to an editing skill.
- Never fabricate references, baselines, or reviewer evidence.
- Always distinguish
[Script]from[LLM]findings. - Always anchor reviewer findings to a quote, section, or exact textual location.
- Be conservative with OCR noise, formatting quirks, and obvious copy-editing trivia.
- Review like a careful reader: understand the author's intended meaning before flagging an issue.
Mode Selection
| Requested intent | Mode |
|---|---|
| "check my paper", "quick audit", "submission readiness" | quick-audit |
| "review my paper", "simulate peer review", "harsh review", "deep review" | deep-review |
| "is this ready to submit", "gate this submission", "blockers only" | gate |
| "did I fix these issues", "re-audit", "compare against old review" | re-audit |
| "polish the writing, but only if safe" | polish |
Legacy aliases still work for one compatibility cycle:
self-check->quick-auditreview->deep-review
Committee Focus Routing (deep-review)
For deep-review, use the Academic Pre-Review Committee by default. This is a 5-role review pass:
- Editor (desk-reject screen)
- Reviewer 1 (theory contribution)
- Reviewer 3 (literature dialogue / gap)
- Reviewer 2 (methodology transparency)
- Reviewer 4 (logic chain)
If the user requests a single dimension, run only the matching committee role(s).
If --focus ... is provided, it overrides keyword inference:
--focus full(default)--focus editor|theory|literature|methodology|logic
Keyword map (English + Chinese):
- editor: "desk reject", "pre-screen", "editor", "EIC", "主编", "预筛", "初筛"
- theory: "theory", "contribution", "novelty", "theoretical dialogue", "理论", "贡献", "创新性"
- literature: "related work", "literature", "research gap", "citation", "文献", "综述", "Research Gap", "引用"
- methodology: "methods", "sample", "coding", "data", "design", "SRQR", "方法", "样本", "编码", "数据", "研究设计", "透明度"
- logic: "logic", "argument", "causal", "structure", "论证", "因果", "逻辑", "结构"
Output language: match the user's request language. If ambiguous, match the paper language.
Review Standard
Read these references before running reviewer-style work:
references/REVIEW_CRITERIA.mdreferences/DEEP_REVIEW_CRITERIA.mdreferences/CHECKLIST.mdreferences/CONSOLIDATION_RULES.mdreferences/ISSUE_SCHEMA.md
The deep-review workflow uses a 16-part issue taxonomy:
- formula / derivation errors
- notation inconsistency
- prose vs formal object mismatch
- numerical inconsistency
- missing justification
- overclaim or claim inaccuracy
- ambiguity that can mislead a careful reader
- underspecified methods / missing information
- internal contradiction
- self-consistency of standards
- table structure violations
- abstract structural incompleteness
- theory contribution deficiency
- qualitative methodology opacity
- pseudo-innovation / straw man
- paragraph-level argument incoherence
Workflow
Common Step 0
Parse $ARGUMENTS and infer the mode if the user did not provide one. State the inferred mode before running commands if you had to infer it.
quick-audit
- Run:
uv run python -B "$SKILL_DIR/scripts/audit.py" <paper> --mode quick-audit ... - Present a concise report:
Submission Blockersfirst- then
Quality Improvements - then checklist items
- mark quick-audit findings with
[Script]provenance
- If the user clearly wants reviewer-depth critique after the quick screen, escalate to
deep-review.
deep-review
Use this as the default reviewer-style path.
Phase 1: Prepare workspace
Run:
uv run python -B "$SKILL_DIR/scripts/prepare_review_workspace.py" <paper> --output-dir ./review_results
This creates:
full_text.mdmetadata.jsonsection_index.jsonclaim_map.jsonpaper_summary.mdsections/*.mdcomments/references/(minimal copies for reviewer agents)committee/(committee reviewer artifacts)
Phase 2: Phase 0 automated audit
Run:
uv run python -B "$SKILL_DIR/scripts/audit.py" <paper> --mode deep-review ...
Treat this as Phase 0 only. It supplies script-backed context and scores, not the final review.
Phase 3: Committee + Review Lanes
Phase 3A: Academic Pre-Review Committee (default)
Decide committee focus:
- If
--focus ...is provided, use it. - Otherwise infer from the user request using the keyword map in "Committee Focus Routing".
- If nothing matches, default to
full(all five roles).
Dispatch the committee reviewers (in this exact order) and have them write artifacts into the workspace:
agents/committee_editor_agent.md- write:
committee/editor.md - write:
comments/committee_editor.json
- write:
agents/committee_theory_agent.md- write:
committee/theory.md - write:
comments/committee_theory.json
- write:
agents/committee_literature_agent.md- write:
committee/literature.md - write:
comments/committee_literature.json
- write:
agents/committee_methodology_agent.md- write:
committee/methodology.md - write:
comments/committee_methodology.json
- write:
agents/committee_logic_agent.md- write:
committee/logic.md - write:
comments/committee_logic.json
- write:
If subagents are unavailable, run the committee reviewers inline, but keep the same file outputs.
Then write: committee/consensus.md
- include: overall score (1-10), ordered priorities, and the top 3 issues to fix first
- scoring formula:
- start at 9.0
- subtract:
1.5 * (# major) + 0.7 * (# moderate) + 0.2 * (# minor) - floor at 1.0
- if Editor verdict is Desk Reject, cap at 4.0
Note: render_deep_review_report.py automatically embeds committee/*.md into review_report.md when present.
Phase 3B: Section and cross-cutting review lanes (coverage)
Read:
references/SUBAGENT_TEMPLATES.mdreferences/REVIEW_LANE_GUIDE.md
Then dispatch reviewer tasks for:
- section lanes
- introduction / related work
- methods
- results
- discussion / conclusion
- appendix, if present
- cross-cutting lanes
- claims vs evidence
- notation and numeric consistency
- evaluation fairness and reproducibility
- self-standard consistency
- prior-art and novelty grounding
Each lane writes a JSON array into comments/.
If subagents are unavailable, use the built-in deterministic fallback lane pass in scripts/audit.py so the workflow still writes lane-compatible JSON into comments/ before consolidation.
Phase 4: Consolidation
Run:
uv run python -B "$SKILL_DIR/scripts/consolidate_review_findings.py" <review_dir>
uv run python -B "$SKILL_DIR/scripts/verify_quotes.py" <review_dir> --write-back
uv run python -B "$SKILL_DIR/scripts/render_deep_review_report.py" <review_dir>
Consolidation rules:
- merge exact duplicates
- keep distinct paper-level consequences separate even if they share a root cause
- preserve singleton findings unless clearly false positive
- assign
comment_type,severity,confidence, androot_cause_key
Phase 5: Present result
Summarize:
- 1 short paragraph overall assessment
- counts of major / moderate / minor issues
- 3 highest-priority revision items
- path to
review_report.mdandfinal_issues.json
gate
- Run:
uv run python -B "$SKILL_DIR/scripts/audit.py" <paper> --mode gate ... - EIC Screening (Phase 0.5): Read
agents/editor_in_chief_agent.mdand perform the editor-in-chief desk-reject screening on the paper's title, abstract, and introduction. This evaluates pitch quality, venue fit, fatal flaws, and presentation baseline. A desk-reject verdict is a gate blocker. - Report PASS/FAIL.
- Present EIC screening results first (verdict + score + justification).
- List blockers next.
- Keep advisory items separate from blockers.
- For IEEE pseudocode checks, make it explicit which issues are mandatory and which are only IEEE-safe recommendations.
re-audit
- Requires
--previous-report PATH. - Run:
uv run python -B "$SKILL_DIR/scripts/audit.py" <paper> --mode re-audit --previous-report <path> ... - If both old and new
final_issues.jsonbundles are available, also run:uv run python -B "$SKILL_DIR/scripts/diff_review_issues.py" <old_final_issues.json> <new_final_issues.json> - Present:
- root-cause-aware status labels:
FULLY_ADDRESSED,PARTIALLY_ADDRESSED,NOT_ADDRESSED,NEW - use structured prior issue bundles when available, but still accept Markdown previous reports
- root-cause-aware status labels:
polish
- Run the audit precheck:
uv run python -B "$SKILL_DIR/scripts/audit.py" <paper> --mode polish ... - If blockers exist, stop and report them.
- Only proceed into polishing if the precheck is safe.
Output Contract
For deep-review, the final issue schema is:
{
"title": "short issue title",
"quote": "exact quote from paper",
"explanation": "why this matters and what remains problematic",
"comment_type": "methodology|claim_accuracy|presentation|missing_information",
"severity": "major|moderate|minor",
"confidence": "high|medium|low",
"source_kind": "script|llm",
"source_section": "methods",
"related_sections": ["results", "appendix"],
"root_cause_key": "shared-normalized-key",
"review_lane": "claims_vs_evidence",
"gate_blocker": false,
"quote_verified": true
}
Always prefer:
- exact quotes over vague paraphrase
- evidence-backed findings over style commentary
- issue bundle + roadmap over raw script dumps
References
| File | Purpose |
|---|---|
references/REVIEW_CRITERIA.md |
top-level audit scoring and mapping |
references/DEEP_REVIEW_CRITERIA.md |
deep-review-specific issue taxonomy (16 dimensions) and leniency rules |
references/CONSOLIDATION_RULES.md |
deduplication and root-cause merge policy |
references/ISSUE_SCHEMA.md |
canonical JSON schema |
references/REVIEW_LANE_GUIDE.md |
section lanes and cross-cutting lanes |
references/SUBAGENT_TEMPLATES.md |
reviewer task templates |
references/QUICK_REFERENCE.md |
CLI and mode cheat sheet |
Scripts
| Script | Purpose |
|---|---|
scripts/audit.py |
Phase 0 audit and mode entrypoint |
scripts/prepare_review_workspace.py |
create deep-review workspace |
scripts/build_claim_map.py |
extract headline claims and closure targets |
scripts/consolidate_review_findings.py |
deduplicate comment JSONs |
scripts/verify_quotes.py |
verify exact quote presence |
scripts/render_deep_review_report.py |
render final Markdown report |
scripts/diff_review_issues.py |
compare old vs new issue bundles |
Reviewer Lanes
Committee agents (deep-review default):
committee_editor_agent.mdcommittee_theory_agent.mdcommittee_literature_agent.mdcommittee_methodology_agent.mdcommittee_logic_agent.md
Default deep-review lanes live in agents/:
section_reviewer_agent.mdclaims_evidence_reviewer_agent.mdnotation_consistency_reviewer_agent.mdevaluation_fairness_reviewer_agent.mdself_consistency_reviewer_agent.mdprior_art_reviewer_agent.mdsynthesis_agent.mdeditor_in_chief_agent.md— EIC desk-reject screener (used ingatemode)
Specialized deep-review agents (read their files for activation criteria):
critical_reviewer_agent.md— devil's advocate with C3-C5 checksdomain_reviewer_agent.md— domain expertise with A1-A7 assessmentsmethodology_reviewer_agent.md— methodology rigor with B3-B10 checksliterature_reviewer_agent.md— evidence-based literature verification (optional,--literature-search)
Examples
- “Review this manuscript like a serious conference reviewer and tell me the biggest validity risks.”
- “Run a quick audit on
paper.texand tell me what blocks submission.” - “Gate this IEEE submission and separate blockers from recommendations.”
- “Re-audit this revision against my previous report.”