single-cell-rna-qc

anthropics/knowledge-work-plugins · updated Apr 8, 2026

MDX-style export adds YAML metadata + attribution linking explainx.ai and this canonical listing URL.

$npx skills add https://github.com/anthropics/knowledge-work-plugins --skill single-cell-rna-qc
0 commentsdiscussion
summary

Automated QC workflow for single-cell RNA-seq data following scverse best practices.

skill.md

Single-Cell RNA-seq Quality Control

Automated QC workflow for single-cell RNA-seq data following scverse best practices.

When to Use This Skill

Use when users:

  • Request quality control or QC on single-cell RNA-seq data
  • Want to filter low-quality cells or assess data quality
  • Need QC visualizations or metrics
  • Ask to follow scverse/scanpy best practices
  • Request MAD-based filtering or outlier detection

Supported input formats:

  • .h5ad files (AnnData format from scanpy/Python workflows)
  • .h5 files (10X Genomics Cell Ranger output)

Default recommendation: Use Approach 1 (complete pipeline) unless the user has specific custom requirements or explicitly requests non-standard filtering logic.

Approach 1: Complete QC Pipeline (Recommended for Standard Workflows)

For standard QC following scverse best practices, use the convenience script scripts/qc_analysis.py:

python3 scripts/qc_analysis.py input.h5ad
# or for 10X Genomics .h5 files:
python3 scripts/qc_analysis.py raw_feature_bc_matrix.h5

The script automatically detects the file format and loads it appropriately.

When to use this approach:

  • Standard QC workflow with adjustable thresholds (all cells filtered the same way)
  • Batch processing multiple datasets
  • Quick exploratory analysis
  • User wants the "just works" solution

Requirements: anndata, scanpy, scipy, matplotlib, seaborn, numpy

Parameters:

Customize filtering thresholds and gene patterns using command-line parameters:

  • --output-dir - Output directory
  • --mad-counts, --mad-genes, --mad-mt - MAD thresholds for counts/genes/MT%
  • --mt-threshold - Hard mitochondrial % cutoff
  • --min-cells - Gene filtering threshold
  • --mt-pattern, --ribo-pattern, --hb-pattern - Gene name patterns for different species

Use --help to see current default values.

Outputs:

All files are saved to <input_basename>_qc_results/ directory by default (or to the directory specified by --output-dir):

  • qc_metrics_before_filtering.png - Pre-filtering visualizations
  • qc_filtering_thresholds.png - MAD-based threshold overlays
  • qc_metrics_after_filtering.png - Post-filtering quality metrics
  • <input_basename>_filtered.h5ad - Clean, filtered dataset ready for downstream analysis
  • <input_basename>_with_qc.h5ad - Original data with QC annotations preserved

If copying outputs for user access, copy individual files (not the entire directory) so users can preview them directly.

Workflow Steps

The script performs the following steps:

  1. Calculate QC metrics - Count depth, gene detection, mitochondrial/ribosomal/hemoglobin content
  2. Apply MAD-based filtering - Permissive outlier detection using MAD thresholds for counts/genes/MT%
  3. Filter genes - Remove genes detected in few cells
  4. Generate visualizations - Comprehensive before/after plots with threshold overlays

Approach 2: Modular Building Blocks (For Custom Workflows)

For custom analysis workflows or non-standard requirements, use the modular utility functions from scripts/qc_core.py and scripts/qc_plotting.py:

# Run from scripts/ directory, or add scripts/ to sys.path if needed
import anndata as ad
from qc_core import calculate_qc_metrics, detect_outliers_mad, filter_cells
from qc_plotting import plot_qc_distributions  # Only if visualization needed

adata = ad.read_h5ad('input.h5ad')
calculate_qc_metrics(adata, inplace=True)
# ... custom analysis logic here

When to use this approach:

  • Different workflow needed (skip steps, change order, apply different thresholds to subsets)
  • Conditional logic (e.g., filter neurons differently than other cells)
  • Partial execution (only metrics/visualization, no filtering)
  • Integration with other analysis steps in a larger pipeline
  • Custom filtering criteria beyond what command-line params support

Available utility functions:

From qc_core.py (core QC operations):

  • calculate_qc_metrics(adata, mt_pattern, ribo_pattern, hb_pattern, inplace=True) - Calculate QC metrics and annotate adata
  • detect_outliers_mad(adata, metric, n_mads, verbose=True) - MAD-based outlier detection, returns boolean mask
  • apply_hard_threshold(adata, metric, threshold, operator='>', verbose=True) - Apply hard cutoffs, returns boolean mask
  • filter_cells(adata, mask, inplace=False) - Apply boolean mask to filter cells
  • filter_genes(adata, min_cells=20, min_counts=None, inplace=True) - Filter genes by detection
  • print_qc_summary(adata, label='') - Print summary statistics

From qc_plotting.py (visualization):

  • plot_qc_distributions(adata, output_path, title) - Generate comprehensive QC plots
  • plot_filtering_thresholds(adata, outlier_masks, thresholds, output_path) - Visualize filtering thresholds
  • plot_qc_after_filtering(adata, output_path) - Generate post-filtering plots

Example custom workflows:

Example 1: Only calculate metrics and visualize, don't filter yet

adata = ad.read_h5ad('input.h5ad')
calculate_qc_metrics(adata, inplace=True)
plot_qc_distributions(adata, 'qc_before.png', title='Initial QC')
print_qc_summary(adata, label='Before filtering')

Example 2: Apply only MT% filtering, keep other metrics permissive

adata = ad.read_h5ad('input.h5ad')
calculate_qc_metrics(adata, inplace=True)

# Only filter high MT% cells
high_mt = apply_hard_threshold(adata, 'pct_counts_mt', 10, operator='>')
adata_filtered = filter_cells(adata, ~high_mt)
adata_filtered.write('filtered.h5ad')

Example 3: Different thresholds for different subsets

adata = ad.read_h5ad('input.h5ad')
calculate_qc_metrics(adata, inplace=True)

# Apply type-specific QC (assumes cell_type metadata exists)
neurons = adata.obs['cell_type'] == 'neuron'
other_cells = ~neurons

# Neurons tolerate higher MT%, other cells use stricter threshold
neuron_qc = apply_hard_threshold(adata[neurons], 'pct_counts_mt', 15, operator='>')
other_qc = apply_hard_threshold(adata[other_cells], 'pct_counts_mt', 8, operator='>')

Best Practices

  1. Be permissive with filtering - Default thresholds intentionally retain most cells to avoid losing rare populations
  2. Inspect visualizations - Always review before/after plots to ensure filtering makes biological sense
  3. Consider dataset-specific factors - Some tissues naturally have higher mitochondrial content (e.g., neurons, cardiomyocytes)
  4. Check gene annotations - Mitochondrial gene prefixes vary by species (mt- for mouse, MT- for human)
  5. Iterate if needed - QC parameters may need adjustment based on the specific experiment or tissue type

Reference Materials

For detailed QC methodology, parameter rationale, and troubleshooting guidance, see references/scverse_qc_guidelines.md. This reference provides:

  • Detailed explanations of each QC metric and why it matters
  • Rationale for MAD-based thresholds and why they're better than fixed cutoffs
  • Guidelines for interpreting QC visualizations (histograms, violin plots, scatter plots)
  • Species-specific considerations for gene annotations
  • When and how to adjust filtering parameters
  • Advanced QC considerations (ambient RNA correction, doublet detection)

Load this reference when users need deeper understanding of the methodology or when troubleshooting QC issues.

Next Steps After QC

Typical downstream analysis steps:

  • Ambient RNA correction (SoupX, CellBender)
  • Doublet detection (scDblFinder)
  • Normalization (log-normalize, scran)
  • Feature selection and dimensionality reduction
  • Clustering and cell type annotation
how to use single-cell-rna-qc

How to use single-cell-rna-qc on Cursor

AI-first code editor with Composer

1

Prerequisites

Before installing skills in Cursor, ensure your development environment meets these requirements:

  • Cursor installed and configured on your development machine
  • Node.js version 16.0+ with npm package manager (verify with node --version)
  • Active project directory or workspace where you want to add single-cell-rna-qc
2

Execute installation command

Execute the skills CLI command in your project's root directory to begin installation:

$npx skills add https://github.com/anthropics/knowledge-work-plugins --skill single-cell-rna-qc

The skills CLI fetches single-cell-rna-qc from GitHub repository anthropics/knowledge-work-plugins and configures it for Cursor.

3

Select Cursor when prompted

The CLI will show a list of available agents. Use arrow keys to navigate and space to select Cursor:

◆ Which agents do you want to install to?
│ ── Universal (.agents/skills) ── always included ────
│ • Amp
│ • Antigravity
│ • Cline
│ • Codex
│ ●Cursor(selected)
│ • Cursor
│ • Windsurf
4

Verify installation

Confirm successful installation by checking the skill directory location:

.cursor/skills/single-cell-rna-qc

Reload or restart Cursor to activate single-cell-rna-qc. Access the skill through slash commands (e.g., /single-cell-rna-qc) or your agent's skill management interface.

Security & Verification Notice

We perform automated surface-level scans (Gen AI Scanner, Socket, Snyk) during installation. These checks detect common vulnerabilities but do not guarantee complete security. Always review skill source code and verify the publisher's reputation before production use.

Skills execute code in your development environment. Always verify the publisher's identity, review recent commits, and test in isolated environments before production deployment.

List & Monetize Your Skill

Submit your Claude Code skill and start earning

GET_STARTED →

Use Cases

User Story & Requirements Generation

Create detailed user stories, acceptance criteria, and feature specs

Example

Generate user stories for 'password reset feature' with acceptance criteria, edge cases, and test scenarios

Reduce spec writing time by 50%, ensure comprehensive coverage

Competitive Analysis

Research competitors, compare features, identify gaps

Example

Analyze 5 competitor products, create feature comparison matrix, suggest differentiation opportunities

Complete competitive research in 2 hours instead of 2 days

Roadmap Prioritization

Evaluate features using frameworks (RICE, ICE, Kano) and create prioritized backlogs

Example

Score 20 feature ideas using RICE framework, generate prioritized roadmap with rationale

Make data-driven prioritization decisions faster

Stakeholder Communication

Draft PRDs, status updates, and stakeholder presentations

Example

Create executive summary of Q3 roadmap, monthly progress report, feature launch announcement

Save 3-5 hours/week on communication overhead

Implementation Guide

Prerequisites

  • Claude Desktop or compatible AI client
  • Access to product documentation and roadmap tools (Jira, Notion, etc.)
  • Understanding of product management frameworks (RICE, Jobs-to-be-Done, etc.)
  • Stakeholder contact information and communication channels

Time Estimate

30-60 minutes to see productivity improvements

Installation Steps

  1. 1.Install product management skill
  2. 2.Start with user story generation for known feature
  3. 3.Progress to competitive analysis: research 2-3 competitors
  4. 4.Use for roadmap prioritization: apply RICE/ICE scoring
  5. 5.Draft stakeholder communications and refine based on feedback
  6. 6.Build template library for recurring PM tasks
  7. 7.Share effective prompts with product team

Common Pitfalls

  • Not validating competitive research—verify facts before sharing
  • Accepting user stories without involving engineering team
  • Over-relying on frameworks without qualitative judgment
  • Not customizing outputs to company culture and communication style
  • Skipping stakeholder validation of generated requirements

Best Practices

✓ Do

  • +Validate research and competitive analysis with real data
  • +Collaborate with engineering when generating technical requirements
  • +Customize frameworks and templates to your company context
  • +Use skill for first drafts, refine with stakeholder input
  • +Document successful prompt patterns for PM tasks
  • +Combine AI efficiency with human judgment and intuition

✗ Don't

  • Don't publish competitive analysis without fact-checking
  • Don't finalize user stories without engineering review
  • Don't make prioritization decisions solely on AI scoring
  • Don't skip customer validation of generated requirements
  • Don't ignore company-specific context and culture

💡 Pro Tips

  • Provide context: company goals, constraints, customer feedback
  • Ask for alternatives: 'Show 3 ways to prioritize this roadmap'
  • Request stakeholder-specific formatting: 'Executive summary vs. engineering spec'
  • Use skill for 70% generation + 30% customization to company needs

When to Use This

✓ Use When

Use for user story writing, competitive research, roadmap prioritization, stakeholder communication, and PRD drafting. Best for reducing repetitive documentation and research work.

✗ Avoid When

Avoid for strategic product vision (requires deep customer empathy), pricing decisions (needs market and financial expertise), or when face-to-face customer discovery is more valuable than speed.

Learning Path

  1. 1Basic: user stories, feature specs, status updates
  2. 2Intermediate: competitive analysis, prioritization frameworks, PRDs
  3. 3Advanced: product strategy, go-to-market planning, OKR setting
  4. 4Expert: product vision, market positioning, business model innovation

Discussion

Product Hunt–style comments (not star reviews)
  • No comments yet — start the thread.
general reviews

Ratings

4.773 reviews
  • Chaitanya Patil· Dec 20, 2024

    single-cell-rna-qc is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.

  • Fatima Park· Dec 16, 2024

    Keeps context tight: single-cell-rna-qc is the kind of skill you can hand to a new teammate without a long onboarding doc.

  • Dev Smith· Dec 12, 2024

    Useful defaults in single-cell-rna-qc — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.

  • Chinedu Choi· Dec 12, 2024

    Registry listing for single-cell-rna-qc matched our evaluation — installs cleanly and behaves as described in the markdown.

  • Ama Abbas· Dec 8, 2024

    single-cell-rna-qc has been reliable in day-to-day use. Documentation quality is above average for community skills.

  • Camila Agarwal· Dec 4, 2024

    single-cell-rna-qc fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.

  • Anika Ramirez· Nov 27, 2024

    Solid pick for teams standardizing on skills: single-cell-rna-qc is focused, and the summary matches what you get after install.

  • Dev Johnson· Nov 27, 2024

    We added single-cell-rna-qc from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.

  • Anika Zhang· Nov 23, 2024

    I recommend single-cell-rna-qc for anyone iterating fast on agent tooling; clear intent and a small, reviewable surface area.

  • Piyush G· Nov 11, 2024

    Useful defaults in single-cell-rna-qc — fewer surprises than typical one-off scripts, and it plays nicely with `npx skills` flows.

showing 1-10 of 73

1 / 8