Scienceofficial

dbsnp-database

google-deepmind/science-skills · updated Jun 4, 2026

MDX-style export adds YAML metadata + attribution linking explainx.ai and this canonical listing URL.

$npx skills add https://github.com/google-deepmind/science-skills --skill dbsnp-database
0 commentsdiscussion
summary

### Dbsnp Database

  • name: "dbsnp-database"
  • description: "Use when you want to look up, map, and search for short genetic variants (SNPs, indels) in NCBI's dbSNP database. Resolves between rsIDs, genomic coordinates in VCF format, and HGVS strings. For an rs..."
skill.md
name
dbsnp-database
description
> Use when you want to look up, map, and search for short genetic variants (SNPs, indels) in NCBI's dbSNP database. Resolves between rsIDs, genomic coordinates in VCF format, and HGVS strings. For an rsID, returns variant type, gene associations, clinical significance, allele frequencies, and genomic coordinates (GRCh38).

dbSNP Database Integration

Prerequisites

  1. uv: Read the uv skill and follow its Setup instructions to ensure uv is installed and on PATH.

  2. User Notification: If LICENSE_NOTIFICATION.txt does not already exist in this skill directory then (1) prominently notify the user to check the terms at https://www.ncbi.nlm.nih.gov/snp/, then (2) create the file recording the notification text and timestamp.

  3. .env file: Make sure the .env file exists in your home directory. Create one if it does not exist.

  4. NCBI_API_KEY (optional): Raises the NCBI rate limit from 3 to 10 requests/second. The skill works without it, but a key is recommended if the user plans many queries or encounters a 429 error. The user can obtain one for free by registering at https://www.ncbi.nlm.nih.gov/account/settings/. If the variable is missing from .env, do NOT ask the user to paste it into the chat (this would leak the key into the agent's context). Instead, give the user this command — substituting ENV_FILE with the resolved literal path to the .env file:

    printf "Enter NCBI API key (typing hidden): " && read -s key && echo && echo "NCBI_API_KEY=$key" >> "ENV_FILE" && echo "Saved."
    

    The scripts load credentials automatically via dotenv. NEVER read, print, or inspect the .env file or its variables (e.g. no cat, grep, echo, printenv, or os.environ.get on keys). Credentials must stay out of the agent's context. See the API Key section for more details.

Core Rules

  • Use the Wrapper: ALWAYS execute the provided wrapper script scripts/dbsnp_cli.py to query the database rather than constructing custom HTTP or curl requests. The script automatically handles rate limiting, retries, and JSON parsing.
  • Command Choice: Do NOT use search-region to find the rsID of a specific variant; use resolve-variant instead.
  • Output Size: Avoid using --full on get-variant unless specifically needed, as raw payloads can exceed 1 MB.
  • Shell Safety: Always wrap HGVS strings in single quotes to prevent shell expansion errors.
  • Notification: If this skill is used, ensure this is mentioned in the output.

When to Use

Use this skill when you need to:

  • Map a genomic variant to its canonical rsID (from VCF coordinates or HGVS notation).
  • Retrieve summary data for an rsID: variant type, gene associations, clinical significance, and population allele frequencies.
  • Convert an rsID back to genomic coordinates on a specific assembly.
  • Find all known variants within a chromosomal region.

Do NOT use when you need to:

  • Obtain clinical pathogenicity classifications with submitter rationales (use clinvar-database).
  • Get precise population-level allele frequencies stratified by ancestry (use gnomad-database).
  • Predict the functional effect of a novel mutation (use alphagenome-single-variant-analysis).
  • View 3D protein structures affected by a variant (use alphafold-database-fetch-and-analyze / pdb-database).

Command Selection Guide

Pick the right command on the first try. Match the user's input to the correct subcommand below — one command call is almost always sufficient.

  • User gives you…: Run this command
  • An rsID (e.g. rs7412, rs268): get-variant
  • Genomic coordinates: chrom pos ref alt (e.g. 8 19962213 C T): resolve-variant
  • An HGVS string (e.g. NC_000008.11:g.19962213del): resolve-hgvs
  • An rsID and they want coordinates back: resolve-rsid
  • A chromosomal region (chrom start end): search-region

[!CAUTION] Do NOT use search-region to find the rsID of a specific variant. If the user provides a chromosome, position, reference allele, and alternate allele (four values), use resolve-variant — it is a direct, single-API-call lookup. search-region is only for surveying all variants within a positional range and returns hundreds/thousands of results.

Quick Start

# Look up variant rs7412: type, gene, clinical significance, MAF
uv run scripts/dbsnp_cli.py get-variant rs7412 --output /tmp/rs7412.json

# Find the rsID for a variant at chr8:19962213 C>T
uv run scripts/dbsnp_cli.py resolve-variant 8 19962213 C T \
  --output /tmp/resolve.json

All subcommands write JSON to disk. Always save output in the /tmp/ directory. The --output flag is required.

Commands

1. get-variant — Fetch Variant Record

Retrieve the RefSNP record for one rsID. By default the output is abbreviated to the most useful fields. Both rs268 and 268 are accepted.

uv run scripts/dbsnp_cli.py get-variant rs268 --output /tmp/rs268.json
uv run scripts/dbsnp_cli.py get-variant 268 --assembly GCF_000001405.40 \
  --output /tmp/rs268.json

Arguments:

  • rsid (positional, required): The RefSNP identifier.
  • --assembly: RefSeq assembly accession (default: GCF_000001405.40 = GRCh38).
  • --full: Return the complete raw JSON payload — see warning below.
  • --output: Output file path (default: /tmp/dbsnp_output.json).

Abbreviated output fields:

  • refsnp_id: Numeric rsID
  • variant_type: e.g. snv, ins, del, delins
  • genes: Sorted list of gene symbols (locus names)
  • clinical_significances: List of clinical significance labels
  • minor_allele_frequencies: Study name, allele count, total count
  • placements: Genomic placements for the requested assembly

[!WARNING] About --full: The raw RefSNP payload is typically 50–500 KB and can exceed 1 MB for clinically significant variants with many submissions. Only use --full when you specifically need data absent from the abbreviated output — for example:

  • The complete HGVS nomenclature across every transcript and protein isoform.
  • Full submission history with individual submitter details and timestamps.
  • Population-level allele frequency breakdowns by sub-population within a study (e.g. per-population gnomAD counts).
  • The full set of genomic placements across multiple assemblies (GRCh37 and GRCh38 simultaneously).
  • Merge history showing which older rsIDs were merged into this one.

2. resolve-variant — Genomic Coordinates → rsID

Determine the rsID(s) for a variant given its genomic coordinates (chromosome, position, reference allele, alternate allele). This is the command to use when the user provides a variant as space-separated coordinates like 8 19962213 C T.

uv run scripts/dbsnp_cli.py resolve-variant 8 19962213 C T \
  --output /tmp/resolve.json

Arguments:

  • chrom (positional): Chromosome number (e.g. 8) or RefSeq sequence accession (e.g. NC_000008.11). Chromosomes X and Y must be passed as their numeric equivalents: 23 for X and 24 for Y.
  • pos (positional): 1-based genomic position.
  • ref (positional): Reference allele (e.g. C).
  • alts (positional): Alternate allele(s), comma-separated (e.g. T).
  • --assembly: RefSeq assembly accession (default: GCF_000001405.40).
  • --output: Output file path (default: /tmp/dbsnp_output.json).

Output: {"rsids": ["12345", "67890"]}

3. resolve-rsid — rsID → Genomic Coordinates

Get the genomic placement (sequence ID and allele details) for a known rsID on a specific assembly.

uv run scripts/dbsnp_cli.py resolve-rsid rs7412 --output /tmp/coords.json

Arguments:

  • rsid (positional): The RefSNP identifier.
  • --assembly: RefSeq assembly accession (default: GCF_000001405.40).
  • --output: Output file path (default: /tmp/dbsnp_output.json).

Output: {"rsid": "7412", "assembly": "...", "placements": [...]}

4. resolve-hgvs — HGVS → rsID

Find the rsID(s) corresponding to an HGVS expression.

uv run scripts/dbsnp_cli.py resolve-hgvs 'NC_000008.11:g.19962213del' \
  --output /tmp/hgvs.json

Arguments:

  • hgvs (positional): The HGVS string.
  • --assembly: RefSeq assembly accession (default: GCF_000001405.40).
  • --output: Output file path (default: /tmp/dbsnp_output.json).

Output: {"rsids": ["12345"]}

[!TIP] HGVS strings often contain characters that shells interpret (colons, greater-than signs). Always wrap them in single quotes to prevent shell expansion.

5. search-region — Regional Variant Search

Find all rsIDs within a bounded chromosomal region.

uv run scripts/dbsnp_cli.py search-region 7 117100000 117300000 \
  --output /tmp/region.json

Arguments:

  • chrom (positional): Chromosome (e.g. 7). Use 23 for chromosome X and 24 for chromosome Y.
  • start (positional): Start position.
  • end (positional): End position.
  • --retmax: Maximum rsIDs to return (default: 500, ceiling: 5 000).
  • --output: Output file path (default: /tmp/dbsnp_output.json).

Output:

{
  "rsids": ["12345", "67890", "..."],
  "returned": 500,
  "total_available": 1423,
  "truncated": true,
  "note": "Only 500 of 1423 variants returned.  Increase --retmax ..."
}

When total_available exceeds the returned count, the output includes a truncated flag and a note. Increase --retmax to retrieve more (up to 5 000).

Typical Workflows

Identify a known variant from coordinates

# Step 1: Map VCF coordinates to rsID
uv run scripts/dbsnp_cli.py resolve-variant 19 44908684 T C \
  --output /tmp/step1.json

# Step 2: Get the full details for the resolved rsID
uv run scripts/dbsnp_cli.py get-variant <rsid_from_step1> \
  --output /tmp/step2.json

Survey variants in a gene region

# Step 1: Find all variants in a region spanning the CFTR gene
uv run scripts/dbsnp_cli.py search-region 7 117100000 117300000 \
  --retmax 1000 --output /tmp/region.json

# Step 2: Retrieve details on individual rsIDs of interest
uv run scripts/dbsnp_cli.py get-variant <rsid> --output /tmp/detail.json

Translate HGVS notation to genomic coordinates

# Step 1: Get the rsID for an HGVS expression
uv run scripts/dbsnp_cli.py resolve-hgvs 'NC_000019.10:g.44908684T>C' \
  --output /tmp/hgvs.json

# Step 2: Resolve that rsID to VCF-style coordinates
uv run scripts/dbsnp_cli.py resolve-rsid <rsid> --output /tmp/coords.json

Assembly Defaults and Automatic Fallback

The Variation Services endpoints (used by get-variant, resolve-variant, resolve-rsid, resolve-hgvs) expect a RefSeq assembly accession. The RefSeq accession for GRCh38 is GCF_000001405.40, and for GRCh37 it is GCF_000001405.25.

The search-region subcommand always searches GRCh38 positions.

[!IMPORTANT] Automatic assembly fallback: The resolve-variant and resolve-hgvs commands automatically try GRCh38 first. If no rsIDs are found, they retry with GRCh37 before reporting failure. When a fallback occurs the output JSON includes a "note" field explaining which assembly succeeded. You do NOT need to manually retry with a different assembly — the script handles this transparently.

You only need to override --assembly when you specifically want to restrict the lookup to one assembly (e.g. because the user's coordinates are known to be GRCh37).

NCBI API Key and Rate Limiting

Without an API key the script is limited to 3 requests per second. With a key this increases to 10 requests per second.

uv run scripts/dbsnp_cli.py get-variant rs268 --output out.json

If a RateLimitError is raised, pause execution and follow the prerequisite instructions to help the user add NCBI_API_KEY to the .env file. See references/api-notes.md for details.

Troubleshooting HTTP 500 Errors

Reference Allele Mismatch

If you receive an HTTP 500 error with a message detailing that the asserted reference allele is not equal to the reference sequence:

What it means: The coordinate position is likely valid, but the reference allele (ref) you provided does not match the base at that position in the requested assembly.

Action: 1. DO NOT RETRY the exact same query mechanically. 2. Check the assembly: Coordinates are assembly-specific. 3. Switch assembly: If you were querying GRCh37, try GRCh38 (using --assembly GCF_000001405.40), or if querying GRCh38, try GRCh37 (using --assembly GCF_000001405.25).

Common Mistakes

  • Mistake: Forgetting to quote HGVS strings Fix: Wrap in single quotes: 'NC_000008.11:g.19962213del'

  • Mistake: Passing a chromosome name to resolve-variant instead of a sequence accession Fix: Use the numeric chromosome ID (e.g. 8) or a RefSeq accession like NC_000008.11

  • Mistake: Using --full on get-variant without needing it Fix: The abbreviated output covers most use cases; --full returns 50–500 KB+ of JSON

  • Mistake: Expecting search-region to return all results by default Fix: The default --retmax is 500; check total_available in the output to see if results were truncated

  • Mistake: Using GRCh37 coordinates with search-region Fix: search-region always uses GRCh38 positions; lift over coordinates first if starting from GRCh37

  • Mistake: Manually retrying resolve-variant or resolve-hgvs with a different --assembly when the first call fails Fix: The script automatically tries GRCh38 then GRCh37; a single call is sufficient

  • Mistake: Passing X or Y as the chromosome value Fix: Use the numeric equivalents: 23 for chromosome X and 24 for chromosome Y. The CLI treats chromosomes numerically by default.

how to use dbsnp-database

How to use dbsnp-database on Cursor

AI-first code editor with Composer

1

Prerequisites

Before installing skills in Cursor, ensure your development environment meets these requirements:

  • Cursor installed and configured on your development machine
  • Node.js version 16.0+ with npm package manager (verify with node --version)
  • Active project directory or workspace where you want to add dbsnp-database
2

Execute installation command

Execute the skills CLI command in your project's root directory to begin installation:

$npx skills add https://github.com/google-deepmind/science-skills --skill dbsnp-database

The skills CLI fetches dbsnp-database from GitHub repository google-deepmind/science-skills and configures it for Cursor.

3

Select Cursor when prompted

The CLI will show a list of available agents. Use arrow keys to navigate and space to select Cursor:

◆ Which agents do you want to install to?
│ ── Universal (.agents/skills) ── always included ────
│ • Amp
│ • Antigravity
│ • Cline
│ • Codex
│ ●Cursor(selected)
│ • Cursor
│ • Windsurf
4

Verify installation

Confirm successful installation by checking the skill directory location:

.cursor/skills/dbsnp-database

Reload or restart Cursor to activate dbsnp-database. Access the skill through slash commands (e.g., /dbsnp-database) or your agent's skill management interface.

Security & Verification Notice

We perform automated surface-level scans (Gen AI Scanner, Socket, Snyk) during installation. These checks detect common vulnerabilities but do not guarantee complete security. Always review skill source code and verify the publisher's reputation before production use.

Skills execute code in your development environment. Always verify the publisher's identity, review recent commits, and test in isolated environments before production deployment.

List & Monetize Your Skill

Submit your Claude Code skill and start earning

GET_STARTED →

Use Cases

Task Automation & Efficiency

Automate repetitive workflows and reduce manual effort

Example

Generate reports, summarize documents, draft communications

Save 3-5 hours per week on routine tasks

Knowledge Enhancement

Learn new skills, understand complex topics, get expert guidance

Example

Explain concepts, provide examples, suggest learning resources

Accelerate learning and skill development by 2x

Quality Improvement

Enhance output quality through reviews, suggestions, and refinements

Example

Review drafts, suggest improvements, catch errors

Improve work quality by 30-40% with less effort

Implementation Guide

Prerequisites

  • Claude Desktop or compatible AI client with skill support
  • Clear understanding of task or problem to solve
  • Willingness to iterate and refine outputs

Time Estimate

15-45 minutes depending on use case complexity

Installation Steps

  1. 1.Install skill using provided installation command
  2. 2.Test with simple use case relevant to your work
  3. 3.Evaluate output quality and relevance
  4. 4.Iterate on prompts to improve results
  5. 5.Integrate into regular workflow if valuable

Common Pitfalls

  • Expecting perfect results without iteration
  • Not providing enough context in prompts
  • Using skill for tasks outside its intended scope
  • Accepting outputs without review and validation

Best Practices

✓ Do

  • +Start with clear, specific prompts
  • +Provide relevant context and constraints
  • +Review and refine all outputs before using
  • +Iterate to improve output quality
  • +Document successful prompt patterns

✗ Don't

  • Don't use without understanding skill limitations
  • Don't skip validation of outputs
  • Don't share sensitive information in prompts
  • Don't expect skill to replace human judgment

💡 Pro Tips

  • Be specific about desired format and style
  • Ask for multiple options to choose from
  • Request explanations to understand reasoning
  • Combine AI efficiency with human expertise

When to Use This

✓ Use When

Use when skill capabilities match your task, clear ROI on time saved, and you can validate outputs. Best for repetitive tasks, learning, and quality improvement.

✗ Avoid When

Avoid when task requires deep expertise you can't validate, involves sensitive decisions, or when learning process is more valuable than speed of completion.

Learning Path

  1. 1Familiarize yourself with skill capabilities and limitations
  2. 2Start with low-risk, non-critical tasks
  3. 3Progress to more complex and valuable use cases
  4. 4Build expertise through regular use and experimentation

Discussion

Product Hunt–style comments (not star reviews)
  • No comments yet — start the thread.
general reviews

Ratings

4.550 reviews
  • Ishan Johnson· Dec 20, 2024

    dbsnp-database fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.

  • Aarav Harris· Dec 8, 2024

    We added dbsnp-database from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.

  • Noah Srinivasan· Nov 27, 2024

    dbsnp-database fits our agent workflows well — practical, well scoped, and easy to wire into existing repos.

  • Fatima Chen· Nov 11, 2024

    We added dbsnp-database from the explainx registry; install was straightforward and the SKILL.md answered most questions upfront.

  • Benjamin Gill· Oct 18, 2024

    dbsnp-database has been reliable in day-to-day use. Documentation quality is above average for community skills.

  • Tariq Robinson· Oct 2, 2024

    dbsnp-database reduced setup friction for our internal harness; good balance of opinion and flexibility.

  • Hiroshi Iyer· Sep 25, 2024

    dbsnp-database is among the better-maintained entries we tried; worth keeping pinned for repeat workflows.

  • Aarav Smith· Sep 21, 2024

    Registry listing for dbsnp-database matched our evaluation — installs cleanly and behaves as described in the markdown.

  • Yash Thakker· Sep 17, 2024

    Keeps context tight: dbsnp-database is the kind of skill you can hand to a new teammate without a long onboarding doc.

  • Hiroshi Ghosh· Sep 17, 2024

    Keeps context tight: dbsnp-database is the kind of skill you can hand to a new teammate without a long onboarding doc.

showing 1-10 of 50

1 / 5