ai-ml

DINO-X

idea-research

by idea-research

DINO-X is a powerful multimodal AI model that lets you detect, localize, and describe anything in images using natural l

Empower LLMs with fine-grained visual understanding — detect, localize, and describe anything in images with natural language prompts.

github stars

112

0 commentsdiscussion

Both formats append explainx.ai attribution and the canonical URL for this MCP server listing.

Fine-grained object detection and localizationStructured JSON outputs with coordinatesMultiple transport modes (local/cloud)

best for

  • / Building visual AI applications and chatbots
  • / Automating visual inspection workflows
  • / Creating multimodal reasoning systems

capabilities

  • / Detect objects in images using natural language queries
  • / Generate region-level descriptions of image areas
  • / Count and locate specific objects with coordinates
  • / Analyze full images for detailed understanding
  • / Create annotated visualizations with bounding boxes
  • / Process images from local files or web URLs

what it does

Provides AI-powered object detection and visual analysis in images using natural language prompts. Works with local files or web URLs to find, locate, and describe specific objects or regions.

about

DINO-X is a community-built MCP server published by idea-research that provides AI assistants with tools and capabilities via the Model Context Protocol. DINO-X is a powerful multimodal AI model that lets you detect, localize, and describe anything in images using natural l It is categorized under ai ml.

how to install

You can install DINO-X in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

license

Apache-2.0

DINO-X is released under the Apache-2.0 license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

readme

DINO-X MCP Server

License npm version npm downloads PRs Welcome MCP Badge GitHub stars

English | 中文

DINO-X Official MCP Server — powered by the DINO-X and Grounding DINO models — brings fine-grained object detection and image understanding to your multimodal applications.

<p align="center"> <video width="800" controls> <source src="https://dds-frontend.oss-cn-shenzhen.aliyuncs.com/dinox-mcp/dinox-mcp-en-overveiw.mp4" type="video/mp4"> Your browser does not support the video tag. </video> </p>

Why DINO-X MCP?

With DINO-X MCP, you can:

  • Fine-Grained Understanding: Full image detection, object detection, and region-level descriptions.

  • Structured Outputs: Get object categories, counts, locations, and attributes for VQA and multi-step reasoning tasks.

  • Composable: Works seamlessly with other MCP servers to build end-to-end visual agents or automation pipelines.

Transport Modes

DINO-X MCP supports two transport modes:

FeatureSTDIO (default)Streamable HTTP
RuntimeLocalLocal or Cloud
TransportStandard I/OHTTP (streaming responses)
Input sourcefile:// and https://https:// only
VisualizationSupported (saves annotated images locally)Not supported (for now)

Quick Start

1. Prepare an MCP client

Any MCP-compatible client works, e.g.:

2. Get your API key

Apply on the DINO-X platform: Request API Key (new users get free quota).

3. Configure MCP

Option A: Official Hosted Streamable HTTP (Recommended)

Add to your MCP client config and replace with your API key:

{
  "mcpServers": {
    "dinox-mcp": {
      "url": "https://mcp.deepdataspace.com/mcp?key=your-api-key"
    }
  }
}

Option B: Use the NPM package locally (STDIO)

Install Node.js first

  • Download the installer from nodejs.org

  • Or use command:

# macOS / Linux
curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash
# or
wget -qO- https://raw.githubusercontent.com/nvm-sh/nvm/v0.40.1/install.sh | bash

# load nvm into current shell (choose the one you use)
source ~/.bashrc || true
source ~/.zshrc  || true

# install and use LTS Node.js
nvm install --lts
nvm use --lts

# Windows (one of the following)
winget install OpenJS.NodeJS.LTS
# or with Chocolatey (in admin PowerShell)
iwr -useb https://raw.githubusercontent.com/chocolatey/chocolatey/master/chocolateyInstall/InstallChocolatey.ps1 | iex
choco install nodejs-lts -y

Configure your MCP client:

{
  "mcpServers": {
    "dinox-mcp": {
      "command": "npx",
      "args": ["-y", "@deepdataspace/dinox-mcp"],
      "env": {
        "DINOX_API_KEY": "your-api-key-here",
        "IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
      }
    }
  }
}

Note: Replace your-api-key-here with your real key.

Option C: Run from source locally

Make sure Node.js is installed (see Option B), then:

# clone
git clone https://github.com/IDEA-Research/DINO-X-MCP.git
cd DINO-X-MCP

# install deps
npm install

# build
npm run build

Configure your MCP client:

{
  "mcpServers": {
    "dinox-mcp": {
      "command": "node",
      "args": ["/path/to/DINO-X-MCP/build/index.js"],
      "env": {
        "DINOX_API_KEY": "your-api-key-here",
        "IMAGE_STORAGE_DIRECTORY": "/path/to/your/image/directory"
      }
    }
  }
}

CLI Flags & Environment Variables

  • Common flags

    • --http: start in Streamable HTTP mode (otherwise STDIO by default)
    • --stdio: force STDIO mode
    • --dinox-api-key=...: set API key
    • --enable-client-key: allow API key via URL ?key= (Streamable HTTP only)
    • --port=8080: HTTP port (default 3020)
  • Environment variables

    • DINOX_API_KEY (required/conditionally required): DINO-X platform API key
    • IMAGE_STORAGE_DIRECTORY (optional, STDIO): directory to save annotated images
    • AUTH_TOKEN (optional, HTTP): if set, client must send Authorization: Bearer <token>

    Examples:

# STDIO (local)
node build/index.js --dinox-api-key=your-api-key

# Streamable HTTP (server provides a shared API key)
node build/index.js --http --dinox-api-key=your-api-key

# Streamable HTTP (custom port)
node build/index.js --http --dinox-api-key=your-api-key --port=8080

# Streamable HTTP (require client-provided API key via URL)
node build/index.js --http --enable-client-key

Client config when using ?key=:

{
  "mcpServers": {
    "dinox-mcp": {
      "url": "http://localhost:3020/mcp?key=your-api-key"
    }
  }
}

Using AUTH_TOKEN with a gateway that injects Authorization: Bearer <token>:

AUTH_TOKEN=my-token node build/index.js --http --enable-client-key

Client example with supergateway:

{
  "mcpServers": {
    "dinox-mcp": {
      "command": "npx",
      "args": [
        "-y",
        "supergateway",
        "--streamableHttp",
        "http://localhost:3020/mcp?key=your-api-key",
        "--oauth2Bearer",
        "my-token"
      ]
    }
  }
}

Tools

CapabilityTool IDTransportInputOutput
Full-scene object detectiondetect-all-objectsSTDIO / HTTPImage URLCategory + bbox + (optional) captions
Text-prompted object detectiondetect-objects-by-textSTDIO / HTTPImage URL + English nouns (dot-separated for multiple, e.g., person.car)Target object bbox + (optional) captions
Human pose estimationdetect-human-pose-keypointsSTDIO / HTTPImage URL17 keypoints + bbox + (optional) captions
Visualizationvisualize-detection-resultSTDIO onlyImage URL + detection results arrayLocal path to annotated image

🎬 Use Cases

🎯 Scenario📝 Input✨ Output
Detection & Localization💬 Prompt:<br>Detect and visualize the <br>fire areas in the forest <br><br>🖼️ Input Image:<br>1-11-2
Object Counting💬 Prompt:<br>Please analyze this<br>warehouse image, detect<br>all the cardboard boxes,<br>count the total number<br><br>🖼️ Input Image:<br>2-1<img width="1276" alt="2-2" src="https://github.com/user-attachments/assets/3f18ef8c-5e89-45fc-bd0f-f23381304272" />
Feature Detection💬 Prompt:<br>Find all red cars<br>in the image<br><br>🖼️ Input Image:<br>4-14-2
Attribute Reasoning💬 Prompt:<br>Find the tallest person<br>in the image, describe<br>their clothing<br><br>🖼️ Input Image:<br>5-15-2
Full Scene Detection💬 Prompt:<br>Find the fruit with<br>the highest vitamin C<br>content in the image<br><br>🖼️ Input Image:<br>6-16-3<br><br>Answer: Kiwi fruit (93mg/100g)
Pose Analysis💬 Prompt:<br>Please analyze what<br>yoga pose this is<br><br>🖼️ Input Image:<br>3-13-3

FAQ

  • Supported image sources?
    • STDIO: file:// and https://
    • Streamable HTTP: https:// only
  • Supported image formats?
    • jpg, jpeg, webp, png

Development & Debugging

Use watch mode to auto-rebuild during development:

npm run watch

Use MCP Inspector for debugging:

npm run inspector

License

Apache License 2.0

FAQ

What is the DINO-X MCP server?
DINO-X is a Model Context Protocol (MCP) server profile on explainx.ai. MCP lets AI hosts (e.g. Claude Desktop, Cursor) call tools and resources through a standard interface; this page summarizes categories, install hints, and community ratings.
How do MCP servers relate to agent skills?
Skills are reusable instruction packages (often SKILL.md); MCP servers expose live capabilities. Teams frequently combine both—skills for workflows, MCP for APIs and data. See explainx.ai/skills and explainx.ai/mcp-servers for parallel directories.
How are reviews shown for DINO-X?
This profile displays 72 aggregated ratings (sample rows for discoverability plus signed-in user reviews). Average score is about 4.6 out of 5—verify behavior in your own environment before production use.

Use Cases

Extended AI Capabilities

Add new capabilities to Claude beyond text generation

Example

Access external data sources, execute code, interact with tools and services

Transform Claude from chatbot to action-taking agent

Context Enhancement

Provide Claude with access to relevant context and data

Example

Load project documentation, access knowledge bases, query databases

Get more accurate, context-aware responses

Workflow Automation

Automate multi-step workflows combining AI and external tools

Example

Research → Summarize → Create document → Send notification

Complete complex tasks end-to-end without manual steps

Implementation Guide

Prerequisites

  • Claude Desktop 0.7.0+ or Cursor IDE with MCP support
  • Basic understanding of MCP architecture and capabilities
  • Access credentials for integrated services (if required)
  • Willingness to experiment and iterate on configuration

Time Estimate

15-60 minutes depending on server complexity

Installation Steps

  1. 1.Install MCP server: npm install -g [package-name] or via GitHub
  2. 2.Add server configuration to ~/.claude/mcp.json
  3. 3.Provide required credentials and configuration
  4. 4.Restart Claude Desktop to load new server
  5. 5.Test basic functionality with simple prompts
  6. 6.Explore capabilities and experiment with use cases
  7. 7.Document successful patterns for reuse

Troubleshooting

  • MCP server not loading: Check config syntax, verify installation
  • Connection errors: Check network, firewall, credentials
  • Feature not working: Read server docs, check required parameters
  • Performance issues: Monitor resource usage, check for network latency
  • Conflicts with other servers: Check port assignments, namespace collisions

Best Practices

✓ Do

  • +Read server documentation thoroughly before setup
  • +Start with simple use cases to validate functionality
  • +Test in non-production environment first
  • +Monitor resource usage and performance
  • +Keep servers updated for bug fixes and new features
  • +Document configuration for team members
  • +Use environment variables for sensitive configuration

✗ Don't

  • Don't grant overly permissive access to MCP servers
  • Don't skip reading security considerations in docs
  • Don't expose sensitive data without proper controls
  • Don't run untrusted MCP servers without code review
  • Don't ignore error messages—investigate root cause

💡 Pro Tips

  • Combine multiple MCP servers for powerful workflows
  • Create custom MCP servers for your specific needs
  • Share successful configurations with team
  • Use MCP inspector for debugging
  • Join MCP community for tips and troubleshooting

Technical Details

Architecture

Model Context Protocol standardizes how AI hosts (Claude, Cursor) communicate with external tools and data sources through server implementations.

Protocols

  • Model Context Protocol (MCP)
  • JSON-RPC 2.0
  • stdio or HTTP transport

Compatibility

  • Claude Desktop
  • Cursor IDE
  • Custom MCP clients

When to Use This

✓ Use When

Use when you need Claude to access external data, execute actions, or integrate with tools. Best for extending AI capabilities beyond conversation.

✗ Avoid When

Avoid when native integrations exist (use official APIs directly), for real-time critical systems, or when security/compliance requires zero external dependencies.

Integration

  • Tool composition: Chain multiple MCP tools in workflows
  • Context augmentation: Provide AI with relevant external data
  • Action delegation: Let AI execute tasks on external systems
  • Bidirectional sync: Keep AI context and external systems in sync

Discussion

Product Hunt–style comments (not star reviews)
  • No comments yet — start the thread.

List & Promote Your MCP Server

Share your MCP server with the developer community

GET_STARTED →
MCP server reviews

Ratings

4.672 reviews
  • Omar Khan· Dec 28, 2024

    DINO-X is among the better-indexed MCP projects we tried; the explainx.ai summary tracks the official description.

  • Dhruvi Jain· Dec 12, 2024

    DINO-X is among the better-indexed MCP projects we tried; the explainx.ai summary tracks the official description.

  • Carlos Kim· Dec 12, 2024

    DINO-X reduced integration guesswork — categories and install configs on the listing matched the upstream repo.

  • Nia Diallo· Dec 12, 2024

    Useful MCP listing: DINO-X is the kind of server we cite when onboarding engineers to host + tool permissions.

  • Hana Verma· Dec 8, 2024

    We wired DINO-X into a staging workspace; the listing’s GitHub and npm pointers saved time versus hunting across READMEs.

  • Mei Chawla· Nov 27, 2024

    DINO-X is a well-scoped MCP server in the explainx.ai directory — install snippets and categories matched our Claude Code setup.

  • Soo Diallo· Nov 19, 2024

    Strong directory entry: DINO-X surfaces stars and publisher context so we could sanity-check maintenance before adopting.

  • Oshnikdeep· Nov 3, 2024

    Strong directory entry: DINO-X surfaces stars and publisher context so we could sanity-check maintenance before adopting.

  • Carlos Li· Nov 3, 2024

    Useful MCP listing: DINO-X is the kind of server we cite when onboarding engineers to host + tool permissions.

  • Hiroshi Tandon· Nov 3, 2024

    DINO-X reduced integration guesswork — categories and install configs on the listing matched the upstream repo.

showing 1-10 of 72

1 / 8