productivitycommunication

Voice MCP

by mbailey

Voice MCP powers two-way voice apps with Google Cloud Speech to Text, Speech Recognition, and Text to Speech API for acc

Enables two-way voice conversations through multiple transport methods including local microphone recording and LiveKit room-based communication, with configurable STT/TTS services and automatic transport fallback for creating voice-enabled applications.

github stars

875

Multiple transport methods with fallbackWorks with existing Claude setupLocal and cloud STT/TTS options

best for

  • / Developers who need hands-free coding assistance
  • / Working while multitasking or away from keyboard
  • / Accessibility for users who prefer voice interaction

capabilities

  • / Record voice through local microphone
  • / Convert speech to text with multiple STT services
  • / Convert text to speech with configurable TTS services
  • / Connect through LiveKit rooms for remote voice chat
  • / Handle automatic transport fallback
  • / Maintain continuous voice conversations

what it does

Enables voice conversations with Claude by converting speech to text and text back to speech. Works through local microphone or remote room connections with automatic fallback options.

about

Voice MCP is a community-built MCP server published by mbailey that provides AI assistants with tools and capabilities via the Model Context Protocol. Voice MCP powers two-way voice apps with Google Cloud Speech to Text, Speech Recognition, and Text to Speech API for acc It is categorized under productivity, communication.

how to install

You can install Voice MCP in your AI client of choice. Use the install panel on this page to get one-click setup for Cursor, Claude Desktop, VS Code, and other MCP-compatible clients. This server runs locally on your machine via the stdio transport.

license

MIT

Voice MCP is released under the MIT license. This is a permissive open-source license, meaning you can freely use, modify, and distribute the software.

readme

VoiceMode

Natural voice conversations with Claude Code (and other MCP capable agents)

PyPI Downloads PyPI Downloads PyPI Downloads

VoiceMode enables natural voice conversations with Claude Code. Voice isn't about replacing typing - it's about being available when typing isn't.

Perfect for:

  • Walking to your next meeting
  • Cooking while debugging
  • Giving your eyes a break after hours of screen time
  • Holding a coffee (or a dog)
  • Any moment when your hands or eyes are busy

See It In Action

VoiceMode Demo

Quick Start

Requirements: Computer with microphone and speakers

Option 1: Claude Code Plugin (Recommended)

The fastest way for Claude Code users to get started:

# Add the VoiceMode marketplace
claude plugin marketplace add mbailey/voicemode

# Install VoiceMode plugin
claude plugin install voicemode@voicemode

## Install dependencies (CLI, Local Voice Services)

/voicemode:install

# Start talking!
/voicemode:converse

Option 2: Python installer package

Installs dependencies and the VoiceMode Python package.

# Install UV package manager (if needed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Run the installer (sets up dependencies and local voice services)
uvx voice-mode-install

# Add to Claude Code
claude mcp add --scope user voicemode -- uvx --refresh voice-mode

# Optional: Add OpenAI API key as fallback for local services
export OPENAI_API_KEY=your-openai-key

# Start a conversation
claude converse

For manual setup, see the Getting Started Guide.

Features

  • Natural conversations - speak naturally, hear responses immediately
  • Works offline - optional local voice services (Whisper STT, Kokoro TTS)
  • Low latency - fast enough to feel like a real conversation
  • Smart silence detection - stops recording when you stop speaking
  • Privacy options - run entirely locally or use cloud services

Compatibility

Platforms: Linux, macOS, Windows (WSL), NixOS Python: 3.10-3.14

Configuration

VoiceMode works out of the box. For customization:

# Set OpenAI API key (if using cloud services)
export OPENAI_API_KEY="your-key"

# Or configure via file
voicemode config edit

See the Configuration Guide for all options.

Permissions Setup (Optional)

To use VoiceMode without permission prompts, add to ~/.claude/settings.json:

{
  "permissions": {
    "allow": [
      "mcp__voicemode__converse",
      "mcp__voicemode__service"
    ]
  }
}

See the Permissions Guide for more options.

Local Voice Services

For privacy or offline use, install local speech services:

  • Whisper.cpp - Local speech-to-text
  • Kokoro - Local text-to-speech with multiple voices

These provide the same API as OpenAI, so VoiceMode switches seamlessly between them.

Installation Details

<details> <summary><strong>System Dependencies by Platform</strong></summary>

Ubuntu/Debian

sudo apt update
sudo apt install -y ffmpeg gcc libasound2-dev libasound2-plugins libportaudio2 portaudio19-dev pulseaudio pulseaudio-utils python3-dev

WSL2 users: The pulseaudio packages above are required for microphone access.

Fedora/RHEL

sudo dnf install alsa-lib-devel ffmpeg gcc portaudio portaudio-devel python3-devel

macOS

brew install ffmpeg node portaudio

NixOS

# Use development shell
nix develop github:mbailey/voicemode

# Or install system-wide
nix profile install github:mbailey/voicemode
</details> <details> <summary><strong>Alternative Installation Methods</strong></summary>

From source

git clone https://github.com/mbailey/voicemode.git
cd voicemode
uv tool install -e .

NixOS system-wide

# In /etc/nixos/configuration.nix
environment.systemPackages = [
  (builtins.getFlake "github:mbailey/voicemode").packages.${pkgs.system}.default
];
</details>

Troubleshooting

ProblemSolution
No microphone accessCheck terminal/app permissions. WSL2 needs pulseaudio packages.
UV not foundRun curl -LsSf https://astral.sh/uv/install.sh | sh
OpenAI API errorVerify OPENAI_API_KEY is set correctly
No audio outputCheck system audio settings and available devices

Save Audio for Debugging

export VOICEMODE_SAVE_AUDIO=true
# Files saved to ~/.voicemode/audio/YYYY/MM/

Documentation

Full documentation: voice-mode.readthedocs.io

Links

License

MIT - A Failmode Project


mcp-name: com.failmode/voicemode