nano-claw voice loop · davidbmar.com

What it is

nano-claw is a personal AI agent that operates entirely through voice in the browser. It combines a TypeScript-based agent core with native macOS services for Speech-to-Text (Whisper) and Text-to-Speech (Kokoro/Piper), leveraging Metal acceleration to bypass Docker's GPU limitations on Apple Silicon.

Features

Real-time voice interaction via WebRTC and WebSockets in the browser
Native Metal-accelerated Whisper STT and Kokoro TTS for low latency
Secure tool execution with manual user approval for shell and file operations
Streaming LLM responses with sentence-level audio synthesis
Support for multiple LLM providers including Anthropic, Google, and OpenAI
Live pipeline configuration for switching models and voices without restart

Quickstart

git clone https://github.com/davidbmar/2026-nano-claw-voice-loop-tts-stt.git
cd 2026-nano-claw-voice-loop-tts-stt
pip install -r requirements.txt
python stt_service.py &
python tts_service.py &
docker compose up --build

Architecture

flowchart TD
    Browser[Browser Client]
    VoiceServer[Voice Server Python]
    NanoClawAPI[nano claw API TypeScript]
    STTService[STT Service Native]
    TTSService[TTS Service Native]
    LLMProvider[LLM Provider Cloud]
    LocalTools[Local Tools Shell File]
    Browser -- WebRTC Audio --> VoiceServer
    VoiceServer -- HTTP Transcribe --> STTService
    VoiceServer -- SSE Chat --> NanoClawAPI
    NanoClawAPI -- API Request --> LLMProvider
    NanoClawAPI -- Execute --> LocalTools
    VoiceServer -- HTTP Synthesize --> TTSService
    VoiceServer -- WebRTC Audio --> Browser
    NanoClawAPI -- Tool Approval --> Browser

How it's built

The system uses a hybrid architecture: a Docker container hosts the TypeScript API, agent loop, and WebSocket voice server, while two standalone Python HTTP services run natively on the host Mac to handle Whisper STT and Kokoro TTS via Metal. The browser communicates via WebRTC for audio and WebSockets for control, while the agent interacts with LLMs (Claude/Gemini) and local tools.

How it runs

sequenceDiagram
    participant User
    participant Browser
    participant VoiceServer
    participant STTService
    participant NanoClawAPI
    participant LLMProvider
    participant TTSService
    User->>Browser: Speak into microphone
    Browser->>VoiceServer: Stream audio via WebRTC
    VoiceServer->>STTService: POST transcribe audio bytes
    STTService-->>VoiceServer: Return transcribed text
    VoiceServer->>NanoClawAPI: POST chat request with text
    NanoClawAPI->>LLMProvider: Stream prompt to model
    LLMProvider-->>NanoClawAPI: Stream response tokens
    NanoClawAPI-->>VoiceServer: Stream sentence deltas
    VoiceServer->>TTSService: POST synthesize sentence
    TTSService-->>VoiceServer: Return audio bytes
    VoiceServer->>Browser: Stream audio and text delta
    Browser->>User: Play audio and show text

How to apply & reuse

Use nano-claw as a hands-free coding companion or desktop assistant. It can execute shell commands, read/write files, and manage tasks based on voice instructions, requiring explicit user approval for any tool execution to ensure safety.

At a glance

CapabilitiesVoice-driven interactionLocal tool executionStreaming LLM inferenceGPU-accelerated audioSession memory persistence

ComponentsAgent LoopContext BuilderMemory StorageTool RegistrySkills LoaderVoice ServerSTT ServiceTTS Service

TechTypeScriptPythonDockerWebRTCWebSocketsWhisperKokoroPiper

Depends onNode.jsPython 3.12Docker DesktopMetal GPUFFmpeg

Integrates withAnthropic ClaudeGoogle GeminiOpenAI GPTDeepSeekGroqAlibaba Qwen

PatternsAgent LoopEvent StreamingHybrid ComputeTool UseWebSocket Communication

Reuse tagsai-agentvoice-interfacelocal-llmtypescriptdockermacos

⚠ Needs attention

unmerged_branch: copilot/optimize-typescript-files is 1 commit ahead of the default branch
unmerged_branch: dependabot/npm_and_yarn/npm_and_yarn-4825ac1e2e is 1 commit ahead of the default branch
unmerged_branch: spacechannel-persona is 26 commits ahead of the default branch
open_pr: PR #1: Bump vitest from 1.6.1 to 4.1.8 in the npm_and_yarn group across 1 directory