iPhone Streaming Plus Finite State Machine

What it is

A local-first voice agent that runs on macOS and connects to iPhone Safari. It uses Whisper for STT, routes queries via regex to either a fast LLM path or a multi-step Finite State Machine (Research, Deep Dive, Fact Check), and streams Piper TTS audio back to the phone using WebRTC.

Features

WebRTC audio streaming from Mac to iPhone Safari with low latency
Hybrid FSM engine for Research, Deep Dive, and Fact Check workflows
Sub-millisecond regex keyword router for intent dispatching
Supports Ollama (local), Claude, and OpenAI with runtime switching
Input quality filter to reject noise and save LLM costs
Mobile UI with workflow debugger and step-by-step progress visualization

Quickstart

git clone https://github.com/davidbmar/iphone-streaming-plus-Finite-State-Machine.git
cd iphone-streaming-plus-Finite-State-Machine
pip install -r requirements.txt
python main.py

Architecture

flowchart TD
    subgraph Mac_Host["Mac Host"]
        Engine["Engine\nWorkflowRunner\nKeyword Router\nFSM Executor"]
        Gateway["Gateway\naiohttp Server :8080\nWebSocket Signaling\nRTCPeerConnection"]
        TTS["Piper TTS (ONNX)"]
        STT["Whisper STT"]
        LLM["LLM Provider\n(Ollama/Claude/OpenAI)"]
    end
    subgraph iPhone["iPhone Safari"]
        UI["Voice Agent UI\nHold-to-Talk\nWorkflow Debugger"]
    end
    UI -->|Audio/Mic Input| Gateway
    Gateway -->|Audio Data| STT
    STT -->|Text| Engine
    Engine -->|Query| LLM
    LLM -->|Response Text| Engine
    Engine -->|Text| TTS
    TTS -->|Audio Chunks| Gateway
    Gateway -->|WebRTC Audio Track| UI
    Engine -->|State Updates| Gateway
    Gateway -->|Debug Info| UI

How it's built

Python backend using aiohttp for signaling and WebRTC peer connections. The core logic includes a keyword router, an FSM executor for complex workflows, and adapters for Whisper (STT), Piper (TTS), and multiple LLM providers (Ollama, Claude, OpenAI). The frontend is a mobile-optimized web UI with hold-to-talk controls and a workflow debugger.

How it runs

sequenceDiagram
    participant User as iPhone User
    participant UI as Safari UI
    participant GW as Mac Gateway (aiohttp)
    participant Eng as Engine (FSM/Router)
    participant LLM as LLM Provider
    participant TTS as Piper TTS
    
    User->>UI: Hold to Talk (Mic Input)
    UI->>GW: Send Audio via WebSocket
    GW->>Eng: Forward Audio Data
    Eng->>Eng: Whisper STT Transcription
    Eng->>Eng: Keyword Router Decision
    alt Complex Query
        Eng->>LLM: FSM State Prompt (e.g., Initial Lookup)
        LLM-->>Eng: Search Query/Reasoning
        Eng->>Eng: Execute Tool (Web Search)
        Eng->>LLM: Next FSM State (Synthesize)
        LLM-->>Eng: Final Answer Text
    else Simple Query
        Eng->>LLM: Direct Chat Completion
        LLM-->>Eng: Response Text
    end
    Eng->>TTS: Generate Audio from Text
    TTS-->>Eng: Audio Chunks (PCM)
    Eng->>GW: Stream Audio Chunks
    GW->>UI: WebRTC Audio Track
    UI->>User: Play Response Audio

How to apply & reuse

Deploy on a Mac connected to the same network as your iPhone. Use it for privacy-focused voice assistance, local LLM experimentation, or as a template for building stateful voice agents with WebRTC audio streaming.

At a glance

CapabilitiesVoice InteractionWebRTC StreamingFinite State Machine ExecutionLocal LLM IntegrationReal-time TranscriptionText-to-Speech Synthesis

Componentsengine/workflow.pyengine/adapter.pyengine/conversation.pyengine/fast_path.pyengine/input_filter.pyengine/llm.pygateway/server.py

TechPythonWebRTCaiohttpWhisperPiper TTSOllamaMermaid

Depends onmacOS HostPython 3.10+Ollama (optional)Anthropic API Key (optional)OpenAI API Key (optional)

Integrates withiPhone SafariOllamaClaude APIOpenAI APIWeb Search Tools

PatternsFinite State MachineKeyword RoutingClient-Server WebRTCSliding Window ContextFast Path Optimization

Reuse tagsvoice-agentwebrtc-audiofinite-state-machinelocal-llmpython-backendios-web-app

⚠ Needs attention

unmerged_branch: dependabot/npm_and_yarn/web-app/npm_and_yarn-d1f9cb5775 is 1 commit ahead of the default branch
unmerged_branch: feature/admin-dashboard is 16 commits ahead of the default branch
open_pr: PR #1: Bump the npm_and_yarn group across 1 directory with 5 updates