Voice Calendar Scheduler FSM

A 24/7 voice-driven apartment viewing scheduling assistant using Twilio/WebRTC, an 8-step FSM, and Google Calendar integration.

https://github.com/davidbmar/voice-calendar-scheduler-FSM  ·  public  ·  shipped

Voice Calendar Scheduler FSM screenshot

What it is

An automated voice agent that handles inbound calls or browser-based WebRTC connections to schedule apartment viewings. It uses a Finite State Machine (FSM) to guide callers through preference gathering, apartment search via RAG, availability checking, and final booking on Google Calendar, utilizing Piper TTS and Faster-Whisper STT for natural voice interaction.

Features

Quickstart

git clone --recursive https://github.com/davidbmar/voice-calendar-scheduler-FSM
cd voice-calendar-scheduler-FSM
./scripts/setup.sh
cp .env.example .env
$EDITOR .env
./scripts/start.sh

Architecture

flowchart TD
    Caller[Caller Phone/Browser] -->|Twilio PSTN| Twilio[Twilio Media Streams]
    Caller -->|WebRTC| Gateway[Gateway Signaling WS]
    Twilio --> Channel[TwilioMediaStreamChannel]
    Gateway --> Channel2[WebRTCChannel]
    Channel --> Session[SchedulingSession]
    Channel2 --> Session
    Session --> STT[STT: Faster-Whisper]
    Session --> FSM[FSM Orchestrator]
    Session --> TTS[TTS: Piper]
    FSM --> LLM[LLM: Claude/Ollama]
    FSM --> Tools[Tool Framework]
    Tools --> RAG[RAG Service: LanceDB]
    Tools --> GCal[Google Calendar API]
    STT --> FSM
    FSM --> TTS
    TTS --> Channel
    TTS --> Channel2

How it's built

Built with Python 3.11+ and FastAPI. The core logic resides in a git submodule (`engine-repo`) providing the FSM orchestrator, STT/TTS pipelines, and LLM abstraction. The application layer (`scheduling/`) implements domain-specific tools (Google Calendar, RAG search). A separate `gateway/` module handles WebRTC signaling and TURN credentials. Apartment data is indexed in LanceDB for RAG retrieval.

How it runs

sequenceDiagram
    participant C as Caller
    participant G as Gateway/Channel
    participant S as SchedulingSession
    participant F as FSM Orchestrator
    participant T as Tools (RAG/GCal)
    
    C->>G: Connect (PSTN/WebRTC)
    G->>S: Initialize Session
    S->>F: Start FSM
    F->>S: Step 1: Greet & Gather Preferences
    S->>C: Play Greeting (TTS)
    C->>S: Speak Preferences (STT)
    S->>F: Update Context
    F->>T: Step 2: Search Listings
    T-->>F: Return Matches
    F->>S: Step 3: Present Options
    S->>C: Narrate Options (TTS)
    C->>S: Select Option (STT)
    F->>T: Step 4: Check Availability
    T-->>F: Return Free Slots
    F->>S: Step 5: Propose Times
    S->>C: Offer Slots (TTS)
    C->>S: Confirm Slot (STT)
    F->>S: Step 6: Collect Details
    S->>C: Ask Name/Email (TTS)
    C->>S: Provide Details (STT)
    F->>T: Step 7: Create Booking
    T-->>F: Event Created
    F->>S: Step 8: Confirm & Done
    S->>C: Confirmation (TTS)
    S->>G: Close Session

How to apply & reuse

Clone the repository recursively to include the engine submodule. Run the provided setup script to create a virtual environment and install dependencies. Configure API keys for Twilio, Google Calendar, and your chosen LLM provider (Claude or Ollama) in `.env`. Start the backend, RAG service, and optional editor using the start script.

At a glance

CapabilitiesVoice InteractionFinite State Machine OrchestrationRetrieval-Augmented Generation (RAG)Calendar IntegrationWebRTC SignalingTelephony Integration
ComponentsEngine SubmoduleScheduling AppGateway ServerRAG ServiceVisual Editor
TechPython 3.11+FastAPITwilioWebRTCFaster-WhisperPiper TTSLanceDBGoogle Calendar APIDocker
Depends onAnthropic Claude API or OllamaTwilio AccountGoogle Cloud Service AccountNode.js (optional)Docker Compose
Integrates withTwilio PSTNGoogle CalendarLanceDB Vector StoreBrowser WebRTC Clients
PatternsFinite State MachineTool Use / Function CallingRetrieval-Augmented GenerationClient-Server SignalingMicroservices (RAG)
Reuse tagsvoice-agentscheduling-botfsm-conversationtwilio-integrationwebrtc-audiocalendar-automation

Repo hygiene

✓ all on main — nothing unmerged.