recording_app · davidbmar.com

What it is

A privacy-focused voice recording system that captures audio in the browser, streams it in chunks to private AWS S3 storage, and uses a local worker (running on your Mac) to transcribe the audio using faster-whisper. It includes a web interface for live status tracking and supports advanced features like intent recognition ('hey riff') and transcript hydration.

Features

Chunked audio streaming from browser to private S3 bucket
Local transcription worker using faster-whisper for privacy and speed
Live status updates for recordings in the web interface
HMAC-based stateless authentication and scoped tokens
Deterministic intent recognition layer for voice commands
Transcript hydration and idea generation via LLM integration

Quickstart

cp env.sample .env
export TOKEN_SECRET="$(openssl rand -hex 32)"
./scripts/deploy.sh
cd /path/to/recording_app
python3 -m venv .wenv
.wenv/bin/pip install boto3
.wenv/bin/python -m worker.worker --bucket <bucket> --region us-east-2 --stub

Architecture

flowchart TD
    User[User Browser] -->|MediaStream API| Client[Web App Client]
    Client -->|Chunked Audio POST| Lambda[AWS Lambda Web App]
    Lambda -->|Auth Check| Auth[src/auth.py]
    Lambda -->|PutObject| S3[(AWS S3 Bucket)]
    S3 -->|Poll/ListObjects| Worker[Local Transcription Worker]
    Worker -->|Download Audio| S3
    Worker -->|Transcribe| Whisper[faster-whisper]
    Worker -->|Upload Transcript| S3
    Worker -->|Hydrate| LLM[DashScope/Qwen API]
    Lambda -->|Get Status| S3
    Client -->|Poll Status| Lambda

How it's built

The system consists of a Python-based AWS Lambda web application (served via Function URL) for handling uploads and auth, and a separate Python worker process that polls S3 for new recordings. The worker uses `faster-whisper` for transcription and `boto3` for S3 interaction. Frontend logic handles media streaming and intent resolution via vanilla JavaScript.

How it runs

sequenceDiagram
    participant U as User
    participant B as Browser
    participant L as Lambda API
    participant S as S3 Bucket
    participant W as Local Worker
    participant M as Whisper Model

    U->>B: Start Recording
    B->>B: Capture MediaStream
    loop Every Chunk
        B->>L: POST audio chunk
        L->>L: Verify HMAC Token
        L->>S: Upload Chunk
    end
    B->>L: Finalize Recording
    W->>S: List New Objects
    S-->>W: Return Recording Key
    W->>S: Download Full Audio
    W->>M: Transcribe Audio
    M-->>W: Return Text
    W->>S: Upload Transcript.txt
    W->>S: Upload Hydrated.json
    U->>B: Refresh Page
    B->>L: Get Recording Status
    L->>S: Check Metadata
    S-->>L: Return Status
    L-->>B: JSON Status
    B-->>U: Display Transcript

How to apply & reuse

Deploy the web backend to AWS using the provided shell scripts and IAM policies. Run the transcription worker locally on a machine with GPU/CPU capacity, configured with least-privilege AWS credentials. Use the web app to record memos, which are automatically transcribed by the local worker.

At a glance

CapabilitiesVoice RecordingAudio StreamingSpeech-to-TextCloud Storage IntegrationLocal ProcessingIntent Recognition

ComponentsWeb Client (JS/HTML)AWS Lambda BackendS3 StorageLocal Python WorkerAuth ModuleIntent Resolver

TechPythonJavaScriptAWS LambdaAWS S3faster-whisperboto3PygmentsMermaid

Depends onboto3faster-whisperopenaimarkdownpytest

Integrates withAWS S3AWS LambdaDashScope (Qwen)Alibaba Cloud Model Studio

PatternsChunked UploadWorker PoolStateless AuthPollingLeast Privilege Access

Reuse tagsvoice-memostranscriptionaws-serverlesslocal-aiprivacy-first

⚠ Needs attention

unmerged_branch: feat/intent-hydrate-from-server is 8 commits ahead of the default branch
unmerged_branch: feat/l1-intent-routing is 1 commit ahead of the default branch
unmerged_branch: feat/l2-intent-catalog is 2 commits ahead of the default branch
unmerged_branch: fix/deploy-pin-recording-deployer is 2 commits ahead of the default branch
open_pr: PR #15: fix(deploy): pin recording-deployer profile + fail-fast identity guard