audio_client_server · davidbmar.com

What it is

This project implements a client-server architecture where a web browser captures audio input via the MediaRecorder API and sends it to a backend server. The server, built with Flask, receives the audio stream, optionally stores it (e.g., in S3), and performs real-time speech-to-text transcription. It includes authentication mechanisms using Auth0 for protected endpoints and supports both lightweight local processing and cloud-integrated workflows.

Features

Browser-based audio recording using Web APIs
Real-time audio streaming to a Python Flask backend
Speech-to-text transcription capabilities
Auth0 JWT-based authentication and authorization guards
Modular API structure with public, protected, and admin endpoints
Optional S3 integration for audio storage

Architecture

flowchart TD
    Client[Web Browser] -->|Audio Stream| Server[Flask Server]
    Server -->|Validate Token| Auth0[Auth0 Service]
    Server -->|Transcribe| STT[Speech Engine]
    Server -->|Store| S3[(S3 Bucket)]
    Server -->|Response| Client

How it's built

The backend is a Python Flask application structured with blueprints for modularity. It uses PyJWT for validating Auth0 access tokens and guards routes based on permissions. The frontend likely uses standard HTML5 JavaScript APIs for audio capture. The system supports dependency injection for services like message handling and security validation.

How it runs

sequenceDiagram
    participant U as User
    participant B as Browser
    participant F as Flask Server
    participant A as Auth0
    participant T as Transcriber
    U->>B: Start Recording
    B->>F: POST Audio Chunk
    F->>A: Validate JWT
    A-->>F: Valid Token
    F->>T: Process Audio
    T-->>F: Text Result
    F-->>B: Return Transcription
    B-->>U: Display Text

How to apply & reuse

Deploy this system to create secure, voice-enabled web applications. Use cases include voice notes, real-time meeting transcription, or accessibility tools where users speak into their browser and receive immediate text feedback. The Auth0 integration makes it suitable for multi-user environments requiring role-based access control.

At a glance

CapabilitiesAudio CaptureReal-time StreamingSpeech RecognitionJWT AuthenticationAPI Security

ComponentsFlask Blueprint APIAuth0 ServiceMessage ServiceSecurity GuardsException Handlers

TechPythonFlaskJavaScriptPyJWTHTML5 MediaRecorder

Depends onFlaskPyJWTWerkzeugRequests

Integrates withAuth0AWS S3Web Browsers

PatternsClient-ServerBlueprint ModularityDecorator-based SecurityService Layer

Reuse tagsaudio-processingflask-apiauth0-integrationreal-time-transcriptionweb-audio

⚠ Needs attention

unmerged_branch: dependabot/npm_and_yarn/auth0/spa_react_javascript/npm_and_yarn-2be76d4b41 is 1 commit ahead of the default branch
unmerged_branch: dependabot/pip/auth0/api_flask_python/pip-31294bf0f8 is 1 commit ahead of the default branch
open_pr: PR #2: Bump the npm_and_yarn group across 1 directory with 30 updates
open_pr: PR #1: Bump the pip group across 2 directories with 9 updates