audio_client_server

A web-based audio recording client that streams sound to a Python Flask server for real-time transcription and processing.

https://github.com/davidbmar/audio_client_server  ·  public  ·  shipped

audio_client_server screenshot

What it is

This project implements a client-server architecture where a web browser captures audio input via the MediaRecorder API and sends it to a backend server. The server, built with Flask, receives the audio stream, optionally stores it (e.g., in S3), and performs real-time speech-to-text transcription. It includes authentication mechanisms using Auth0 for protected endpoints and supports both lightweight local processing and cloud-integrated workflows.

Features

Architecture

flowchart TD
    Client[Web Browser] -->|Audio Stream| Server[Flask Server]
    Server -->|Validate Token| Auth0[Auth0 Service]
    Server -->|Transcribe| STT[Speech Engine]
    Server -->|Store| S3[(S3 Bucket)]
    Server -->|Response| Client

How it's built

The backend is a Python Flask application structured with blueprints for modularity. It uses PyJWT for validating Auth0 access tokens and guards routes based on permissions. The frontend likely uses standard HTML5 JavaScript APIs for audio capture. The system supports dependency injection for services like message handling and security validation.

How it runs

sequenceDiagram
    participant U as User
    participant B as Browser
    participant F as Flask Server
    participant A as Auth0
    participant T as Transcriber
    U->>B: Start Recording
    B->>F: POST Audio Chunk
    F->>A: Validate JWT
    A-->>F: Valid Token
    F->>T: Process Audio
    T-->>F: Text Result
    F-->>B: Return Transcription
    B-->>U: Display Text

How to apply & reuse

Deploy this system to create secure, voice-enabled web applications. Use cases include voice notes, real-time meeting transcription, or accessibility tools where users speak into their browser and receive immediate text feedback. The Auth0 integration makes it suitable for multi-user environments requiring role-based access control.

At a glance

CapabilitiesAudio CaptureReal-time StreamingSpeech RecognitionJWT AuthenticationAPI Security
ComponentsFlask Blueprint APIAuth0 ServiceMessage ServiceSecurity GuardsException Handlers
TechPythonFlaskJavaScriptPyJWTHTML5 MediaRecorder
Depends onFlaskPyJWTWerkzeugRequests
Integrates withAuth0AWS S3Web Browsers
PatternsClient-ServerBlueprint ModularityDecorator-based SecurityService Layer
Reuse tagsaudio-processingflask-apiauth0-integrationreal-time-transcriptionweb-audio

⚠ Needs attention