whisper-runpod

Containerized Faster Whisper transcription service optimized for RunPod with S3 integration.

https://github.com/davidbmar/whisper-runpod  ·  public  ·  shipped

What it is

A Docker-based solution that deploys a Faster Whisper speech-to-text API server, designed for easy deployment on RunPod.io. It includes utility scripts to fetch audio files from AWS S3 and submit them for transcription via the local API endpoint.

Features

Quickstart

docker build -t yourusername/whisper-runpod:latest .
docker push yourusername/whisper-runpod:latest
docker run -p 8000:8000 yourusername/whisper-runpod:latest
curl http://localhost:8000/v1/audio/transcriptions -F "file=@your-audio-file.mp3" -F "language=en"

Architecture

flowchart TD
    User[User/Client] -->|HTTP POST| API[Faster Whisper API :8000]
    Script[S3 Transcribe Script] -->|aws s3 cp| S3[(AWS S3 Bucket)]
    S3 -->|Audio File| Script
    Script -->|curl POST| API
    API -->|Process| Model[Faster Whisper Engine]
    Model -->|Text Output| API
    API -->|JSON Response| User
    API -->|JSON Response| Script
    Entrypoint[entrypoint.sh] -->|Manages| API

How it's built

The project uses a Dockerfile to build an image containing the Faster Whisper engine and its dependencies. An entrypoint script manages the server lifecycle, ensuring the API remains active. Shell scripts handle external interactions, specifically downloading assets from S3 using AWS CLI and sending HTTP requests to the transcription endpoint.

How it runs

sequenceDiagram
    participant U as User/Script
    participant S as S3 Bucket
    participant A as API Server (Port 8000)
    participant M as Faster Whisper Model
    
    alt S3 Workflow
        U->>S: aws s3 cp (Download Audio)
        S-->>U: Return Audio File
        U->>A: POST /v1/audio/transcriptions (File + Language)
    else Direct Workflow
        U->>A: POST /v1/audio/transcriptions (Local File)
    end
    
    A->>M: Process Audio Stream
    M-->>A: Return Transcribed Text
    A-->>U: JSON Response with Text

How to apply & reuse

Deploy the built Docker image to a RunPod instance or run it locally. Use the provided shell scripts to automate the workflow of fetching remote audio files from S3 buckets and posting them to the running container's API for text extraction.

At a glance

CapabilitiesSpeech-to-Text TranscriptionS3 IntegrationAPI Service HostingContainerized Deployment
Componentsentrypoint.shtest_transcribe_by_fasterWhisperAPI_fromS3.shDockerfile
TechDockerShell ScriptingFaster WhisperPythonAWS CLI
Depends onDocker RuntimeAWS CredentialsRunPod Account (Optional)
Integrates withAWS S3RunPod.ioOpenAI-compatible Clients
PatternsMicroserviceWorker PatternSidecar Scripting
Reuse tagsspeech-to-textwhisperrunpoddockers3transcription

Repo hygiene

✓ all on main — nothing unmerged.