Smart Transcription Router

Hybrid AWS architecture routing audio transcription between real-time GPU FastAPI and batch SQS processing based on server health.

https://github.com/davidbmar/smart-transcription-router  ·  public  ·  shipped

Smart Transcription Router screenshot

What it is

A serverless routing layer that intelligently directs audio transcription requests. It checks the availability of a high-performance GPU-powered FastAPI server; if available, it routes requests directly for low-latency results. If the server is down or busy, it falls back to an SQS queue for deferred batch processing, optimizing for both cost and reliability.

Features

Quickstart

./scripts/step-000-setup-configuration.sh
./scripts/step-001-validate-configuration.sh
./scripts/step-010-setup-iam-permissions.sh
./scripts/step-011-validate-iam-permissions.sh
./scripts/step-020-create-sqs-resources.sh
./scripts/step-021-validate-sqs-resources.sh
./scripts/step-340-deploy-lambda-router.sh
./scripts/step-341-configure-eventbridge-trigger.sh
./scripts/step-342-test-lambda-router.sh

Architecture

flowchart TD
    A[Audio Upload] -->|EventBridge| B(Lambda Router)
    B -->|Health Check| C{FastAPI Healthy?}
    C -->|Yes| D[FastAPI Server GPU]
    C -->|No| E[SQS Queue]
    E -->|Scheduled Trigger| F[Batch Worker GPU]
    D --> G[(S3 Storage)]
    F --> G

How it's built

Built using AWS Lambda (Python/Shell) for the routing logic, EventBridge for event ingestion, and SQS for queuing. The compute layer consists of Dockerized FastAPI servers running WhisperX or Voxtral models on GPU instances. Infrastructure is managed via shell scripts interacting with AWS CLI.

How it runs

sequenceDiagram
    participant User as Audio Source
    participant EB as EventBridge
    participant Lambda as Lambda Router
    participant API as FastAPI Server
    participant SQS as SQS Queue
    participant Worker as Batch Worker
    
    User->>EB: Upload Audio File
    EB->>Lambda: Trigger Event
    Lambda->>Lambda: Check Idempotency
    alt Already Transcribed
        Lambda-->>User: Skip Processing
    else Not Transcribed
        Lambda->>API: Health Check / Transcribe Request
        alt Server Healthy
            API->>API: Process with WhisperX/Voxtral
            API-->>Lambda: Return Transcript
            Lambda-->>User: Success
        else Server Unhealthy/Fail
            Lambda->>Lambda: Retry with Backoff
            alt Retries Exhausted
                Lambda->>SQS: Send Message
                SQS-->>Lambda: Acknowledge
                Note over Worker: Scheduled Trigger
                Worker->>SQS: Receive Message
                Worker->>Worker: Spin up GPU & Process
                Worker->>SQS: Delete Message
            end
        end
    end

How to apply & reuse

Deploy the core SQS-only router first to establish reliable batch processing. Optionally add the FastAPI GPU instances for real-time capabilities. Configure environment variables for AWS region, ECR URIs, and SQS queue URLs, then run the provided setup scripts in sequence.

At a glance

CapabilitiesReal-time transcriptionBatch processingHealth-based routingRetry managementSession assembly
ComponentsLambda RouterFastAPI ServerSQS QueueBatch WorkerEventBridge Trigger
TechPythonShellFastAPIAWS LambdaDockerWhisperXVoxtral
Depends onAWS CLIDockerGPU InstancesS3 BucketECR Repository
Integrates withAmazon S3Amazon SQSAmazon EventBridgeAmazon CloudWatch
PatternsCircuit BreakerDead Letter QueueBatch ProcessingServerless RoutingIdempotency Key
Reuse tagsaws-serverlesshybrid-cloudaudio-processinggpu-optimizationevent-driven

Repo hygiene

✓ all on main — nothing unmerged.