Serverless pipeline for downloading, transcribing, and scanning YouTube videos for specific phrases using WhisperX and AWS infrastructure.
https://github.com/davidbmar/youtube_transcriber_3 · public · shipped
A distributed system that processes YouTube video URLs from an SQS queue. It downloads audio, generates high-accuracy transcripts using WhisperX on GPU instances (managed via RunPod), scans for user-defined phrases, and stores results in S3. It includes a Lambda-based controller for managing ephemeral GPU resources.
pip install -r docker/requirements.txt export RUNPOD_API_KEY=your_api_key python run.py
flowchart TD
A[Client] -->|Push URL| B(AWS SQS Queue)
B -->|Trigger| C[Worker Container]
C -->|Download Audio| D[YouTube]
C -->|Request GPU| E[AWS Lambda]
E -->|Manage Lifecycle| F[RunPod API]
F -->|Provision Pod| G[GPU Instance]
C -->|Send Audio| G
G -->|Run WhisperX| H[Transcription]
H -->|Scan Phrases| I[Scanner Module]
I -->|Store Results| J(AWS S3 Bucket)
Python-based worker architecture containerized with Docker. Uses `yt-dlp` for downloading, `WhisperX` for transcription, and `boto3` for AWS integration (SQS/S3). GPU compute is abstracted via RunPod, controlled by an AWS Lambda function that handles pod lifecycle events.
sequenceDiagram
participant Client
participant SQS
participant Worker
participant Lambda
participant RunPod
participant S3
Client->>SQS: Send Video URL
SQS->>Worker: Trigger Job
Worker->>Lambda: Request GPU Resource
Lambda->>RunPod: Create/Get Pod
RunPod-->>Lambda: Pod Endpoint
Lambda-->>Worker: Return Endpoint
Worker->>Worker: Download Audio (yt-dlp)
Worker->>RunPod: Upload Audio & Start Transcription
RunPod->>RunPod: Process with WhisperX
RunPod-->>Worker: Return Transcript
Worker->>Worker: Scan for Phrases
Worker->>S3: Save Results
Deploy the Lambda function to manage RunPod credentials and permissions. Configure the SQS queue to trigger the worker container. Push YouTube URLs to the queue to initiate processing. Results are written to S3 buckets for downstream analysis.
✓ all on main — nothing unmerged.