TranscriptionAPI-S3-backend

What it is

This project is a Python-based backend component designed to facilitate the upload of transcription content. It acts as an intermediary between a client application and an AWS S3 bucket. Instead of handling large binary data directly, the service authenticates requests and returns short-lived, presigned URLs. Clients use these URLs to upload audio files directly to S3, reducing server load and bandwidth usage while maintaining security through temporary credentials.

Features

Generates AWS S3 presigned URLs for direct client-side uploads
Configurable URL expiration time for security
Automatic UUID generation for unique file naming
Environment-based configuration for bucket and region
Lightweight Flask API with JSON responses

Quickstart

pip install flask boto3 python-dotenv
export AWS_ACCESS_KEY_ID=your_key
export AWS_SECRET_ACCESS_KEY=your_secret
export S3_BUCKET_NAME=your-bucket-name
python src/AudioTranscriptionAPI.py

Architecture

flowchart TD
    Client[Client App] -->|HTTP GET /upload-url| FlaskApp[Flask Service]
    FlaskApp -->|Generate Presigned URL| Boto3[Boto3 SDK]
    Boto3 -->|Sign Request| AWS_S3[AWS S3 Bucket]
    Client -->|HTTP PUT Audio File| AWS_S3
    subgraph Configuration
        Env[Environment Variables]
    end
    Env --> FlaskApp

How it's built

The application is built using Flask for the HTTP server layer and Boto3 for AWS SDK interactions. It relies on environment variables for configuration, including the target S3 bucket name, AWS region, and presigned URL expiration time. The core logic involves validating incoming requests, generating a unique key (UUID) for each upload, and using the Boto3 client to create a presigned PUT URL with a configurable timeout.

How it runs

sequenceDiagram
    participant C as Client
    participant F as Flask App
    participant B as Boto3
    participant S as AWS S3

    C->>F: GET /generate-presigned-url
    F->>F: Generate UUID for filename
    F->>B: generate_presigned_url('put_object')
    B-->>F: Return Presigned URL
    F-->>C: JSON { url: '...', key: '...' }
    C->>S: PUT audio/file.mp3 (using Presigned URL)
    S-->>C: 200 OK

How to apply & reuse

Deploy this service in environments where you need to accept audio uploads without managing the storage infrastructure yourself. It is suitable for integration into larger transcription pipelines where the frontend or mobile app needs a secure way to push raw audio data to cloud storage before triggering processing jobs. Configure it with your specific AWS credentials and bucket details via environment variables.

At a glance

CapabilitiesPresigned URL GenerationS3 IntegrationREST API Endpoint

ComponentsFlask ApplicationBoto3 ClientEnvironment Config Loader

TechPythonFlaskBoto3AWS S3

Depends onflaskboto3python-dotenv

Integrates withAWS S3Frontend UploadersMobile Applications

PatternsPresigned URL PatternDirect-to-Storage UploadMicroservice

Reuse tagsaudio-uploads3-backendflask-apipresigned-urls