Audio2ScriptViewer · davidbmar.com

What it is

Audio2ScriptViewer is a backend utility designed to consume transcription outputs, likely from an upstream speech-to-text service. It listens to an AWS SQS queue for JSON messages containing filenames and transcribed text, cleans the text formatting, sorts entries by a numeric key extracted from the filename, and appends them to a local CSV file. It operates as part of a larger media processing pipeline but can function independently as a worker service.

Features

Consumes transcription results from AWS SQS queues
Cleans and normalizes transcribed text (handling newlines/carriage returns)
Sorts output entries chronologically based on numeric filename keys
Appends processed data to a structured CSV file
Supports both continuous polling and single-execution modes
Infrastructure-as-Code setup via Terraform

Quickstart

terraform init
terraform apply
python3 audio2Script.py --run-once

Architecture

flowchart TD
    A[Upstream Transcriber] -->|JSON Message| B(AWS SQS Queue)
    B -->|Poll| C[Audio2ScriptViewer]
    C -->|Process & Clean| D[Local Storage]
    D -->|Write| E[(output.csv)]
    subgraph Infrastructure
    B
    C
    end

How it's built

The core logic is implemented in Python using the boto3 library for AWS interactions. It uses argparse for CLI configuration, allowing it to run in a continuous loop or a single-pass mode. Text cleaning is handled via regular expressions to normalize newlines and carriage returns. The infrastructure is defined using Terraform, suggesting an AWS-native deployment model involving SQS queues and potentially S3 or EC2 for hosting the script.

How it runs

sequenceDiagram
    participant T as Transcriber
    participant Q as AWS SQS
    participant S as Audio2ScriptViewer
    participant F as Local CSV File

    T->>Q: Send Message (filename, text)
    loop Every X Seconds or Once
        S->>Q: Receive Messages
        Q-->>S: Return Message Batch
        S->>S: Parse JSON & Extract Key
        S->>S: Clean Text (regex)
        S->>S: Sort Messages by Key
        S->>F: Append Rows to CSV
    end

How to apply & reuse

This tool is applicable in media production workflows where audio files are transcribed asynchronously. It serves as the aggregation layer that converts scattered transcription events into a unified, chronological script document (CSV) for review or further processing by editors or downstream NLP tasks.

At a glance

CapabilitiesSQS Message ConsumptionText NormalizationCSV GenerationFilename ParsingBatch Sorting

Componentsaudio2Script.pyutility.mock_transcribe.pyTerraform Configs

TechPythonboto3AWS SQSTerraformCSVRegex

Depends onAWS AccountSQS Queue AccessPython 3 Environment

Integrates withAWS Speech ServicesThird-party TranscribersMedia Asset Management Systems

PatternsWorker PatternPolling ConsumerETL (Extract, Transform, Load)

Reuse tagsaws-integrationtranscription-pipelinecsv-exportmicroservicepython-script