Serverless platform for browser-based chunked audio recording, S3 storage, and file management with planned real-time transcription.
https://github.com/davidbmar/audio-ui-realtime-transcribe · public · shipped
A web application that records audio in the browser, chunks it into configurable intervals (5s-5min), and uploads it to AWS S3 via pre-signed URLs. It provides a mobile-optimized UI for managing these sessions, playing back .webm files, and organizing recordings by human-readable timestamps. The backend is entirely serverless, using AWS Lambda for API logic and Cognito for user isolation.
git clone https://github.com/davidbmar/audio-ui-realtime-transcribe.git cd audio-ui-realtime-transcribe chmod +x step-*.sh ./step-10-setup.sh ./step-20-deploy-lambda.sh ./step-25-update-web-files.sh ./step-45-validation.sh
flowchart TD
User[User Browser] -->|HTTPS| CF[CloudFront Distribution]
CF -->|Static Assets| S3Web[S3 Web Bucket]
User -->|API Requests| AG[API Gateway]
AG -->|Invoke| Lambda[Lambda Functions]
Lambda -->|Auth Check| Cognito[Cognito User Pool]
Lambda -->|Generate Pre-signed URL| S3Audio[S3 Audio Bucket]
User -->|Direct Upload Chunk| S3Audio
Lambda -->|Read/Write Metadata| S3Audio
The frontend uses vanilla HTML/CSS/JS (with React templates) for the recorder and file manager. The backend consists of Node.js Lambda functions behind API Gateway. Audio chunks are uploaded directly to S3 using pre-signed URLs generated by Lambda. Session metadata and file listings are managed via S3 object operations. Authentication is handled by Amazon Cognito.
sequenceDiagram
participant U as User Browser
participant AG as API Gateway
participant L as Lambda (audio.js)
participant S3 as S3 Bucket
U->>AG: POST /upload-chunk (sessionId, chunkNumber)
AG->>L: Invoke uploadChunk
L->>L: Validate User Claims (Cognito)
L->>S3: Generate Pre-signed PUT URL
S3-->>L: Return Signed URL
L-->>AG: 200 OK { uploadUrl }
AG-->>U: Return Signed URL
U->>S3: PUT Audio Chunk (Binary)
S3-->>U: 200 OK
U->>AG: GET /sessions
AG->>L: Invoke listSessions
L->>S3: List Objects (users/{userId}/audio/sessions/)
S3-->>L: Object Keys
L-->>AG: 200 OK { sessions[] }
AG-->>U: Return Session List
Deploy the infrastructure using provided shell scripts to set up AWS resources (S3, Lambda, Cognito, CloudFront). Access the web interface via the CloudFront distribution URL to record meetings or conversations. Use the file manager to review, play, and organize recorded sessions. Future phases will enable semantic search and live transcription on this stored data.
✓ all on main — nothing unmerged.