
How a mobile-first platform split media upload, processing, moderation, and realtime status delivery into a resilient event-driven pipeline.
Context
A mobile-first platform needed to support image and video uploads in chats and albums without turning every upload into a long synchronous API request. On mobile networks, large files and inconsistent connectivity are normal, so the backend had to do heavy processing out of band while the app still gave clear user feedback.
The codebase already had signed-upload patterns, WebSocket infrastructure, and media records with lifecycle states. The goal was to turn those ingredients into a clean pipeline where API endpoints stayed lightweight and processing workers could scale independently.
Challenge
Media ingestion is rarely just upload and done. Images needed multiple derivative sizes. Videos needed transcoding, thumbnail extraction, and showreel generation. Some media also required moderation checks before being surfaced publicly. At the same time, users needed status updates they could trust, especially when processing took longer than a normal request window.
Another challenge was avoiding tightly coupled handlers. If one Lambda handled database checks, media transforms, and broadcasts all together, it would be harder to test, harder to scale, and riskier to change. The platform needed clear ownership boundaries between preflight validation, heavy processing, state persistence, and realtime fan-out.
Approach
The pipeline was separated into focused stages. The API created the media row and returned a signed upload URL. Once the object landed in storage, an upload event triggered preflight logic that validated key metadata and dispatch rules, then emitted a canonical ready event.
From there, image and video processors handled heavy work in stateless workers. Image processing generated multiple output sizes and optionally ran moderation against the primary output. Video processing normalized files to MP4, generated thumbnail and showreel assets, captured duration and dimension metadata, and ran moderation against the showreel path when required.
Lifecycle events represented the contract between stages. Processing workers emitted started, progress, completed, and failed events. Separate consumers then handled database status updates and realtime broadcast behaviour. This split let processing code stay focused on media work while client-facing state and messaging stayed consistent through dedicated consumers.
On the app side, upload code validated file types and size limits before transfer and handled signed-header requirements. Status mapping translated websocket packets into client states like uploading, processing, ready, and error. Media library cache logic also used context-based retention to keep local storage under control. The test suite covers processors and consumers, including edge cases like malformed events and stale progress regressions.
Outcome
The platform gained a more resilient upload experience without overloading synchronous API paths. Users can start an upload quickly, then receive clear progress and completion signals while transforms and moderation continue in background workers.
Operationally, the architecture reduced coupling and made failures easier to reason about. Preflight, processing, status writes, and broadcasts each have explicit responsibilities and test coverage. That improved release confidence when changing media behaviour because one stage can evolve without rewriting the whole flow. It also lowered the risk of client confusion by keeping status updates consistent across chat and album contexts.
Key takeaway
When mobile media is central to the product, event contracts and strict stage ownership matter more than trying to do everything in one request lifecycle.