Skip to main content
Back to projects

Event-driven retail backend — clearing 10K+ transactions/day

A multi-channel commerce backend where every state change is idempotent and auditable. Go + TypeScript on Kafka, outbox pattern, exactly-once effects.

GoTypeScriptKafkaPostgreSQLOutbox · idempotency
Throughput
10K+ tx/day · ~250 orders/min peak
Checkout
p99 ≤ 500ms under load
Migration
1M+ LOC monolith · 18-month strangler-fig
Audit
BACEN 3.978 trace · outbox exactly-once

The problem

In multi-channel commerce the same order can arrive twice, a partner webhook can fire three times, a network blip can drop an event mid-flight. The ledger still has to stay exactly right and auditable under load.

The solution

BFF + Broker + Dispatcher. The broker decouples producers from consumers. The transactional outbox publishes each event exactly once per committed write, and idempotency keys dedupe at every effect boundary. Kafka carries the event log; Go and TypeScript services consume it. Retries are safe by construction, so partner flakiness degrades gracefully instead of corrupting state.

fig. 01decision record
Constraint
A 1M+ LOC monolith couldn't scale checkout under load and lacked a regulator-traceable audit path. The cutover couldn't stop the business selling. The same order can arrive twice and a partner webhook can fire repeatedly, yet the ledger must stay exactly right (checkout p99 ≤500ms, BACEN 3.978-traceable) across an 18-month migration.
Decision
Strangler-fig the monolith. Carve domain capabilities into a BFF + Broker + Dispatcher. Decouple producers from consumers, publish via a transactional outbox (exactly-once per committed write), make every downstream effect idempotent so retries are safe by construction. Rejected a big-bang rewrite (the business can't stop selling), synchronous point-to-point calls (cascading failure), and at-least-once delivery without dedupe (double effects).
Outcome
A 1M+ LOC monolith migrated over 18 months without halting sales. 10K+ transactions/day (~250 orders/min peak), checkout p99 ≤500ms, BACEN 3.978-traceable. Retry-safe exactly-once effects mean partner flakiness degrades gracefully instead of corrupting the ledger.

Overview

An event-driven backend clearing 10K+ transactions/day (~250 orders/min at peak) across multiple commerce surfaces, carved out of a 1M+ LOC Rails monolith over an 18-month strangler-fig migration. A BFF fronts the channels. A broker fans domain events out to dispatchers. The outbox publishes each event exactly once per committed transaction, and idempotency keys make every downstream effect safe to retry. Checkout p99 stayed ≤500ms under load. The audit path traces to BACEN Circular 3.978/2020 throughout.