Zhivko Todorov
ALL CASE STUDIES

CASE 56 · TEMPO · 2025

KAFKAMSKCONNECTSCHEMA REGISTRY

Self-hosted Kafka, retired without losing a partition.

A real-time bidding platform ran 28 self-hosted Kafka brokers across three AZs, with a team that had become reluctant ZooKeeper experts. We migrated to MSK Serverless for the variable workloads and MSK Provisioned for the steady ones, retired the EC2 cluster, and reclaimed the team’s attention.

INDUSTRY

Real-time bidding

DOMAIN

MIGRATION

DELIVERED

2025

STACK

MSK PROVISIONED·MSK SERVERLESS·MSK CONNECT·GLUE SCHEMA REGISTRY·MIRRORMAKER 2·EC2 (RETIRED)

RESULTS

What changed, by the numbers.

OPERATIONAL HOURS / WEEK

−92%

TEAM RECLAIMED 38h/wk

BROKER FAILURES

0

90 DAYS POST-CUTOVER

COST

−18%

INCL. SUPPORT TIME

CUTOVER WINDOW

ROLLING

NO PARTITION DOWNTIME

HOW IT WENT

The team had been operating Kafka well — five 9s of partition availability over the previous year — but at significant cost. Brokers needed patching, ZooKeeper needed coddling, and the runbook was 47 pages long. Two engineers had become full-time Kafka operators by accident.

We provisioned MSK alongside the existing cluster and used MirrorMaker 2 to replicate every topic. After two weeks of mirroring with verified consumer-group offsets, we cut producers over topic-by-topic. Consumers followed when their offset position was confirmed in MSK.

No partitions lost. No data lost. The two reluctant Kafka operators went back to platform engineering work — one of them now leads the team. The 47-page runbook got archived. The new runbook is two pages.

READY WHEN YOU ARE

Let's get your AWS bill (and architecture) in order.

The discovery call is free. You walk away with at least one concrete idea — even if we never work together.

Or email directly →