Zhivko Todorov
ALL CASE STUDIES

CASE 142 · OPAL · 2023

REDISELASTICACHECLUSTEROPERATIONAL

Self-hosted Redis, retired without anyone noticing.

A gaming SaaS company ran self-hosted Redis on EC2 — a six-node cluster with the operational responsibility quietly resting on one engineer. We migrated to ElastiCache for Redis with no application code changes and no observable downtime.

INDUSTRY

Gaming SaaS

DOMAIN

MIGRATION

DELIVERED

2023

STACK

AMAZON ELASTICACHE·REDIS CLUSTER MODE·AWS DMS (REDIS)·CLOUDWATCH METRICS

RESULTS

What changed, by the numbers.

OPERATIONAL HOURS

−92%

WEEKLY, ON-CALL ENGINEER

OBSERVED DOWNTIME

0

DURING MIGRATION

FAILOVER TIME

< 30s

MANAGED

CODE CHANGES

CONN STRING

NOTHING ELSE

HOW IT WENT

The six-node cluster on EC2 was working — until one of the nodes failed and the team realised that the runbook for "what to do when a node fails" had drifted out of date. They got through it, but the conversation afterward was unanimous: ElastiCache should have happened a year earlier.

We provisioned an ElastiCache cluster with cluster mode enabled, matching the partitioning of the existing setup. Migration ran via REPLICAOF replication — the ElastiCache cluster slaved off the self-hosted master until it was caught up, then took over after a connection-string swap.

Operational hours dropped 92% for the on-call engineer (rotating to other work). Observed downtime during the migration was zero — the connection-string swap happened at a quiet moment and the application reconnected transparently. The self-hosted cluster was decommissioned two weeks later.

READY WHEN YOU ARE

Let's get your AWS bill (and architecture) in order.

The discovery call is free. You walk away with at least one concrete idea — even if we never work together.

Or email directly →