Zhivko Todorov
ALL CASE STUDIES

CASE 149 · WISP · 2024

AURORAREBOOTMAINTENANCEREHEARSAL

Aurora maintenance windows that the team rehearses.

A fintech had been treating Aurora minor-version upgrades and maintenance windows as a "fingers crossed" event — sometimes they were fine, sometimes a workload broke. We instituted quarterly rehearsals against a clone of production using Aurora’s blue/green deployment feature.

INDUSTRY

Fintech

DOMAIN

RELIABILITY

DELIVERED

2024

STACK

AURORA BLUE/GREEN·AURORA POSTGRES·PERFORMANCE INSIGHTS·CLOUDWATCH SYNTHETICS·GAME DAY RUNBOOKS

RESULTS

What changed, by the numbers.

UNPLANNED INCIDENTS

0

POST-INSTITUTING REHEARSALS

REHEARSALS / QUARTER

1

AGAINST CLONE

MINOR-VERSION CONFIDENCE

HIGH

WAS LOW

MAINTENANCE WINDOW STRESS

LOW

TEAM SURVEY

HOW IT WENT

The team had had three Aurora minor-version upgrades in eighteen months that surprised them — a query plan changed, a parameter default flipped, a connection-pool behaviour shifted. Each was small but disruptive. None had been rehearseable in the production environment.

Aurora Blue/Green Deployments clone the cluster, including data, and let you run the upgrade against the clone first. We instituted quarterly rehearsals: clone production, apply the next planned maintenance, run the synthetic traffic suite against it, document any surprises. Surprises that survived became JIRA tickets to investigate before the real maintenance.

Unplanned incidents from Aurora maintenance dropped to zero post-instituting the rehearsals. The maintenance windows have become routine — the team’s survey on "stress level during DB maintenance" dropped from 7/10 to 2/10. The blue/green clone is a habit now.

READY WHEN YOU ARE

Let's get your AWS bill (and architecture) in order.

The discovery call is free. You walk away with at least one concrete idea — even if we never work together.

Or email directly →