Zhivko Todorov
ALL CASE STUDIES

CASE 125 · SANDPIPER · 2024

CI/CDSPOTGITHUB ACTIONSAUTO SCALING

CI on Spot, with the wallet to prove it.

An open-source vendor ran GitHub Actions on a fleet of EC2 self-hosted runners — entirely on-demand, sized for peak. Off-peak utilisation was 12%. We rebuilt the runner fleet on Spot with Karpenter-driven scaling, and brought CI compute spend down 82%.

INDUSTRY

Open-source vendor

DOMAIN

COST

DELIVERED

2024

STACK

GITHUB ACTIONS RUNNER CONTROLLER·EC2 SPOT·KARPENTER·EKS·EFS (BUILD CACHE)

RESULTS

What changed, by the numbers.

CI COMPUTE BILL

−82%

$18K → $3.2K / MONTH

BUILD QUEUE TIME

−61%

PEAK HOURS

SPOT INTERRUPT IMPACT

< 0.4%

OF BUILD-MINUTES

PROVISIONING TIME

< 25s

NEW RUNNER UP

HOW IT WENT

The on-demand fleet had been sized for the worst Tuesday of the month. Most of the time, it sat under-utilised; engineers occasionally complained about queue time anyway because the peaks were even sharper than the sizing assumed.

Actions Runner Controller scaled runner pods on EKS via Karpenter, with Spot capacity across multiple instance families. EFS held the build cache so runners weren’t cold every time. Pod disruption budgets handled the rare Spot interruption.

CI bill dropped 82%. Peak-hour queue time dropped 61% because the elastic fleet could grow beyond what the on-demand sizing had supported. Spot interruption rate stayed under 0.4% of build-minutes; the affected builds retried automatically and finished only a couple of minutes later.

READY WHEN YOU ARE

Let's get your AWS bill (and architecture) in order.

The discovery call is free. You walk away with at least one concrete idea — even if we never work together.

Or email directly →