Zhivko Todorov
ALL CASE STUDIES

CASE 73 · GLADE · 2026

EKSKARMADAMULTI-CLUSTERFEDERATION

Twelve EKS clusters, one application surface.

An AI infrastructure company ran 12 EKS clusters across regions and GPU classes. Deploying anything across the whole estate was 12 separate kubectl applies plus a spreadsheet to track them. We adopted Karmada for cluster federation and turned cross-cluster operations into a single declaration.

INDUSTRY

AI infrastructure

DOMAIN

PLATFORM

DELIVERED

2026

STACK

EKS·KARMADA·ARGOCD·CILIUM CLUSTER MESH·EXTERNAL DNS·CERT-MANAGER

RESULTS

What changed, by the numbers.

CROSS-CLUSTER DEPLOY TIME

−83%

90m → 15m

CLUSTERS FEDERATED

12

CROSS-REGION, MIXED GPU

DRIFT INCIDENTS

−100%

SINGLE-SOURCE TRUTH

NEW-CLUSTER ONBOARDING

2h

WAS 1–2 DAYS

HOW IT WENT

The cross-cluster spreadsheet was the smoking gun. Each cluster had drifted slightly from the others — different versions of a CRD here, a missed NetworkPolicy there. Deploys involved checking the spreadsheet, then apologising when a deploy failed because cluster #7 still had v1beta1.

Karmada became the federation control plane. ArgoCD synced application manifests to Karmada; Karmada propagated them to the member clusters with override policies for cluster-specific config (GPU type, region-specific secrets). Cilium Cluster Mesh handled cross-cluster networking for the few cross-cluster service calls.

Drift incidents dropped to zero — the single source of truth made it impossible for clusters to differ unintentionally. A cross-cluster deploy that used to be 90 minutes of careful coordination now takes 15 minutes and is initiated by a single git push.

READY WHEN YOU ARE

Let's get your AWS bill (and architecture) in order.

The discovery call is free. You walk away with at least one concrete idea — even if we never work together.

Or email directly →