Forty terabytes of Spark, off GCP in nine weeks.

A marketing analytics company ran a 40TB nightly Spark pipeline on GCP Dataproc with BigQuery storage. Their largest customer’s preferred-cloud clause triggered a forced migration. We rebuilt the pipeline on EMR + Redshift Spectrum + S3 without rewriting a single transformation.

INDUSTRY

Marketing analytics

DOMAIN

MIGRATION

DELIVERED

2025

STACK

EMR·REDSHIFT·REDSHIFT SPECTRUM·S3·GLUE CATALOG·STEP FUNCTIONS·DMS

RESULTS

What changed, by the numbers.

TIMELINE

KICKOFF → CUTOVER

CODE CHANGES

< 5%

CONFIG-DRIVEN ABSTRACTION

PIPELINE RUNTIME

−12%

EMR SPOT FLEET

STORAGE COST

−34%

S3 + ICEBERG

HOW IT WENT

The migration brief was unusual: the data had to move, the schedules had to keep running, and the SQL had to stay nearly unchanged because the analyst team didn’t have the bandwidth to rewrite it. The "configuration of paths and credentials" path was the goal.

We staged the migration in three phases. First, parallel infra in AWS (EMR cluster, Redshift, S3 buckets, Glue Catalog seeded from BigQuery schemas). Second, dual-write of the nightly outputs so analysts could compare. Third, cutover after two weeks of identical outputs.

DMS handled the historical BigQuery export. EMR with Spot capacity ran the nightly Spark jobs at 12% less wall-clock time. Redshift Spectrum served the long-tail historical queries against S3 directly. The analyst team noticed nothing.

RELATED · SAME DOMAIN

Other engagements in this space.

TEMPO · 2025

Self-hosted Kafka, retired without losing a partition.

−92%OPERATIONAL HOURS / WEEK

VELVET · 2024

Auth0 to Cognito, with social logins and password resets intact.

−96%AUTH BILL

PREVIOUSPOLARIS — GitOps for infrastructure, not just for Kubernetes manifests.NEXT BOREAL — One network, many accounts — without VPC peering hairballs.

READY WHEN YOU ARE

Let's get your AWS bill (and architecture) in order.

The discovery call is free. You walk away with at least one concrete idea — even if we never work together.

Or email directly →