cd ../blog
PulumiAWSTypeScriptDevOpsIaC

Pulumi in Production: Lessons from Standardizing AWS Infrastructure for a 20-Engineer Team

December 5, 20257 min read

When we decided to consolidate three years of ad-hoc AWS configurations into proper IaC, we chose Pulumi over Terraform. A year and a half later, here's an honest look at what worked, what didn't, and what I'd do differently.

The state of our AWS infrastructure three years into ProductBox was... honest. Each project had accumulated its own deployment scripts, half-migrated CloudFormation stacks, and a handful of things that were genuinely hand-configured in the console. Standard startup entropy. When we hit 20 engineers across 10+ production services, the entropy became a real cost: onboarding took too long, deployments were inconsistent, and debugging infra issues required tribal knowledge. ## Why Pulumi Over Terraform The honest answer: because our team writes TypeScript all day, and Pulumi lets you write TypeScript for infrastructure. No new DSL, no fighting HCL's limited type system. Concretely, the things Pulumi handles better than Terraform for a TypeScript shop: **Real programming constructs.** When you need to loop over a variable number of ECS task definitions, you just... write a for loop. In HCL you're wrestling with `count`, `for_each`, and dynamic blocks. **Type safety.** Pulumi's TypeScript types catch a large class of errors before `pulumi up`. I've seen teams spend hours debugging Terraform plan errors that would have been a compile-time type error in Pulumi. **Component abstraction.** We built a small internal library of reusable Pulumi components — a `StandardService` that bundles an ECS task definition, service, ALB target group, and CloudWatch log group. New services get production-ready infra in ~20 lines. ## What the Standardization Actually Looked Like We structured the codebase as a monorepo with a `/infra` directory at root: ``` /infra /shared ← VPC, RDS clusters, Redis, shared ALB /services /api ← ECS service for the main API /worker ← ECS service for background workers /frontend ← CloudFront + S3 /components ← reusable Pulumi component library ``` Shared infrastructure gets deployed once per environment. Service stacks reference shared stack outputs via `StackReference`. ## The Parts That Didn't Go Smoothly **State backend migration.** Moving existing AWS resources under Pulumi management (`pulumi import`) is tedious. Each resource needs to be imported individually, and some resource types have quirks that require manual state edits. Plan for this taking longer than you expect — we allocated a sprint for the migration and it took two. **Pulumi Cloud vs. self-hosted state.** We used Pulumi Cloud for the state backend. It's convenient but adds a dependency. For a team with strict data residency requirements (relevant if you're operating in Germany under GDPR), self-hosting state in S3 is worth the setup cost. **Drift detection.** Pulumi's drift detection (running a preview to see if reality matches state) is good but not automatic. We added a weekly CI job that runs `pulumi preview` across all stacks and alerts on drift. This caught several cases where someone had manually changed something in the console. ## The Outcome After standardization: a new service from zero to production ECS deployment takes about 45 minutes, including review and merge. Before, it was a half-day minimum and often blocked on someone with AWS console access. Deployment errors caused by configuration inconsistencies dropped to near zero. The 40% reduction in deployment complexity is a rough estimate, but the more tangible metric is that infra questions stopped being a common interruption in engineering stand-ups. ## Would I Choose Pulumi Again? Yes, for a TypeScript team. The trade-off is real though: Terraform has a larger ecosystem of providers and more community examples. If your team isn't already comfortable with TypeScript or another Pulumi-supported language, the HCL learning curve might be more predictable than a new programming model for infrastructure. For a team that's already all-in on TypeScript — and especially one building rapidly across many services — Pulumi's composability wins.

$ ls ../