I run a company that does cloud infrastructure, which means I spend a lot of time on calls when I'd rather be coding. Sales calls, scoping calls, architecture reviews. I talk with engineering leaders every week about the state of their infrastructure.
And here's what I've noticed: the story is almost always the same.
Infrastructure grew organically. Each decision made sense at the time. The team was moving fast, shipping product, solving real problems. Nobody sat down and said "let's build something unmaintainable." It just happened — the way it always happens when smart engineers are focused on delivery.
Then they hit a wall. And the instinct — the very natural, very engineering instinct — is to rebuild.
Here's where it gets interesting. We have AI now. The cost of writing code has never been lower. You can spin up a Terraform module in an afternoon that used to take a week. But the cost of making the wrong decisions hasn't changed at all. Bad architecture is just as expensive to unwind whether a human wrote it or an AI did — maybe more expensive, because the speed means you get further down the wrong path before you notice.
Engineers are builders. It's our superpower. Give us a problem and we'll build a solution from scratch. But that same instinct is often what created the situation in the first place. The infrastructure grew organically because the team kept building, one thing at a time, without stepping back to question the approach. AI makes building faster, but it doesn't make the approach any better.
So when the answer to "our infrastructure is a mess" is "let's rebuild it ourselves" — using the same approach, the same assumptions, just with cleaner code and better tools — that's not a fix. That's a cycle. You end up in the same place, just with newer tech debt. And often with two problems now instead of one: maintaining the old world while continuing to invest in the new.
The teams that break the cycle don't just rebuild. They change how they build. Different assumptions. Different foundations. That's the real shift.
Which brings me to the four assumptions I see most often — the ones that keep teams stuck in that cycle. I wrote a full breakdown here: The Most Expensive Lie in Cloud Engineering. Here's the short version.
Lie #1: "We run vanilla Terraform."
Nobody runs vanilla Terraform. If you're using GitHub Actions, Makefiles, or any kind of wrapper to authenticate, manage state, or deploy — that's a framework. You just built it yourself instead of adopting one. Terraform might be the only language ecosystem where there's a purity test against using frameworks. Nobody in JavaScript brags about avoiding React.
Lie #2: "It's just Terraform. How hard can it be?"
Terraform handles the what. Architecture handles the why and how. Multi-account patterns, CI/CD standardization, identity providers, compliance—none of these are Terraform problems. You can be fluent in HCL and still spend a year building something that doesn't pass security review.
Lie #3: "We can just fix it later."
This is the one nobody says out loud, but everyone acts on. The plan is always the same: ship now, clean up next quarter. And when next quarter comes, you hire a contractor to untangle it. The contractor does good work — the code gets cleaner, the patterns get tighter. Then they leave. And so does the context. Six months later, nobody fully understands the architecture, changes require knowledge that walked out the door, and the platform works but can't evolve. You didn't fix the problem — you rented a fix. Infrastructure isn't a deliverable. It's a living system that needs to evolve with your team, not just survive a handoff.
Lie #4: "We'll just copy modules from GitHub."
Nobody actually says this — but it's what teams end up doing. You need a VPC module, you find one on GitHub, you fork it. Then another. Then another. Before long you've assembled your infrastructure from a dozen different authors with different conventions, different testing standards, and different opinions about how AWS should work. This isn't about using our modules versus someone else's. It's about the difference between adopting a coherent system and assembling one from parts that were never designed to work together. Available and production-ready are very different things.
Every one of these beliefs does the same thing: it reaches for the builder's instinct. Write it ourselves. Fix it ourselves. Rebuild it ourselves. And every one leads back to the same place — because the approach didn't change, just the code.
The teams that break the cycle do something different. They stop treating infrastructure as a build problem and start treating it as an adoption problem. They adopt a framework instead of inventing one. They invest in ownership transfer, not just deliverables. They build on battle-tested foundations instead of reinventing what the community has already solved.
The full post goes deep on each one, with the specific patterns I see and what to do instead: Read the full breakdown.
If you're curious what an IaC framework actually looks like — and what CI-native tooling should do instead of duct-taping GitHub Actions together — check out the Atmos project. Our recent native CI integration and Atmos Auth show what these capabilities look like when they're built into the framework instead of bolted on.
Ready to skip the expensive lessons?
Talk to an engineer — no sales theater, just a real conversation about what you're building.
Erik Osterman Founder & CEO, Cloud Posse
