There's a pattern I see over and over again.
A team sets out to build their cloud infrastructure. They pick Terraform because it's the industry standard. They sketch out some modules, wire up a CI pipeline, and start provisioning resources. The early days feel productive.
Then six months pass. The backlog of infrastructure work keeps growing. Security review flags gaps nobody anticipated. The "temporary" workarounds became permanent. And someone in a planning meeting says the thing everyone's been thinking: this is taking way longer than we expected.
It always does. Because the assumptions teams start with are almost always wrong.
Not wrong in a way that's obvious. Wrong in a way that sounds reasonable. That's what makes these beliefs so expensive — they survive scrutiny right up until reality catches up.
Here are the four most common ones I see.
It's time to ruffle some feathers.
Nobody runs vanilla Terraform.
Think about it. If the deployment involves GitHub Actions, it's not just Terraform anymore. Terraform Cloud, Spacelift, env0, Terramate, Terragrunt, Atmos — none of that ships with Terraform. Makefiles, Taskfiles, shell scripts, a little Python glue — all additions.
Vanilla Terraform is the binary and the code. That's it.
And here's what's interesting: Terraform might be the only language ecosystem where there's a purity test around not using frameworks. Nobody in the JavaScript world brags about avoiding React. Nobody badges "vanilla Python — no pip packages." Rails, Django, Spring — these are how professionals build things. Nobody questions it.
But in the Terraform world, "vanilla" became an identity. Which is strange, because vanilla Terraform doesn't actually solve most of the operational problems teams face:
Every team solves these problems eventually. And every team reaches for something beyond the binary to do it.
Once that realization sinks in, the conversation gets more productive. It stops being about whether a wrapper is acceptable and starts being about whether to build one from scratch or adopt one that's already been battle-tested.
That's the conversation worth having.
This is one of the most expensive lies in cloud engineering.
I get where it comes from. Terraform is familiar. The syntax is learnable. The docs are good. You can get a resource provisioned in an afternoon and feel like you've got the whole thing figured out.
But here's what that framing misses:
Terraform handles the what of infrastructure. Architecture handles the why and how.
What makes AWS infrastructure genuinely hard isn't HCL. It's everything around it:
None of these are Terraform problems. They're coordination problems. Design problems. Organizational problems.
You can be fluent in HCL and still spend a year building something that doesn't pass security review. Because syntax doesn't solve architecture. And the teams that treat "it's just Terraform" as a project estimate instead of a technical observation are the ones that blow their timelines.
What teams think will take three months takes a year. And it still doesn't cover drift detection, doesn't handle secrets properly, and requires one specific engineer to make changes because they're the only one who understands the layout.
The gap between "I can write Terraform" and "we have a production-grade platform" is where budgets go to die.
Hiring a contractor to "clean up AWS" usually improves implementation quality.
It rarely fixes structural accountability gaps.
Here's what happens. The contractor comes in. The Terraform gets cleaner. The modules get organized. Maybe some tagging and a few guardrails get added. The deliverable looks professional. Everyone feels good about the engagement.
But six months later:
This isn't a skills problem. It's an ownership problem.
Contractors optimize for delivery. That's their job. They're incentivized to produce clean, well-organized code and hand it over. But infrastructure isn't a deliverable — it's a living system that needs continuous ownership, context, and evolution.
The question isn't "can someone clean this up?" It's "who owns this after they leave?"
If the answer isn't clear, the cleanup is temporary. The accountability gap remains. And the next time requirements change — which they will — you're back where you started, except now you're modifying someone else's design instead of your own.
What teams actually need isn't a contractor who builds for them. It's a guide who transfers capability to them. Someone who builds alongside the team so that when the engagement ends, the team owns everything: the code, the architecture, the decisions, and the context behind them.
Good luck with that.
The internet is full of Terraform code. That's both a blessing and a problem. Because "available" and "production-ready" are very different things.
What you find on GitHub:
terraform validate in CI. Maybe not.It's like trying to build a car by stitching together random parts from different manufacturers. Each part might work fine on its own. Together, they don't fit.
Battle-tested Terraform modules look different:
Most DIY module efforts are unproven beyond the team that built them, dependent on one engineer's tribal knowledge, and quickly out of date. The first version works. The question is what happens twelve months later when the engineer who wrote it has moved on and AWS has deprecated two of the services it depends on.
The real question isn't "can we build it?" It's "should we?" No one earns a competitive edge by reinventing IAM patterns or multi-account governance. Mature teams focus engineering effort on the product.
These four beliefs aren't random. They share a structure.
Each one treats infrastructure as simpler than it actually is. Each one feels reasonable in the moment. And each one optimizes for short-term comfort — avoiding a framework, underestimating scope, outsourcing the work, copying code — over long-term capability.
The common thread is underestimation. Teams underestimate what vanilla Terraform doesn't cover. They underestimate the gap between syntax and architecture. They underestimate how much context walks out the door with a contractor. They underestimate the difference between found code and production-grade foundations.
And the cost isn't a single bad quarter. It's compounding. Each shortcut creates a dependency on the next shortcut. The team that skips the framework builds their own ad hoc one. The team that hires a contractor to clean up inherits code they can't maintain. The team that copies modules from GitHub spends months gluing them together and years maintaining the glue.
The teams that avoid these traps do something different. Not something harder — something more intentional.
They adopt a framework early, because they know they'll need one eventually and building from scratch is the most expensive option. They invest in ownership, not just deliverables. They build on battle-tested foundations instead of reinventing what the community has already solved. And they treat infrastructure as what it is: a living system that needs continuous investment, not a one-time project.
The conversation I want engineering leaders to have isn't "should we use Terraform?" That question was settled years ago.
The real questions are harder:
These are the questions that determine whether your infrastructure becomes a strategic advantage or a slow-moving liability. The teams that answer them honestly — even when the answers are uncomfortable — are the ones that ship.
The lies are comfortable. The truth is faster.
If you're curious what an IaC framework actually looks like — and what it means to have tooling that was built for Terraform workflows instead of bolted on — check out the Atmos project. Take a look at our native CI integration and Atmos Auth to see what CI-native tooling and authentication should look like when they're not afterthoughts.
Subscribe to the Production Ready newsletter.

Continue reading with these featured articles