The GitHub Actions Tax
Issue #3workflow sprawl

The GitHub Actions Tax

Erik Osterman
byErik OstermanCEO & Founder of Cloud Posse

I love GitHub Actions. I've said it on LinkedIn, on recent podcasts, in customer calls. It's the easiest CI in the world to start with. The marketplace is enormous. Triggers are obvious. The free tier is generous. If you're shipping software in 2026, there's a good chance the first thing your developers reach for is .github/workflows/.

That's exactly why I keep watching teams pay for it the wrong way.

Not in dollars. The bill I'm talking about doesn't show up in the GitHub invoice. It shows up in maintenance time, debugging hours, the cost of onboarding a new engineer onto an undocumented YAML graph, and the slow accumulation of vendor lock-in dressed up as portability. I've started calling it the GitHub Actions tax, and most teams don't realize they're paying it until they try to leave or scale.


What "the tax" actually means

Walk into a typical infrastructure repo and look at .github/workflows/. You'll find a workflow that does twelve things. It checks out the code, configures AWS credentials, installs Terraform, runs terraform fmt, awks the output, posts a PR comment, runs terraform plan, parses the plan output, generates a custom summary, conditionally applies based on a label, comments again, updates a status check, and uploads an artifact.

Each of those steps was added on a Tuesday afternoon by someone solving a real problem. None of them are wrong individually. The workflow grew organically.

Now the workflow is the platform. Which means:

  • The platform only runs in CI. Want to test that flow on your laptop? You can't. act gets you 70% of the way there for the simple cases and silently diverges everywhere else.
  • The platform is debugged through echo. Something broke at step 9 of 12, and the only telemetry you have is whatever the previous engineer remembered to print to stdout. There's no --debug you can flip locally because the platform doesn't run locally.
  • The "declarative" YAML is decorated bash. What started as a config file is now encoding your system architecture in a thousand lines that's declarative in name only. Half the steps shell out to inline bash -c blocks. The other half use third-party actions whose internals are bash. It looks declarative because it's indented; it's procedural code in a language without functions, types, or tests — and it happens to be the canonical description of how your platform works.
  • The platform's security surface is enormous. Every action you pin is supply chain. Most of them aren't pinned by SHA. Most of them have transitive dependencies you've never audited. The pull_request_target patterns in your repo are exactly as safe as the third-party action you trusted last quarter.
  • The platform is invisible to onboarding. New engineers can't read 800 lines of YAML and understand what your deployment does. The workflow encodes a hundred decisions but documents none of them.

This isn't a GitHub Actions problem. Switching to GitLab CI, Bitbucket Pipelines, or Azure DevOps doesn't fix it — you'll just rebuild the same wall of YAML somewhere else. The problem lives one layer down, in the tooling. When tooling is doing its job, your pipelines are boring — short, readable, almost identical regardless of which CI vendor is running them. Boring pipelines are the goal. Workflow sprawl is what happens when CI absorbs responsibilities that don't belong in CI.


The Unix philosophy still applies

Here's the heuristic I keep coming back to: workflows should orchestrate, not implement.

The Unix philosophy says: write programs that do one thing and do it well. Compose them with pipes. The compose layer is dumb on purpose — it doesn't know how to do anything, it knows how to wire things together. That's its job.

A workflow should be the compose layer. Each step should be a real program — testable on a laptop, understandable in isolation, debuggable with the same tooling you'd use for anything else. The workflow's only job is to say "run this, then run that, in this order, with these inputs."

The principle has a name worth saying out loud: local reproducibility. Anything CI does should run the same way on a laptop. If a step only works on a GitHub-hosted runner, that's not a feature — it's a bug in your pipeline. When CI and the laptop run the same programs, you debug locally, you onboard new engineers in minutes, and the conversation about "which CI vendor" stops mattering. Same commands, same outputs, same exit codes — wherever they run.

The moment you start writing 200-line bash blocks inside run: keys, you've inverted the relationship. The workflow is no longer the compose layer — it's the implementation, and bash is the runtime. You've turned your CI into a programming environment, but a particularly bad one: no debugger, no local testing, no proper modules, slow feedback loop, expensive iteration cycle.

The teams that get this right aren't writing fewer workflows. They're writing simpler ones, because the work moved to programs that exist independently of CI.


What this looks like in Terraform-land

This pattern is most painful in Terraform. The community has been arguing for years about how much glue you should write to make Terraform work in production, and the answer keeps coming back as "a lot." I just wrote two posts on this — they're meant as a pair:

Terraform the Hard Way is borrowed in spirit from Kelsey Hightower's Kubernetes the Hard Way. It walks through every implicit decision native Terraform leaves on your plate — twenty-one of them, grouped into design, build, and operate. Every one of those decisions is a place where teams reach for a workflow YAML. Layout. Toolchain. Auth. Registry credentials. State backend provisioning. Change detection. Module sourcing. Drift detection. CI ergonomics. The hard way is to solve each one with a different third-party action or a hand-rolled bash block. Most teams do exactly that, and over years it accumulates into the platform-inside-CI I described above.

Terraform the Easy Way is the companion. Same crossroads — different answers. The kind a framework that has already made the decisions can give you. Each crossroad becomes a few lines of YAML in atmos.yaml instead of a new workflow you maintain. The compose layer in CI gets dumber. The work moves to programs that run identically on a laptop and in CI. That's not a coincidence — that's what good frameworks do.

I'm not making the framework argument here for marketing reasons. I'm making it because the alternative is the GitHub Actions tax, and most teams have been paying it for years without naming it.


And no, your IDP isn't the answer

A predictable response when teams realize they have workflow sprawl is to build an Internal Developer Platform on top. Slap a Backstage portal in front of the YAML, expose buttons for the common operations, declare victory.

I wrote about why that's the wrong order: Build Your IDP Last. The portal is the icing. The framework underneath is the cake. If your platform is your workflows, your IDP is just a UI layer over a YAML graph nobody understands. It exposes the inconsistencies — different naming, different auth flows, different drift behaviors per service — instead of smoothing them over. It makes the underlying mess more visible, not less.

The order has to be: framework first, conventions next, then the portal. If you skip step one, the portal doesn't fix anything. It just adds a layer.


The actual question

The question isn't "should we use GitHub Actions?" Of course you should. It's the best CI runner most teams will ever have access to.

The question is: how much of your platform should live inside it?

Answer: as little as possible. Keep it boring.

CI is a runner, not a runtime. The work that belongs in real programs — auth, toolchain management, state backend provisioning, change detection, plan output formatting — runs identically on a laptop and in CI. Your workflow YAML should be the dumb compose layer that wires them together. Same commands locally, same commands in CI, same outputs.

The second-order problems are different. Drift detection, reconciliation, observability across an entire fleet — those don't live inside your CI, but they don't live inside your tooling either. They need a platform layer on top. And here's the thing: by the time those problems are real for your team, you're also probably ready to start thinking about an internal developer platform. That timing isn't a coincidence — the same maturity curve gets you to both. The mistake is reaching for the platform layer before the tooling layer is in shape, which is what most teams do.

That's not a GitHub Actions critique. It's an architecture argument that happens to land especially hard on the most popular CI in the world, because most teams have built their platform inside it without noticing.

If your CI minutes are growing faster than your headcount, your bash blocks are growing faster than your tests, and your onboarding doc opens with "first, you'll need to understand how deploy.yml works" — that's the tax. You're paying it. The way out isn't a different CI vendor. It's moving the work out of CI entirely.


If your team is hitting the workflow sprawl wall — or you're building the framework argument internally and want a sparring partner — talk to an engineer. No sales theater, just a real conversation about what you're paying for.

Erik Osterman Founder & CEO, Cloud Posse

Erik Osterman
Erik Osterman
CEO & Founder of Cloud Posse
Founder & CEO of Cloud Posse. DevOps thought leader.
Book a Meeting

Share This Issue

← All IssuesSubscribe →