fredsmith.org

AI Makes Code Cheap. It Doesn't Make Ownership Cheap.

aisoftware-engineeringmaintenance

If you’ve used Claude Code or Copilot for any real work in the last year, you know the velocity is real. I watched an Opus model produce a PR last month that would’ve taken me a couple days — tested, documented, ready for review. I’m not going to argue about whether AI-assisted development is a big deal. It is. I’ve been doing this for 25 years and this is the real thing.

But I keep having the same conversation with other engineering leads, and it goes like this: the code is getting written faster, but nobody’s infrastructure bill went down. Nobody’s on-call rotation got shorter. Nobody’s Dependabot queue got smaller.

The cost of generating software collapsed. The cost of owning it didn’t.

Every Line of Code Is Carrying Cost

We’ve beaten the “technical debt” metaphor to death, but Ward Cunningham’s original point still holds: even good code costs money to keep alive. Dependencies drift. CVEs show up. Infrastructure has to be patched. APIs have to stay running. Tests have to keep passing. None of that is free, and none of it creates new value — it just protects the value you already have.

This was manageable when building was expensive. When a new service took weeks of engineering time, you thought hard about whether to build it. The cost of creation was a natural brake on accumulation.

AI removed that brake.

What This Looks Like in Practice

At my company, we have decades of code across hundreds of repos. We get several hundred Dependabot PRs a day. Merging them is basically a full-time job, so we’re automating it — which means more code, more test runs, more build minutes, more deploy pipelines. The automation doesn’t eliminate the carrying cost. It moves it around and adds some of its own.

I spent a week last year tracking down every service that still referenced an old hostname we were decommissioning. I had to query Cloudflare logs and Datadog metrics to prove zero traffic before I could delete anything. Some of those services were running fine. They just didn’t need to exist anymore, and nobody had built a process to figure that out.

Your AWS bill doesn’t care whether Opus wrote the service or a senior engineer did. Your Dependabot queue doesn’t care that the code it’s flagging took ten minutes to generate instead of two weeks. The compute, the security patching, the on-call rotation, the compliance checks — none of those are on an AI improvement curve.

We Borrowed From Manufacturing. We Forgot the Disposal Side.

Software engineering took a lot from manufacturing — CI/CD, kanban, sprint cadences, deployment pipelines. All of that made us better at building things.

What we didn’t take was the disposal discipline.

In lean manufacturing, there’s a concept: the best part is no part. The cheapest component is the one you engineered out of existence. Not optimized — eliminated. Unsold inventory isn’t neutral. It takes up space, requires tracking, can spoil, and ties up capital. Lean isn’t just about making things efficiently. It’s about not making things you don’t need to own.

Software has YAGNI as a philosophical nod to this, but we never built the institutional habits around retirement that manufacturing takes for granted. We have CI/CD pipelines. We don’t have decommissioning pipelines.

That gap was fine when building was expensive. It’s becoming a real problem now that building is cheap.

The Hardest Debt: Things That Work Fine

The code that hurts you isn’t the stuff that’s obviously broken. It’s the stuff that’s running fine and might still be needed.

Observability is my favorite example, because I manage our Datadog spend. Most engineering orgs spend more on their observability stack than on production infrastructure. Logging, tracing, metrics, alerting, dashboards — all valuable, and all nearly impossible to turn off. You can’t easily prove that something you’ve been measuring is safe to stop measuring. Maybe the system is healthy because you’re watching it. Maybe nobody’s looked at that dashboard in two years. Good luck figuring out which.

So the tools stay on. Someone has to keep upgrading them because they’re connected to everything else, and everything else keeps changing. They become permanent not because anyone decided they should be, but because nobody built a process to ask “do we still need this?”

I spent a chunk of last year deleting references to Rancher, Istio, Telegraf, envdir, and a bunch of other systems we’d migrated off of. Config files still referenced them. Helm charts still had their labels. Docs still described them as current. All of it worked fine — it just didn’t do anything anymore, and it made everything harder to understand.

Now multiply that by the rate AI lets you create new things. A graveyard of services that still twitch.

So What Do You Actually Do About It

I don’t think the answer is more process — retirement checklists, ownership review cadences, decommissioning pipelines. That stuff sounds great in a blog post and never actually happens.

The real answer is simpler: be thoughtful about what you create, and think about your systems as a whole, not in isolation.

You just built a new service. Great. Can something else shut off now? What would it take to actually turn it off instead of running both? Can you build one thing that correctly replaces three things? Can you delete the old code entirely and just leave it in git history where it belongs?

These aren’t process questions. They’re engineering judgment. And they need to happen at creation time, not six months later when someone notices the AWS bill went up.

The fastest way I’ve found to keep things clean is to make deletion part of the same work as creation. When we migrated from Jenkins to GitHub Actions, decommissioning the Jenkins jobs was the same project — not a follow-up ticket that sat in the backlog for a year. When we moved config from a shared repo to in-repo helm values, removing the old config references was part of the PR, not a TODO comment. If you treat the old thing as still someone else’s problem, it will be everyone’s problem forever.

AI makes this more urgent because it makes creation so cheap that you stop thinking about it. Claude will happily generate a new service for you without asking “does this need to be a new service, or should it be a feature in something that already exists?” That’s your job. The model doesn’t know what you already own. You do.

The Actual Constraint Is Shifting

The bottleneck in software development used to be “how fast can we build?” It’s moving to “how much can we responsibly own?” I think the teams that figure this out early — that pair fast creation with actual retirement discipline — are the ones that’ll keep the velocity. Everyone else will plateau as the carrying cost of all that accumulated code catches up with them.

The best part is no part. The best service is the one you don’t have to run. The best code is the code that solved the problem so well you were able to delete it.