
This is Bruce Wang from Netflix with his inspiring talk about Technical Debt. I enjoyed his talk and wanted to write about it but as it turns out, I’m opinionated on the subject and would like to share my thoughts rather than his. Bruce says Technical Debt is the delta between the current code and the ideal code. I think Technical Debt is when a coding team takes a shortcut and comes up with a solution that mostly works but needs future changes. The work that’s needed and may never be done is the debt.
Samples of technical debt from my experience:
- Copy/paste, quick and incomplete fixes
- Slowly building obscure God classes or long chains of ifs in long functions. All changes are tiny but the final result is bad
- Quickly building large and complex pattern-based solutions to simple problems
- Using old and unmaintained libraries
- Insufficient test coverage or test coverage that’s guaranteed to break for legitimate changes
- Agreeing on any kind of a second system and then insisting the old system is the technical debt until either the old or the new system is removed
Out of these, the second system presented the biggest and the most time-consuming challenges. All the other problems can be improved with small iterations but when you have two competing systems, you’ll keep having two until the very end of whatever the solution is.
This code is bad, it will be better to rewrite it from scratch
— An engineer getting in trouble
To my understanding, this is the worst type of technical debt – one that’s hard to repay because the best outcome of the work is that nothing visually changes. Some issues with rewriting:
- Old code that’s in heavy use tends to cover many cases. Easy to miss them and produce regressions
- The new code tends to focus on the area unsupported by old code so it quickly deviates from compatibility
- The old code tends to accumulate changes that are unsupported by the new system while the new system is in progress
- The amount of work is usually much larger than the wildest estimates
- Migrating from one system to another is hard, and may even be cost-prohibitive
- It never gets easier to complete it, and it keeps draining life from engineers doing this or that but never achieving the final result (one system)
Before I continue, note that sometimes rewriting is the only option. 20-ish years ago, I saw a complete ASP/MS SQL website that used horizontally expandable tables. A new record would need a new column. I was contacted as a freelancer to fix it because the owner ran out of columns. The whole thing felt bad beyond repair. It, however, was not a high-traffic or high-responsibility service.
In many cases, the rewrite is initiated when other solutions exist.
Here’s what I’d do if a rewrite of a heavy-use code is suggested. First, I’d come up with a vision for what the final result needs to be, then challenge that vision with honest questions like “Do I need this?”. Most of the time the answer is no, You aren’t gonna need it (YAGNI). With the vision in mind, I’d need to reach the most simplified version of the future version of the code with small iterations that go to production so that there are never two systems and there is no migration between them, for example, by using the Strangler Fig pattern. I’ve done this and it works well enough that nobody notices. And if the second system is started and never gets completed, it soon becomes everyone’s problem.