I’ve been thinking a lot about tech debt lately. I’ve always felt like there are two kinds of tech debt: the sort that you can comfortably ignore, and the kind that slowly strangles you.
Having built and maintained a few systems, I’ve now crystalised my thinking. All technical debt starts life as a small pile. Over time you might build on top of that pile and it eventually becomes a hill. It might even become a mountain. That’s OK though. We can slowly climb a mountain if we can break it down into manageable steps and take a breath on the way. We can slowly fix this kind of technical debt.
Sometimes though, we build impossible mountains. Mountains that you can’t break into small segments as you’ll just fall down. The kind of tech debt that has to be dealt with in one go and it’s built up enough that it becomes infeasible to deal with - the business simply can’t afford you to down tools for six months whilst you resolve it.
Below are a few real world examples that I’ve dealt with.
I love tests. You should too. Given enough time and not enough attention, test suites may eventually become slow and you lose the ability to rapidly execute them as you are building software. This is often an impossible mountain. Unless you’re fortunate you’re unlikely to be able to find a small change that will reduce a test suite run time from twenty minutes down to two. Instead, you have to break it down into a series of small changes going from twenty minutes down to nineteen, down to eighteen… all while the rest of the team are continuing to build more tests. At this point you find it hard to gain traction to take the time to fix the problem.
Fortunately, you can build an escape valve to prevent this kind of problem. Most people will agree that a fast test suite is valuable. Most people will also be happy to agree that if the suite goes over a certain time threshold then it needs to be dealt with before it becomes too big a problem and the value of the test suite diminishes.
This pattern is more sinister. Leaky abstractions make refactoring hard. My favourite example is using an ORM. Often we use an ORM to make dealing with a database easy and safe. We write code that looks like this:
def send_daily_emails(): for user in User.objects.filter(notifications_enabled=True): # code to generate the email goes here send_email(user.email, report)
At first blush, this seems fine. Then we decide that, actually, we need to
refactor the User model. Supposed now, notifications should be a series of
different flags, or, the
User objects are being split into two different
types. At this point we discover we have hundreds of slightly different ways of
User model. Now, it’s hard to refactor. How do we check that all
the uses of the ORM are going to be OK? How do we check that all the queries are
going to be performant when we change our indexes? Now, it might just be another
impossible mountain to climb and instead of tidying things… you add another
type of query.
Fortunately, this can be avoided with a bit of forward planning. Building narrow APIs in front of your leaky abstractions (in this case the ORM) can make it vastly easier to refactor and analyse what is going on:
def send_daily_emails(): for user in UserRepo.find_all_with_notifications_enabled(): # code to generate the email goes here send_email(user.email, report)
When we next come to make a change to the
User model we can easily find
everything that uses that method and change it… if needed. It’s likely you
will be able to just change the internals of that method and never change
anything that calls it.
It’s hard to spot the impossible mountains as they’re getting built. I rely on a few smells:
These are the bits of tech debt that I focus my energy on and push for time to solve / prevent. The rest, I can live with in the interest of moving the product forward.