The Exponential Nature of Lines of Code

The faster your codebase grows, the less of it people will understand. When people don’t understand all of the code, they don’t see global patterns, and so they will reinvent little wheels all over the place. In theory, the development leads and the architects are supposed to watch out for these issues, but there are always places where redundant code can hide, and as the code continues to grow, even this watchdog function breaks down. Soon you have people who are intimately familiar with only a couple of modules in the system, and so replication across modules becomes difficult to spot. As the line-count continues to rise, the percentage of the code that each person really knows decreases, compounding the problem. Welcome to exponential code growth.

From Jason Marshall’s Software Weblog via Ned Batchelder

Unfortunately this is very true…

It’s hard work to keep code under control and it’s easy not to notice the point when things start to slide and the decay starts to set in and entropy takes over. A shortcut for the best of reasons here, a little copy and paste there and suddenly you have duplication; add a splash of schedule pressure and a handful of inexperience and complexity starts to flourish…

I find it useful to use slices of otherwise dead time for ‘weeding’ code; reading a random block of code, looking for simple cleanups, small refactorings or reductions, looking for duplication, remembering what this bit does and the patterns we used when we did this bit, searching for and removing potential rot… It’s a useful way to use those 20 minutes between meetings, or the half an hour at the end of the day when it’s too late to start something new. Pull a fresh copy of code from the source control system, drop into a random area, weed for a bit and, if you’ve made changes and you’re happy with them, and the tests pass, check it in, if not either throw it away or leave it for later…

Unfortunately, it’s far easier to write lots of complex code than write a small amount of simple code that solves the same problem. Simplicity is hard and only seems to come after multiple iterations. Sometimes clarity only comes if you walk away from a piece of code and come back with fresh eyes…

As ever, refactoring is risk management…