Firm Foundations

2004-06-21

As I mentioned last week, I’m writing a new component for one of my clients. I also mentioned that ‘beyond the interface, I can do what I like’; that’s actually a surprisingly important part of the specification due to the situation that the client finds itself in…

The client’s current codebase is in a fairly bad way; a mixture of staff turn-over, evolving requirements, tight deadlines, and lack of care and inexperience and lack of strong technical leadership has allowed the code to develop a nice selection of problems. It uses lots of in-house libraries, developed by other teams and of varying quality. The dependencies are complex and unplanned; there are several BSTR wrapper classes in use, multiple string types, and at least two coding standards per developer. The code is messy, complicated, tightly coupled, and relatively buggy and has few tests.

When we started the new component we had a choice; we could attempt to reuse some of the existing code and try and pull the rest of the codebase along with us towards our quality targets or we could cut it loose and start afresh and then, once this component was done, build another new piece based on the new foundations. We chose the later. Rather than building on sand (or worse) and continually patching up the problems, we’re building on firm foundations of testable, consistent, code.

The idea is not to write it all from scratch, it’s to reuse the right pieces of code and ditch the rest. Our code is shy, it doesn’t trust other code that much; when we interface with other team’s components we isolate them so that should we need to replace them we can. We don’t allow a potentially buggy component to infect the whole codebase with its types and interfaces; we use compilation firewalls (pimples) to help us layer the code in such a way that we can replace a component by writing something that provides the same services. We’re not setting out to rewrite the world, we’re just making sure we can if we need to; be prepared, it may be easier to rewrite than get the team that owns the component to fix it. If your deadline depends on someone else fixing their bugs then you’re in trouble if you don’t have alternatives.

Although we want to build on firm foundations we realised that many of the problems we were addressing have already been explored in the old codebase. There’s value in there but it’s poorly realised, but some can be salvaged. Every piece of code that we reuse from the old codebase gets looked at, cleaned up, tested, and made internally consistent. It’s like knocking down a wall and rebuilding it with the same bricks; you don’t use the bricks until you’ve cleaned them up and removed all the old mortar. The point is, before we accept some code into the new codebase we think; is the time saved by ‘reuse’ going to cost us more in the future due to unexpected and buggy behavior. We want to move forward as quickly as possible but we’d rather spend longer on the foundations and build with confidence than cobble together something that looks OK at the start and then find that it’s unreliable later on.

We also follow one strict rule which helps with all of this; no known bugs. If we suddenly discover some unexpected behavior we stop what we’re doing, there and then, investigate it and resolve the issue. We don’t add it to a list, we don’t ignore it, and we don’t pretend it didn’t happen. We fix it.

So far this has all been going well. We’re very close to integrating version 1 of the new component with the existing system. The new code is rock solid and has lots of tests. The fact that the code is so testable is now starting to help us in all sorts of unexpected ways; we can quickly write a test to measure the performance of key areas of the code, we can run the test harnesses under program validation and coverage tools and we can mock up or instrument almost any part of the system which has been useful during testing and promises to be useful in the future in helping us to write a mock version of the entire component that we’re currently writing; one that will allow us to spread the testing into the remaining code in the system by providing a repeatable, controllable source of data that we can pass to the rest of the system.

Of course all is not perfect. The code is new and different and this doesn’t necessarily sit well with existing team members who have invested a lot of time in learning the tricks and work arounds to get the existing codebase to do things. Hopefully, once they see how much more productive they can be with a consistent and solid foundation, they’ll come around. I’m hoping that during the next stage of the work management will be pulling more and more of the existing team onto the new codebase. But before that we have integration, documentation and evangelism phases to go through…