Unit testing and accessing external systems

There’s a lot of talk about how unit tests shouldn’t touch the network or the file system or databases whilst they’re running. Michael Feathers even has a set of unit test “rules” (A Set of Unit Testing Rules) which go so far as to suggest that:

“A test is not a unit test if:

  • It talks to the database

  • It communicates across the network

  • It touches the file system

  • It can’t run at the same time as any of your other unit tests

  • You have to do special things to your environment (such as editing config files) to run it.”

Which pretty much boils down to “a test is not a unit test if it relies on an external system”. This may, at first, seem to be a little harsh until you consider what abiding by this rule actually means in practice.

The people that feel that their unit tests must access external systems such as databases may have a point. They seem to feel that it’s just nit picking to suggest that if a a class accesses an external system then the tests for it aren’t unit tests. Cedric, over at Otaku, suggests grouping unit tests by type so you can simply say “these tests access a database and are slow” and then be able to run just your non database tests or just your fast test, etc. Phil Haack, at Haacked, suggests scripting your database so that the test creates and populates the database before it runs. Both of these suggestions are reasonable positions to take and, in fact, I often do both. However, once again it’s Jeremy D. Miller, at The Shade Tree Developer, that hits the nail on the head for me, with his First Rule of TDD: “Isolate the ugly stuff”.

Whilst both Cedric and Phil deal with how you can make it possible to test external systems when you have to, Jeremy points out that what you should really be doing is ensuring that you have to jump through these extra hoops for the least number of tests possible. In summary, don’t litter your code with access to external systems, define an interface for communicating with each system and have the rest of the system use the interface. Once you’ve done this you can use Dependency Injection (parameterize from above) to provide this interface to the parts of the system that need it. This usually isolates much of the code from the external systems and allows you to mock up the external system for most of your tests.

If you try very hard to stick to Michael Feathers’ Unit Testing Rules then you will find that you become very keen to restrict the amount of code that accesses external systems of any kind. Soon you’ll find that you routinely wrap access to these systems up in a single place and provide an interface that the rest of the code can use to interact with the system. Suddenly your code is no longer coupled to a concrete implementation of the external system. Your code is more flexible. Not only can you test it easily and without “breaking the rules” but you have isolated the areas of code that need to change if you decide to use a different external system. For me this is one of the most important aspects of TDD; it drives the design in the right direction. The rules may seem arbitrary but all of them drive you in the same direction, towards flexible, loosely coupled code that’s easy to test and easy to change. Like most rules, you sometimes need to break them, but, if you try and stick with them you may find that you actually have to break them far less often than you think.

Obviously this still leaves you with the need to test the code that does have to access the external systems. This is when Cedric and Phil’s suggestions come in very handy. Separate out your “slow”, external system tests, and, if you can, make sure that you can still just pull the system out of your revision control system and build and run the tests with a single command… Be pragmatic, sometimes it may not be worth the effort to script an entire database, accept that you’re taking a risk and move on. At least it’s only one small piece of untested code rather than a whole system that is hard or “impossible” to test.

This technique works for all external systems, databases, file systems, network access, message queues, windows event logs, etc. Follow the rules, accept that it will change how you design code and go with it. The results are well worth the journey.