Writing tests for new code vs writing tests for legacy code

| 2 Comments
I was working for my poker game client yesterday. This project now seems to be firmly test first. What was interesting with yesterday's work was how the tests drove the design and how when I finally came to integrate the tested code into the main body of code the required design changes weren't an issue as I had a whole load of tests to support the changes.

I was adding some code to manage the pot during play. The rules for Holdem Poker seem to be a bit complex in the area of who wins and how much they can win. Anyway, I started out test first because the other test first stuff I'd done with this project had gone well; and because I didn't know where to start.

Write a test.

First a use case: At the end of each round of play all bets that have been made need to be allocated to one pot or another. The pot that the money is allocated to depends on whether anyone went 'all in' during the round and if so for how much. Who's eligable to win each pot also depends on if they went 'all in' or if they folded (and are no longer eligable to win at all). Some players (actually, only one, I think) may get money back if they raise and everyone else folds - I'm not sure we actually need this special case, they'd win the pot anyway in this situation, but it's in the client's test spec.

That was a bit of a big use case so I wrote a test for part of it: At the end of the round of play all bets need to be allocated to a pot. Ok, so we have an object to test, our PotManager. This has a RoundComplete function which takes a list of bet amounts, player identifiers and some flags to say if the player is all in or folded his hand, it creates one or more pots and returns a list of players (probably zero, perhaps one) that are due a refund.

Write the test, stub out the object under test in the simplest way possible and watch the test pass. Wheee. We're rolling. Starting wasn't so hard after all.

Next: A test case for a certain set of bets from a certain set of players. None of them all in, none of them folded. This should result in a single pot of a specific value. So we add methods to access the pots, simplest thing possible, so we don't bother with a full blown iterator just yet. A GetFirstPot() method can return a pointer to the first pot or 0 if there isn't one. A GetNextPot() method can increment the internal iterator and return the next pot. I'd normally have provided 'proper' iteration functionality, but for now this will let the test pass. I can rewrite it when I find it breaks a test that needs to iterate the pots more than once at the same time... I skip the 'just return a pot with a hard coded value' stage and go directly to adding the player's bets into the pot. That's two tests and a chunk of design.

Now I'm in the swing of things I decide to mix it up a bit; a player goes all in for x and everyone else bets and calls for y; two pots, of differing values. We adjust the pot calculation logic to account to multiple pots and teach it how and when to create a new one.

Now we need to know who's eligible for each pot. The pot needs to know who can win it. We add a list of player ids to each pot. Everyone who has contributed to the pot and hasn't folded can win it. Add to each of the last two tests so that they also test the list of eligible players for each pot. Add the methods that return the list so that the tests compile again, watch them fail; implement it.

People who fold contribute to a pot, if they fold after they've bet, but aren't eligible. Add a test, watch it fail, fix the code.

Add a few more complicated tests; ones that involve multiple rounds, different people being all in, people folding, etc. Work out the values of the pots and the eligibility on paper, code the test watch it pass... :0

All of this had been done without even opening the main server project. Now I had some working code I decided to integrate. I hadn't really had much of a clue how the code would look when I started; I didn't know where to start, remember. The code I had at this point seemed pretty good, it did the job and passed the tests. The iteration wasn't quite as polished as I'd normally make it, but I could live with the GetFirst(), GetNext() style for now.

Integration was a pain. I'd been working with a half remembered idea of the information that the server had available at the end of a round. Unfortunately it didn't have the players who folded during the round easily available. My design was broken. Ooops.

Never mind. Rather than having a single RoundComplete() function that we call at the end of every round we can have an UpdatePlayerBet() function that we call each time a player's bet changes. We can also have a PlayerIsIneligible() function that gets called when people fold. Now we call these two throughout the round and then call RoundComplete() when the round is, well, complete. Sounds good so far; I adjust the tests. Nothing compiles, let alone runs. :(

I adjust the PotManager code and the tests run and pass. I integrate the code into the server and play a hand or two. Looking good...

The interesting thing is that the code I've ended up with is very simple. There's no unnecessary complexity and it does just what I need and no more. It doesn't look much like I expected it to look like, but that's not necessarily bad. I didn't really understand the problem when I started to write the first test. Now I do and I can refactor, if I need to (and I don't actually think I do). If I do decide to refactor then I have my tests to guide me; well, to shout at me when I screw things up anyway...

Writing tests this way is much easier than retro-fitting them to legacy code. I seem to end up with more code and more of it is in the tests than I would expect. The actual production code is, it seems, so far, smaller than I'd normally expect...

I was about to say "less to maintain", but, of course, that's not true. The tests need as much maintenance as the production code. The thing is though; the tests and the production code will help keep each other in line. If one breaks during maintenance then you'll know about it...

When I started the day I didn't know how I was going to implement the feature. By the end of the day I had done the work, had tests to back it up, understood more about the problem domain because I'd thought about the edge cases, and had thoroughly enjoyed myself in the process.

Good stuff!

2 Comments

"the tests and the production code will help keep each other in line" -- interesting. It makes me think of double-entry bookkeeping.

Yeah, I guess it is a bit like that. Whats good about the testing we're doing for the FX project is that the way the test code works out the expected results is completely different to how the existing FX code does it. The tests are using the methods shown in the text books - with appropriate comments to the pages in question. The actual library code is a bit more opaque at present. I guess that once we refactor the library code we'll end up using the current test harness calcs in the library and move on to using manually calculated and checked values in the test... Either way the numbers come from two different sources, which should help protect us from errors.

Leave a comment