James Antill doesn't like TDD

And this is why I hate TDD, testing is a great thing. But testing too early is bad, and you are obviuosly doing that. First you need to know what your code has to do in full. For instance even if you wanted to have both sync. and async. APIs (I personally abhore sync. APIs due to non-scalability) the obvious implementation is to have something like…

James Antill - in a comment on my Tangled Testing entry.

James, I disagree…

In Tangled Testing I reported that I’d started to write some POP3 client code and that I was using the server code and much of the server test harness with some shims in place of the input and output code to link the client to the server. I mentioned that I had originally thought to make the client async but decided to start off with a synchronous version as it was easier to write a test for.

James seems to be concerned that TDD was sending me in the wrong direction. He suggests that I’ll end up throwing lots of code away if I move from a synchronous solution to an asynchronous solution. He expects the results of incrementally working towards an asynchronous solution via TDD will be inefficient and poorly designed. He also suggests that it would be better to craft my synchronous solution in terms of an underlying asynchronous implementation.

I disagree and I actually think that James may have misunderstood what I’m trying to do; but first a little background.

I’m currently writing an email application as a way of experimenting with TDD. The requirements are to be able to retrieve mail from a POP3 server, process it and then make the mail available to email clients via POP3. Think of it as a filter. I decided to start by writing the server side POP3 protocol handler and integrating it into an existing server framework; this went pretty well. Now I’m working on the client. I need to be able to send POP3 commands to a server and receive the responses and parse them.

I’m writing the client code using a synchronous method call style API; I’ll call a method and it will block until it has the results at which point it will return them. This is easy to test but means the thread making the call can’t do anything else whilst it’s waiting for the network IO to complete. This is adequate for my current requirement; to retrieve messages from a server so that I can make them available to the filter code. If the filtering part of the requirements works well then I can see that I might want to be retrieving messages from multiple POP3 servers and I wouldn’t want to have to spawn a thread or a process for each connection; hence my comment about having the client operate in an async manner. Right now the important thing is to get to the point where I can grab the messages and make them available to the filter.

James suggests that an obvious way to build the API is to write a pair of functions; one to initiate the remote call and a second to wait for and retrieve the results when available. A sync call is simply calling these two parts one directly after the other. That’s one way to do it, but it would mean making all kinds of interesting design decisions at this point; given James’ Unix background, we’d probably need a thread to manage the outstanding async calls, perhaps using poll or select or whatever to multiplex reads on all the sockets that we are expecting to receive data on. At this point in the development I don’t want to worry about any of that, and we’re doing this using Winsock on Windows so things could be done differently… So, for now I intend to just push data to the server using a call to WriteFile and then loop on a blocking call to ReadFile whilst accumulating the response. Then I need to parse the results and present them in an appropriate way. It seems to me that the code I’m actually interested in is pretty much the same in both circumstances. I’ll need to send data, receive data and parse responses. If I take James’ approach I still need to do all of this; it’s been a while since I’ve worked at this level on Unix, but from his web page it seems that doing the network IO properly can be non-trivial over there.

The actual async API that I’ll require is likely to look a little different to James’ proposal. If all goes well, I’ll be integrating this client code into the POP3 server. This server is built on the Win32 concept of overlapped IO using IO completion ports. We have some threads that get notified when socket IO completes, we associate some state with the socket and, in effect, build an event driven state machine which operates as reads and writes complete on the socket. This means that my async requests will be given to the OS in a non-blocking manner, the OS will then notify me when they complete. Once in overlapped IO mode it will appear to my code that a WriteFile will succeed immediately; the OS will handle writing data to the network and my thread will be free to do whatever it likes. A ReadFile will also complete immediately but it won’t return any data, once the data has been read from the network the OS will let me know. Thus, the resulting async API is likely to be ‘push’ rather than ‘pull’. It will send its request to the server and issue a read and we’ll keep track of our state and give up the shared thread that we’re using at the time. When the read completes a thread will be woken up to deal with the IO completion and we’ll look at our state, work out what we were doing and, most probably, push the data that just arrived into the POP3 client. We can then check to see if the client has a complete response, if it does we can act on it, if it doesn’t we will simply stay in this state, issue another read, and let go of our thread.

So, at present I don’t know if I’ll need the async version, I want to get something working quickly and a sync call version of the API will do just fine. It won’t be perfect but much of the code will remain the same even if we do need to do the async thing in a future iteration. I think many of James’ concerns are based on a culture clash. Using the Windows specific IO method that I intend to use I don’t have the same problems as James has over on Unix. My input and output calls will stay completely isolated from each other and operate asynchronously thanks to the support of the OS. The design change will involve removing code from the sync API; rather than allowing it to read and processes data by pulling that data, we’ll let it issue a read and then return. When the data is available it will be pushed into it and the work will continue from where it left off.

In this particular situation I’m quite confident that TDD is taking me in the right direction, but I’ll keep an eye out for the issues James has raised.