January 25, 2010
Happy Zeroth birthday Scott Holgate!
My baby son Scott was born today, Jan 25th 2010, at 1.58am. He weighed 6lb 15oz and both he and his wonderful mother are doing fine.
January 14, 2010
Performance improvements in the Socket Server Framework
I'm just finishing a batch of work that will be included in the 6.2 release of my IOCP based, high performance windows client and server framework and which improves the performance of the framework. There are two main improvements; the first is that the filtering API is now completely optional. There are now filtering and non-filtering base classes so that you only need to include filtering support if you need it. This removes a considerable amount of code that was not required for the non-filtering servers and improves performance by a small amount. The framework base classes are now templatised so that filtering is included only if you want it. There are no changes to user code unless you are actually using filtering in which case you need to switch you base classes to the new filtering ones.
Whilst making these filtering changes the filtering API has been brought to the datagram side of the framework and I'm currently working on various filters and examples of their use; as with the stream socket filtering API you need to select the correct base class to include the functionality.
The second performance improvement is only available for code running on Vista/Server 2003 and later operating systems and involves incorporating the changes that I spoke of here and here to enable FILE_SKIP_COMPLETION_PORT_ON_SUCCESS and FILE_SKIP_SET_EVENT_ON_HANDLE. This is a configurable option that can be set in Config.h and which can be set differently for stream and datagram sockets (mainly due to the fact that there appear to be some issues with setting these options on datagram sockets (see here and here)). In testing these options make quite a considerable difference to performance and I expect that they'll be even more effective on heavily loaded servers as they reduce the amount of thread context switching as many I/O completions can be handled on the thread that issued the I/O request rather than by pushing the completion through the I/O completion port.
These changes will be available in release 6.2 which currently doesn't have a release date but which should be available in the first quarter of 2010.
January 06, 2010
FILE_SKIP_COMPLETION_PORT_ON_SUCCESS and datagram socket read errors
As I mentioned a while back I've been looking at incorporating some simple performance gains in the framework by following the advice given over at the Windows Server Performance Team Blog. Specifically the advice from part three of the "Designing Applications for High Performance" series of postings.
Whilst I'd done some quick and dirty tests with the FILE_SKIP_COMPLETION_PORT_ON_SUCCESS flag of SetFileCompletionNotificationModes() and all looked good I'm now working my way through my unit tests and adjusting them to work both when this option is enabled and when it's not (completion dispatch changes somewhat when it's enabled and my tests are invasive enough to notice this).
I've come across a strangeness. I have a test which tests how the framework behaves when you attempt to read a datagram but you don't supply a big enough buffer. When FILE_SKIP_COMPLETION_PORT_ON_SUCCESS is NOT enabled then I get an error from the I/O completion port; "More data is available". However, when FILE_SKIP_COMPLETION_PORT_ON_SUCCESS IS enabled the read simply never completes; not even after the socket is closed...
It seems that everything works as expected if there is a read pending before the data arrives; I get the "More data is available" error as expected both when FILE_SKIP_COMPLETION_PORT_ON_SUCCESS is enabled and when it is not. However if there isn't a read pending when the data arrives then if FILE_SKIP_COMPLETION_PORT_ON_SUCCESS is not enabled I'll get the "More data" error when I post a read (the error completes though the IOCP and not directly from the call to WSARecv()) and this is as I'd expect, BUT if FILE_SKIP_COMPLETION_PORT_ON_SUCCESS is enabled then the read never completes, not even after the socket is closed...
I need to do some more investigation into this but right now it looks like a fairly serious issue for me and may mean that this performance improvement will NOT be enabled for datagram sockets in 6.2
Updated: 14/01/2010 - the guys over at the Windows Server Performance Blog have confirmed that this behaviour is their bug and not mine.
December 24, 2009
DevPartner Studio 9.1
I've complained about DevPartner Studio enough in the past (here, here, here, etc.) that I thought I should write a positive blog posting since my recent experiences have been very positive.
Some time ago I reported a bug in the BoundsChecker part of the product which meant that it hung sometimes in multi-threaded code. I managed to get a reasonably straight forward reproduction and raised an issue with them. It took a while but this is now fixed in 9.1 and this alone makes the product much more usable for me - having your diagnostic tool hang is never that useful!
However 9.1 is good for me in several other ways; it runs on x64, it runs on Windows 7 and it seems considerably faster than previous versions.
I'm not sure if the last one is just my perception due to the fact that I'm finally running it on my main development box rather than on an older box. Either way it's now something that I CAN run on the servers that I'm developing and get reasonably real results out of any tests that I care to run to exercise the server. In the past it's always been too slow to run my servers under these kinds of tools as the overhead of the API interception and whatever made the servers run unreasonably slowly.
Being able to run the tool all the time rather than just when I decide to do a special run of it on my test harnesses is much better and makes it much more useful to me. Now, if only there cost effective way to have it run on my build machines and report failures to me automatically - right now there's no point in working out a solution to the second problem as I'd have to purchase full licenses for each of my build machines and that isn't going to happen... This is unfortunate as I think the place for these kinds of tools is in the automatic build tool chain. If it were to run as part of my test suite then I would catch issues with it that much faster; the usage would switch from being reactive to being proactive...
Now, of course, I'll find some really annoying bugs as soon as I start to turn on more of the features (the API validation part always seems to be out of date or incorrect in some key API that I use all the time - (it's a real shame that the config for this isn't supplied as user editable XML files or something!)).
December 23, 2009
Bug fix for obscure shutdown sequence race conditions
Charles Dickens said it best in A Tale of Two Cities; "It was the best of times, it was the worst of times, it was the age of wisdom, it was the age of foolishness, it was the epoch of belief, it was the epoch of incredulity, it was the season of Light, it was the season of Darkness, it was the spring of hope, it was the winter of despair, we had everything before us, we had nothing before us, we were all going direct to heaven, we were all going direct the other way".
In short - debugging complex race conditions can be an emotional roller-coaster...
Last week some developers at a client of mine reported a shutdown problem during stress testing their IVR system. The problem was in the API I'd developed for them using the server framework. In certain situations they were getting a pure virtual function call error from the connection manager during shutdown. Since I have a policy of 'no known bugs' where fixing bugs is the most important work that I can be doing (see The Joel Test, item 5) I immediately switched from what I was doing to looking at this problem. At this point I was convinced that it was a simple issue and that it wouldn't take more than a couple of hours at most to resolve.
At first I thought it was a simple case of failing to wait for the connection manager to stop before allowing its destruction to complete; since the normal model for developing a custom asynchronous client is to derive from the standard CStreamSocketConnectionManager class and also from the IStreamSocketConnectionManagerCallback interface and pass yourself into your connection manager base as the callback interface you must be certain that you wait for the connection manager to shutdown completely in your destructor otherwise the destructor will allow you (and the callback interface) to be destroyed before the connection manager stops during its destruction; in this case there's a race condition between your destruction and that of your base class...
Anyway, it wasn't that. But luckily my client's developers had managed to work out a sequence of breakpoints that made reproduction of the problem easier. I managed to reproduce the problem and the purecall WAS due to the race condition that I detailed above but only because the call that was trying to wait for the connection manager base class to cleanly shut down was throwing an exception during its operation and this meant that we weren't waiting for the shutdown to complete before the destructor was called... The exception that was being thrown was due to us attempting to increment a socket's reference count from 0 so that we could dispatch the socket to the I/O pool to provide an asynchronous 'connection closed' notification during when we aborted the connection.
During the shutdown of a connection manager object we have to abort all outstanding connections. This is essential as we cannot allow connections to outlive the connection manager that is responsible for them or we end up with connections generating events and calling back into a connection manager that no longer exists. The abort code simply asks the socket allocator to walk the list of active sockets and abort each one. It was during this traversal of the active sockets that the exception was being generated.
The client didn't have any active socket connections at the time of shutdown; at the point where the shutdown code was being called we had completed all of our connections and released all of our sockets. Unfortunately, the race condition that was causing the problem was actually between the socket's final call to release reducing the socket's reference count to zero and removing it from the socket allocators active list and our abort code iterating that list. This final release was happening on a thread on the I/O pool as a read was completed in error due to socket itself being shutdown; whilst this was happening the main thread was happily walking along the socket allocator's active list and aborting each connection.
Of course my first reaction was to jump into the middle of the Kübler Ross model and try and bargain my way out of the problem. I had skipped through stage 1; denial as the client clearly had a reproducible bug. I'm getting better at skipping stage 2; anger, and I was quite clearly to blame as they'd reduced the issue down to only using code that I'd developed for them... Given my incomplete understanding of the actual problem at that time, I tried to solve the obvious issue. Thus I wasted a day with attempts to allow the abort code to legally increase the reference count from zero in this 'special' situation... The reason I took this somewhat strange approach is, I think, because there's already a situation where this kind of strangeness goes on and it's directly related to the final release of a socket. During socket destruction we often need to close the socket, closing the socket generates a close notification and this close notification results in the socket being passed back into user code. At the point where we pass the socket back into user code we have a socket with a reference count of zero. Since we're passing the socket into user code and the user might want to do something with it (a common requirement is to dispatch the close event to a thread pool for processing) you MUST be able to manipulate the reference count of the socket inside the notification callback. To enable this the final release of a socket can legally reanimate the socket if it has just closed the socket by incrementing its reference count from zero to one before calling the notification callback. The socket destruction process then puts itself on hold, and when the socket is released to zero next time the socket is already closed and so can't be reanimated again and finally gets destroyed. This has been in place for a long time now and works well. Unfortunately no other code can work this magic. I hadn't fully understood and accepted that at the time; or I had considered it but the changes always seemed so simple that they would obviously solve my problem... I cycled through stages 1-4 with a detour into "elation" every time I was convinced that I'd solved the problem.
Unfortunately my periods of elation were short lived; they were merely small periods of "The best of times" during what was proving to be the age of foolishness... Luckily for me the client's hardware reproduced the problems much more reliably than my main development machine and so none of this nonsense got into production.
The problem is that the change DID solve the problem of an exception being thrown when the socket had its reference count increased during the dispatch of the close notification but that simply shifted the problem to an access violation when the socket was then accessed again after the dispatch. I'd spent almost a day stuck in a rut trying to solve the symptom rather than discovering the actual problem.
At this point I decided to call it a day. I'd been thrashing around the problem for most of the day and into the night; the client was under pressure because they only had a small window on Saturday when they could put a fix to the problem live at their client's site. I was tired, they were tired and it was clear to everyone that I wasn't actually getting anywhere. Before going to bed I sat down once more with the code and tried to look at it with fresh eyes; I didn't expect to find anything but I knew that if I didn't at least try then I would be waking up in the night trying to solve the problem. I convinced myself that I'd worked out the (startlingly obvious) underlying reason for the problem and went to bed.
Looking at the problem with fresh eyes in the morning finally brought me the realisation of what was actually wrong (the analysis I'd come up with before bed was correct although slightly incomplete), I also realised that I'd seen the symptoms before...
Whilst 'no known bugs' is a very important thing to me, and the approach has worked very well over the years, occasionally there are things that show up once in a blue moon and that are nigh on impossible to get a reproducibly bug from. I'd recently seen a couple of instances of the "incrementing a reference count from 0 during shutdown" problem in automated test runs on one of my build machines (but only one). The problem showed up during the shutdown of a multiple client test harness program that I run as part of my build tests for the server framework examples. I have around 55 example servers that ship as part of the licensed framework and most of these have the client test run against them during the build, each of these is built on between two and three build machines, with different operating systems, different numbers of CPU cores, using 5 different compilers and 5 different build configurations. All of the source is rebuilt at least 3 or 4 times on the build machines prior to a new release (mainly as last minute changes sneak in) and considerably more often during normal development. In all of that time I've seen this issue only a couple of times. It was hard to reproduce and only one build machine seemed to cause it to occur at all. I'd previously thought that it was due to another bug in the multiple client application but that had been fixed a couple of releases ago...
Anyway. The root cause of the problem was the rather obvious fact that the code that iterates the socket allocators list of active sockets and calls abort on them is doing so without owning a reference to the socket that it's working with. The socket allocator or connection manager can't own a reference to the socket as that would screw with the whole way that the reference counting scheme works. Although I had locked the socket allocator's active socket list, so the destruction of a socket couldn't actually complete whilst the abort was working and although I had locked the socket's own lock before manipulating it there was nothing to stop a socket getting to the point where it was very very nearly destroyed (a reference count of zero with the thread blocked in the final stages of destruction on the socket's lock. As soon as the abort completed the socket would be destroyed. Trying to be clever and increment the reference count from zero so that I could post a notification in this situation was doomed from the very beginning.
At this point I should, perhaps, point out that there are race conditions and there are race conditions... Sometimes it doesn't matter which order things complete in and it's quite OK not to know. There are ways of manipulating some parts of an object's state without needing to lock the entire object; a subset of state is manipulated atomically in one thread and another subset is manipulated atomically in another. In these situations it's often unknown which thread might complete its work first; it doesn't matter as long as all combinations of completion order result in an object that is in a stable state. Since each thread is manipulating its subset of state in an atomic fashion this is the case.
There's deliberately no locking in the socket's reference counting operations; they use InterlockedIncrement() and InterlockedDecrement() to manipulate the reference count atomically and when the reference hits zero we know that nobody else is holding a reference so nobody else will be manipulating the socket so there's no need to lock it during much of its destruction. Since our abort code doesn't own a reference to the socket there's a race condition between the last reference that's held being released and our abort completing. No amount of locking will help us work around this without locking on entry to Release() and that is something that is just not going to happen and something that would simply change the problem to one of deadlock if it did happen; the final release would lock the socket lock and then the allocator lock, the connection manager shutdown abort would lock the allocator lock and then the socket lock... So there's always going to be a race on a zero reference if you're trying to manipulate a socket for which you don't own a reference to... The problem here is not the race itself but the fact that you shouldn't be manipulating a socket that you don't own a reference to.
At this point I designed a solution to the problem, implemented it, tested it to death on my main development machine and the build machine that had shown up the problem in the past and sent it to the client for testing with a note saying that whilst this fixed the race condition problem in the client the knock on effects would need to be looked at with regards to the server; and, indeed, every single system built on the server framework, it was very much a breaking change.
The fix itself was pretty simple. Since a shutdown induced abort was different to an abort that was caused by someone who actually owned a reference to a socket it needed to be treated differently. During the shutdown abort we couldn't do anything that required us to manipulate the reference count. This meant that we couldn't post a close notification. Since lots of code is written on the basis that close notifications ALWAYS occur for any socket that has had an OnConnectionEstablished() notification (see the "lifecycle of a steam socket connection") this could cause resource leaks in user code. Because of that I realised that I could flag the fact that the notification hadn't been sent and send it if we ever attempted to close the socket in the future (as would happen when the final release happened on the socket). Due to how the locking around socket closure worked this was safe with no race conditions. The notification of closure would be delayed but you'd get it... At this point I spotted a situation where I could send the notification safely at a point prior to socket final release in some situations: If a read was pending on the socket when the socket was closed then the read returned with zero bytes read; much like when a client close occurs. With a few simple changes I could spot this situation and issue the correct close notification at this point. Only sockets without reads pending when the socket was aborted would suffer from a close notification that was delayed until just before the socket was destroyed.
This worked fine in the client code in question. The client was happy with the fix and it passed their tests and the pressure was off; at last. We were all back in the "spring of hope". I switched to working out the ramifications of the change for their server software and for other framework examples. I was in stage 5 of the Kübler Ross model; acceptance and, interestingly, from this point on I stayed there...
Unfortunately not sending a close notification directly from the socket abort meant that one particularly common style of server design was now broken. Often you find that you need to hold on to sockets within the server for some reason; usually this is because you need to allow one connection to talk to another or perhaps it's because the server generates data of its own to send on the connections. There's a whole lump of code in the framework that makes this easier to do but unfortunately the change meant that all of this was broken...
Servers that hold references to a socket tend to release those internal references when the socket is closed. Client closure and connection reset errors are handled with the appropriate notifications but a server generated abort is handled by the OnConnectionClosed() event. Since we now send our close notification for shutdown aborts when the socket is finally released we'll never get that event (or release the socket) for sockets which are being held on to by the server and which will be released when OnConnectionClosed() is called. What's more, given the way I did send the notification if a read was pending at the time that the abort occurred I knew that this was a new intermittent, hard to track down, bug in the making; every so often servers that used to work fine would now hang during shutdown as a reference to a socket would never be released.
I spent a while looking at alternatives but most smacked of being a return to denial; the code didn't own a socket reference so I couldn't send a notification... I moved on and added a new callback which was called when an 'abort all connections' event had occurred and which told you of the number of connections that had been aborted and for which a close notification might be delayed. With this in place I could fix up the client's server code to release the socket references that it held. The problem was fixed, for this client at least, but I'd yet to fully appreciate the full ramifications for other common server designs. I merged the client changes into a test branch of my main framework and picked a 'complex server' at random; I deliberately didn't pick one that I knew held on to connections in the same way that the client's server had. The OpenSSL server and its associated test client use most of the important parts of the framework and as I got these working I began to realise that the removal of this notification was a breaking change too far. Firstly there was nothing to cause a compiler error if you were about to fall foul of the problem; sure I could rename the notification so that you had to work out why I'd renamed it and perhaps you'd read the release notes and comments and perhaps you'd understand the problem and perhaps you'd fix your code but more than likely you'd do a search and replace for the old name with the new name and move on with a potentially broken system. Removing or delaying the notification was not acceptable; too much code relied on it.
I went back to the original problem and spent some time thinking. Then I stepped away from it for a while and waited to see if the thinking would result in anything...
As stated above, the root problem is that: During shutdown the connection manager potentially calls abort on a socket for which it doesn't own a reference. Since it doesn't own a reference it can't legally manipulate the reference count though the way we lock the socket allocator and the socket itself allows it to safely manipulate the socket whilst it performs the abort.
The solution that requires the drastic breaking change assumes the worst; we don't have a reference and we can't get one because the reference count is not locked during manipulation and therefore there's no safe way to increment it to "steal" a reference. What I really need is an InterlockedAddIfNotCompare() function that could compare the reference count to zero and add one if the compare failed. Unfortunately such a thing doesn't exist. However due to how both the abort and the socket closure code that needs to be called during a socket's destruction both lock the socket and test it for validity I realised that if I was actually inside the code that needed to steal a reference I could be sure that I could increment the reference count, test it to see if I'd just incremented it from zero and if so decrement it back to zero all without affecting the code that requires the reference to be zero during socket destruction... If I could steal a reference in this way then the whole problem of not owning a reference went away; since stealing the reference meant obtaining a reference to the socket only if the existing reference count was not zero, once I'd stolen the reference the potential race with other reference owners went away as the reference that I'd obtained was just as valid as anyone else's. What's more, in the situations where delaying the close notification would be a problem I was guaranteed to be able to steal a reference since this problem was due to a socket reference being held that needed the close notification to release it... In the situations where I couldn't steal a reference I knew that we were currently in a race with the socket's final destruction and, due to the locking and socket state checks that guaranteed we were ahead of the socket destruction code, that destruction code would issue the delayed notification shortly after we finished. This left me without the massively breaking change and with a solution that I was happy was correct; though one which had generated copious code comments to warn me against changing anything that the code relied on.
By this stage I had some unit tests in place. These tested the notification generation during the shutdown abort and helped document the behaviour. Unfortunately the original issue was not something that I could get into a unit test as it relied on a race condition that was too hard to manipulate within a test; sometimes you have to accept that some things are just untestable with reasonable resources.
Unfortunately this left me with one outstanding issue... The way that the socket allocator handles walking the list of active sockets deals with the fact that the socket we're currently aborting may end up removed from the list before we step on to the next element in the list. This list doesn't use the STL and is a custom invasive list so that all we need to do to guard against the node that's being processed being deleted during processing is to make sure that we have already retrieved the 'next' node before we process the current node. Unfortunately this doesn't help if processing the current node causes the next node to become invalid. Certain styles of server can have one socket directly holding onto another socket so that when one socket is released it releases the socket it's holding on to (gateway servers, or proxies might be designed like this). A fairly contrived test, whereby socket A holds a reference to socket B and socket A is released first during a shutdown abort causing the release of socket B can clearly demonstrate the potential problem. The list traversal jumps off into hyper-space as the next node is invalidated before we step to it. Locking doesn't help here as everything is happening on the same thread and so we simply re-enter any locks recursively; using non re-entrant locks would simply deadlock us on ourselves.
Moving the reference stealing into the allocator's abort loop and only processing sockets that can have a reference stolen fixes this final problem. By stealing the reference we know that the socket that's being worked on cannot be deleted whilst we're working on it so we don't need to do anything clever to allow for sockets that release themselves and/or other sockets. However we now no longer abort connections that we are unable to steal a reference from. By definition a socket from which we're unable to steal a reference is one that is in the later stages of being closed and released as such it's probably better to allow it to close naturally rather than aborting it at this point. The revised design appears to be safe from problems caused by the race condition on socket destruction and perform better than the original design by allowing sockets that are closing anyway to continue to close naturally rather than being aborted at the last minute.
Getting from the original problem report to a satisfactory solution took some time, the journey was interesting but I went through a whole host of emotional states as I initially rushed to fix the symptoms without taking the time to fully understand the underlying problem. Once the pressure was off things went much more smoothly as I started to view each attempt at a fix as just that, an attempt. I didn't feel pressured by the client even though they had a deadline, I was pressuring myself because I knew that they were under pressure and I was embarrassed to have found a bug in my code...
These fixes will be included in release 6.2 which currently doesn't have a release date but which should be available in the first quarter of 2010. If you think you need them sooner then get in touch.
December 11, 2009
Datagram filtering
The filtering API that's built into the stream (i.e. TCP) side of the licensed framework is pretty powerful. It's what the OpenSSL, SChannel and SSPI Negotiate filters use to transform the data as it flows into and out of your server without you needing to do anything special; it's what the flow control filter uses to provide efficient TCP window based flow control; it's what the compression filter uses to compress your data stream; it's what the read timeout filter uses to provide you with asynchronous read timeouts on overlapped I/O and what the connection re-establishment filter uses to automatically reconnect if your connection goes down. All in all it's pretty powerful stuff that makes it easy to build a data stream with just the features that you need by simply pushing filters onto your connection.
The downside is that it's only available on stream sockets. Well, that will soon change. Right now I'm working on adding a datagram version of the filtering API. This will work in a similar way to the stream filters but since the data flowing is, at present, unreliable (UDP being the only datagram protocol that we currently support) the filters will be limited to doing things that don't require state from previous datagrams; so compression using the ZLib compression filter will be fine, but encryption using SSL wont be. The first filters available for the datagram filter API will be compression and a Carmack unreliable delta 'reliable' data transfer protocol.
Datagram filtering will be available in release 6.2 which currently doesn't have a release date but which should be available in the first quarter of 2010.
November 26, 2009
Writing a custom Wireshark dissector plugin
I've been spending a little time recently writing a custom Wireshark dissector plugin for the UDT protocol. This didn't prove to be that difficult once I got over the initial problem of actually getting the Wireshark source to build with VS2008. My problem was that I'd found a CodeProject article which is now slightly out of date and which I followed too closely - setting up the Platform SDK version that the article stated was required. In fact the source code has moved on since then and now needs a later version. Eventually I switched to using the latest Platform SDK and things built fine. I expect they'd also build fine with a straight VS2008 system with no additional platform SDK installed but I haven't tried that yet.
Anyway, once you have the source building the CodeProject article by Ken Thompson on writing a custom dissector is quite useful; it's certainly a great kick start into developing a plugin. Then, of course, there's the source code to the standard plugins and the official developer Readme file.
Building a dissector for a relatively simple protocol such as UDT is pretty easy and didn't take very long and it's well worth doing if you're working with protocols that Wireshark doesn't understand as standard. It's much much easier to look at a packet trace with even a partially completed custom dissector rather than simply looking at raw UDP packets and doing the decode in your head.
Of course, having built my dissector I now realise that the reason I was having trouble doing a decode in my head was that the example UDT client and server speak a slightly different protocol to the one that's documented in the unofficial UDT protocol specification document.
Looks like "check that the documentation is up to date" is becoming a bit of a theme with my UDT work.
November 25, 2009
.Net 4.0 Hosting (again)
As I mentioned a while back the CLR Hosting API has changed with .Net 4.0. The new API allows you much more flexibility in loading CLR runtimes and also allows you to load and run multiple different runtimes in a single process. Whilst this is useful if you need to control which runtime runs your code and indispensable if you need to host code that really requires different versions of the runtime and can't all run in the "latest" runtime, I think the times when a user of my CLRHosting libraries will actually need this functionality will be few and far between. Anyway, I've added some initial support for hosting multiple instances of the CLR and I will be building a new example server over the next few days.
I've found the new hosting API to be a bit of a let down really. Whilst the extra level of indirection that the 'meta host' gives you is essential for loading multiple different runtimes the actual runtime hosting hasn't really improved much over what we had in .Net 2.0. There's still no way to cleanly shutdown a runtime once you've loaded it (which is a shame and a pain but I can't see it ever changing).
An overview of features of the licensed Server Framework
The licensed version of my freely available I/O Completion Port based, high performance, Windows networking framework provides a whole host of features that the free code doesn't; aside from performance improvements, bug fixes and an active support and development process. These features make writing highly scalable TCP and UDP clients and servers very easy and solve many of the problems that you would come across if you were to start from scratch, either with the Winsock API directly or using something such as boost::asio. Whilst my client/server framework deliberately does not offer cross platform functionality it targets the writing of high performance clients and servers on Windows and doesn't sacrifice performance for ease of portability. It uses I/O Completion Ports to provide highly scalable asynchronous I/O and has been designed from the ground up to be configurable and 'pluggable'; it's easy to replace functionality by simply providing your own implementation of a standard framework interface. The framework comes with full source code, a simple and straight forward licensing arrangement, free support and updates for life and numerous fully working demonstration clients and servers that help to explain how various aspects of the framework work and provide you with a base for building your own clients and servers. Anyway, here are some of the things that you get 'for free' when using the licensed code.
- Connection limiting - strangely enough, one of the most important things that you need when you're writing highly scalable servers is a way to limit the scalability when needed. Since the framework code itself wont impede your ability to deal with 10's of thousands of concurrent connections it's sometimes necessary to place an explicit restriction on the number of connections you will actually allow. The main reason for this is that each of your connections will use resources, be it memory, non paged pool, database connections, bandwidth, whatever. Before moving into production it useful to profile how your server operates on the hardware that you'll be using and then limit the number of connections that it can process to a number that allows it to operate safely within the hardware limitations. Other clients find that the ability to limit connections is useful to them for licensing their resulting products based on a X connections for $Y based licensing scheme. As with most aspects of the framework, connection limiting is 'pluggable' you can either use a standard implementation of a connection limiter or implement your own by creating a class that implements the
ILimitConnectionsinterface. - Asynchronous read timeouts - often you'll find that your client or server needs to act on inactivity. In an interactive server where you have a physical human being at the end of a connection you may need to close down connections when they haven't been used for a while. Unfortunately Winsock doesn't provide a way to set timeouts on asynchronous reads;
SO_RCVTIMEOsimply doesn't work on a non-blocking socket. The framework provides a very easy way to deal with this Winsock limitation in the form of the CReadTimeoutStreamSocketConnectionFilter class which comes with a complete sample server that demonstrates its usage. - Flow Control - to be able to send as much data as possible over a connection using asynchronous I/O you need to take an active interest in managing flow control between the peers involved. Failure to do this can result in unconstrained resource usage either on the client or the server as you continue to try and send data whilst the TCP stack's transmission window is full. The CFlowControlStreamSocketConnectionFilter provides a pluggable way to actively manage flow control over a TCP connection and comes with fully functional demonstration clients and servers which implement a high performance data distribution server using TCP.
- Automatic connection re-establishment - some clients and servers need to re-establish failed connections, possibly waiting a predetermined period of time before they retry the connection. The CConnectionMaintainingStreamSocketConnectionFilter class provides this functionality automatically for you with configurable connection re-establishment timeouts and the ability to selectively enable the re-establishment at the individual connection level.
- Compression - it's often desirable to compress a TCP stream to reduce the amount of data that needs to be sent. The
CCompressingStreamSocketConnectionFilter(which will be available with release 6.2 of the framework in early 2010) provides ZLib compression for your TCP connections automatically. You don't need to worry about compressing or decompressing the data and the filter can be added to existing servers and clients with minimal code changes. - Stream Filtering - all of the filters mentioned above are implemented using the filtering API which is part of the framework. This API makes the whole 'stream filtering' process available for your own plugins and the source code for the numerous filters that are supplied as part of the framework makes it easy to implement your own. Filters can be "stacked", so that it's easy to compose your server from existing functionality; you may have an SSL protected compressed TCP stream with explicit flow control, read timeouts and connection re-establishment simply by pushing the appropriate filters onto your stream. All filtering occurs before your own code gets to see the data so that you don't need to alter your business logic if you suddenly need to add compression, or security to a connection.
- Server collections - often you may need to build a server which listens on multiple ports to provide its functionality, or one that uses UDP and TCP or that provides both cleartext and encrypted connections on different ports. The CServerCollection and CNamedServerCollection classes provide an easy way to manage groups of server objects, allowing you to start, stop, and pause all or some of the servers easily. Of course, multiple server objects can share socket allocators, buffer allocators and the thread pool used for I/O operations to share the resources required for high performance.
- Efficient data broadcasting and scatter/gather I/O - the framework supports partial buffer reuse, buffer broadcasting and scatter/gather I/O with various buffer allocators. When broadcasting data or when building a response from standard parts and request specific parts the framework is optimised to provide you with the tools you need to be able to send data to multiple clients with the minimum of memory copying.
- Page aligned buffers - when striving for the ultimate performance it's often desirable to use page-aligned buffers for your socket I/O as this can help to restrict the number of memory pages that the operating system needs to lock in memory during data transfers. The standard framework buffer allocator, CBufferAllocator, has options that allow it to produce page-aligned buffers if you need it to.
- Connection collections - clients and servers often need to be able to deal with all, or a subset, of their connections from the context of any other connection. Chat servers may need to allow one client to send data to another, as may data distribution servers. The framework provides several 'connection collection' classes which make it easy to manage a collection of connections. The CStreamSockeConnectionCollection class deals with efficiently collecting connections into a grouping that you can search or iterate. The slightly more advanced CStreamSocketBroadcastableConnectionCollection class adds the ability to efficiently broadcast data to multiple connections without duplicating it. More about these classes can be found here.
- Monitoring - when striving to build a high performance system one of the most important things is the ability to measure the effect that your changes have. The framework contains many monitoring interfaces that allow you to simply plug a monitor into the heart of the code that runs your networking. These monitoring interfaces can then be hooked up to your preferred monitoring solution; I tend to go with 'perfmon' counters. This allows you to track and visualise the effects that your changes have during development and to monitor your product once it's in production.
- Lightweight timers - complex servers can often require several independent timers per connection, the lightweight timer implementation that ships with the framework makes it easy to manage 10's of thousands of timers. Since timers and periodic scheduling are such a common issue there's an easy to follow "How to" available that explains how the supplied demo servers use timers.
- User data - it's all very well having a server that can deal with 70,000 concurrent connections but what happens when you need to store data with each connection, looking up that data in a collection keyed on the connection identifier can ruin performance due to the locking required. Luckily the design of the framework allows you to associate your own data with each connection in such a way that you never need to lock to manipulate it from connection events on the connection; you can read more about it here.
Of course the framework contains much more. There are also several things that you can license separately to speed up your development; high performance, asynchronous SSL, SSPI 'Negotiate' support for Single Sign on, Kerberos and NTLM support. CLR hosting so that you can implement some of your server or client in managed code. The Performance counter library for easy server monitoring using perfmon. The Services library for easy service development. See the licensing options page of the latest framework documentation set for full details of what's available.
Of course, you might want to 'try' before you buy. All of the example servers are available for download as source code so that you can take a look at how real servers and clients are built using the framework; and if you want to obtain compiled versions of these examples so that you can perform your own performance tests on them then please get in touch.
November 17, 2009
Stream compression
Compressing TCP (and reliable UDP) streams is one of the things that often comes up in discussion with clients and I've helped several people implement stream compression using the TCP stream filtering API that's part of the server framework. The filtering API is also used to provide SSL security over a stream and for things like asynchronous read timeouts, read sequencing, flow control and automatic reconnects. It's a flexible API and the fact that it's pluggable means that you can pick and choose from the various filters that form part of the framework and add whichever combination you fancy.
I'm just finishing work on a filter that applies compression to the stream. Whilst it's always been pretty straight forward to help clients write their own it's always useful to fold good ideas into the framework so everyone can take advantage of them. The compressing filter takes a pluggable compressor and at present I have a compressor that uses ZLib for the compression.
As with the SSL filters you don't need to adjust your business logic code at all to take advantage of this functionality. Simply push the compressing filter onto your stream and you're done. Inbound data is uncompressed before you see it and outbound data is compressed once you write it to the socket.
This new filter will be available in release 6.2 of the server framework which currently doesn't have a release date but which shouldn't be expected before early 2010.
November 16, 2009
Latest release of licensed socket server code: 6.1.1
The latest release of the licensed version of the socket server framework is now available. This release includes the following changes.
Note that this is mainly a bug fix release.
The following changes were made to the libraries.
Admin Library - 6.1.1
- New build configuration options. All of these are enabled by defining the option to
1inConfig.hand disabled by defining them to0; the default state if you do not do anything inConfig.his shown for each option:
JETBYTE_USE_CAPTURE_STACK_BACK_TRACE- enabled by default. Define to0to prevent the use ofCaptureStackBackTrace()when building for platforms later than Windows XP. Normally we assume thatCaptureStackBackTrace()is available on Windows XP SP1 and later, however if you're building using the default Platform SDK that came with Visual Studio 2005 this does not includeCaptureStackBackTrace()and so our platform version check wrongly includes code that cannot build.JETBYTE_ADDITIONAL_BUFFER_TRACKING_CONTEXTandJETBYTE_ADDITIONAL_SOCKET_TRACKING_CONTEXT- by default the buffer and socket reference tracking code keeps one level of call stack context outside of the actual call within the library code that caused the reference to change. This is often enough context to track down the problem and keeps the amount of memory required for storing call stacks down. Sometimes, however, you need a little more context, in those situations you should defineJETBYTE_ADDITIONAL_BUFFER_TRACKING_CONTEXTand/orJETBYTE_ADDITIONAL_SOCKET_TRACKING_CONTEXTto a value that represents the additional number of stack frames that you want saved and displayed. Note that each additional level of call stack that is stored requires aDWORD64of space per call so these additional stack frames can soon add up.
Win32 Tools Library - 6.1.1
- Added JetByteTools::Win32::MurMurHash2 by Austin Appleby. See http://murmurhash.googlepages.com/. This is used
JetByteTools::Win32::CCallStackto provide a hash of the stack. - Updated
JetByteTools::Win32::StackWalkerto use the v10 (2009-11-01) version of Jochen Kalmbach's code. See http://stackwalker.codeplex.com for details. - Fixed a 'lack of locking' bug in
JetByteTools::Win32::CCallTrackerwhich caused problems during over-release reference dumps. - Added
JetByteTools::Win32::ITrackReferences::TrackAllocationsFrom,JetByteTools::Win32::ITrackReferences::ExcludeAllocationsFromandJetByteTools::Win32::ITrackReferences::TrackFromHerewhich allow you to restrict which socket or buffer allocators are tracked by the reference tracking code when it's enabled. This can be very useful in reducing the runtime performance cost of enabling reference tracking.
I/O Tools Library - 6.1.1
- Fixed a bug in
JetByteTools::IO::CRotatingAsyncFileLogwhich could cause the date stamp for a log file to end up with a date of 16010101. See here http://www.lenholgate.com/archives/000880.html for more details. - Fixed a bug in
JetByteTools::IO::CRotatingAsyncFileLogwhich caused the timestamp in the log to include 4 characters for the milliseconds rather than 3 resulting in 0999 rather than 999. - Changed how
JetByteTools::IO::CBufferReferenceTrackerworks. You can now get it to include additional call stack context by definingJETBYTE_ADDITIONAL_BUFFER_TRACKING_CONTEXT(inConfig.h) to the number of additional frames of context that you require.
Socket Tools Library - 6.1.1
- Changed how
JetByteTools::Socket::CSocketReferenceTrackerworks. You can now get it to include additional call stack context by definingJETBYTE_ADDITIONAL_SOCKET_TRACKING_CONTEXT(inConfig.h) to the number of additional frames of context that you require.
OpenSSL Tools Library - 6.1.1
- No changes.
SChannel Tools Library - 6.1.1
- Fixed a bug in
JetByteTools::SSPI::SChannel::CContextwhich would cause a small memory leak in derived classes (missing virtual destructor).
SSPINegotiate Tools Library - 6.1.1
- No changes.
SSPI Tools Library - 6.1.1
- No changes.
PerfMon Tools Library - 6.1.1
- No changes.
Service Tools Library - 6.1.1
- Fixed a bug with
JetByteTools::Service::CServiceStatus::RegisterForDeviceNotification()andJetByteTools::Service::CServiceStatus::RegisterForPowerSettingNotification()so that they work when called in service debug mode. Note that since the service is running as a normal exe in service debug mode we can't actually register for device and power notifications as we don't have a valid handle to our service in the SCM. Instead we simply skip these calls and output some debug.
CLR Hosting Tools Library - 6.1.1
- No changes.
Full details of the licensed version of the code are available here.
Full details of the free version of the code are available here.
No new documentation for this release.
Doxygen documentation for the 6.1 release is available here.
Doxygen documentation for the 6.0 release is available here.
Doxygen documentation for the 5.2 release is available here.
If you're an existing client and you'd like these changes let me know.
November 05, 2009
Talk to the bear
In the great tradition of explaining your problem to someone else as a way of fixing it yourself without their help...
I've nailed the rotating file log 16010101 bug! The problem was to do with the fact that when there was a new log ready for use and a log message switched it to be the active log it also set the 'next log time' to zero. If a time change notification came along just after this, whilst the 'next log time' was zero then it created a new log due to a bug in the time change handler; a test was missing a 'next log time != 0' test...
Interestingly, for me at least, this was the fix that I initially thought was required when I started investigating the problem yesterday. I didn't put the fix in as I couldn't reproduce the actual sequence of events that would lead to it actually being a fix and not just a random act of hopeful hackery. Anyway, after writing the last blog posting I worked out what was going wrong, knocked up a test that proved it and uncommented my 'fix' and it did indeed fix the problem.
Yay me! ;)
Bugs
It's been a bit of a week for bugs. First I had a bug report for the really rather old CodeProject socket server COM object. This allows people using VB to create a socket server that uses the free version of my server framework and IO Completion Ports. It works well, for what it is, and has been the basis for several custom objects that various clients have needed over the years. The bug involves the 'read string' functionality and either the ATL headers (specifically the narrow to wide character conversion macros) have changed since the code was originally written in 2002 in VC 6.0 or it never worked. Anyway I've posted a fix, here and there's a compiled version of the COM component available here.
Next up was a strange (and currently unsolved) problem with my rotating async file log. This is a log that uses async file writes to decouple the writing of log messages to disk from the thread that's logging the messages. One of its features is that it can be set up to change the log file every X period (hour, day, week, etc). It can also be told about time changes on the box, so that it doesn't get confused when daylight savings time changes, or when someone adjusts a clock. Anyway, whilst looking for an issue in a client's log file I noticed that the log file name ended in -16010101.log... This was a little unusual as the log file should have been for 20091103.log (or thereabouts). It looks like the file name is being created with an invalid system time, or one that's set to zero. I spent a few hours writing some tests to exercise the areas of code that I expected might cause this problem but couldn't get a reliable reproduction of it. If anyone else sees this happen please let me know. (The 'new file' creation code is somewhat over-engineered as the log uses a timer to create a new file 30 seconds or so before it needs it and then swaps the file in use for the new file when the first log message is written after the time when the new file should become active).
Finally I'm looking into various implementations of 'reliable' UDP for inclusion in the server framework; lots of people ask me about it (mostly games people) and although I've built an async version of the ENet protocol and integrated it with the framework it was one of the very few pieces of development where the client didn't want to trade cost for IP rights and so I can't reuse for other clients and the couple of clients who were willing to pay for an implementation from scratch have gone quiet on me. Anyway, so I'm looking at various reliable UDP alternatives such as UDT, RUDP, and on a related note DCCP, and possibly even something a simple as the "Carmack unreliable delta transfer" system. I decide that it might be a good idea to first extend the standard UDP server example and its test client so that I can have a base line of unreliable data flow to measure the improvements or potential improvements on and I discover that the EchoServerUDPTest program has been broken for several releases without me knowing. Basically it was never ported to the 'remove inappropriate use of pointers' release (which, I think, was pre 5.2). Anyway, it just goes to show what happens when code isn't automatically tested as part of the build and release process. All along the UDP side of things hasn't been automatically tested as well as the TCP side of things simply because of the U in UDP; it's harder to run even a simple black box echo test during a build if some of the packets might not get echoed...
So, I guess this week's theme is 'must try harder'.
October 29, 2009
Concurrency profiling with VS2010
I'm currently looking at the new concurrency profiling tools that come with Visual Studio 2010. It's interesting and useful stuff. The tools provide what I have attempted to develop in the past by using API Hooking but it's vastly superior in speed and functionality.
One of the big problems for me with my API Hooking concurrency tools (Deadlock detector and Lock Explorer) was that the instrumentation often caused the target process to run pretty slowly. There seems to be none of that with the concurrency tools in VS2010. The target process runs very nicely whilst it's being profiled and the tools then take a while to process the data after the profiling is complete.
The contention report is useful and the multi-threaded view of the process execution is also very useful. It's easy to spot major sources of contention and thread synchonisation (luckily so far the easy to spot culprit is my debug message dump; once that's removed there's hardly any contention when running my simple server under test, which is nice). Once again the MS STL is responsible for the hottest locks, I've fixed up my number to string conversion issues (see here) but there are still IMHO unnecessary and unexpected locks being held during some operations (iterators seem to all use a common mutex, for example). Obviously this may be something that can either be worked around or explained away once it's understood but I expect it will be interesting to compare the lock contention reports for a server built with MS STL and one built with STLPort.
Now, of course, I need to build a version of STLPort for VS2010...
October 23, 2009
.Net 4.0 Hosting
I've been playing with Visual Studio 2010 Beta 2 and .Net 4.0, building code, running tests, playing with the IDE, etc. The first issue that I've come across with my existing codebase is that the .Net 2.0 hosting APIs (such as CorBindToRuntimeEx)are now deprecated and there's a whole new way of hosting the CLR.
We've been quite successful in hosting the CLR from within our C++ servers, either to provide servers that support a mix of managed/unmanaged plugins as a pluggable high performance windows application server or to provide network protocol support in C++ (such as ENet) with 'business logic' being written in managed code. The .Net 2.0 hosting API works OK but is not without some annoyances. Over the next couple of weeks I hope to take a look at the new hosting interface and report here on my findings. With any luck the server framework will support the new hosting interfaces in release 6.2 which currently has no scheduled release date but which I expect will appear early in 2010.
