Windows TCP/IP Server Performance

For the last week or so I’ve been working on measuring and improving the performance of The Server Framework. The latest release of the free version of my asynchronous, windows, IOCP based, socket server framework can now be obtained from here at ServerFramework.com.

This week I’ve had several potential new clients asking about performance numbers for various aspects of the system and so I thought it best to do some measuring. To be able to measure I first had to build a tool to help me.

My first potential client was interested in how many concurrent connections our framework could support. I’d never really tested this and whilst I knew that there were system wide limits that you hit when creating lots of socket connections I’d never looked into exactly what the limits were and when they caused problems. So the first testing that I did was to see how many connections I could establish. After a few false starts, I detailed my findings here. In summary the framework doesn’t impose any limits other than those which the the operating system and available memory impose. On a Windows Server 2003 machine with 760MB ram I could achieve more than 70,000 concurrent connections from multiple clients.

Of course, just opening connections and doing nothing isn’t likely to be a real world scenario. I added the ability to have the test client make each connection send data to the server and for the server to echo it back. This tests connections into a busy server but, at least initially, this is equally unrealistic as the test sent data as fast as it could on all connections. It thrashed the server that was under test but it didn’t really tell me a great deal.

My next potential client wanted to know about the performance of our UDP code when sending a steady stream of datagrams to multiple clients. They’re writing a media streaming server and they were interested in efficiently sending datagrams every 20ms or so. To get my head around their problem I set up a simple test server that used our timer queue class to associate a timer with each socket and then send data when the timer expired. Whilst doing this I discovered some performance “issues” with the choice of algorithm used by the timer queue; but they were easy to address thanks to the tests I had in place.

Although not directly related to the testing that I was doing for our TCP servers, once I’d written the code to pace the data sending for the UDP server example I decided to use a similar approach with the TCP test tool. Adding the ability to have each connection send data at a specific rate (i.e. a message every Xms) meant that I could test the servers under much more realistic (and tuneable) loads.

Finally someone over on CodeProject was interested in comparing the performance of an AcceptEx based server with our standard Accept based server and with a server that they’d written themselves. I added a few more options to the test tool to make these kinds of tests easier.

The result is a fairly useful tool, EchoServerTest, which is available for download here. The tool is built using the same “server” framework that the servers use. At the point we added connection establishment support to the framework I suppose it stopped being purely a server framework… Anyway, EchoServerTest uses IO Completion Ports and overlapped IO. It creates 2 IO threads per processor and creates another thread to manage the timers that drive the data flow. The main thread sits and waits until everything else completes. You can shut the test down early by using the ServerShutdown tool.

EchoServerTest has several command line parameters and switches which can be combined to create various kinds of tests.

Only two command line parameters are required, the rest are optional. The required parameters are these:

  • -server - The server address to connect to (dotted ip).
  • -port - The port to connect to.

So you might execute a test like this:

EchoServerTest -server 192.168.0.2 -port 5050

Which will use defaults for everything else and attempt to create 1000 connections to the specified server. All of the other parameters are optional and allow you to tune your testing.

  • -connections - The number of connections to create. This defaults to 1000.
  • -connectionBatchSize - Batch connections in groups of X size. This defaults to 0 (no batching). If you specify a value then progress will be displayed as each batch of connections have been initiated.
  • -connectionBatchDelay - Delay for Y milliseconds between each batch. This defaults to 0 (no delay). If you specify a value then there will be a pause between each batch of connections. This can let you simulate X connections per second, etc.
  • -messages - The number of messages to send on each connection. This defaults to 0 (don’t send any data). If you specify a value for this then data will be sent on each connection once the connection is established.
  • -messageSize - The size of each message. This defaults to 1024 bytes
  • -messageRate - Delay for Y milliseconds between each message. This defaults to 1000. By changing this value you can vary the rate of data flowing to the server.
  • -sendAfterRecv - Wait for the echo of the previous message before starting the timer for the next send. If you don’t set this then the timer for the next send starts when the current message has been sent. If you do set this then the timer isn’t started until the current message has been echoed back by the server. Setting this option reduces the pace of the message flow to one that’s controlled by the server speed. If you don’t set this flag then you will see why network protocols without explicit flow control built into the protocol are a bad idea.
  • -syncConnect - Use synchronous connect rather than ConnectEx(). Without this switch all connections are established asynchronously using ConnectEx(). This means that you can send connection requests as fast as the machine you’re running the test on and its network connection allows (good for testing how a server deals with lots of near simultaneous connection requests). If you set this option then the test uses synchronous connects (good for testing how many concurrent connections a server can support).
  • -hold - Hold all connections open until the test completes. If this option is not set then the connections are allowed to close when the data flow is complete (or straight away if -messages is not set).
  • -pause - Wait for a key press once data flow has completed before closing connections. Use this with -hold if if you want to run a “max concurrent connections” test from multiple clients.
  • -preallocate - Preallocate sockets and buffers before beginning the test. This speeds up the operation of the test very slightly so that it can, potentially, make more connection attempts per second.

Typical usage scenarios might be:

  • Run from multiple clients against a server to test the maximum number of concurrent connections that the server can support.
EchoServerTest -server 192.168.0.2 -port 5050 -syncConnect -hold -pause -connections 40000
  • As above but on a busy server.
EchoServerTest -server 192.168.0.2 -port 5050 -syncConnect -hold -pause -connections 40000 -messages 100 -messageSize 512 -messagerate 2000 -sendAfterRecv
  • As above but on a busy server with a poorly designed protocol (watch the non-paged memory usage on both client and server machines as the TCP/IP stack is used to buffer all the data).
EchoServerTest -server 192.168.0.2 -port 5050 -syncConnect -hold -pause -connections 40000 -messages 100 -messageSize 1000 -messageRate 100
  • Run from multiple clients against a server to test how well it handles accepting lots of near simultaneous connection attempts.
EchoServerTest -server 192.168.0.2 -port 5050 -connections 40000 -connectionBatchSize 5000 -connectionBatchDelay 500
  • etc…

Timings and details of any errors, etc are reported when the test completes. Please bear in mind that, given appropriate hardware, you can easily use this test to flood your network and that once you do that you’re testing all manner of other things rather than just your server’s performance. For example, you may see the test harness taking a long while to complete as it waits for the TCP/IP stacks to deal with retransmissions to get packets through, etc. You may see your servers showing connections that haven’t closed yet though you think they should have (again due to retransmission of lost packets).

Note that test doesn’t check the data in the packets that it receives. I have other tests for that kind of thing so I don’t need to bother in this test program. I expect I may do so later on.

Things that are still on my list of things to do.

  • Add SSL support
  • Validate the echoed data
  • Add a pluggable architecture to allow for custom message flow in a similar way to the way our C# test harness works.

The tool is available for download here. Please report bugs or problems via the comments on this article.