Windows TCP/IP Server Performance

| 20 Comments

For the last week or so I've been working on measuring and improving the performance of The Server Framework. The latest release of the free version of my asynchronous, windows, IOCP based, socket server framework can now be obtained from here at ServerFramework.com.

This week I've had several potential new clients asking about performance numbers for various aspects of the system and so I thought it best to do some measuring. To be able to measure I first had to build a tool to help me.

My first potential client was interested in how many concurrent connections our framework could support. I'd never really tested this and whilst I knew that there were system wide limits that you hit when creating lots of socket connections I'd never looked into exactly what the limits were and when they caused problems. So the first testing that I did was to see how many connections I could establish. After a few false starts, I detailed my findings here. In summary the framework doesn't impose any limits other than those which the the operating system and available memory impose. On a Windows Server 2003 machine with 760MB ram I could achieve more than 70,000 concurrent connections from multiple clients.

Of course, just opening connections and doing nothing isn't likely to be a real world scenario. I added the ability to have the test client make each connection send data to the server and for the server to echo it back. This tests connections into a busy server but, at least initially, this is equally unrealistic as the test sent data as fast as it could on all connections. It thrashed the server that was under test but it didn't really tell me a great deal.

My next potential client wanted to know about the performance of our UDP code when sending a steady stream of datagrams to multiple clients. They're writing a media streaming server and they were interested in efficiently sending datagrams every 20ms or so. To get my head around their problem I set up a simple test server that used our timer queue class to associate a timer with each socket and then send data when the timer expired. Whilst doing this I discovered some performance "issues" with the choice of algorithm used by the timer queue; but they were easy to address thanks to the tests I had in place.

Although not directly related to the testing that I was doing for our TCP servers, once I'd written the code to pace the data sending for the UDP server example I decided to use a similar approach with the TCP test tool. Adding the ability to have each connection send data at a specific rate (i.e. a message every Xms) meant that I could test the servers under much more realistic (and tuneable) loads.

Finally someone over on CodeProject was interested in comparing the performance of an AcceptEx based server with our standard Accept based server and with a server that they'd written themselves. I added a few more options to the test tool to make these kinds of tests easier.

The result is a fairly useful tool, EchoServerTest, which is available for download here. The tool is built using the same "server" framework that the servers use. At the point we added connection establishment support to the framework I suppose it stopped being purely a server framework... Anyway, EchoServerTest uses IO Completion Ports and overlapped IO. It creates 2 IO threads per processor and creates another thread to manage the timers that drive the data flow. The main thread sits and waits until everything else completes. You can shut the test down early by using the ServerShutdown tool.

EchoServerTest has several command line parameters and switches which can be combined to create various kinds of tests.

Only two command line parameters are required, the rest are optional. The required parameters are these:

  • -server - The server address to connect to (dotted ip).

  • -port - The port to connect to.

So you might execute a test like this:
EchoServerTest -server 192.168.0.2 -port 5050

Which will use defaults for everything else and attempt to create 1000 connections to the specified server. All of the other parameters are optional and allow you to tune your testing.
  • -connections - The number of connections to create. This defaults to 1000.

  • -connectionBatchSize - Batch connections in groups of X size. This defaults to 0 (no batching). If you specify a value then progress will be displayed as each batch of connections have been initiated.

  • -connectionBatchDelay - Delay for Y milliseconds between each batch. This defaults to 0 (no delay). If you specify a value then there will be a pause between each batch of connections. This can let you simulate X connections per second, etc.

  • -messages - The number of messages to send on each connection. This defaults to 0 (don't send any data). If you specify a value for this then data will be sent on each connection once the connection is established.

  • -messageSize - The size of each message. This defaults to 1024 bytes

  • -messageRate - Delay for Y milliseconds between each message. This defaults to 1000. By changing this value you can vary the rate of data flowing to the server.

  • -sendAfterRecv - Wait for the echo of the previous message before starting the timer for the next send. If you don't set this then the timer for the next send starts when the current message has been sent. If you do set this then the timer isn't started until the current message has been echoed back by the server. Setting this option reduces the pace of the message flow to one that's controlled by the server speed. If you don't set this flag then you will see why network protocols without explicit flow control built into the protocol are a bad idea.

  • -syncConnect - Use synchronous connect rather than ConnectEx(). Without this switch all connections are established asynchronously using ConnectEx(). This means that you can send connection requests as fast as the machine you're running the test on and its network connection allows (good for testing how a server deals with lots of near simultaneous connection requests). If you set this option then the test uses synchronous connects (good for testing how many concurrent connections a server can support).

  • -hold - Hold all connections open until the test completes. If this option is not set then the connections are allowed to close when the data flow is complete (or straight away if -messages is not set).
  • -pause - Wait for a key press once data flow has completed before closing connections. Use this with -hold if if you want to run a "max concurrent connections" test from multiple clients.
  • -preallocate - Preallocate sockets and buffers before beginning the test. This speeds up the operation of the test very slightly so that it can, potentially, make more connection attempts per second.

Typical usage scenarios might be:
  • Run from multiple clients against a server to test the maximum number of concurrent connections that the server can support.
    EchoServerTest -server 192.168.0.2 -port 5050 -syncConnect -hold -pause -connections 40000

  • As above but on a busy server.
    EchoServerTest -server 192.168.0.2 -port 5050 -syncConnect -hold -pause -connections 40000 -messages 100 -messageSize 512 -messagerate 2000 -sendAfterRecv

  • As above but on a busy server with a poorly designed protocol (watch the non-paged memory usage on both client and server machines as the TCP/IP stack is used to buffer all the data).
    EchoServerTest -server 192.168.0.2 -port 5050 -syncConnect -hold -pause -connections 40000 -messages 100 -messageSize 1000 -messageRate 100

  • Run from multiple clients against a server to test how well it handles accepting lots of near simultaneous connection attempts.
    EchoServerTest -server 192.168.0.2 -port 5050 -connections 40000 -connectionBatchSize 5000 -connectionBatchDelay 500

  • etc...

Timings and details of any errors, etc are reported when the test completes. Please bear in mind that, given appropriate hardware, you can easily use this test to flood your network and that once you do that you're testing all manner of other things rather than just your server's performance. For example, you may see the test harness taking a long while to complete as it waits for the TCP/IP stacks to deal with retransmissions to get packets through, etc. You may see your servers showing connections that haven't closed yet though you think they should have (again due to retransmission of lost packets).

Note that test doesn't check the data in the packets that it receives. I have other tests for that kind of thing so I don't need to bother in this test program. I expect I may do so later on.

Things that are still on my list of things to do.

  • Add SSL support

  • Validate the echoed data

  • Add a pluggable architecture to allow for custom message flow in a similar way to the way our C# test harness works.

The tool is available for download here. Please report bugs or problems via the comments on this article.

20 Comments

tried this utility, but it seems i can run it only on the tested server. might be some network issue (don't have firewalls installed though). i get an output:

- the specified network name is no longer available

- an astablished connection was aborted by the software in your host machine

- an existing connection was forcibly closed by the remote host

Test passed.

btw, what are the guidelines for designing a good protocol? a memory alignment? there must be something more.

thanks.

Can you ping from one machine to another? If not then fix your networking first. Can you telnet to the echo server and get a connection and have data echoed back?

Protocol design, look at existing RFCs for established protocols. I'm not sure what you mean by memory alignment.

I've fixed the broken links that were caused by a last minute decision to zip up the exes.

quite strange ....

i can telnet to my echo server from clients (diferent machines) and netstat -p tcp on the server shows active connections correctly, but when i try to run this utility from clients i get the same output as when i run just EchoServerTest with no parameters (help how to run the utility). EchoServerTest utility works only in case i run it on the machine that acts as a server. The output is:

Creating 1000 connections
All connections in progress
All connections complete in 509ms
1000 established. 0 failed.
Connections reset
302 - The specified network name is no longer available.
533 - An existing connection was forcibly closed by the remote host.
Test Passed

Alek

What operating system does the client machines run?

What happens if you set the number of connections to 1?

clients run win 2000 and server runs win 2003. i'll perform more tests on monday and i'll let you know ... cheers.

if win 2000 machine acts as a server everything is fine. it must be some win 2003 setting that rejects connections ?? need to investigate ....

Some kind of Firewall?

Seems to work find on my clean install of 2003 in a VMWare box.

I might be experiencing hardware issues you mentioned in your article. What was your test environment regarding hardware? (bridges, switches 100 Mb/s or ? ..)

2 x windows xp SP2 machines on a 100mb/s switch.

or Windows NT 4 sp (latest) and Win2003 server in VMWare boxes on virtual networks and on 100mb/s to Windows XP SP2.

Describe the problems you're having, email if you'd prefer.

one more thing that puzzles me big time:

after running EchoServerTest utility on the same machine that acts as a server i got the result:

C:\source\Misc\networks\utils>EchoServerTest -server 10.0.2.1 -port 5010

Creating 1000 connections
All connections in progress
All connections complete in 612ms
1000 established. 0 failed.
Connections reset
242 - The specified network name is no longer available.
738 - An existing connection was forcibly closed by the remote host.
Test Passed

So, it says 1000 established and 0 failed (even though some errors have been reported). I have a line of a code in

void CSocketServer::OnConnectionEstablished(
Socket *pSocket,
CIOBuffer * /*pAddress*/)
{
Output(_T("OnConnectionEstablished"));

pSocket->Write(m_welcomeMessage.c_str(), m_welcomeMessage.length());

pSocket->Read();

::InterlockedIncrement(&m_iCnt);
}

that counts the number of established connections and after the test it's value was 20 ?? Am I missing something?

If I use batch options (delay between connections 20 ms) I can get approx. 374 connections even though utility reports close to 700 connections. I'm using EchoServer from the ServerSocket project. It sounds like a network issue but I'm not sure. Network is a mix of 10/100 Mb/s equipment.

I haven't run the test against the last publically available code. I'll do that today. Have you tried running against the precompiled latest servers that I posted here?

The test results you're seeing show that all of the connection attempts have completed OK (which should mean that your counter gets set to 1000) but that the server then closes the connections with an abortive close rather than a graceful shutdown. If you have some kind of firewall on the server machine then this may be intercepting the connection attempts. Have you tried on a clean installation of the OS? Have you tried setting the batch size to 1 and the delay to 500 ? this would show if it was something to do with the speed of the connection attempts... Have you tried with the -hold -pause options to try and keep the connection open?

Just tried against precompiled code (release) and performance of the AcceptEx release version is superior. 1000 connestions in 870 ms, no errors, everything is clean. WSAAccept version has a bit weaker performance 877 established, 123 failed in 2068 ms but with the following error: The remote system refused the network connection.

publically available code (now in the release mode for the first time), 1000 connections, no errors!, everything completed in 1231 ms. It seems that release mode does make a difference :) .. second test 919 connections established, 81 failed in 1517 ms, now with the "The remote system refused the network connection" error. third one, all connections complete in 619ms, 1000 established, 0 failed, no errors.

hmm, results vary from test to test, apart from AcceptEx which is very consistent. max number of connections i achieved on my system is 3790 for all server versions ....

Haven't tried on a clean machine. Played with switches but in a debug version with results I told you about. I would like to hear results you achieved in your tests today.

thanks a lot.

Alek,

Cool. That's what I would expect. I'm not too surprised that the results vary a little and I'm not surprised that release is considerably better than debug.

The 3790 limit is due to the default limit for outbound connections (MaxUserPort). See here for how to change that; I set it to 65534. The default value is causing the client to run out of ports and therefore fail. If you ran the client on multiple machines they would all be able to establish that many connections...

Doesn't happen everyday with me but you helped me fail my server and for **that**, I am so grateful to you :)

My server doesn't seem to keep pace with lots of simultaneous incoming connections. Curious to know, does your server pass this:
choServerTest -server 192.168.0.2 -port 5050 -connections 40000 -connectionBatchSize 5000 -connectionBatchDelay 500

My implementation doesn't even pass following set:
-connections 1000 -connectionBatchSize 500 -connectionBatchDelay 5000

Can you advice what should I consider to make accepting connections faster? What do you think about so many simultaneous connections in real world? Would you try to accept them?

What have you set your listen backlog to? How are you accepting connections? The 40,000 and the 1,000 examples you show are probably failing for the same reason; that is the rate 500 connections per batch.

This may or may not be a reasonable real world scenario. Try batches of 100? Or, ideally less something around your listen backlog size.

Remember this is testing for 500 new simultaneous connections every 5 seconds... But given how the test program uses async connection establishment you're firing all 500 of those off 'at the same time' (ish).

You could try 100/1000 rather than 500/5000 but even then that's a LOT of simultaneous connections.

And yes, my C++ code handles these cases without a problem.

Len, I am interested in the source code to your EchoTestServer test harness. Is it possible to get a copy of the source. I need to adapt it for custom testing such as executing N commands to my business logic server. Could you mail me a response on how I can get a copy? Thanks in advance...

The source to the EchoServerTest program is supplied as part of The Server Framework's Core Framework. The example itself can be downloaded for free from here, but you need a license to The Server Framework to actually build it.

Leave a comment

About this Entry

RSS feeds for Microsoft Knowledge Base articles was the previous entry in this blog.

Simple Echo Servers is the next entry in this blog.

I usually write about C++ development on Windows platforms, but I often ramble on about other less technical stuff...

Find recent content on the main index or look in the archives to find all content.

I have other blogs...

Subscribe to feed The Server Framework - high performance server development
Subscribe to feed Lock Explorer - deadlock detection and multi-threaded performance tools
Subscribe to feed l'Hexapod - embedded electronics and robotics
Subscribe to feed MegèveSki - skiing