Simple Echo Servers

| 2 Comments

A long time ago when I wrote my first article on high performance TCP/IP servers for Windows using IO Completion ports over on CodeProject I complained that "Also the more complicated examples, ones that used IO completion ports for example, tended to stop short of demonstrating real world usage. After all, anyone can write an echo server... "

Yet here I am, three and a half years later with a huge collection of different kinds of echo server.

Whilst I still agree with my original view point, you can demonstrate a lot of server techniques using a simple echo server. What I tend to do these days when writing example servers for The Server Framework is build example servers that demonstrate one particular aspect of a server and have the actual core or the server just echo the received bytes back. I can then build on each of these example servers adding more and more functionality (running as a service, using a separate thread pool for to do the "work", publishing performance counters, etc) whilst always being able to simply replace the single line echo implementation with the real logic that the client requires.

Since I've been getting a lot of requests about performance figures I've decided to upload the latest set of compiled servers so that, with the aid of my TCP/IP server test tool, or their own tools, potential clients can do their own testing if they want to.

The first two servers that I'll present are the basic EchoServer and the AcceptEx() version, EchoServerEx. Due to the way I structure my code samples these are actually called 01-EchoServer and 01-EchoServerEx. All of the "01" examples are simple servers where all of the work is done on threads from the IO thread pool. There other example server groupings, 04 are "business logic thread pool" servers, 07 are SSL servers, etc. Since I want to present both the debug and release builds of the servers the actual executables are named in the following form D01-EchoServer-1.0.exe for the debug build and R01-EchoSerber-1.0.exe for the release build, etc. The debug builds include logging of various server state to the console (where available) and a log file. The release builds are considerably more performant as they don't do any of the logging and therefore do not need to synchronize around the log file.

Since these samples usually ship as source code to client's that have licensed The Server Framework they are usually configured by changing the code. Since that's not really appropriate for the compiled versions I've added a simple command line interface to configure these console based samples.

The only required command line parameter for both of these servers is the -port which configures the server to listen on a particular port. With no other command line arguments supplied, both servers will listen on the specified port on all interfaces in the machine. You can restrict them to a single interface by supplying the dotted ip address of the interface that you want them to listen on via the -server parameter.

So, if you want to run up the debug echo server and have it listen on port 5050 on all of your interfaces you'd do this:

D01-EchoServer-1.0.exe -port 5050

The other command line parameters and switches are as follows (please note that all default values are arbitrary values of no particular significance):

  • -numberOfLocks - The sockets used by the server share CriticalSection objects by using a hash map to provide one of a set of locks to each socket based on the socket structure's address. The number of locks in the pool is set by this parameter which defaults to 47. It's best if the value is a prime number. The larger the value the more locks will be allocated and the less each socket will cause contention with other sockets. The locks aren't used very often so there usually isn't a great deal of contention between sockets, but you can tune the system to get better performance if you like.

  • -numberOfIOThreads - Each server has a pool of IO threads which service the IO Completion Port and deal with all of the socket IO. This setting determines how many threads are present in this pool. It defaults to 0 which sets the number of threads as 2 x the number of processors in the machine.

  • -listenBacklog - This setting sets the listen backlog queue for the servers. In the Accept based server this is actually the real listen backlog that's supplied to the listening socket. In the AcceptEx server it's the number of accepts to "post". This defaults to 5; for servers that need to accept lots of near simultaneous connections you should increase it considerably. For my recent tests of 10000 concurrent near simultaneous connections I set it to 150 for both servers. There are platform specific limits to the physical listen backlog queue on some Windows operating systems (Windows XP Home limits this to 5, for example). Please test on a 'server class' operating system before assuming there's something wrong with my code ;)

  • -socketPoolSize - The server maintains a pool of socket structures for reuse. This means that the server often doesn't need to dynamically allocate a new socket data structure when it creates a new socket. This improves performance. This value determines the size of the pool. It defaults to 10. This only affects the number of pooled sockets, so if you create 11 connections and then close them all only 1 of the corresponding socket data structures will be released, the other 10 will be pooled for reuse. Note that this will cause the server's memory footprint to grow and then not shrink back as much as you might expect, that's by design.

  • -bufferPoolSize - Each read or write operation requires one (or more) data buffers. This setting allows you to specify the size of the data buffer pool. Like the socket pool this maintains buffers in memory for reuse. It defaults to 10.

  • -bufferSize - For simple echo servers the size of the data buffers used isn't really an issue. This is the maximum amount of data that can be read or written in one operation, you'll probably get better performance if you increase this value. This defaults is 1024 bytes which is an arbitrary size of no particular meaning. If you make this value very large then you will start to see how the server deals with the "i/o page lock limit".

  • -displayDebug - This only applies to the debug builds of the servers. If set the server displays debug to show buffer and socket life cycles and IO pool events.

  • -useSequenceNumbers - This determines if sockets with knowledge of sequencing are used or not. These simple servers never have more than a single read pending at any one time so they don't require sequence numbers to keep the data stream in order. Not using sequence numbers for these servers should improve performance a little but I haven't profiled it yet.

  • -noWelcomeMessage - If set then the server doesn't display its usual sign on welcome message. Set this if you want to run the EchoServerTest program against the servers; failure to do so will result in the welcome message triggering an additional send from the test client due to how -sendAfterRecv works...

The servers are here:

Note that if these examples interest you then you might also be interested in WASP, see here for more details.

2 Comments

Greetings and thanks for the interesting articles posted on codeproject!

My primary reason for contacting you is to see if you are available to provide some client/server socket logic to me and to hear your ideas about what cost would be involved?

My situation is this:
1) I have a client/server design based on chatter/chatsrv, which seems to work extremely well.
2) Allthough, chatter/chatsrv is a messy example, i have managed to refactor it so that the socket logic is well isolated.
3) if i could replace this socket logic with something that is more scalable (provided by you), i would be very interested to do so.

Here is a link to the portion of my design that has something to do with sockets...
http://www.activemetrics.com/CSocketBasedClientServer.jpg

I think you can guess from the picture how the logic goes...
CLIENT: Client is only used in the client application. It instantiates a MfcSocketLogic object and initializes it (i.e. connects to the server)
SERVER: Server is only used in the server application. it listens for connections and creates an MfcSocketLogic object for each new connection
MYMESSAGE: MyMessage is a generic structure that the client and server pass back and forth to communicate anything at all. It knows nothing about sockets but it can serialize to and from a CArchive object.
MFCSOCKETLOGIC: MfcSocketLogic binds each communication socket to 2 CArchive objects (1 for input and 1 for output). This way, it can easily do ProcessPendingRead() or SendData() by simply calling MyMessage::Serialize()

I am impressed by your designs and articles. However, i am concerned about getting the details right. Especially with regard to managing the buffersize efficiently and getting the client-side logic right. So i prefer to engage you directly rather than rework your code for my purpose.

Please let me know if you are interested to provide me with a set of classes that would have similar functionality to what i show here, but with improved performance/scalability/reliability as desscribed in your articles? And also what costs you might predict?

(by the way, i intend to maintain the mfc socket logic for older pcs. so this means that your high-performance logic need not support older versions of windows)

(also, i am happy to simply get an example project from you that simply matches my needs more closely than what is at codeproject. any simple example that shows a client and server passing some objects back and forth would probably do the trick.)

thanks again for the excellent articles and code examples. and also for responding to so many posts at codeproject. it really impresses me when authors make time to do that.

best regards,
-chris

Thanks Chris, will respond by email.

Leave a comment