High throughput, low latency

As I said in my recent posting about Data Distribution Servers, “Next on the list is writing a more focused server and clients.”. Tick.

I started out by writing the data feed. This was a simplified version of the echo server test harness that I’d extended to use a controllable TCP receive window. The data feed is just a client that generates random data in packets that have simple header, length, sequence number, type and sends it to the server that it’s connected to. It doesn’t get anything sent back and it uses the various options that I discussed last time to enable it to push data as fast as it can without using all your machine resources.

Next was a server. This was pretty easy to put together with The Server Framework. The server listens on two ports, one for data feeds and one for feed clients. Data feeds connect and send data that needs to be distributed. Feed clients connect and receive any data that arrives at the server. The servers listening on the two ports are connected together by a “distributor” (the only piece of custom code in the whole server). The distributor maintains a collection of feed clients and allows the data feeds server to send all data that arrives to all clients. Since, at present, this example is a simple broadcast of all of the inbound data to all of the outbound clients we use the CBufferHandle class that was developed for use in an auction server and that allows you to send the same buffer to multiple connections. The framework’s data buffers contain all of the I/O Completion port housekeeping data that they require as well as the data and this means that you can’t simultaneously send a single buffer to multiple connections. The CBufferHandle class allows you to attach multiple sets of ‘housekeeping data’ to a buffer so that you can safely send it on multiple connections. The key point about this is that if you’re sending the same data to each connection then you don’t need to duplicate it.

Finally the Feed Client. Again this is just another hacked around version of the standard echo server test client. This can make multiple connections to a server and simply accepts data.

With these pieces in place I can run up the server, add several feed clients and then run up one or more data feeds, and watch. The server behaves as expected, performance is pretty good though I’d like to expose some performance counters so that it’s easier to monitor what’s going on. With everything on a 1gb lan things are good, but, as expected, with the data feed on a machine with a 1gb connection to the server and a feed client on a machine with a 100mb connection the client’s connection gets swamped and the server uses an unrestrained amount of non-paged pool trying to send everything it can.

So, the next steps are to take the write completion driven flow control that’s used in the Data Feed and finally implement a connection filter based version that can be used in the server on its Feed Client connections. This will need to keep track of the number of outstanding write completions and buffer a configurable number of writes when it is unable to send. Then, when it can send, it should send from the buffered data first and then resume sending directly. If it gets to the point where it has buffered a configurable ’too much’ then it should allow the user to configure a policy for dealing with the situation (throw away new writes or throw away buffered writes…). Once that’s done some performance counters on the server to clearly show the data flow and we’re almost there.