Socket connection termination

| 4 Comments

I've been putting together a sample server for a client that shows how to cleanly terminate a socket connection. This should have been a simple thing to do, but in doing so I've discovered some gnarlyness within The Server Framework and the result has been some new TODO items for the 5.3 release...

When you have an active TCP/IP connection that you wish to terminate cleanly you need to initiate a TCP/IP protocol level shutdown sequence by calling shutdown(). This sends the appropriate packets between the two TCP/IP stacks (server and client) and terminates the connection. Once this is done you can close the socket by calling closesocket(); this cleans up the resources used by the socket (and associated data structures) within your program. Closing the socket without initiating the protocol level shutdown sequence implicitly triggers the shutdown sequence. This is explained here "Graceful shutdown, linger options, and socket closure".

Simple servers written using The Server Framework tend to operate as follows: When an incoming connection is detected an asynchronous read is issued, this increments a reference count on our socket class. When a read completes the last thing that happens before the function returns to the calling code within the framework is that a new read is issued. If the client closes the connection the pending read within the server returns with 0 bytes read, this is interpreted as a 'client close' and no further reads can be issued on the socket. This, eventually, causes the reference count on the socket to fall to 0 and the socket gets cleaned up. Part of that clean up involves calling closesocket(). If the server wants to terminate the connection then it calls Shutdown(ShutdownSend) on its socket to indicate to the client that it has no more data to send and this eventually results in the client shutting down its socket and the server socket cleanup sequence that I described earlier.

Due to the way the server is designed, there's some 'clever stuff' in there to make sure that if you have several writes pending but not yet issued by the framework then the call to shutdown() occurs after the last write has actually been passed off to the TCP/IP stack.

The socket class also exposes a Close() method which calls closesocket() on the socket directly; that is it doesn't do 'clever stuff' to deal with outbound data that is 'in flight' within the framework. You probably don't want to call Close() unless you don't care if the data gets to the other end or not; or if you know that there's no data 'in flight'.

It gets more complex...

Due to either my misreading of the docs for closesocket() (or the fact that they were originally less clear and have since been clarified) it was my belief that a graceful shutdown using closesocket() would block. Since one of the most important design decisions of the framework is that work done on the I/O threads should not block the default behaviour for the automatic socket closure that happens when a socket is being cleaned up is for the close to be a 'hard' or 'abortive' close. That is we deliberately choose not to linger. Because this isn't always what you want (no kidding!) there's some code in there that allows a user to intercept the default behaviour and, potentially, call CloseSocket() yourself or to marshal the CloseSocket() call off to your own threads so that it could block them instead. However, graceful shutdowns that occur due to closesocket() do not block, so, it seems, most of that code isn't really needed...

Some of the example servers in the 5.2.1 release get this wrong, it doesn't cause them to lose data, since they're not doing anything that complex, but a more complex server that has been modelled on one of the examples may have problems. If you've been having this problem then I'm sure you'd have contacted me already, but, if not, do get in touch and I'll help sort things out for you.

So, in summary, at present, in version 5.2.1 of The Server Framework or before, you should generally be calling Shutdown() to terminate your connections and the framework will deal with the resource cleanup and eventual call of closesocket() itself. You can call Close() but you shouldn't do that unless you KNOW that there cannot be any data 'in fligh' that the server has sent but that the client might not have recieved, OR you don't care if the data gets to the client.

This will become nicer in 5.3, I hope. I plan to make "standard" connection termination easier to manage and provide access to the, currently private, AbortiveClose() method on the socket class; this sets the socket's linger options in such a way that the socket is closed immediately and all pending data is discarded. What's especially useful is that this also sends a RST (reset) on the TCP/IP connection and this closes the connection without putting the closer into the TIME_WAIT state; which is useful sometimes.

4 Comments

Hi Len,

In my testing with socket framework, I found that the shutdown method always leave a connection to be either fin_wait_1 or time_wait state. That means, that port cannot be reuse after certain time period (in my case it is 30 sec. after a registry change). In a high traffic volume, this can be an issue because there are only a finite number of socket connections can be opened concurrently on a PC.

I found that the abortiveclose method cleaned a port nicely and cancel any pending data, except that it is a private method.

What is the harm of exposing this method and replace any shutdown calls with the abortiveclose method? Is there extra that I need to do?

Dennis

Shutdown initiates a TCP/IP active close. If you look at a TCP/IP state transition diagram you'll see that whoever initiates the active close ends up into TIME_WAIT. This is by design. The usual way to avoid this on the server is to have the client initiate the active close. This leaves the TIME_WAIT on the client and allows the server to reuse resources quicker. Using AbortiveClose() causes the connection to be reset; it does a non-lingering close; it doesn't close the connection cleanly, a TCP/IP shutdown sequence doesn't occur and data is likely to be lost, etc. Resetting the connection in this way doesn't result in either side being in a TIME_WAIT.

I've just been working on the licensed version of the code and have changed how a connection can be shut down. I've exposed AbortiveClose() for those people who want to tear down a connection in this way. I'd suggest that you just make it public if you want; of course the free version of the code isn't supported, so you can do what you like and still receive the same level of support ;)

Hi Len,

I am currently using C# and found something that puzzels me. If I connect my client async using BeginConnect and the server does not respond ie cable loss, I try to close the socket using the Close method. All seems to be fine, but in Perfmon I can see that the number of Active Connections does not decrease.

If I try a new connect the number of Active connections will increase. I have also tried to set LingerOption to true with 0 sec, but nothing seems to work. How should I handle this kind of scenario?

Johnny,

That's interesting; I'm not in a position to test this in C++ at the moment, but it could be that the async connection attempt will eventually time out and your perfmon counter will drop to 0 at that point. Have you tried leaving things and seeing if that happens? Alternatively it may be that it's a bug...

Leave a comment