Threading flames

| 3 Comments

Thanks to Ned Batchelder for pointing out the "discussion" about the pros and cons of multi-threaded programming over on the SQLite newsgroup. The comments on Ned's post are well worth reading; they've provided me with a new blog to subscribe to, Jeff Darcy's Canned Platypus which seems to have lots of the kind of high quality techie stuff that I like.

My view on multi-threading is probably pretty obvious given the way my socket server framework is designed...

I'm not scared of using threads when they're appropriate, and, as Ned says, it is difficult and you do have to abide by certain rules. It's not something I'd want everybody doing because, quite frankly, lots of developers don't pay enough attention to what they're doing...

Developing high performance servers on Windows pretty much requires multi-threading, ideally in the form of IO Completion Ports and a thread pool. The interesting thing about an IO Completion Port based design is that the focus shifts from working with multiple threads, to working with multiple event-driven state machines... Some people find that a difficult shift to make. It seems, from the C10K document, that this approach has not yet really taken off over in Linux-Land. The event-driven state machine approach is actually quite similar to what you need to do if you use just one thread and non-blocking IO. It's also what some of the people who don't like threads think you should be using as an alternative. For me IO Completion Ports let you exploit the processors that you have and do so in a way that works well and scales well. You do have to spend a little bit of time learning how to do it properly though...

3 Comments

My favorite posting on that SQLite newsgroup thread is Tim Browse's:

"All you people posting to comp.sys.15yearoldarguments - you know you're
also cross-posting to the sqlite mailing list, right?"

:)

A lot of your recent posts resonate with what I have been dealing with and thinking about lately. I just spent 4 days fixing a bug in my IOCP-based TCP server. It turned out to be a deadlock between 3 critical sections (A waits on B, B waits on C, C waits on A).

It is very difficult to get right, but as you write, there is no alternative if you want a fast TCP server.


Personally, I am a huge fan of message/event based programming. In my previous job, where I worked for 5 years on embedded systems, there was multiple threads communicating via
messages and queues. The good part was that most programming problems were converted to a single thread problem. But it required a bit more scaffolding: You had to define a struct for each message and you need keep to track of messages you sent (to match responses).

When I moved to a Windows job I thought things would be easier. But there a no process/tread queues in the Win32 API (or other forms of IPC). You have to build them yourself. It is a bit odd because the windows kernel use messaging internally (IRP). They are be fixing this in .NET:

http://codebetter.com/blogs/sahil.malik/archive/2004/12/16/37438.aspx

Herb Sutter has written several excellent articles about multi-threading issues lately. I can highly recommend "The Trouble With Locks" (C/C++ Users Journal, 23(3), March 2005). Some quotes:

"Lock-based programming, our status quo, is difficult for experts to get right. Worse, it is also fundamentally flawed for building large programs."

"This demonstrates the fundamental issue that lock-based synchronization isn't composable"

His recent thoughts are right up my alley

http://pluralsight.com/blogs/hsutter/archive/2005/03/19/6804.aspx

--
Vagn Johansen

I'm sure Christopher Baus will comment on our view that there's "no alternative" he's writing a high performance proxy server and, as far as I know, doing it with just one thread...

I've found that since I've been using IOCP for socket IO I've also got into the habit of using them as an in process queue for all kinds of other things. This started with my 'business logic thread pool' server which runs its IO in one fixed sized thread pool and then uses an IOCP to queue complete network messages to a thread pool that can expand and contract as it does blocking database work. Obviously this fits pretty well as the messages are already defined and the application is one that works with message passing anyway... Once I'd added performance counters to the queue and the thread pool so that I could see how many items were queued and how the threads were running I found the whole architecture so damn useful (and composable) that I've used it a lot since. I've written a couple of servers now where there are pipelines of these constructs that connect different stages of the procesing. Yes it's multi-threaded but there's very little interaction between threads, if any. It works well, scales very well and it's very easy to tune the server.

I think I've seen the Sutter articles but I'll check them out again. Thanks.

Leave a comment