CLR thread pool woes

Joe Duffy has written an interesting piece over on “Generalities & Details: Adventures in the High-tech Underbelly” about problems with the CLR thread pool. Joe’s a program manager on the CLR team at Microsoft, so he knows what he’s talking about!

I find the issues that Joe raises interesting as I spent some time designing a flexible thread pool for my C++ IOCP servers some time go and came across the problems that he’s facing. In essence the problem is that you want to allow the thread pool to grow and shrink based on demand but you don’t want it to grow too fast as threads are expensive things to create and creating too many hurts performance, yet you don’t want it to grow too slowly as that hurts throughput of work items; especially if the work items are performing blocking operations.

Please bear in mind that I haven’t used the CLR thread pool a great deal because most of my clients still require that I write in C++, so I may have things wrong, be gentle with me…

It sounds like Joe’s CLR thread pool design is similar to my first cut of a design. There are a minimum number of threads that the pool will hold and a throttling time that restricts creating new threads for a period after a thread has been created. The idea being that as work comes in you allocate each work item to a thread until all threads are busy and then you may decide to start a new thread, or you may decide to wait a while (because you just started a new thread). If threads are idle for too long then they are removed from the pool and the pool shrinks. If the work items arrive at a “reasonable” and consistent speed and the work items take a “reasonable” amount of time then you’re fine. The pool will expand and contract as required and work will be done in a “reasonable” time. The problem is what happens when the work items occur in batches with gaps between and some or all of the work items take a long period of time on blocking operations (like database access, for example).

My final design was a little more complex than Joe’s design (possibly too complex!). I have an initial count of threads, which is the number that the thread pool creates when it starts. I have a minimum number of threads, which is the number below which the thread pool will not shrink. I have a maximum number of threads, which is the number above which the thread pool will not grow. There are also a maximum number of ‘dormant’ threads, which is the number of threads that aren’t currently processing work items; once this limit is exceeded we start shutting some threads down. Then there are the dispatch timeout values. I have two; one for when the thread pool is not currently running at maximum threads, this is the timeout that determines how long to wait between starting new threads. The second is a timeout that is used when the pool is running at maximum threads, this timeout simply slows the dispatch monitoring loop so that we don’t busy wait when we aren’t allowed to increase the capacity of the pool. The final configuration parameter to the thread pool is the pool maintenance period; this determines how often the pool considers shrinking itself. There’s a detailed, if old, design rationale in this CodeProject article here.

In itself my design simply moves the problem around a little, however there are two advantages that I have when using my thread pool in a process: Firstly, I can tune the thread pool to suit the application; all of the configuration parameters are passed in to the constructor, they can be sourced from an external config system if required and thus can be adjusted on a per process level as you profile the process that’s using them. Secondly, I can have more than one thread pool in a process; there is no need for a single process wide thread pool that has to be appropriate for all of the async work within a process.

I can understand how the concept of “THE CLR Thread pool” might have come about. It’s convenient for programmers using various APIs if they don’t have to worry about providing a thread pool and that the API just goes off an “grabs” THE thread pool. Unfortunately, singletons are evil ;) and tempting as it may be to make it “easier” for people by hiding things and allowing access to global variables (the single thread pool object!) it’s rarely appropriate. The problem is that for this design to work, the CLR team has to anticipate all possible usage patterns and optimise for all of them… Obviously needing to have multiple thread pools is unusual, but it may happen, it may be that there are some work items that are slow and should not take up all of your single thread pool’s resources and some that are fast and must be processed reasonably quickly… If you, the programmer, can decide to use two, or more, appropriately tuned thread pools then tuning one pool to be just right for a mixed set of work item requirements is no longer an issue… After all, no matter how large a number of threads, N, you allow in your single pool you only need N+1 of your slow work items to prevent your faster work items from being processed at a “reasonable” rate. With two, or more, pools this is never an issue…

So, my advice to Joe is this: Firstly, allow the user access to all of the tuning parameters that the pool supports, not just the number of threads, but also any timeouts, etc. Default to “sensible” values, write MSDN articles on why you shouldn’t use “foolish” values and then trust the user to do the right thing. Second, for shipping products that use the thread pool allow the user to configure the thread pool via an external source, such as a config file. Thirdly, consider adjusting all of the APIs that currently use THE thread pool to also be callable with A thread pool and then allow the user to create and configure additional pools within their process.