I've been lazy this week

| 4 Comments

As I mentioned in an earlier posting I've been working on a tool this week. I'm too lazy to do a job manually and so I decided to write a tool to help me do it...

Note: the deadlock detector mentioned in this blog post is now available for download from www.lockexplorer.com.

The tool is designed to help me track down deadlocks in code. I decided I needed this tool because I wrote a piece about debugging deadlocks in Visual C++ and realised that using trial and error to locate deadlocks in some client code simply wasn't good enough. The trial and error thrash test required that the code under test actually ended up in a deadlock and, of course, the main problem with locating deadlocks is that the often never happen under test and always happen in production on slightly different hardware with slightly different scheduling. Thus the main aim for the tool was to reliably tell me if a program could deadlock, even if it didn't deadlock during that particular run.

Yesterday evening I got to the point where the report coming out of the program could pinpoint potential deadlocks and today I used that information to locate and fix one of the issues in the code. Cool!

The part of the report that I used this morning was this. It shows two sequences of lock acquisition:

Sequence 1:       0x057f5634 @ 1960,  1:[0x0012facc @ 1970], 2:[0x0012fa5c @ 1995], 0x00571d60 @ 2004
Threads: 3904

Sequence 2:    2:[0x0012fa5c @ 2722],    0x057f4e74 @ 2723,  1:[0x0012facc @ 2729], 0x0012f948 @ 2738, 0x00571d60 @ 2741
Threads: 1920

For these sequences only one thread uses each particular sequence but the sequences acquire the locks tagged as 1 and 2 in different orders. This means that if both threads are executing at the same time they can deadlock each other; Thread 3904 acquires lock 1 and, before it can acquire lock 2, thread 1920 acquires lock 2 and then tries to acquire lock 1...

The @ XXXX bits are code locations. They can be expanded like this:

Location: 1970
MTSCSS - I:\JetByteTools\Win32Tools\CriticalSection.cpp: 99 - Win32::CCriticalSection::Owner::Owner
MTSCSS - I:\MTSCSS\ConnectionCacheBase.cpp: 300 - CConnectionCacheBase::GetConnection
MTSCSS - I:\MTSCSS\ProtocolHandler.cpp: 486 - CProtocolHandler::Connect
MTSCSS - I:\MTSCSS\ProtocolHandler.cpp: 340 - CProtocolHandler::ProcessCommand
MTSCSS - I:\MTSCSS\ProtocolHandler.cpp: 262 - CProtocolHandler::OnLocalDataReceived
MTSCSS - I:\MTSCSS\ProtocolHandler.cpp: 162 - CProtocolHandler::OnDataReceived
MTSCSS - ..\JetByteTools\IOTools\TProtocolHandlerImpl.h: 114 - IO::TProtocolHandlerExImpl<1,IO::IProtocolHandler>::OnDataReceived
MTSCSS - I:\MTSCSS\SocketServer.cpp: 215 - CSocketServer::ReadCompleted
MTSCSS - I:\JetByteTools\SocketTools\AsyncSocketConnectionManager.cpp: 336 - Socket::CAsyncSocketConnectionManager::HandleOperation
MTSCSS - I:\JetByteTools\SocketTools\AsyncSocket.cpp: 582 - Socket::CAsyncSocket::HandleOperation
MTSCSS - I:\JetByteTools\IOTools\IOPool.cpp: 222 - IO::CIOPool::WorkerThread::Run
MTSCSS - I:\JetByteTools\Win32Tools\Thread.cpp: 149 - Win32::CThread::ThreadFunction
MTSCSS - threadex.c: 212 - _threadstartex

With this information I can locate the potential deadlock and work out how to fix it. The good thing about running the tool is that it's alerted me to this potential deadlock without the deadlock ever having to have actually happened and this makes it considerably more useful, to me, than, say, John Robbins' Deadlock Detection Tool.

It's been quite hard work to get the tool to this state, and it's not finished yet. When it's finished it will be able to tell me if code will deadlock and I won't have to look through all the code manually to try and work that out... Lazy is good! ;)

4 Comments

Hi Len,
Will you post this new code on www.codeproject.com ?? :):)
By,
Franc

I doubt it. I may turn it into a product though...

Sounds great Len! I'd buy this tool. How much?

Another tool I'd like is a way to measure thread "contention" - generally how many locks, how many required a wait, how long, etc. Sometimes I make a change that I think improves this but have no way to measure it. This seems related to what you're working on.

Lately you keep hearing about how important mult-threading is becoming with dual core, etc. so it follows that these types of tools are going to be increasingly in demand too.

BTW, I've been enjoying your blog lately - please keep it up!

pUnk,

I've no idea. There's a fair bit of work to do before it's actually useful. It's currently generating far too much 'noise'; it's only that I know the application inside out that I can pick up on the real issues. That said, I think reducing the noise is reasonably easy to do once I put some more functionality in place...

The current tool could measure lock contention and I'm hoping to add that soon as it will help me to tune our servers more accurately. Though of course the act of measuring the contention may affect it...

I'll post more news about the tool when I've done some more work on it. At that stage I might start looking for some beta testers...

Leave a comment