Bitten by 'scoped static initialisation' in C++ - turn on Visual Studio's C4640 warning!

2014-04-22

Today I discovered that C++ scoped static initialisation (function level) in Visual Studio is not done in a thread safe manner. It’s the kind of thing that I should have already known but I guess I assumed that since namespace level static initialisation was safe so was function level static initialisation. Unfortunately it’s not. If you read the MSDN documentation in a particular way you could decide that the docs say that it’s not; but it’s imprecise and unclear and I doubt that I did read the MSDN documentation anyway. Looking at the generated code from Visual Studio 2013 clearly shows that there’s no thread safety involved.

** test        byte ptr ds:[5C87D0h],1  
 jne         JetByteTools::IO::CBufferChainStoresNulls::InternalAddToFront+77h (0473967h)  
 or          dword ptr ds:[5C87D0h],1**  
 xor         eax,eax  
 mov         dword ptr [esp+1Ch],eax  
 push        2Dh  
 push        56DFE8h  
 mov         ecx,5C87B4h  
 mov         dword ptr ds:[5C87CCh],7  
 mov         dword ptr ds:[005C87C8h],eax  
 mov         word ptr ds:[005C87B8h],ax  
 call        std::basic_string<wchar_t,std::char_traits<wchar_t>,std::allocator<wchar_t> >::assign (0404130h)  
 push        55C8B0h  
 call        atexit (047E190h)  
 add         esp,4  
 mov         dword ptr [esp+1Ch],0FFFFFFFFh  

void CBufferChainStoresNulls::InternalAddToFront(
 mov         edi,dword ptr [esi+8]  
</wchar_t></wchar_t,std::char_traits<wchar_t>

The code in bold is the one time static initialisation test, check a value and if set jump past the initialisation, else set the value. Clearly multiple threads could pass through this check at the same time and both end up creating and setting the value to the static variable. I was aware of this but had dismissed it as a minor issue which would, at worst, result in a little extra work and an extra copy of an object which would be cleaned up at program exit. There’s a far worse problem though; a second thread could get to the test just after the first thread had set the ‘constructed’ flag but before it had actually completed constructing the variable. The second thread could then use the variable in an partially constructed state. Raymond Chen goes gives more detail on the issue here.

I’m not sure how I got into the habit of using this code pattern but I seem to have used it mostly as an attempt a micro optimisation where constant strings were required. I’m currently scanning my source code trees to remove all traces of it. There are very few situations where it might have been a valid choice and so far it’s easy to remove.

I’m also not really sure how I’d convinced myself that block scoped static constant variables were OK. The fact that I knew they were lazily constructed at first use and the fact that I knew that I was using them as a micro optimisation means that I must have known that there was no compiler injected synchronisation. I guess I just assumed that “magic happens here”… It’s one of those head slapping moments where the issue is SO obvious once you know about it. Anyway, I live and learn.

I guess I’ve been very lucky (or unlucky, you decide) up until now. Most of the scenarios where I’ve used this pattern have been where the object construction was obviously fast enough that I never had two threads in exactly the right places at exactly the right time. So it’s a standard latent race condition… It finally came to my attention because I was using the “micro optimisation” to create some strings in a function that was being called by multiple threads. My build machine’s unit tests were failing, rarely and strangely. I was getting an access violation in code that looked completely fine. Eventually I tracked the problem down by turning off compiler optimisation and generating a mini dump file when I converted the SEH exception for the access violation into a C++ exception… It then took a little while to work out exactly what the problem was.

Luckily I already compile with /Wall and with ‘warnings as errors’. Visual Studio’s Warning C4640 will locate the issue for me and cause my builds to fail until I fix it. All I need to do is remove the warning suppression #pragma that I have for that warning in my global warning suppression header file… Whilst I’m at it, I guess I should take a good look at the other warnings that are being suppressed…