<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>Rambling Comments</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/" />
    <link rel="self" type="application/atom+xml" href="http://www.lenholgate.com/blog/atom.xml" />
    <id>tag:www.lenholgate.com,2010-12-10:/blog//12</id>
    <updated>2012-06-08T17:50:13Z</updated>
    
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type Pro 5.12</generator>

<entry>
    <title>C++ 11, Concurrency</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2012/06/c-11-concurrency.html" />
    <id>tag:www.lenholgate.com,2012:/blog//12.1185</id>

    <published>2012-06-08T17:25:29Z</published>
    <updated>2012-06-08T17:50:13Z</updated>

    <summary> I&apos;ve been watching Bartosz Milewski&apos;s C++ 11 Concurrency videos and they&apos;re a pretty good way to get up to speed on the new threading support in the latest C++ standard. They start off nice and slowly, for people who...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="C++ Tips" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Geek Speak" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<iframe align="right" src="http://rcm-uk.amazon.co.uk/e/cm?lt1=_blank&amp;bc1=000000&amp;IS2=1&amp;bg1=FFFFFF&amp;fc1=000000&amp;lc1=0000FF&amp;t=ramcom-21&amp;o=2&amp;p=8&amp;l=as4&amp;m=amazon&amp;f=ifr&amp;ref=ss_til&amp;asins=1933988770" style="width:120px;height:240px;" scrolling="no" marginwidth="0" marginheight="0" frameborder="0"></iframe>
<div>
I've been watching <a href="http://bartoszmilewski.com/" target="_blank">Bartosz Milewski's</a> <a href="http://www.youtube.com/watch?v=80ifzK3b8QQ&amp;feature=channel&amp;list=UL" target="_blank">C++ 11 Concurrency videos</a> and they're a pretty good way to get up to speed on the new threading support in the <a href="http://en.wikipedia.org/wiki/C%2B%2B11" target="_blank">latest C++ standard</a>. They start off nice and slowly, for people who haven't been doing concurrency for years, and explain the various new features provided by the language. It's good stuff.
</div>
<div><br /></div>
<div>
I've been reading <a href="http://www.boost.org/users/people/anthony_williams.html" target="_blank">Anthony Williams'</a> <a href="http://www.manning.com/williams/" target="_blank">C++ Concurrency In action</a> which is a great way to understand the details of what you'll see in the videos. It's a good book and there's lots of useful stuff in there even if you've been writing multi-threaded code for years. The chapters on designing thread-safe data structures, with locks and lock-free, were especially interesting. The lock-free chapter being just about scary enough to put most people off of rolling their own lock-free data structures, I hope.
</div>
<div><br /></div>
<div>
To someone who's been writing multi-threaded code for over 10 years this sudden interest in concurrency seems a little strange, but if you want to understand why it's becoming more and more important then Herb Sutter's <a href="http://shadow-technologies.tv/video/179" target="_blank">Welcome to the Jungle</a> video is worth watching. If you're interested in C++11 in general then the <a href="http://channel9.msdn.com/Events/GoingNative/GoingNative-2012" target="_blank">excellent Channel 9 coverage of the Going Native conference</a> is worth watching.
</div>
<div><br /></div>
<div>
And finally, now that I've worked my way through all of that lot, I need to find some more good technical videos to watch whilst I'm working out. Does anyone have any suggestions?
</div> 

 ]]>
        
    </content>
</entry>

<entry>
    <title>The curious case of the missing copy constructor</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2011/09/the-curious-case-of-the-not-missing-copy-constructor.html" />
    <id>tag:www.lenholgate.com,2011:/blog//12.1135</id>

    <published>2011-09-14T08:15:46Z</published>
    <updated>2011-09-14T08:56:53Z</updated>

    <summary> I have a tendency to write unit tests that are a little more invasive than they need to be; these tests make sure that not only are the results as expected but also that as many of the side-effects...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="C++ Tips" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Testing" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<div>
I have a <a href="http://www.lenholgate.com/blog/2008/04/what-would-i-do.html">tendency to write unit tests that are a little more invasive</a> than they need to be; these tests make sure that not only are the results as expected but also that as many of the side-effects and interactions with other objects are as expected as well. So, for example, in my current <a href="http://www.lenholgate.com/blog/2011/09/the-websocket-protocol---draft-hybi-13.html">WebSockets development</a> for <a href="http://www.serverframework.com/" target="_blank">The Server Framework</a> I have some tests which test that the correct data is delivered to the client of the API that I'm developing and also test that the API interacts with its buffer allocator correctly and doesn't leak memory. The detailed interaction testing sometimes gets in the way of refactoring as the inputs don't change, the outputs don't change but the interaction changes so a test that should pass before and after a refactoring may fail just because I've optimised the use of a sub object that happens to be being checked in the test; in these cases where both levels of testing are useful I sometimes duplicate the test with and without the detailed interaction examination... Anyway...
</div>
<div><br /></div>
<div>
I have a test which tracks the way a buffer's reference count is modified as the object under test performs an action; it's useful to be have tests which prove that under given circumstances you don't have a reference counting leak. The buffer is passed around via a mix of raw pointers and smart pointers and during the test a mock buffer is used which logs all of the operations performed. At one point in the interaction log there's a point where the buffer is returned by value via a smart pointer and we get an <code>AddRef()</code>, <code>Release()</code> sequence of calls as the temporary is copied into and out of. The compiler is allowed by the C++ standard to elide copy constructors in some situations, see <a href="http://en.wikipedia.org/wiki/Copy_elision" target="_blank">here</a>, so you should be careful that your copy constructor doesn't do anything other than copy the object as you can't guarantee that it will actually be called. If you then allow for the fact that the compiler may opt to use the <a href="http://en.wikipedia.org/wiki/Return_value_optimization">Return Value Optimisation</a> in some circumstances to avoid creating temporaries you should probably start to realise that my invasive test is somewhat fragile depending on which compiler optimisations are enabled and which compiler is being used...
</div>
<div><br /></div>
<div>
So, my tests expect to see the copy constructor called and the resulting sequence of <code>AddRef()</code> and <code>Release()</code> calls on the buffer and in debug builds on all supported compilers this is what they see. Unfortunately on release builds the compilers differ... VS2005 and VS2010 RTM both elide the copy constructor (presumably applying  RVO) whereas VS2008 and VS2010 SP1 both call the copy constructor. For now I have a rather clunky macro that detects the compiler version and build version and removes the test requirement where necessary. For a while I had some problems differentiating between VS2010 RTM and VS2010 SP1 but luckily <code>_MSC_FULL_VER</code> can be used for that. 
</div>
<div><br /></div>
<div>
Now, off to learn a little more about RVO and <a href="http://www.synesis.com.au/resources/articles/cpp/movectors.pdf" target="_blank">move constructors</a>.
</div>
<div><br /></div>
<div>
<b>Edit:</b>It seems that it's less compiler related than I thought. One of the projects (the one where the smart pointer template lives) had optimisations turned off in some of the release builds for some of the compilers... More investigation needed, with any luck all compilers will elide the copy once optimisations are turned on...
</div>]]>
        
    </content>
</entry>

<entry>
    <title>Invasive containers</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2010/08/invasive-containers.html" />
    <id>tag:www.socketframework.com,2010:/blog//12.972</id>

    <published>2010-08-04T08:28:08Z</published>
    <updated>2011-01-02T15:17:33Z</updated>

    <summary>Rather than immediately dive into the fun of writing my own invasive alternative for std::map I decided to take a look at what has been done before, as expected boost contains something that might work in the shape of the...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="C++ Tips" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Geek Speak" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<p>Rather than immediately dive into the fun of writing my own <a href="http://www.lenholgate.com/archives/000907.html">invasive alternative</a> for <code>std::map</code> I decided to take a look at what has been done before, as expected <a href="http://www.boost.org/">boost</a> contains something that might work in the shape of the "<a href="http://www.boost.org/doc/libs/1_43_0/doc/html/intrusive.html">intrusive containers library</a>". </p>

<p>Of course, being part of boost I first have to work out exactly how much more of boost it will require me to depend on and then I have to work out how I can use it to replace my current <code>std::map</code> usage. It seems quite clever (no surprise there) and allows for a type to be included in multiple intrusive containers by allowing an object to have multiple links embedded in it and allowing you to specify the link to use for a particular container. It seems slightly over engineered for my needs (again, no surprise there) and it's a pity that it doesn't simply provide obvious drop in replacements for the equivalent STL containers; yes it's nice that there are two kinds of base tree structure (red black and AVL) so you can optimise for your specific usage patterns but it would be nicer if you could simply switch from <code>std::map</code> to <code>boost::intrusive::map</code> and add an appropriate data member to the class you want to store in it... </p>

Still, I guess I should spend some more time learning and understanding... One thing that it doesn't do for me, as far as I can see, is provide a multimap from which I can remove all elements at a specific key in one go...]]>
        
    </content>
</entry>

<entry>
    <title>STL allocators, hmm...</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2010/08/stl-allocators-hmm.html" />
    <id>tag:www.socketframework.com,2010:/blog//12.971</id>

    <published>2010-08-04T06:17:15Z</published>
    <updated>2011-01-02T15:17:20Z</updated>

    <summary>As I mentioned a while ago, I have some code which needs to perform better than it currently does and one of the areas that could be improved upon is the amount of contention for the heap that&apos;s occurring. The...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="C++ Tips" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Geek Speak" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[As I mentioned <a href="http://www.lenholgate.com/archives/000907.html">a while ago</a>, I have some code which needs to perform better than it currently does and one of the areas that could be improved upon is the amount of contention for the heap that's occurring. The fact that I'm using an STL <code>map</code> for my collection means that the class has a <a href="http://www.lenholgate.com/archives/000908.html">'big C'</a> contention value of <b>C(n threads using the heap)</b> rather than <b>C(n threads using the object)</b>. Of course, the fact that allocations need to be done at all is an unfortunate feature of <code>std::map</code> but rather than immediately replace the container with an invasive collection I decided to look into replacing the STL allocator that's being used with one that uses a private heap so  that I could reduce the contention value of the allocations to <b>C(n threads using the object)</b>. Doing this has required that I take a look at STL allocators and my initial thought is "hmm..."]]>
        <![CDATA[<p>STL containers can be configured to use custom allocators but by default they use a default allocator which, generally, uses the heap to allocate and free memory in a thread safe way. An allocator is a template parameter to the container and, as pointed out in several references, you don't generally need to mess with them. Still, my requirements are to improve performance and part of that job is to reduce potential contention so the allocators need to be looked at. <iframe align="right" src="http://rcm-uk.amazon.co.uk/e/cm?lt1=_blank&amp;bc1=000000&amp;IS2=1&amp;bg1=FFFFFF&amp;fc1=000000&amp;lc1=0000FF&amp;t=ramcom-21&amp;o=2&amp;p=8&amp;l=as1&amp;m=amazon&amp;f=ifr&amp;md=0M5A6TN3AXP2JHJBWT02&amp;asins=0201309564" style="width:120px;height:240px;" scrolling="no" marginwidth="0" marginheight="0" frameborder="0"></iframe>
My reference books didn't adequately cover allocators, though <a href="http://www.amazon.co.uk/gp/product/0201309564?ie=UTF8&amp;tag=ramcom-21&amp;linkCode=as2&amp;camp=1634&amp;creative=19450&amp;creativeASIN=0201309564">Generic Programming and the STL</a><img src="http://www.assoc-amazon.co.uk/e/ir?t=ramcom-21&amp;l=as2&amp;o=2&amp;a=0201309564" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /> did explain it all in a rather terse manner with lots of "you don't need to know this" provisos.</p>
 
<p>A quick Google led me to <a href="http://www.tantalon.com/pete.htm">Pete Isensee's</a> pages where he talks about STL allocators for games programming. He has a nice set of slides and some code and this has helped me get started. </p>

<p>I can't help thinking that allocators weren't really thought through especially well; the whole 'rebind' from <code>allocator&lt;int&gt;</code> to <code>allocator&lt;node&gt;</code> is, IMHO, just pointless 'aren't we clever with templates' crap ;) especially given the fact that at the end of the day all these things are doing is allocating memory of a given size... </p>

Anyway, being able to plug my own allocator into the containers is one small step on the path to improving the performance of the objects in question.]]>
    </content>
</entry>

<entry>
    <title>Speeding up C++ builds</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2010/03/speeding-up-c-builds.html" />
    <id>tag:www.socketframework.com,2010:/blog//12.945</id>

    <published>2010-03-01T11:49:37Z</published>
    <updated>2011-01-02T13:19:59Z</updated>

    <summary>I stumbled on an idea for speeding up C++ builds the other day and it&apos;s not something that I&apos;ve considered before and it really does offer a considerable speed up so I think it may be worth considering in some...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="C++ Tips" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<p>I stumbled on an idea for speeding up C++ builds the other day and it's not something that I've considered before and it really does offer a considerable speed up so I think it may be worth considering in some situations. It has downsides which make it harder to use with my default style of code structuring but the increase in build speed is tempting...</p>

<p>The idea is that of "Unity builds" which I discovered from an <a href="http://stackoverflow.com/questions/2251212/how-to-improve-visual-c-compilation-times/2307039#2307039">answer</a> by <a href="http://cheind.wordpress.com/">Christoph Heindl</a> on Stack Overflow about how to speed up Visual Studio builds. Christoph refers to this blog entry; <a href="http://buffered.io/2007/12/10/the-magic-of-unity-builds/">The Magic of Unity Builds</a> which explains that due to disk accesses it's more efficient for a compiler to compile a single .cpp file that includes all of your other .cpp files and that this has a major effect on compilation time. Others suggest that this also results in better optimisation due to the fact that all of the code is visible to the optimiser at the same time, but this may not make that much difference to you if you have link time code generation and whole program optimisation turned on.</p>

<p>Anyway, I gave this a go for one of my libraries and it does indeed dramatically improve compile and link times when rebuilding the library. Of course that's the problem though, it always results in a complete rebuild. Because of this I don't see that it's especially suitable for day to day development where my "no precompiled headers" <a href="http://www.serverframework.com/ServerFramework/latest/Docs/group___precomp.html">build configuration</a> means that I only ever have to rebuild the code that has changed. However the Unity Build is faster than my complete rebuild with precompiled headers enabled so it might be useful for the integration builds on my build servers. Since my build servers also test that all of the normal build configurations work, and since I don't see me ever moving away from a standard build, I would only be able to use this technique for the integration builds for all compilers and platforms during the development process and I'd have to use my current method of compilation for the final test builds before release.</p>

<p>The problem with Unity Builds is that you end up with a single C++ "<a href="http://www.efnetcpp.org/wiki/Translation_unit">translation unit</a>" rather than a series of translation units (one per source code file). Translation units are important in C++ compilation and there are some practices that rely on each file being a separate translation unit. In <a href="http://buffered.io/2007/12/10/the-magic-of-unity-builds/">The Magic of Unity Builds</a>, OJ refers to the problems that arise from merging translation units as being due to dodgy coding practices. I disagree here. The main problem that I've come across so far is that static variables and functions are, of course, local to the translation unit in which they're defined. With the single translation unit "unity build" approach you effectively lose the ability to declare static variables and functions. This is quite a big issue for me as I like to limit visibility as much as possible so functions that are used internally by a class and that do not need to be member functions are often implemented as file level static functions. Sure they could be implemented as class level, private, static functions but this exposes them to the header file and thus causes more code to require compiling if they're changed. Likewise file level static variables are sometimes useful. Rarely as variables but often as constants. The unity build approach means that you need to be careful with naming to avoid name clashes from your static functions. This may be a big enough issue for me that I wont use Unity Builds no matter how much they improve the speed of my integration test builds.  </p>

<p>Another issue with Unity Builds is the additional build configurations required. To be of value to me during integration testing I would need to be able to build a Unity Build of each of my 4 standard build configurations (Debug, Unicode Debug, Release, Unicode Release). That means 8 more configurations to manage (4 for x86 and 4 for x64). Because of this, and because I expect that I'd want to strip the Unity Build from my released source code distribution, I expect that I'd place all of the Unity Builds in their own project file. So now each library would have two project files per compiler and an additional solution file that pulls them all together. These could be automatically generated from the standard project and solution files.</p>

<p>Finally there's the maintenance of the "Unity" file itself. Again this could be automatically generated, but it's likely to be more complex than simply creating a file that includes every .cpp file in the project. I'm sure that some of my libraries will have enough code that they will blow some internal compiler limits when compiled as a Unity Build. This means that I'll need some way of splitting the Unity file into several files to get around compiler limits - supporting 5 versions of Visual Studio is likely to make this more complex.</p>

The maintenance of the Unity build can be automated but the problems of using a single translation unit could well mean that I don't use Unity Builds for my library source code. Unfortunately the single translation unit issues also mean that the approach is also less likely to work in arbitrary client projects which desperately need to have their compilation times reduced - I'm sure there are lots of dubious coding practices that will make single translation unit builds impossible without code changes. That said it's a potentially useful technique and one that I hadn't heard of before. Certainly something to spend a bit more time exploring...]]>
        
    </content>
</entry>

<entry>
    <title>The most important C++ stuff, ever...</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2006/09/the-most-important-c-stuff-ever.html" />
    <id>tag:www.socketframework.com,2006:/blog//12.714</id>

    <published>2006-09-08T11:11:45Z</published>
    <updated>2010-12-27T11:58:47Z</updated>

    <summary>I&apos;m still skiing in Argentina, the training is going well and within 3 weeks I&apos;ll know if I make the grade and qualify as a BASI Ski Instructor... Because of all the skiing and partying and work out here I...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="C++ Tips" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<p>I'm still skiing in <a href="http://www.megeveski.com/">Argentina</a>, the training is going well and within 3 weeks I'll know if I make the grade and qualify as a BASI Ski Instructor...</p>

<p>Because of all the skiing and partying and work out here I haven't been keeping up with many technical issues but this morning I checked bloglines and picked a couple of random feeds to catch up on. One of them was the Artima C++ Source feed which has recently published  5 articles by Scott Meyers. These articles are five lists of five of the "Best C++ ... ever", one on <a href="http://www.artima.com/cppsource/top_cpp_books.html">books</a>, one on <a href="http://www.artima.com/cppsource/top_cpp_publications.html">other publications</a>, one on <a href="http://www.artima.com/cppsource/top_cpp_publications.html">software</a>, one on <a href="http://www.artima.com/cppsource/top_cpp_publications.html">people</a> and one on <a href="http://www.artima.com/cppsource/top_cpp_aha_moments.html">aha! moments</a>. They're all worth reading.</p>

Now, must finish breakfast and head off to the slopes again.]]>
        
    </content>
</entry>

<entry>
    <title>C++ Tips: 4 - Learn to work in terms of abstractions, no matter how small</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2006/01/c-tips-4---learn-to-work-in-terms-of-abstractions-no-matter-how-small.html" />
    <id>tag:www.socketframework.com,2006:/blog//12.661</id>

    <published>2006-01-06T22:10:24Z</published>
    <updated>2010-12-26T07:09:38Z</updated>

    <summary>In the fight to make C++ code easier to reason about and understand, never underestimate the value of a name. Giving something a decent name is the first step in thinking about the concept at a slightly more abstract level....</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="C++ Tips" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        In the fight to make C++ code easier to reason about and understand, never underestimate the value of a name. Giving something a decent name is the first step in thinking about the concept at a slightly more abstract level. Abstraction is all about selective forgetfulness, by grouping together several related program elements, defining a concept and giving that concept a name you can, from then on, choose to work at the level of the name rather than the detail. This has a marvellous effect on the amount of information that you can hold in your head at the same time and is a very powerful aid to communication, both between programmers and between the code and the programmer.
        <![CDATA[<p>Probably the simplest form of abstraction available in C++ is the <code>typedef</code>, it's just another name for something but I'm often amazed at how some programmers fail to capitalise on the power of good naming. I often see code like this:</p>
<pre class="brush: cpp gutter: false">std::list&lt;std::pair&lt;std::string,std::string&gt; &gt; tokens;
std::pair&lt;std::string,std::string&gt; token;</pre>
rather than the more preferable:
<pre class="brush: cpp gutter: false">typedef std::pair&lt;std::string, std::string&gt; Token;
typedef std::list&lt;Token&gt; Tokens;
  
Tokens tokens;
Token token;
</pre>
<p>Even in the small code snippet above the use of a tiny abstraction, a name, makes the code easier to reason about. A <code>Token</code> is a pair of strings and <code>Tokens</code> is a list of <code>Token</code> data. The tiny abstraction allows us to be explicit about the relationship between the <code>std::pair&lt;std::string, std::string&gt;</code> that's used in the declaration of tokens and the corresponding use in the declaration of token. It's the same usage, the same abstraction; we call it a "Token". In real code, you'd no doubt manipulate this data a little more. Without the simple abstraction of the <code>typedef</code> the code required to iterate through that list of tokens would look something like this:</p>

<pre class="brush: cpp gutter: false">for(std::list&lt;std::pair&lt;std::string, std::string&gt; &gt;::const_iterator it = tokens.begin() ...
</pre>
<p>What should communicate quite clearly that you want to iterate over a series of tokens is, instead, full of implementation details that force you to fully understand, and reason about, what constitutes a token. The simplest abstraction, a name, moves the detail into "need to know" territory and reduces the amount of reasoning that you <i>must</i> do when working with the concept. Now, as I've said before, <a href="http://www.lenholgate.com/archives/000496.html">typedefs aren't perfect</a>, but as a step towards abstraction they're valuable.</p>

<p>I have a rule of thumb that has served me quite well. As soon as you need to start worrying about the C++ parse problem that requires a space between the closing angle brackets of a template that is a template on a template (<code>std::list&lt;foo&lt;bar&gt; &gt;</code>) then you're missing a name for the inner template...</p>

<p>The token in the example above might be better off being defined as a specific kind of structure as it would allow us to convey more meaning and work at the level of the abstraction rather than at the level of the implementation. Using the token shown above we'd end up with code that operated in terms of the <code>first</code> and <code>second</code> elements of the <code>std::pair</code> template. Code like this:</p>
<pre class="brush: cpp gutter: false">DoThing(token.first);
  
DoSomethingElse(token.second);
</pre>
<p>The code above shows the standard names of the two elements of the <code>pair</code>. Whilst <code>pair</code> is useful it's often worth the time and effort to declare your own structure that's more suitable for your particular abstraction, even if it only has two elements. Once again the power of a name should not be underestimated. For the time taken to declare something like this:</p>
<pre class="brush: cpp gutter: false">struct Token 
{
   std::string name;
   std::string value;
};
</pre>
<p>You end up with code that reads better, communicates with the programmer and stays at the level of the abstraction rather than jarringly displaying implementation details. Although it's said that good programmers are <a href="http://blog.outer-court.com/archive/2005-08-24-n14.html">lazy</a>, failure to apply a little abstraction where it's required is the wrong kind of laziness. Amazingly, I've seen code that has built up some quite complex structures using nested pairs. Not surprisingly, the level of communication between code and programmer from a line that reads: <code>DoStuff(first.second-&gt;first.first-&gt;second);</code> is fairly low... The problem is that it's often not quite so obvious that the communication provided by a single pair is often equally as low.</p>

<p>As we've seen, naming data and groups of data can help build simple abstractions. Another way to create an abstraction is to move small pieces of code into simple functions. In my opinion, even if a function is only called from one place it's often worth having simply for the fact that the detail is hidden behind a name. Once again by naming something you can work at the level of the name, the concept, rather than at the level of the detail; you can drill down when you need to. </p>

<p><img align="right" alt="RouteMaster.jpg" src="http://www.lenholgate.com/blog/images/RouteMaster.jpg" />Working in terms of abstractions usually means thinking in terms of the problem that you're solving rather than in terms of the way that you're solving it. When you're doing this you'll often find that you don't tend to use 'raw' types that often. The abstraction that is a <code>std::map</code> lives in the realm of the solution whereas a "WidgetCollection" lives in the realm of the problem. What's more, and I've said this <a href="http://www.lenholgate.com/archives/000277.html">before</a>, the abstraction of the <code>std::map</code>, or, indeed, most 'standard' abstractions, is too general to provide maximum value when working in the problem domain. A more precise, more focused abstraction allows you to work at a higher level. Just as "vehicle" is a valuable concept, the concept of a "bus" is more precise, and the concept of a <a href="http://www.greyhound.com/">greyhound bus</a> or a <a href="http://www.routemaster.org.uk/">routemaster</a> even more so. If you are forced to communicate in terms of "vehicle" the whole time when referring to a Routemaster then you need to continually remind your audience that you should only enter through the opening at the rear and that it can't fly and can't travel on water... The more specialised the concept that you're working with the more precise your understanding of the operations that you can perform on it and the clearer it communicates its purpose.</p>

<p>Of course, as <a href="http://www.joelonsoftware.com/">Joel Spolsky</a> once pointed out in "<a href="http://www.joelonsoftware.com/articles/LeakyAbstractions.html">The Law of Leaky Abstractions</a>", abstractions tend to be "leaky" and you'll often find that you need to be able to work in terms of their component parts to be able to program effectively, but, and it's a very bit but, that doesn't remove the value of the abstraction and small, simple, abstractions can often be as valuable and powerful as larger ones.</p>

<p>In my opinion, being able to work at different levels of abstraction is one of the most important skills of a good programmer. Being able to build these abstractions from simpler abstractions is as important as is being able to drill down through them. In fact, I think it's more important. Often, programmers who can drill deeply through abstractions seem to focus on this at the expense of building their own abstractions. Whilst I agree that being able to cut through the names and concepts, breaking them apart into their component parts and drilling further and further until you end up at the assembler level is amazingly useful when you have a nasty bug to chase, or when an API doesn't function quite how you, or the documentation, expects. It's not <i>always</i> necessary to go all the way down to the "metal", it's just often useful if you can. Joel alludes to this in his "<a href="http://www.joelonsoftware.com/articles/ThePerilsofJavaSchools.html">The Perils of Java Schools</a>" piece; the more layers that you understand the more you can reason about your abstractions on multiple levels at the same time, switching effortlessly between the use of the concept to focus on the detail of the concepts that make up the abstraction that you're working with and those below it. However, if you focus on this at the expense of learning how to build your own abstractions then you're missing out and your code will only communicate at a very detailed and complicated level.</p>

Never underestimate the power of a good name. Simple abstractions are as important as the big and complex abstractions. Once you start building simple abstractions you'll find that they begin to build upon themselves. Your code will begin to communicate at multiple levels at once and you will be able to decide which level of detail is appropriate for each situation.]]>
    </content>
</entry>

<entry>
    <title>C++ Tips: 3 - Strive to be const correct</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2006/01/c-tips-3---strive-to-be-const-correct.html" />
    <id>tag:www.socketframework.com,2006:/blog//12.658</id>

    <published>2006-01-04T22:34:44Z</published>
    <updated>2010-12-26T07:06:49Z</updated>

    <summary>Another extremely powerful tool that you can use to ensure that your C++ code communicates as clearly as possible is const. By correctly using const all the time when designing your abstractions you can divide an object&apos;s interface into two...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="C++ Tips" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<p>Another extremely powerful tool that you can use to ensure that your C++ code communicates as clearly as possible is <code>const</code>.  By correctly using <code>const</code> all the time when designing your abstractions you can divide an object's interface into two smaller, easier to understand interfaces; one which does change the object's internal state and one which doesn't. By correctly using <code>const</code> all the time when defining constants and variables you can clearly communicate which are which.</p>

Strive to be <a href="http://www.lenholgate.com/archives/000283.html">const correct</a>.]]>
        <![CDATA[<p><code>const</code> isn't just for passing objects to functions as references that can't be changed:</p>
<pre class="brush: cpp gutter: false">CWidget &amp;CWidgets::Get(const std::string &amp;name)
</pre>
<p>Although looking at much C++ code you might think it was.</p>

<p>The first, and simplest, use of <code>const</code> to improve code clarity is simply using it when defining variables that never change, constants. Say we have a function that calls a few other functions to obtain the data that it needs before doing its main work. If that data isn't changed by the function then it should be declared as <code>const</code>. This clearly communicates that the function doesn't change the data later on and it allows the compiler to protect you from accidental attempts to change the data. Seeing <code>const</code> means that you can mentally flag the value as a constant and that's one less thing to worry about. If a value would normally be set using an <code>if</code> statement but the result should be <code>const</code> then consider using the <a href="http://www.lenholgate.com/archives/000445.html">conditional operator </a>(?:) instead.</p>

<p>Likewise, if parts of your object can never be changed after construction then they should be <code>const</code> so that a) they communicate the fact that they're immutable and b) so that they can't be changed by mistake. It's relatively simple to do, you burn in some of the business rules into the code and you prevent bugs. These parts of the object are not variable they're constant. If your object represents a trade and a trade has an ID that is assigned once and never ever changes then the trade ID should probably be constant within the trade object. The one down side is that you can't assign to the object as the left hand side object in the assignment can't be changed because it contains constant data; in practice I haven't found this to be a limiting factor in most cases, if it is then you can either use the <a href="http://www.possibility.com/epowiki/Wiki.jsp?page=BridgePatterns">"handle/body" idiom</a> to allow you to maintain the constantness of the body whilst allowing assignment between handles or simply always work in terms of pointers or references to the object in question.</p>

<p>Finally you should use <code>const</code> in the interfaces to your classes so that you separate the interface into two, one that will modify the object and one that cannot. If a method is a query on the object and simply returns details of the object's state and doesn't change that state then the method should be declared <code>const</code>.</p>
<pre class="brush: cpp gutter: false">bool CLocationManager::Contains(
   LocationIndex location) const
{
   return m_callStacks.find(location) != m_callStacks.end();     
}
</pre>
<p>This serves three purposes; i) it communicates precisely what's going on, ii) it allows the compiler to prevent accidental mistakes within the body of the method, iii) it allows the method to be used on a constant instance of the object (which is extremely important if you're correctly declaring your constants to be <code>const</code>!).</p>

<p>The correct use of <code>const</code> is contagious. Once you get into the habit of using <code>const</code> to clearly declare your constants you'll find that you <i>have</i> to use <code>const</code> to correctly define your interfaces. </p>

<p>If you have a class that holds a reference to a collection that it never changes then this reference should be <code>const</code>. Once it is <code>const</code> the interface on the class <i>must</i> be correctly defined to provide both a query interface that is <code>const</code> and an "adjustment" interface that is not <code>const</code>. Part of the query inteface might look like this:</p>
<pre class="brush: cpp gutter: false">const CFunctionStack &amp;CLocationManager::Get(
   LocationIndex location) const
{
   FunctionStacks::const_iterator it = m_functionStacks.find(location);

   if (it == m_functionStacks.end())
   {
      throw CException(_T("CLocationManager::Get()"), _T("Location: ") + ToString(location) + _T(" not found"));
   }

   return *(it-&gt;second);
}
</pre>
<p>Notice how this allows us to both limit what the function can do to the object and clearly communicate that limitation. The function can be used on a constant collection of locations as it cannot change that collection. Likewise it returns a constant reference to one of its contained objects. The caller has a "read-only" view of the collection. You can tell, quickly and accurately by looking at either the object in question or the start of the function if data will be changed.</p>

<p>As I've said <a href="http://www.lenholgate.com/archives/000283.html">before</a>, being const correct means you need to think a little bit more when you write the code but you can trade that for thinking a little bit less when you read the code. Since code is read more times than it's written it's well worth being disciplined when you write code.</p>

<i><b>Updated 5/1/6 : for the chapter and verse on const correctness see the <a href="http://www.parashift.com/c++-faq-lite/const-correctness.html">C++ FAQ Lite</a>.</b></i>]]>
    </content>
</entry>

<entry>
    <title>C++ Tips: 2 - Avoid designing undefined behaviour</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2006/01/c-tips-2---avoid-designing-undefined-behaviour.html" />
    <id>tag:www.socketframework.com,2006:/blog//12.657</id>

    <published>2006-01-03T19:45:53Z</published>
    <updated>2010-12-26T07:05:10Z</updated>

    <summary>When designing code it&apos;s often easy to include undefined behaviour. The need for code that exhibits this kind of behaviour is, however, generally pretty rare and there are often ways around allowing undefined behaviour. In general it&apos;s usually best to...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="C++ Tips" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[When designing code it's often easy to include <a href="http://en.wikipedia.org/wiki/Undefined_behavior" rel="nofollow">undefined behaviour</a>. The need for code that exhibits this kind of behaviour is, however, generally pretty rare and there are often ways around allowing undefined behaviour. In general it's usually best to try to avoid undefined behaviour and instead be clear about exactly what happens under all usage conditions.]]>
        <![CDATA[<p>Undefined behaviour can be part of your contract with the client, you expect them to adhere to their side of the contract and if they do then you will keep your side; if they don't then all bets are off. A common example is where you have clients who will trade improved performance for less safety and checking. Your code assumes that they know what they're doing and offers no guarantees as to what happens if they don't. For example, it may be possible for the client to determine the number of elements in a collection and do their own bounds checking so that they never ask for out of bound data, if your only data access method always provides bounds checking then you are extracting a performance penalty from those who always access you with valid parameters. Skipping the bounds checking makes your code fractionally faster (which may make all the difference when used in a tight loop) but introduces risk because you are assuming that input data is valid when it may not be. In general if you really think that you need to provide a method that can exhibit undefined behaviour under certain error conditions then it's probably best to also include a version that behaves in a controlled way under the same error conditions. For an example see <code>std::basic_string</code> and its <code>at()</code> and <code>operator[]</code> methods which both provide indexed access to the data. <code>at()</code> checks the supplied index and throws an <code>out_of_range</code> exception on an invalid index and <code>operator[]</code> need not perform any index checking and exhibits undefined behaviour when supplied an invalid index. See "<a href="http://www.artima.com/cppsource/deepspace.html" rel="nofollow">Contract Programming 101</a>" by Matthew Wilson for more details.</p>

<p>The thing about undefined behaviour is that, by definition, it can be anything. This makes it difficult to write unit tests for. Sure, it's possible to have the 'undefinedness' be something that integrates with your testing framework in debug builds but it's unlikely that you'd want it to do this in your release builds; especially if the purpose of the undefined behaviour is to improve performance by not checking things and trusting the caller. So, functions that include undefined behaviour in their contracts are hard to unit test under failure conditions that give rise to the undefined behaviour. You can write a test that exercises the code with a contractually correct usage pattern but you can't prove that the code fails when and how it should because you can't know what such a failure will do. </p>

<p>Obviously there may be other reasons that you either cannot or would prefer not to fail in a defined manner but they're generally few and far between (unless, perhaps, you're working on high performance generic container implementations). By all means think about providing an unchecked, undefined on failure, version of a function but don't bother implementing it or using it until your profiling has shown that the code that you think requires it actually does. With fine grained unit tests you should be able to determine the actual performance requirements reasonably quickly by unit testing the user of the code under the kind of situations that you expect performance to be critical. If you attempt to avoid undefined behaviour for as long as possible then you may often find that you don't actually really need it.</p>

<p>I know some people will say that there's nothing really to worry about because they can include an <code>assert</code> to validate that the caller is sticking to the contract. This is true but if the checking code remains in all builds of the code then the code need no longer exhibit undefined behaviour and if the checking code is not included in some builds then you may find yourself looking for <a href="http://en.wikipedia.org/wiki/Heisenbug" rel="nofollow">heisenbugs</a> since the code that runs in production may not be the same code that you can debug and test.</p>

<p>By definition, undefined behaviour is irrecoverable. Since there's no specification of what the behaviour is there's nothing you can do when it happens. If you're writing code that needs to be robust and needs to fail in a graceful and controlled manner than as soon as you step into undefined behaviour you cannot guarantee that the code is robust and you are incapable of failing gracefully under all situations.</p>

<p>Of course some people will claim that using functions which include undefined behaviour in their specifications is OK as long as the caller plays by the rules; this is true, to a point. The problem is that there's very little that you can do to prove that the caller <i>is always</i> going to play by the rules. Checks that compile out of release code may give you a warm fuzzy feeling when running tests in debug mode but they won't necessarily stop the software from failing in production. This is even more likely if the code in question uses multiple threads as thread scheduling is often the most affected by differences between debug and release builds. </p>

<p>What does all this mean in practice? You can usually avoid designing undefined behaviour if you always validate your input parameters and always document exactly how you will fail if they are invalid. This documentation may take the form of a unit test for the code in question, if it does then you can be sure that the documentation stays in sync with the code; the test fails if the documentation gets out of sync with reality.</p>

In general I've found it far better to avoid undefined behaviour as much as possible. If you use unit testing then you may find that including undefined behaviour for certain failure conditions makes it impossible to write tests for the code or that the code that you can test is not the code that you can release. As soon as one code path includes a potential trip into undefined behaviour you may be unable to guarantee how the code will behave under certain situations, it can be considerably more difficult to deliver robust software when this is the case.]]>
    </content>
</entry>

<entry>
    <title>C++ Tips: 1 - Avoid unnecessary optionality</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2006/01/c-tips-1---avoid-unnecessary-optionality.html" />
    <id>tag:www.socketframework.com,2006:/blog//12.656</id>

    <published>2006-01-03T12:14:28Z</published>
    <updated>2010-12-26T07:02:58Z</updated>

    <summary>One of my main aims when writing code in C++ is to have the code clearly communicate its purpose. I find it useful to be able to look at a single line in isolation and have a pretty good idea...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="C++ Tips" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<p>One of my main aims when writing code in C++ is to have the code clearly communicate its purpose. I find it useful to be able to look at a single line in isolation and have a pretty good idea of what its effects are on the code that it cooperates with. Unfortunately C++ code can often be written in an imprecise way that makes reasoning about what it actually does harder than it needs to be. By increasing the precision of your code writing you can limit what the code could potentially do to just what you want it to do; by doing this it's then easy to reason about what the code is doing when you read it as it's clearly and precisely doing only one thing.</p>

One of the most important, and simplest, things that you can do to reduce code complexity and make your code easier to reason about is to avoid unnecessary optionality.]]>
        <![CDATA[<p>At its simplest, avoiding unnecessary optionality is simply replacing the use of a pointer with that of a reference. A pointer can either point to something, or not, whereas a reference must always refer to something. If you see a pointer in a piece of code you often have to look around a bit to find out if the pointer is valid or if it could be null. If you see a reference you know straight away that it is valid. That's pretty much all there is to it. If the code you are writing uses an object that is optional then you might choose to use a pointer to represent that optionality, if the object is not optional and must always be present then you <i>should</i> use a reference.</p>

<p>I personally find that reasoning about references when examining code is less complex that reasoning about pointers. A reference can do less that a pointer, it's more precise because it can only refer to a valid object whereas a pointer can refer to a valid object or null. Each time you see a pointer you have to work out if it is likely to be valid from the context in which you see it and from where you obtained it from, when you see a reference you don't have to do that. Whilst reading, or working through, complex code that you're not that familiar with, pointers require that you remember more about a variable than references. </p>

<p>Of course, about now someone will pipe up with a comment that points out that you can subvert the C++ reference mechanism by deliberately creating a reference to a null pointer. That's true, but if you have programmers that are doing that then you have more fundamental problems that this tip might help you address.</p>

<p>As always, the devil is in the detail. What does it mean to remove unnecessary optionality in practice?</p>

<p>Assume we have a container that maps from string names to widgets. Widget objects are pretty large so we decide to store pointers to widgets in the container and have the container manage the lifetime of the widgets that it contains. This could be a simple <code>stl::map</code> but I tend to prefer a more precise interface so I would tend to <a href="http://www.lenholgate.com/archives/000277.html">write a wrapper class</a> that exposes the interface that I actually need.</p>

<p>Suppose that the usage pattern for the collection of widgets is that they're loaded at program start up and the names are presented to the user so that they can select an appropriate widget to manipulate. What should the code that retrieves a widget from the collection look like? </p>

<p>Often you'll see code in the collection like this:</p>
<pre class="brush: cpp gutter: false">CWidget *CWidgets::Find(
   const std::string &amp;name)
{
   CWidget *pWidget = 0;
   
   Widgets::iterator it = m_widgets.find(name);
   
   if (it != m_widgets.end())
   {
      pWidget = it-&gt;second;
   }
   
   return pWidget;
}
</pre>
<p>and code at the point of use like this:</p>
<pre class="brush: cpp gutter: false">
CWidget *pWidget = m_widgets.Find(name);

assert(pWidget != 0);

DoThisWithWidget(pWidget);
DoThatWithWidget(pWidget);
// more code that manipulates the widget
</pre>
<p>One of the problems with this kind of code is that it doesn't clearly express what's going on. The code says the widgets collection <i>may</i> contain a widget with this name whereas the requirement is that the widgets collection <i>always</i> contains a widget with the supplied name. The collection is slightly more general purpose than it needs to be and because of this we inject a small amount of uncertainty into the code that uses it. Although the code following the <code>assert</code> may be protected from bugs that cause the collection not to contain the expected value the programmer still needs to reason in terms of a pointer, a potentially optional widget, rather than in terms of a reference to a widget that must exist. The slight addition in complexity is negligible at this point since you can easily see that the <code>assert</code> that states the pointer must always be valid, however, once we trace the code into <code>DoThisWithWidget()</code> or <code>DoThatWithWidget()</code>, or even as we move further away from the assertion code, things may, or may not, be more complex and harder to reason about.</p>

<p>My preferred solution to this <a href="http://en.wikipedia.org/wiki/Accidental_complexity">accidental complexity</a> is to remove the unnecessary optionality. Instead of using <code>CWidgets::Find()</code> as shown above we would use something like this:</p>
<pre class="brush: cpp gutter: false">CWidget &amp;CWidgets::Get(
   const std::string &amp;name)
{
   Widgets::iterator it = m_widgets.find(name);
   
   if (it == m_widgets.end())
   {
      throw CException("Widget: \"" + name + "\" not found");
   }
   
   return *(it-&gt;second);
}
</pre>
<p>This method on the widget collection will return a widget of the specified name if it exists and throw an exception if it doesn't. Obviously this is inappropriate if you need to find out <i>if</i> a widget exists in the collection, but it <i>is</i> appropriate if the widget <i>must</i> exist. What is done with the exception is up to the user of the widget collection, or, perhaps, the user of the user of the widget collection.</p>

<p>The code at point of use becomes something like this:</p>
<pre class="brush: cpp gutter: false">
CWidget &amp;widget = m_widgets.Get(name);

DoThisWithWidget(widget);
DoThatWithWidget(widget);
// more code that manipulates the widget
</pre>
<p>It's now quite clear that if the code ever returns from the call to <code>Get()</code> then you have a valid widget. Likewise, code following this point is clearly written with the intention of working with a widget that is always present. This allows our removal of optionality to "trickle down" to the functions that follow; <code>DoThisWithWidget()</code> and <code>DoThatWithWidget()</code> can now be written to take a reference and any assertions or checking for valid pointers inside them can be removed. Code such as this:</p>
<pre class="brush: cpp gutter: false">void DoThisWithWidget(
   CWidget *pWidget)
{
   assert(pWidget != 0);

   // do stuff with widget</pre>
or this:
<pre class="brush: cpp gutter: false">void DoThisWithWidget(
   CWidget *pWidget)
{
   if (!pWidget)
   {
      // handle unexpected error, widget can't be null!
   }
   
   // do stuff with widget
</pre>
<p>need never exist. Instead we just have code that clearly communicates its expectations in a way that the compiler can confirm and that we, as programmers, can forget.</p>
<pre class="brush: cpp gutter: false">void DoThisWithWidget(
   CWidget &amp;widget)
{
   // do stuff with widget
</pre>
<p>Often people will complain that they <i>must</i> use pointers rather than references for non-optional objects for some reason. Most of the time they're wrong ;) Sure, you may be passed a pointer that can never be null from an API that you're using but you don't need to pass this design pollution on to the rest of your code. Likewise you may use an API that requires a pointer but that doesn't mean you need to pass the object as a pointer within your own code.</p>

<p>Even when an object <i>is</i> optional you may still gain some clarity by removing the optionality. Of course if the object really is optional then you need to fabricate something to use when the optional object isn't present, for this you can often use the <a href="http://www.cs.oberlin.edu/~jwalker/nullObjPattern/">Null Object Pattern</a>. This replaces an optional object which may <i>be</i> nothing with an object that will always be there but may <i>do</i> nothing. Using a Null Object often allows you to move switching based on the presence of the optional object and consolidate the "doing nothing" within the null object rather than spreading it across all the users of the object in question.</p>

Code that clearly communicates what it is doing is easy to understand. Unnecessary optionality adds a small amount of confusion to code by having the code say that it <i>could</i> do one of two things when in fact it only actually does a single thing. Removing unnecessary optionality can make code communicate more clearly and can reduce uncertainty when reading unfamiliar code.]]>
    </content>
</entry>

</feed>


