<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>Rambling Comments</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/" />
    <link rel="self" type="application/atom+xml" href="http://www.lenholgate.com/blog/atom.xml" />
    <id>tag:www.lenholgate.com,2010-12-10:/blog//12</id>
    <updated>2011-09-14T08:56:53Z</updated>
    
    <generator uri="http://www.sixapart.com/movabletype/">Movable Type Pro 5.12</generator>

<entry>
    <title>The curious case of the missing copy constructor</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2011/09/the-curious-case-of-the-not-missing-copy-constructor.html" />
    <id>tag:www.lenholgate.com,2011:/blog//12.1135</id>

    <published>2011-09-14T08:15:46Z</published>
    <updated>2011-09-14T08:56:53Z</updated>

    <summary> I have a tendency to write unit tests that are a little more invasive than they need to be; these tests make sure that not only are the results as expected but also that as many of the side-effects...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="C++ Tips" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Testing" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<div>
I have a <a href="http://www.lenholgate.com/blog/2008/04/what-would-i-do.html">tendency to write unit tests that are a little more invasive</a> than they need to be; these tests make sure that not only are the results as expected but also that as many of the side-effects and interactions with other objects are as expected as well. So, for example, in my current <a href="http://www.lenholgate.com/blog/2011/09/the-websocket-protocol---draft-hybi-13.html">WebSockets development</a> for <a href="http://www.serverframework.com/" target="_blank">The Server Framework</a> I have some tests which test that the correct data is delivered to the client of the API that I'm developing and also test that the API interacts with its buffer allocator correctly and doesn't leak memory. The detailed interaction testing sometimes gets in the way of refactoring as the inputs don't change, the outputs don't change but the interaction changes so a test that should pass before and after a refactoring may fail just because I've optimised the use of a sub object that happens to be being checked in the test; in these cases where both levels of testing are useful I sometimes duplicate the test with and without the detailed interaction examination... Anyway...
</div>
<div><br /></div>
<div>
I have a test which tracks the way a buffer's reference count is modified as the object under test performs an action; it's useful to be have tests which prove that under given circumstances you don't have a reference counting leak. The buffer is passed around via a mix of raw pointers and smart pointers and during the test a mock buffer is used which logs all of the operations performed. At one point in the interaction log there's a point where the buffer is returned by value via a smart pointer and we get an <code>AddRef()</code>, <code>Release()</code> sequence of calls as the temporary is copied into and out of. The compiler is allowed by the C++ standard to elide copy constructors in some situations, see <a href="http://en.wikipedia.org/wiki/Copy_elision" target="_blank">here</a>, so you should be careful that your copy constructor doesn't do anything other than copy the object as you can't guarantee that it will actually be called. If you then allow for the fact that the compiler may opt to use the <a href="http://en.wikipedia.org/wiki/Return_value_optimization">Return Value Optimisation</a> in some circumstances to avoid creating temporaries you should probably start to realise that my invasive test is somewhat fragile depending on which compiler optimisations are enabled and which compiler is being used...
</div>
<div><br /></div>
<div>
So, my tests expect to see the copy constructor called and the resulting sequence of <code>AddRef()</code> and <code>Release()</code> calls on the buffer and in debug builds on all supported compilers this is what they see. Unfortunately on release builds the compilers differ... VS2005 and VS2010 RTM both elide the copy constructor (presumably applying  RVO) whereas VS2008 and VS2010 SP1 both call the copy constructor. For now I have a rather clunky macro that detects the compiler version and build version and removes the test requirement where necessary. For a while I had some problems differentiating between VS2010 RTM and VS2010 SP1 but luckily <code>_MSC_FULL_VER</code> can be used for that. 
</div>
<div><br /></div>
<div>
Now, off to learn a little more about RVO and <a href="http://www.synesis.com.au/resources/articles/cpp/movectors.pdf" target="_blank">move constructors</a>.
</div>
<div><br /></div>
<div>
<b>Edit:</b>It seems that it's less compiler related than I thought. One of the projects (the one where the smart pointer template lives) had optimisations turned off in some of the release builds for some of the compilers... More investigation needed, with any luck all compilers will elide the copy once optimisations are turned on...
</div>]]>
        
    </content>
</entry>

<entry>
    <title>Autobahn WebSockets protocol compliance test suite</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2011/08/autobahn-websockets-protocol-compliance-test-suite.html" />
    <id>tag:www.lenholgate.com,2011:/blog//12.1132</id>

    <published>2011-08-31T07:27:29Z</published>
    <updated>2011-08-31T07:37:25Z</updated>

    <summary> I&apos;m nearing the end of my WebSockets implementation for The Server Framework and have been dealing with various protocol compliance issues. Whilst I have decent unit test coverage I haven&apos;t, yet, sat down and produced compliance specific unit tests...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Socket Servers" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Testing" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<div>
I'm nearing the end of my <a href="http://www.lenholgate.com/blog/2011/07/the-websocket-protocol-design-by-committee-and-requirements-tracing.html">WebSockets implementation</a> for <a href="http://www.serverframework.com/">The Server Framework</a> and have been dealing with various protocol compliance issues. Whilst I have decent unit test coverage I haven't, yet, sat down and produced compliance specific unit tests which walk through the various parts of the (ever changing) draft RFC and test each aspect. Looking back, I probably should have taken this approach even though the RFC was fluid. Anyway. One of the people involved in the WebSockets Working Group has a nice set of compliance tests for their implementation and they have made these tests are freely available; so I've been using them! They've really helped nail a few subtle issues in my implementation (and one or two not so subtle issues!).
</div>
<div><br/></div>
<div>
The <a href="http://www.tavendo.de/autobahn/testsuite.html" target="_blank">Autobahn tests can be found here</a>, and the report generated by the current pre-release version of my implementation can be found <a href="http://www.serverframework.com/ServerFramework/6.5/WebSockets/">here</a>.
</div>   ]]>
        
    </content>
</entry>

<entry>
    <title>In response to @dhanji on unit testing</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2011/06/in-response-to-dhanji-on-unit-testing.html" />
    <id>tag:www.lenholgate.com,2011:/blog//12.1115</id>

    <published>2011-06-07T07:30:59Z</published>
    <updated>2011-06-23T10:10:22Z</updated>

    <summary> Dhanji over at Rethrick Construction has written an interesting piece on the value of unit testing. I agree with his conclusion; &quot;So the next time someone comes to you saying let&apos;s write the tests first, or that we should...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Testing" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<div>
<a href="https://twitter.com/#!/dhanji" target="_blank">Dhanji</a> over at <a href="http://rethrick.com/" target="_blank">Rethrick Construction</a> has written <a href="http://rethrick.com/#unit-tests-false-idol" target="_blank">an interesting piece</a> on the value of unit testing. 
</div>
<div><br /></div>
<div>
I agree with his conclusion; <em>"So the next time someone comes to you saying let's write the tests first, or that we should aim for 80% code coverage, take it with a healthy dose of skepticism." </em> But then I tend to take everything with a dose of skepticism...
</div>
]]>
        <![CDATA[<div><br /></div>
<div>
I also agree with the fact that sometimes the tests get in the way of refactorings that you'd like to do and sometimes the tests give you more code that needs to be maintained and that they often appear to slow down your development time. I'm probably more affected than many by these problems because I happen to <a href="http://www.lenholgate.com/blog/2008/04/what-would-i-do.html">like rigid tests</a> with hand rolled mocks which <a href="http://www.lenholgate.com/blog/2005/06/easy-interaction-testing-in-c-with-mocks-that-create-logs.html">log and validate</a> all of their interaction with the objects under test. Yes, you need to maintain the tests as well as the code and, yes, sometimes you'll find that a simple change to the code results in lots of changes to your tests. This is, often, a chance to refactor your tests - something that I don't do often enough. It's also often a sign than you need to refactor your code too, if seemingly unrelated tests are failing then you possibly have some unexpected cohesion between the code and tests relating to the code you're changing and other tests which are only related via mocks and interfaces... Is that interface too fat, does it do too much?
</div>
<div><br /></div>
<div>
However, I'm not sure that deleting the tests is a good idea, though of course it depends on the tests themselves. Sure it's often faster to test working code with tests which test a lot of functionality; integration tests are great and I have a very blurry line between them and my unit tests - my integration tests simply test larger "units" than my unit tests. Deleting unit tests and testing at a higher level is almost certain to make you feel faster when you're working on a well factored, well tested and stable code base and when the change you're making works and doesn't have any strange and unexpected side effects. You can simply make the changes required and fix up fewer tests and run your integration tests. The problem, in my experience, comes from when the change doesn't go in as easily as expected... 
The higher the level of your tests the more they're testing and the less focused they are; test failures may tell you that the functionality doesn't work any more but they don't necessarily tell you where the problem is. This may make it take longer for you to find the problem and it may mean that you spend longer debugging... You may end up adding new unit tests to track down the problem, or you may simply spend longer in the debugger.
</div>
<div><br /></div>
<div>
These things are hard to quantify which is why I often use <a href="http://www.lenholgate.com/cgi-bin/mt/mt-search.cgi?search=JIT&amp;IncludeBlogs=12&amp;limit=20" target="_blank">JIT</a> and initially write fewer tests than I feel I should until I know I need to write them... I think the important thing, as always, is to <a href="https://twitter.com/#!/LenHolgate/status/55978172881182720" target="_blank">think and question</a>, rather than just to blindly follow someone else's advice. So, I urge you to take my blog with a healthy dose of skepticism, think about it, and come to your own conclusions; this stuff works for me, it might not work for you.
</div>]]>
    </content>
</entry>

<entry>
    <title>Practical Testing: 31 - A bug in DestroyTimer.</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2011/01/practical-testing-31---a-bug-in-destroytimer.html" />
    <id>tag:www.lenholgate.com,2011:/blog//12.1054</id>

    <published>2011-01-13T16:36:25Z</published>
    <updated>2011-06-23T11:04:32Z</updated>

    <summary>Back in 2004, I wrote a series of articles called &quot;Practical Testing&quot; where I took a piece of complicated multi-threaded code and wrote tests for it. I then rebuild the code from scratch in a test driven development style to...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Source Code" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Testing" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<p>Back in 2004, I wrote a series of articles called <a href="http://www.lenholgate.com/blog/2004/05/practical-testing.html">"Practical Testing"</a> where I took a piece of complicated multi-threaded code and wrote tests for it. I then rebuild the code from scratch in a test driven development style to show how writing your tests before your code changes how you design your code. Since the original articles there have been several bug fixes and redesigns all of which have been supported by the original unit tests and many of which have led to the development of more tests.</p>

Whilst doing some development on a new server design I managed to expose a bug which has been present in the timer queue code for some time and which is also present in the timer wheel implementation. The bug allows memory within the timer queue to be deleted twice if you happen to call <code>DestroyTimer()</code> on a timer from within <code>OnTimer()</code> for that same timer. The problem occurs infrequently as it requires the memory for the timer data that has just been deleted to be set with a specific bit pattern that will then cause the timer queue to think that the data needs to be deleted again. It also requires that you're calling <code>HandleTimeouts()</code> to handle timeouts whilst allowing the timer queue to hold its internal lock; this is something most of my servers don't do. Anyway, circumstances conspired to make this bug visible and so here I am to fix it.]]>
        <![CDATA[<p>Of course, the first thing to do is write a test that exposes the bug. The test is fairly simple to build but it requires that we adjust the mock timer that we use so that it can be told to delete the timer during timeout handling. The new test looks something like this:</p>

<pre class="brush: cpp gutter: false">template &lt;class Q, class T, class P&gt;
void TCallbackTimerQueueTestBase&lt;Q, T, P&gt;::TestDestroyTimerDuringOnTimerInHandleTimeouts()
{
   JetByteTools::Win32::Mock::CMockTimerQueueMonitor monitor;

   P tickProvider;

   tickProvider.logTickCount = false;

   {
      Q timerQueue(monitor, tickProvider);

      CheckConstructionResults(tickProvider);

      CLoggingCallbackTimer timer;

      const Milliseconds timeout = 100;

      const IQueueTimers::UserData userData = 1;

      IQueueTimers::Handle handle = CreateAndSetTimer(
         tickProvider,
         timerQueue,
         timer,
         timeout,
         userData);

      timer.DestroyTimerInOnTimer(timerQueue, handle);

      const Milliseconds expectedTimeout = CalculateExpectedTimeout(timeout);

      THROW_ON_FAILURE_EX(expectedTimeout == timerQueue.GetNextTimeout());

      tickProvider.CheckResult(_T("|GetTickCount|"));

      tickProvider.SetTickCount(expectedTimeout);

      timerQueue.HandleTimeouts();

      tickProvider.CheckResult(_T("|GetTickCount|"));

      timer.CheckResult(_T("|OnTimer: 1|TimerDestroyed|"));

      tickProvider.CheckNoResults();

      THROW_ON_NO_EXCEPTION_EX_1(timerQueue.DestroyTimer, handle);
   }

   THROW_ON_FAILURE_EX(true == monitor.NoTimersAreActive());
}
</pre>
<p>And the change to the mock looks something like this:</p>
<pre class="brush: cpp gutter: false">void CLoggingCallbackTimer::OnTimer(
   UserData userData)
{
   if (logMessage)
   {
      if (logUserData)
      {
         LogMessage(_T("OnTimer: ") + ToString(userData));
      }
      else
      {
         LogMessage(_T("OnTimer"));
      }
   }

   if (m_pTimerQueue)
   {
      m_pTimerQueue-&gt;DestroyTimer(m_handle);

      LogMessage(_T("TimerDestroyed"));
   }

   ::InterlockedIncrement(&amp;m_numTimerEvents);

   m_timerEvent.Set();
}</pre>
<p>The result is that the timer is destroyed during timeout handling and the test demonstrates the failure in the code.</p>

<p>Unfortunately, with the latest build of the code the test does NOT demonstrate the problem. Unfortunately the PTMalloc implementation <a href="http://www.lenholgate.com/blog/2010/09/practical-testing-30---reducing-contention.html">that we're currently using</a> doesn't allow you to set it to fill deleted memory with an unlikely bit pattern. The default allocator with the Visual Studio C runtime does allow this in debug builds and this helps to force the bug into view. Adding the new test to the code that was presented in <a href="http://www.lenholgate.com/blog/2010/09/practical-testing-29---fixing-the-timer-wheel.html">part 29</a> causes the bug to manifest and for Visual Studio to pop up a debug assertion message when the memory is double deleted.</p>

<p>The potential for this problem to occur when using the <code>BeginTimeoutHandling()</code>, <code>EndTimeoutHandling()</code> style of "lock free" timeout handling was already identified and fixed back when I first added the "lock free" timeout handling in <a href="http://www.lenholgate.com/blog/2008/08/practical-testing-18---removing-the-potential-to-deadlock.html">part 18</a>. The fix is pretty similar. There's already a flag in the internal timer data structure that's used to delay destruction until after the timer has finished being processed but it doesn't get set when using <code>HandleTimeouts()</code>. The fixes are fairly simple for both the timer wheel and the timer queue.</p>
<pre class="brush: cpp gutter: false">void CCallbackTimerQueueBase::TimerData::OnTimer()
{
   m_processingTimeout = true;

   OnTimer(m_active);

   m_processingTimeout = false;
}

CCallbackTimerWheel::TimerData *CCallbackTimerWheel::TimerData::OnTimer()
{
   m_ppPrevious = 0;

   m_processingTimeout = true;

   OnTimer(m_active);

   m_processingTimeout = false;

   return m_pNext;
}
</pre>

<p>The code from part 29 with these fixes applied can be found <a href="http://www.lenholgate.com/zips/PracticalTesting-31a.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'PracticalTesting-31a.zip']);">here</a>.</p>

<p>The code that uses PTMalloc from part 30 with these fixes applied can be found <a href="http://www.lenholgate.com/zips/PracticalTesting-31b.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'PracticalTesting-31b.zip']);">here</a>.</p>

<p>Please note that the <a href="http://www.lenholgate.com/archives/000906.html">previous rules</a> apply.</p>]]>
    </content>
</entry>

<entry>
    <title>WinRM/WinRS job memory limits.</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2010/11/winrmwinrs-job-memory-limits-1.html" />
    <id>tag:www.socketframework.com,2010:/blog//12.990</id>

    <published>2010-11-19T16:11:00Z</published>
    <updated>2011-11-24T12:15:27Z</updated>

    <summary>My tangential testing that began with my problems with commands run via WinRs during some distributed load testing are slowly unravelling back to the start. I now have a better build and test system for the server examples that ship...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Geek Speak" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Testing" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<p>My <a href="http://www.lenholgate.com/archives/000936.html">tangential testing</a> that began with my problems with <a href="http://www.lenholgate.com/archives/000935.html">commands run via WinRs</a> during some distributed load testing are slowly unravelling back to the start. I now have a better build and test system for the server examples that ship as part of <a href="http://www.serverframework.com/">The&nbsp;Server&nbsp;Framework</a>. I have a test runner that runs the examples with memory limits to help spot memory leak bugs and a test runner that checks for lock inversions. My test scripts are also more flexible and I've found a couple of bugs.</p>

<p>Today I used a variation on my job object based memory limiting test runner to investigate the original problem with commands spawned remotely using WinRS. Given that I had a test runner that could spawn a process and place it into a job object it was simple to adjust this so that the target process was instead broken out of any job that the test runner might be in; just add <code>CREATE_BREAKAWAY_FROM_JOB</code> to the process creation flags and hope... This seemed to work so I then looked into reporting on the job that the test runner found itself in... Some simple calls to <code>IsProcessInJob()</code> and <code>QueryInformationJobObject()</code> with null jobs showed that the WinRS processes <i>were</i> being run inside of a job and that the limits (on my particular box) were as follows:</p>

<pre>Job report for process: 10676
Process IS in a job
Flags: 0x2b08 - 
   JOB_OBJECT_LIMIT_ACTIVE_PROCESS
   JOB_OBJECT_LIMIT_PROCESS_MEMORY
   JOB_OBJECT_LIMIT_JOB_MEMORY
   JOB_OBJECT_LIMIT_BREAKAWAY_OK
   JOB_OBJECT_LIMIT_KILL_ON_JOB_CLOSE

<p>ActiveProcessLimit: 15
Affinity: 0x0
MaximumWorkingSetSize: 0
MinimumWorkingSetSize: 0
PerJobUserTimeLimit: 0
PerProcessUserTimeLimit: 0
PriorityClass: 32
SchedulingClass: 5
JobMemoryLimit: 157286400
ProcessMemoryLimit: 157286400</p></pre>

<p>The problem limits being the job and process memory limits. Luckily the WinRS job has <code>JOB_OBJECT_LIMIT_BREAKAWAY_OK</code> set so my attempt to break the target process free of these limits works fine.</p>

I still haven't found where any of this is documented, but at least I now don't have to write my own remote process spawning code.]]>
        
    </content>
</entry>

<entry>
    <title>A lock inversion detector as part of the build is good</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2010/11/a-lock-inversion-detector-as-part-of-the-build-is-good.html" />
    <id>tag:www.socketframework.com,2010:/blog//12.989</id>

    <published>2010-11-12T08:47:34Z</published>
    <updated>2011-07-07T07:19:47Z</updated>

    <summary>As I mentioned, I&apos;ve been adjusting my build system and have finally got to the point where my lock inversion detector is suitable to run on all of my example servers during their test phase on the build machines. I&apos;m...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Lock Explorer" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Socket Servers" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Testing" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<p><a href="http://www.lenholgate.com/archives/000936.html">As I mentioned</a>, I've been adjusting my build system and have finally got to the point where my lock inversion detector is suitable to run on all of my example servers during their test phase on the build machines. I'm working my way through the various example server's test scripts and adjusting them so that they use the lock inversion detector, can be easily configured to run the full blown deadlock detector and also can run the servers under the memory profiling test runner that I put together earlier in the week.</p>

<p><b>Note:</b> the deadlock detector mentioned in this blog post is now available for download from <a href="http://www.lockexplorer.com" target="_blank">www.lockexplorer.com</a>.</p>

<p>So far I've found 3 lock inversions that have never caused deadlocks in practice but which could, given the right circumstances and server design, cause problems. The 3 that I've found (and fixed) so far are in the async connectors used by the <a href="http://www.serverframework.com/products---the-ssltls-using-openssl-option.html">OpenSSL</a>, <a href="http://www.serverframework.com/products---the-ssltls-using-schannel-option.html">SChannel</a> and <a href="http://www.serverframework.com/products---the-sspi-negotiate-option.html">SSPI Negotiate</a> stream filters.</p>

So there will be a 6.3.2 next week but I'll delay it until the new build process is being used by all the example servers.]]>
        
    </content>
</entry>

<entry>
    <title>Tangential testing</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2010/11/tangential-testing.html" />
    <id>tag:www.socketframework.com,2010:/blog//12.988</id>

    <published>2010-11-10T08:15:03Z</published>
    <updated>2011-07-07T08:57:46Z</updated>

    <summary>My theorising about the strange memory related failures that I was experiencing with my distributed testing using WinRS have led me to putting together a test runner that can limit the amount of memory available to a process and terminate...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Lock Explorer" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Socket Servers" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Testing" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<p>My <a href="http://www.lenholgate.com/archives/000935.html">theorising about the strange memory related failures</a> that I was experiencing with my distributed testing using WinRS have led me to putting together a test runner that can limit the amount of memory available to a process and terminate it if it exceeds the expected amount. With this in place during my server test runs I can spot the kind of memory leak that slipped through the cracks of my testing and made it into release 6.2 of <a href="http://www.serverframework.com/">The&nbsp;Server&nbsp;Framework</a>.</p>

<p>The idea is that rather than just starting the server we start the server using a monitoring process that first creates a job object for the server process and then starts the server process and assigns it to the job. The monitoring process can set some limits to the amount of memory that the processes in the job can allocate and it gets informed when they reach that limit. For the kind of testing I'm currently doing the monitor then simply terminates the server which causes the test to fail. You need to run the target process under the monitor without the memory limiter in place a few times and dump out the memory allocation stats to get an idea of the maximum memory that it will allocate during a normal run and from then on you can limit the memory allocation to that amount and be sure that the tests will fail if there's a leak like the one that made its way into 6.2.</p>

<p>Whilst I was adjusting the test scripts to test the new test runner I decided that it might be a good idea to merge this functionality into my "lock inversion detector" code. The <a href="http://www.lockexplorer.com/lock-inversion-detector-download-free.html"<lock inversion detector</a> is a cut down version of my <a href="http://www.lockexplorer.com/lock-inversion-analyser-buy-now.html">deadlock detection</a> and "lock explorer" tools. The lock inversion detector is considerably faster than the full lock explorer and the idea is that it is used in a similar way to the memory monitoring test runner. Test servers run under the lock inversion detector and if a release introduces the potential to deadlock via a lock inversion then the test will fail. Note that the code under test doesn't actually have to deadlock it just has to have the potential to... This is quite a powerful tool and it's helped my clients out on many occasions but it's still under development (I tend to use it, tweak it if need be to find the problem, and then move on with fixing the problem). Recent tweaks have made it run fast enough to become part of my build process.</p>

<p><b>Note:</b> the lock inversion detector mentioned in this blog post is now available for download from <a href="http://www.lockexplorer.com" target="_blank">www.lockexplorer.com</a>.</p>

<p>If the lock inversion detector finds a lock inversion you need to run the full deadlock detector to get the information you need to fix the problem. This causes the target process to run a little slower than when under the lock inversion detector but the end result is a list of problem lock sequences with the threads concerned and call stacks showing each lock manipulation. With this information it's usually pretty trivial to find and remove the lock inversion. Again the target process doesn't need to actually deadlock for the deadlock detector to show you where it could deadlock.</p>

<p>I need to do some more work to integrate these tests into my build and release process but they're valuable additions. Ideally I'd like to merging the memory monitoring test runner functionality with the lock inversion detector, but for now I may simply run one set of tests with the memory monitoring and one with the lock inversion testing.</p>

These improvements to the build and release process will hopefully be in place for the release of 6.4.]]>
        
    </content>
</entry>

<entry>
    <title>Practical Testing: 30 - Reducing contention</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2010/09/practical-testing-30---reducing-contention.html" />
    <id>tag:www.socketframework.com,2010:/blog//12.980</id>

    <published>2010-09-23T14:19:51Z</published>
    <updated>2011-01-02T15:28:45Z</updated>

    <summary>Previously on &quot;Practical Testing&quot;... I&apos;ve been looking at the performance of the timer system that I developed and have built a more specialised and higher performance timer system which is more suitable for some high performance reliable UDP work that...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Source Code" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Testing" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<p>Previously on <a href="http://www.lenholgate.com/archives/000306.html">"Practical Testing"</a>... I've been looking at the performance of the timer system that I developed and have built a <a href="http://www.lenholgate.com/archives/000909.html">more specialised</a> and higher performance timer system which is more suitable for some high performance reliable UDP work that I'm doing. Whilst developing the new timer wheel I began to consider the thread contention issues that the timer system faced and came up with a notation for talking about contention (see <a href="http://www.lenholgate.com/archives/000908.html">here</a>). Both the general purpose timer queue and the new timer wheel suffered from more potential thread contention that they needed to because of the way the STL containers that I am using a) require memory allocations on insertion and removal and b) use the program heap for those memory allocations. This converted the contention for the timer system from contention between the number of threads accessing the timer system to contention between the number of threads accessing the program heap...</p>

I mentioned a while back that a custom STL allocator would be one way to reduce the thread contention; the allocator could use a private heap that only the timer system used and so the potential contention during memory allocation and release would be reduced to the potential contention for the timer object itself. Today I'll present the results of switching to a private heap using a custom STL allocator for the STL collections that I use.]]>
        <![CDATA[<p>I <a href="http://www.lenholgate.com/archives/000919.html">went looking</a> for information about writing my own STL allocator and ended up with some useful code from <a href="http://www.tantalon.com/pete.htm">Pete Isensee</a>. I then boiled this down to something that fitted with my development style and that supported allocations using <code>HeapAlloc()</code>. Actually using the allocator was trivial, if slightly messy.</p>
<pre class="brush: cpp gutter: false">typedef std::deque&lt;TimerData *&gt; Timers;
 
typedef std::pair&lt;size_t, Timers&gt; TimersAtThisTime;
 
typedef std::map&lt;ULONGLONG, TimersAtThisTime *&gt; TimerQueue;
 
typedef std::pair&lt;TimerQueue::iterator, size_t&gt; TimerLocation;
 
typedef std::map&lt;TimerData *, TimerLocation&gt; HandleMap;
</pre>
<p>became</p>
<pre class="brush: cpp gutter: false">typedef std::deque&lt;TimerData *, CAlloc&lt;TimerData *&gt; &gt; Timers;
 
typedef std::pair&lt;size_t, Timers&gt; TimersAtThisTime;
 
typedef std::map&lt;ULONGLONG, TimersAtThisTime *,
   std::less&lt;ULONGLONG&gt;,
   CAlloc&lt;std::pair&lt;ULONGLONG, TimersAtThisTime *&gt; &gt; &gt; TimerQueue;
 
typedef std::pair&lt;TimerQueue::iterator, size_t&gt; TimerLocation;
 
typedef std::map&lt;TimerData *, TimerLocation,
   std::less&lt;TimerData *&gt;,
   CAlloc&lt;std::pair&lt;TimerData *, TimerLocation&gt; &gt; &gt; HandleMap;
</pre>
<p>And I had to add some allocators and a private heap to the class.</p>
<pre class="brush: cpp gutter: false">CSmartHeapHandle m_heap;
 
CAlloc&lt;TimerData *&gt; m_timersAllocator;
 
CAlloc&lt;std::pair&lt;ULONGLONG, TimersAtThisTime *&gt; &gt; m_timerQueueAllocator;
 
CAlloc&lt;std::pair&lt;TimerData *, TimerLocation&gt; &gt; m_handleMapAllocator;</pre><br />
And then adjust the constructors to make use of all of this:<br />
<pre class="brush: cpp gutter: false">CCallbackTimerQueueBase::CCallbackTimerQueueBase()
   :  m_heap(::HeapCreate(HEAP_NO_SERIALIZE, 0,0)),
      m_timersAllocator(m_heap),
      m_timerQueueAllocator(m_heap),
      m_handleMapAllocator(m_heap),
      m_queue(std::less&lt;ULONGLONG&gt;(), m_timerQueueAllocator),
      m_handleMap(std::less&lt;TimerData *&gt;(), m_handleMapAllocator),
      m_monitor(s_monitor),
      m_maxTimeout(s_timeoutMax),
      m_handlingTimeouts(InvalidTimeoutHandleValue)
{
   if (!m_heap.IsValid())
   {
      throw CException(_T("CCallbackTimerQueueBase::CCallbackTimerQueueBase()"), _T("Failed to create private heap"));
   }
}
</pre>
<p>Note that since we are taking responsibility for locking around access to the heap we can tell <code>HeapCreate()</code> not to bother locking internally with the <code>HEAP_NO_SERIALIZE</code> flag.</p>

<p>Unfortunately this makes performance worse, though arguably it has reduced contention. The problem is that <code>HeapAlloc()</code> isn't as efficient as the standard implementation of <code>new</code> and so whilst we've reduced contention we've also reduced overall performance. Not good.</p>

<p>I did some research on high performance memory allocators and decided that <a href="http://www.malloc.de/en/">PTMalloc</a> was a good fit for what I needed. PTMalloc supports separate heaps by what it terms "malloc spaces" or <code>mspace</code> and it supports multi-threaded use where you're responsible for locking. I wrapped the code in one of my library projects and created some helper code so that it integrated more easily with the rest of my code.</p>

<p>A new STL allocator implementation can then allocate and deallocate from a PTMalloc <code>mspace</code> rather than from a heap created with <code>HeapAlloc()</code>. The results were good, faster than the original <code>new</code> implementation and, due to the private heap, the contention of the timer queue was reduced to <b>C(n threads using the queue)</b>.</p>

<p>The STL allocator isn't the only place that dynamic memory is being allocated though, we're also allocating a timer handle when we create a timer and in the timer queue we need to allocate the various structures that help us build and manage our queue. If these continue to use the standard program heap then our worst possible contention is still <b>C(n threads using the program heap)</b> rather than <b>C(n threads using the queue)</b>.</p>

<p>Providing custom allocation and deallocation code, that uses the PTMalloc private heap, for the other memory allocations deals with both the contention and boosts performance.</p>

<p>In the case of the <b>TimerData</b> object we can add a placement new implementation for it that uses our private heap. For the simpler memory objects we allocate and construct them manually using the private heap's allocator.</p>

<p>On my system the performance tests show some pretty nice improvements for the timer queues. Every operation is faster. Timer creation is down from 60ms per 100,000 to 40ms. Setting timers down from 130ms to 100ms (again per 100,000) and timer handling down a little.</p>

<p>Of course the allocation and STL allocator changes can also be applied, albeit with lesser results as the only STL collection used is for timer handle validation and the only dynamically allocated data is the timer handle.</p>

<p>The code for the STL allocator using <code>HeapAlloc()</code> can be found <a href="http://www.lenholgate.com/zips/PracticalTesting-30a.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'PracticalTesting-30a.zip']);">here</a>.</p>

<p>The code for the STL allocator using PTMalloc can be found <a href="http://www.lenholgate.com/zips/PracticalTesting-30b.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'PracticalTesting-30b.zip']);">here</a>.</p>

<p>And the code for all allocations using PTMalloc can be found <a href="http://www.lenholgate.com/zips/PracticalTesting-30c.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'PracticalTesting-30c.zip']);">here</a>.</p>

<p>Please note that the <a href="http://www.lenholgate.com/archives/000906.html">previous rules</a> apply.</p>

There's still scope for improvement. As I've mentioned before (<a href="http://www.lenholgate.com/archives/000907.html">here</a> and <a href="http://www.lenholgate.com/archives/000920.html">here</a>), the STL containers are not intrusive and so memory must be allocated for each item placed in them. Old school intrusive containers wouldn't require memory allocation and release at all and so should improve performance somewhat. What's more, a custom designed intrusive multi-map could allow for the "remove all entries that match this key" operation which I'm currently fudging using more dynamically allocated structures...]]>
    </content>
</entry>

<entry>
    <title>Practical Testing: 29 - Fixing the timer wheel</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2010/09/practical-testing-29---fixing-the-timer-wheel.html" />
    <id>tag:www.socketframework.com,2010:/blog//12.978</id>

    <published>2010-09-09T07:41:33Z</published>
    <updated>2011-01-02T15:24:27Z</updated>

    <summary>Previously on &quot;Practical Testing&quot;... I&apos;m writing a timer wheel which matches the interface used by my timer queue. This new implementation is designed for a particular usage scenario with the intention of trading space for speed and improving performance of...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Source Code" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Testing" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[Previously on <a href="http://www.lenholgate.com/archives/000306.html">"Practical Testing"</a>...  I'm writing a <a href="http://www.lenholgate.com/archives/000909.html">timer wheel</a> which matches the interface used by my timer queue. This  new implementation is designed for a particular usage scenario with the intention of trading space for speed and improving performance of some reliable UDP code. The last entry completed the development of the timer wheel. This time we fix a couple of the bugs that I've discovered since I started to integrate the code with the system that it was developed for.]]>
        <![CDATA[<p>I try to be pragmatic with my testing. I know that I could write tests for the rest of my life and never prove that the code under test was 100% correct and so I try and write the smallest number of tests that give the largest amount of validation and confidence in the code. Because of this I quite expect to discover bugs in my code and find that I need to add a new test that reproduces the bug before I then fix the bug and the test passes.</p>

<p>That said, I still should have included a test for these particular bugs in my original test suite for the timer wheel, it's a fairly obvious hole in the testing. Both bugs are fairly serious, and fairly simple. Luckily once I integrated the code into the real application the problems showed up quickly and regularly; this meant that it was easy to track them down and once I had an idea of what the likely problems were it was easy to put together some tests that showed the problems.</p>

<p>The first bug is in how we handle calculating the next timeout when that timer has wrapped and is before our current position in the timer wheel rather than after it. None of the existing tests put the code in this position, but real usage caused it almost immediately. </p>

<p><img alt="TimerWheel-4.png" src="http://www.lenholgate.com/blog/images/TimerWheel-4.png" width="434" height="301" border="0" /></p>

<p>The calculation for this situation was wrong, it was this:</p>
<pre class="brush: cpp gutter: false">nextTimeout = static_cast&lt;Milliseconds&gt;((
   (m_pFirstTimerSetHint &gt; m_pNow ?
      (m_pFirstTimerSetHint - m_pNow) :
      (m_pNow - m_pFirstTimerSetHint)) + 1) * m_timerGranularity);
</pre>
<p>and it should be this:</p>
<pre class="brush: cpp gutter: false">nextTimeout = static_cast&lt;Milliseconds&gt;((
   (m_pFirstTimerSetHint &gt;= m_pNow ?
      (m_pFirstTimerSetHint - m_pNow) :
      (m_pTimersEnd - m_pNow + m_pFirstTimerSetHint - m_pTimersStart)) + 1) * m_timerGranularity);
</pre>

<p>The original calculation caused the timer wheel to report an incorrect next timeout which caused timeout handling to stall until our first timer was larger than "now" again....</p>

<p>The second bug is slightly less serious but occurs if there are no timers set and we're using the <code>HandleTimeouts()</code> call for timeout processing. The timer wheel's view of the current time is updated during timer processing. If this there are no timers set then the timer processing loop is skipped inside of <code>HandleTimeouts()</code> and the wheel's view of the current time begins to lag. This progressively reduces the value of the timeout that you can set with <code>SetTimer()</code>. The fix is to have <code>SetTimer()</code> reset the wheel if no timers are currently set. In this situation it's safe to set the wheel to its initial state before setting a new timer. The fix is pretty simple, we just add this:</p>
<pre class="brush: cpp gutter: false">Milliseconds CCallbackTimerWheel::CalculateTimeout(
   const Milliseconds timeout)
{
   const Milliseconds now = m_tickCountProvider.GetTickCount();
  
   if (m_numTimersSet == 0)
   {
      m_currentTime = now;
  
      m_pNow = m_pTimersStart;
   }
  
   const Milliseconds actualTimeout = timeout + (now - m_currentTime);
  
   if (actualTimeout &gt; m_maximumTimeout)
   {
      throw CException(
         _T("CCallbackTimerWheel::CalculateTimeout()"),
         _T("Timeout is too long. Max is: ") +
         ToString(m_maximumTimeout) +
         _T(" tried to set: ") + ToString(actualTimeout) +
         _T(" (") + ToString(timeout) + _T(")"));
   }
  
   return actualTimeout;
}
</pre>

<p>Where the fix is the code inside of the <code>if (m_numTimersSet == 0)</code> block.</p>

<p>I've also renamed the <code>#define</code> that's used to enable monitoring; the previous name seemed a little back to front...</p>

The code can be found <a href="http://www.lenholgate.com/zips/PracticalTesting-29.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'PracticalTesting-29.zip']);">here</a> and the <a href="http://www.lenholgate.com/archives/000906.html">previous rules</a> apply.]]>
    </content>
</entry>

<entry>
    <title>Practical Testing: 28 - Finishing the timer wheel</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2010/08/practical-testing-28---finishing-the-timer-wheel.html" />
    <id>tag:www.socketframework.com,2010:/blog//12.973</id>

    <published>2010-08-04T11:54:16Z</published>
    <updated>2011-01-02T15:19:53Z</updated>

    <summary>Previously on &quot;Practical Testing&quot;... I&apos;m writing a timer wheel which matches the interface used by my timer queue. This new implementation is designed for a particular usage scenario with the intention of trading space for speed and improving performance of...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Source Code" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Testing" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<p>Previously on <a href="http://www.lenholgate.com/archives/000306.html">"Practical Testing"</a>...  I'm writing a <a href="http://www.lenholgate.com/archives/000909.html">timer wheel</a> which matches the interface used by my timer queue. This  new implementation is designed for a particular usage scenario with the intention of trading space for speed and improving performance of some reliable UDP code.</p>

<p>Over the last four entries I've implemented various parts of the timer wheel and adjusted the test code so that I could reuse the tests that I already had for the other implementations with my timer wheel. The tests needed to be tweaked quite a bit to take into account the different behavioural characteristics of the wheel and the queues, this was accomplished using traits which determine the detail of how the class under test interacts with its service providers (mainly a tick count provider).</p>

Today we finally get to the point where we have a working timer wheel that is compatible with the interface used by the two timer queues. We can then look at the results of the performance tests and work out where we need to go next.]]>
        <![CDATA[<p>The wheel that was presented in the <a href="http://www.lenholgate.com/archives/000918.html">previous entry</a> is mostly complete. In fact only three functions are left to implement and these three functions implement a single feature; the handling of timeouts without needing to hold a lock on the wheel during timeout dispatch, this functionality makes it impossible to deadlock due to lock inversions which involve the timer wheel. I described the changes that were required when I added this functionality to the timer queue <a href="http://www.lenholgate.com/archives/000795.html">here</a>, the changes required for the timer wheel are similar.</p>

<p>The main change is in how we store the timer data so that we can be dispatching an expired timer and setting, cancelling or destroying the same timer during the dispatch. We need to allow for this because the lock that should be held whilst you're updating the timer wheel is not held during timer dispatch. So, rather than holding a single set of timer data within the timer we hold two sets, an active set and a timed out set. When the timeout handling begins the call to <code>BeginTimeoutHandling()</code>, which should be protected by a lock, updates each of the timers to prepare it for timeout dispatch. This means that it needs to walk the list of timers for this time and copy the active set of data to the timed out set and clear the active set. Now when the timer is processed by <code>HandleTimeout()</code> we're working with the timed out set of data which allows the timer to be manipulated normally using the active data. Having to prepare the timers is a bit of a pain, it means that timer dispatch is <b>O(n)</b> where <b>n</b> is the number of timers to be dispatched, however this is the same as for the timer queue and we're avoiding the <b>O(log n)</b> (n being the number of timers currently set in this case) of the balanced tree lookup required for the timer queues...</p>

<p>There's still scope for some refactoring in the code and there's a need for some more tests to make sure that we're doing things sensibly when we fail to handle timeouts for a long period of time but this can be done later. I've added a few new tests to test the timer wheel with the <code>CThreadedCallbackTimerQueue</code>, I expect that could do with a name change now, but that can also wait.</p>

<p>Since we now have a fully functional timer wheel I can compare the performance results with those from the timer queue implementations. Note that these are the results that I get on my development box, the results that you get are likely to be different but the proportional differences should be similar... All test results are an average of 10 runs of the same test with 100,000 timers in use in each test.</p>

<ul>
<li><b>Creating timers</b> - Unsurprisingly, since very similar work is being done by both the queues and the wheel, <code>CreateTimer()</code> is roughly similar with the wheel actually taking fractionally longer at 55ms vs the queues which both take 50ms.</li>
<li><b>Setting timers</b> - Again unsurprisingly, the timer wheel is much faster when setting timers at 8ms compared to 88ms for the queue.  Note that the 8ms value is a best case scenario where we set and reset the same timer, with different timers the times rise to 15ms vs 130ms and with different timers being set for the same times we get 14ms vs 60ms. What doesn't show here is that the wheel is also only <b>C(n wheel users)</b> compared to the queue's <b>C(n wheel users)+2C(n heap users)</b> (see <a href="http://www.lenholgate.com/archives/000908.html">here</a> for details of my 'big C' notation for talking about contention).</li>
<li><b>HandleTimeouts</b> - When handling timeouts and holding the lock we get results of 54ms against 94ms for the queues. When not holding the locks the numbers are 57ms and 98ms. Again the wheel has lower contention.</li>
</ul>

<p>All in all I'm pleased with the performance and contention improvements that have come from using a radically different design. The timer wheel isn't as general purpose as the timer queues and it's not going to be a good fit for all of the usage scenarios that I use the queue in but for those situations that it <i>is</i> appropriate for, the performance will be considerably better.</p>

The code can be found <a href="http://www.lenholgate.com/zips/PracticalTesting-28.zip"  onclick="_gaq.push(['_trackEvent', 'Downloads', 'PracticalTesting-28.zip']);">here</a> and the <a href="http://www.lenholgate.com/archives/000906.html">previous rules</a> apply.]]>
    </content>
</entry>

<entry>
    <title>Practical Testing: 27 - Fixing things...</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2010/08/practical-testing-27---fixing-things.html" />
    <id>tag:www.socketframework.com,2010:/blog//12.970</id>

    <published>2010-08-03T16:53:50Z</published>
    <updated>2011-01-02T15:16:03Z</updated>

    <summary>Previously on &quot;Practical Testing&quot;... To deal with some specific usage scenarios of a piece of general purpose code I&apos;m in the process of implementing a timer wheel that matches the interface to the timer queue that I previously developed in...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Source Code" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Testing" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[Previously on <a href="http://www.lenholgate.com/archives/000306.html">"Practical Testing"</a>... To deal with some specific usage scenarios of a piece of general purpose code I'm in the process of implementing a <a href="http://www.lenholgate.com/archives/000909.html">timer wheel</a> that matches the interface to the timer queue that I previously developed in this <a href="http://www.lenholgate.com/archives/000306.html">series of articles</a>. Last time I left myself with a failing test. The problem is that setting a new timer on the timer wheel sets a timer that's relative to the time that timer wheel thinks is 'now' and the timer wheel's view of the current time could be slightly behind reality; see the <a href="http://www.lenholgate.com/archives/000913.html">previous entry</a> for a diagram that explains the problem.]]>
        <![CDATA[<p>This kind of problem wouldn't exist if the timer wheel was operating on a hard real time system where each tick of the hardware clock caused the timer wheel to 'rotate' and process timers that have expired. Unfortunately since we just have a normal thread to process timers the wheel can get slightly behind reality. There are two ways to solve this problem and both have drawbacks. The first is to cause the wheel to be processed before any new timer is set, this would mean that the wheel is always up to date and therefore the timer insertion would be correct. Unfortunately this leads to timers being handled on any thread that calls <code>SetTimer()</code> which may not be ideal for users of the wheel, it also means that setting a timer is no longer O(1)... The second approach is to simply set the timer based on the current time and allow for the difference between the current time and the timer wheel's view of the current time when the timer is inserted into the wheel. The disadvantage with this approach is that the maximum timeout that can be set will fluctuate around the lag between 'now' and the timer wheel's view of 'now'. You can work around this fluctuation by making the wheel have a maximum timeout that is larger than the actual maximum timeout that you wish to set and the expected lag...</p>

<p>I've taken the second approach as non O(1) timer setting and 'random thread timer dispatch' are not desirable qualities for the usage scenarios that I'm currently targeting. This means that the timer wheel now queries the tick count provider when you call <code>SetTimer()</code> but it's easy to adjust the tests for this due to the test traits that I introduced a while back.</p>

<p>Now that <code>SetTimer()</code> works correctly we can move on to implementing the remaining functions. Unfortunately though, before we do that we need to deal with a memory leak bug in the <code>CCallbackTimerQueueBase</code> class which the tests can't detect and which I missed due to <a href="http://www.lenholgate.com/archives/000914.html">not running BoundsChecker</a> after each set of changes... The leak was introduced when I switched from using the <code>std::multimap</code> to using a <code>std::deque</code> back in <a href="http://www.lenholgate.com/archives/000907.html">part 21</a>. Unfortunately I missed out a couple of <code>delete</code> statements to clear up the new structure that we allocate to store in the <code>std::deque</code>  when we set timers. This just goes to show that it doesn't matter how many tests you have and how good your coverage is, it's never enough to prove that the code is without bugs. Running BoundsChecker showed the bug quite clearly but it's a pity that it needs to be a separate stage of testing. Instrumenting memory allocation within the test would help and is something that I might look into... Anyway, the fixes are to add a loop which cleans up the allocated memory in the queue's destructor and to clean up blocks of timers as they expire in <code>HandleTimeouts()</code>.</p>

The code can be found <a href="http://www.lenholgate.com/zips/PracticalTesting-27.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'PracticalTesting-27.zip']);">here</a> and the <a href="http://www.lenholgate.com/archives/000906.html">previous rules</a> apply.]]>
    </content>
</entry>

<entry>
    <title>Practical Testing: 26 - More functionality, more refactoring and a new bug</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2010/07/practical-testing-26---more-functionality-more-refactoring-and-a-new-bug.html" />
    <id>tag:www.socketframework.com,2010:/blog//12.965</id>

    <published>2010-07-23T08:56:01Z</published>
    <updated>2011-01-02T13:45:12Z</updated>

    <summary>Previously on &quot;Practical Testing&quot;... To deal with some specific usage scenarios of a piece of general purpose code I&apos;m in the process of implementing a timer wheel that matches the interface to the timer queue that I previously developed in...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Source Code" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Testing" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<p>Previously on <a href="http://www.lenholgate.com/archives/000306.html">"Practical Testing"</a>... To deal with some specific usage scenarios of a piece of general purpose code I'm in the process of implementing a <a href="http://www.lenholgate.com/archives/000909.html">timer wheel</a> that matches the interface to the timer queue that I previously developed in this <a href="http://www.lenholgate.com/archives/000306.html">series of articles</a>. The timer wheel trades space for speed and so far the development has gone well as I've been able to use the tests that I had already developed for the previous implementations to guide the new development.</p>

By the end of <a href="http://www.lenholgate.com/archives/000911.html">last time</a> we'd got to the point where we had four functions left to implement...]]>
        <![CDATA[<p>Today we'll deal with the second style of <code>SetTimer()</code> call. The timers can be set in two ways, the first is useful if you need to repeatedly set and reset a timer, you create a timer handle by calling <code>CreateTimer()</code> and you can then call <code>SetTimer()</code> and <code>CancelTimer()</code> as often as you like. Finally you call <code>DestroyTimer()</code> when you're done with your timer. The second style is for when you simply want to 'fire and forget' a timer. You simply call the second variation of <code>SetTimer()</code> and this creates a timer for you, sets it and destroys it once the timer has timed out or when the timer system is destroyed. This second style of timer makes it easy for the caller and slightly more complex for the timer system since there are now timers that need to be cleaned up once they expire. However, we need to deal with this kind of thing anyway as a timer could be destroyed during timeout processing, or in the gap between a begin/handle/end sequence of timer handling.</p>

<p>Whilst adjusting the tests to make sure they took into account the timer wheel's traits and generally worked correctly with the new implementation I decided that although I like my tests rigid (some would say <a href="http://www.lenholgate.com/archives/000910.html">brittle</a>, or fragile), I'd gone slightly too far with the logging coming out of the tick count providers. These were reporting the value of the tick count that was being provided as well as the fact that the call was made. Now in some of my tests this is useful but here the test was clearly setting the value so logging it was of no use and made the tests more complex in the presence of variable timer granularities. Adjusting the mocks and the tests makes them a bit cleaner.</p>

<p>The tests also needed adjusting now that all of the code can be built with monitoring enabled. The <a href="http://www.lenholgate.com/archives/000910.html">traits</a> work pretty well for this and I'm happy with the results.</p>

<p>There's still quite a bit of duplicate code in the tests and the code to create and set a timer is one piece that's easy to slim down by using the helper function that is used by some but not all of the tests. Only the tests that are actually for <code>SetTimer()</code> need to do it long hand to make it clear what we're actually testing.</p>

<p>Since we now have timers that can delete themselves and since I <a href="http://www.lenholgate.com/archives/000906.html">recently added to the timer monitoring interface</a> to allow us to ensure that all timers were always cleaned up it seems about the right time to add monitoring interface support to the timer wheel.</p>

<p>The resulting changes to implement the second <code>SetTimer()</code> overload are as follows, note that I've adjusted <code>CreateTimer()</code> so that the common code that I need to call when I create a timer in <code>SetTimer()</code> isn't duplicated. The timer data constructor that we're using there sets the timer up appropriately for single use mode.</p>
<pre class="brush: cpp gutter: false">CCallbackTimerWheel::Handle CCallbackTimerWheel::CreateTimer()
{
   TimerData *pData = new TimerData();
  
   return OnTimerCreated(pData);
}

CCallbackTimerWheel::Handle CCallbackTimerWheel::OnTimerCreated(
   TimerData *pData)
{
   m_handles.insert(pData);
  
#if (JETBYTE_PERF_TIMER_WHEEL_MONITORING_DISABLED == 0)
  
   m_monitor.OnTimerCreated();
  
#endif
  
   return reinterpret_cast&lt;Handle&gt;(pData);
}

bool CCallbackTimerWheel::SetTimer(
   const Handle &amp;handle,
   Timer &amp;timer,
   const Milliseconds timeout,
   const UserData userData)
{
   if (timeout &gt; m_maximumTimeout)
   {
      throw CException(
         _T("CCallbackTimerWheel::SetTimer()"), 
         _T("Timeout is too long. Max is: ") + ToString(m_maximumTimeout) + _T(" tried to set: ") + ToString(timeout));
   }
  
   TimerData &amp;data = ValidateHandle(handle);
  
   const bool wasPending = data.CancelTimer();
  
   data.UpdateData(timer, userData);
  
   InsertTimer(timeout, data, wasPending);
  
#if (JETBYTE_PERF_TIMER_WHEEL_MONITORING_DISABLED == 0)
  
   m_monitor.OnTimerSet(wasPending);
  
#endif
  
   return wasPending;
}
  
void CCallbackTimerWheel::SetTimer(
   IQueueTimers::Timer &amp;timer,
   const Milliseconds timeout,
   const IQueueTimers::UserData userData)
{
   if (timeout &gt; m_maximumTimeout)
   {
      throw CException(
         _T("CCallbackTimerWheel::SetTimer()"), 
         _T("Timeout is too long. Max is: ") + ToString(m_maximumTimeout) + _T(" tried to set: ") + ToString(timeout));
   }
  
   TimerData *pData = new TimerData(timer, userData);
  
   OnTimerCreated(pData);
  
   InsertTimer(timeout, *pData);
  
#if (JETBYTE_PERF_TIMER_WHEEL_MONITORING_DISABLED == 0)
  
   m_monitor.OnOneOffTimerSet();
  
#endif
}
</pre>
<p>So that we can delete these "one shot" timers when the timer wheel is destroyed we need to keep the handle in the handle map, this does, however, open a hole in our handle validation code as someone could pass in a random invalid handle value that matches one of the "one shot" timers and convince the timer wheel that the handle is valid. To prevent this we now also check that the handle isn't scheduled to be deleted after the timer expires, the only handles in the handle map that will be set like this are the "one shot" timers and these, by definition, don't have a valid handle that you can manipulate.</p>
<pre class="brush: cpp gutter: false">CCallbackTimerWheel::TimerData &amp;CCallbackTimerWheel::ValidateHandle(
   const Handle &amp;handle) const
{
   TimerData *pData = reinterpret_cast&lt;TimerData *&gt;(handle);
  
   Handles::const_iterator it = m_handles.find(pData);
  
   if (it == m_handles.end())
   {
      throw CException(
         _T("CCallbackTimerWheel::ValidateHandle()"), 
         _T("Invalid timer handle: ") + ToString(handle));
   }
  
   if (pData-&gt;DeleteAfterTimeout())
   {
      throw CException(
         _T("CCallbackTimerWheel::ValidateHandle()"), 
         _T("Invalid timer handle: ") + ToString(handle));
   }
  
   return *pData;
}
</pre>
<p>With all of that done and with some more tests passing I'm left with just the <code>BeginTimeoutHandling()</code>, <code>HandleTimeout()</code>,  <code>EndTimeoutHandling()</code> code to implement. Unfortunately there's a bug in our timer setting code for the timer wheel and the existing tests don't catch it.</p>

<p><img alt="TimerWheel-3.png" src="http://www.lenholgate.com/blog/images/TimerWheel-3.png" width="434" height="301" border="0" /></p>

<p>Let's assume that we're in the situation shown above and we set a timer. The timer wheel has its current time set to 35ms before the actual time because timeouts haven't been handled yet. At present if we set a timer for 10ms the timer will be set at the point marked as 30 on the diagram above rather than at the point marked 65. The existing tests don't show this problem as they all set timers for a 'now' that is the same as the time the wheel was created or just after timeouts have been handled; which means that current always equals now in most of the tests that set timers. A new test that sets a timer with now != current clearly shows the problem. I'll leave this broken test to show me the way for next time.</p>

The code can be found <a href="http://www.lenholgate.com/zips/PracticalTesting-26.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'PracticalTesting-26.zip']);">here</a> and the <a href="http://www.lenholgate.com/archives/000906.html">previous rules</a> apply.]]>
    </content>
</entry>

<entry>
    <title>Practical Testing: 25 - Nothing is free</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2010/07/practical-testing-25---nothing-is-free.html" />
    <id>tag:www.socketframework.com,2010:/blog//12.963</id>

    <published>2010-07-21T13:13:57Z</published>
    <updated>2011-01-02T13:41:50Z</updated>

    <summary>I&apos;m in the process of implementing a timer wheel that matches the interface to the timer queue that I previously developed in this series of articles. The idea being that for certain specific usage scenarios the timer wheel will perform...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Source Code" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Testing" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<p>I'm in the process of implementing a <a href="http://www.lenholgate.com/archives/000909.html">timer wheel</a> that matches the interface to the timer queue that I previously developed in this <a href="http://www.lenholgate.com/archives/000306.html">series of articles</a>. The idea being that for certain specific usage scenarios the timer wheel will perform better than the timer queues. Last time I refactored the tests that I was using for the timer queues to remove duplication and I now have a set of failing tests for the new timer wheel. </p>

As soon as I started to look at making some of the failing tests pass I realised that having a heap of failing tests wasn't such a good idea, at least with my home brew test framework. I had stubbed out the timer wheel's interface and had decided to throw exceptions from the functions that weren't implemented yet. Those exceptions caused the tests that used that functionality to fail, so far so good. Unfortunately there was no differentiation between tests that I knew would fail and tests that just happened to be failing; I discovered this when I realised that some of the failing tests were for the timer queues and not the wheel... Switching the exception thrown to one of my testing exceptions, a "test skipped" exception means that I now have a load of timer wheel tests that are skipped due to lack of implementation code and the test failures are clearly failures. Once the real failures were fixed I could move on with the new code.]]>
        <![CDATA[<p>The great thing about a timer wheel is that it has O(1) performance for setting a timer, simply index into the array and push the new timer onto the head of the list of other timers at this time. It also has O(1) timer cancellation; each timer knows where it is in the wheel and can unset itself directly. For many uses timeout handling can be O(1) as well, if your timer handling code runs from a hardware timer tick then each tick moves the wheel forward by one slot and expires the timers that are present there. My usage is a little more complicated in that I need to be able to query the wheel for the time when the next timer is due. This means that I need to look up when the next timer is due and to do that I need to scan the wheel from 'now' forward until I find a timer... Our worst case is O(n slots) where the number of slots is determined by the maximum supported timeout and the timer granularity.</p>

<p><img alt="TimerWheel-1.png" src="http://www.lenholgate.com/blog/images/TimerWheel-1.png" width="434" height="301" border="0" /></p>

<p>In the diagram above, if the timer wheel's current time is 0 then we would need to scan forward sequentially from 'start' to the first timer that is set at position 30 to determine that the first timer is set at 30...</p>

<p>It's possible to optimise this. We could manage a 'hint' which points to the earliest timer that has been set, discovering the next timeout could then be O(1) if the hint is set. The hint could be managed by our calls to <code>SetTimer()</code>, if the timer we're setting is earlier than the hint, or if it's the first timer set then we set the hint to point at it. Unfortunately this scheme falls down in the presence of timer cancellation. If you cancel the earliest timer then you need to scan forward to update the hint, this forward scan is potentially O(n); so now your cancellation has gone from O(1) to O(n) to keep your timeout processing at O(1)... In some usage scenarios this might be acceptable except that, of course, expiring a timer or setting a timer that is already set are also forms of cancellation...</p>

<p>For now we'll avoid all of this complexity and settle for O(n) next timeout calculation. We will, however, mitigate the worst case and add a counter that counts how many timers are currently set, if the counter is zero then there's no need to scan the whole array to discover that no timer is set; our worst case is now that only one timer is set and it's set to the maximum timeout value. Likewise we can keep a hint that can be passed from one call to <code>GetNextTimeout()</code> to the next as long as the hint is zeroed upon any timer changes.</p>
<pre class="brush: cpp gutter: false">Milliseconds CCallbackTimerWheel::GetNextTimeout()
{
   Milliseconds nextTimeout = INFINITE;
  
   // We need to work out the time difference between now and the first timer that is set. 
  
   if (!m_pFirstTimerSetHint)
   {
      m_pFirstTimerSetHint = GetFirstTimerSet(); 
   }
  
   if (m_pFirstTimerSetHint)
   {
      // A timer is set! Calculate the timeout in ms
  
      nextTimeout = static_cast&lt;milliseconds&gt;((
         (m_pFirstTimerSetHint &gt; m_pNow ? 
            (m_pFirstTimerSetHint - m_pNow) : 
            (m_pNow - m_pFirstTimerSetHint)) + 1) * m_timerGranularity);
  
      const Milliseconds now = m_tickCountProvider.GetTickCount();
  
      if (now != m_currentTime)
      {
         // Time has moved on, adjust the next timeout to take into account the difference between now and 
         // the timer wheel's view of the current time...
  
         const Milliseconds timeDiff = (now &gt; m_currentTime ? now - m_currentTime : m_currentTime - now);
  
         if (timeDiff &gt; nextTimeout)
         {
            nextTimeout = 0;
         }
         else
         {
            nextTimeout -= timeDiff;
         }
      }
   }
  
   return nextTimeout;
}
  
CCallbackTimerWheel::TimerData **CCallbackTimerWheel::GetFirstTimerSet() const
{
   TimerData **pFirstTimer = 0;
  
   if (m_numTimersSet != 0)
   {
      // Scan forwards from now to the end of the array...
  
      for (TimerData **p = m_pNow; !pFirstTimer &amp;&amp; p &lt; m_pTimersEnd; ++p)
      {
         if (*p)
         {
            pFirstTimer = p;
         }
      }
  
      if (!pFirstTimer)
      {
         // We havent yet found our first timer, now scan from the start of the array to 
         // now...
  
         for (TimerData **p = m_pTimersStart; !pFirstTimer &amp;&amp; p &lt; m_pNow; ++p)
         {
            if (*p)
            {
               pFirstTimer = p;
            }
         }
      }
  
      if (!pFirstTimer)
      {
         throw CException(_T("CCallbackTimerWheel::GetFirstTimerSet()"),
            _T("Unexpected, no timer set but count = ") +
            ToString(m_numTimersSet));
      }
   }
  
   return pFirstTimer;
}
</pre>

<p>Now that we can work out when the next timeout is due we can start to think about handling the timers when they expire. Given the diagram below, if the timer wheel believes that the current time is as indicated and the timers are then expired when the time is at 'now' we will need to process all of the the timers that are set in the order shown by their index numbers.</p>

<p><img alt="TimerWheel-3.png" src="http://www.lenholgate.com/blog/images/TimerWheel-3.png" width="434" height="301" border="0" /></p>

<p>Again, if we were driving the wheel from a timer tick then things are simplified as we would only ever 'rotate' the wheel by one slot at a time. In the world of general purpose, multi-threaded, non-real time systems though (is that a big enough proviso?) all manner of reasons might mean that we don't actually get to process the timers until after they're due.</p>

<p>If all we need to worry about is processing the timers in sequence then we could step along the wheel and then walk each chain of timers and handle them as we go. It could be a little more complex than that if we want to use the <code>BeginTimeoutHandling()</code>, <code>HandleTimeouts()</code>, <code>EndTimeoutHandling()</code> methods to allow us to process timers without holding our lock onto the timer system whilst the timers are dispatched (I talk about why this is a desirable design <a href="http://www.lenholgate.com/archives/000795.html">here</a>, and why needing to go through the begin, handle, end sequence multiple times to process timers is less than ideal <a href="http://www.lenholgate.com/archives/000907.html">here</a>). Ideally, for the later situation we'd want our 'begin' to accumulate all 6 timers into a correctly ordered list and remove them from the wheel. We would then unlock the wheel and process the 6 timers in order before locking the wheel again to update the processed timers inside of <code>EndTimeoutHandling()</code>. Doing it this way would mean traversing each slot's list of timers to get to the last one so that we can link the next list onto the end of the previous lists...</p>

<p>If we ignore the more complex scenario and implement the easy one we end up with code like this to deal with the 'holding a lock whilst dispatching' case.</p>
<pre class="brush: cpp gutter: false">void CCallbackTimerWheel::HandleTimeouts()
{
   const Milliseconds now = m_tickCountProvider.GetTickCount();
  
   while (TimerData *pTimers = GetTimersToProcess(now))
   {
      while (pTimers)
      {
         pTimers = pTimers-&gt;OnTimer();
  
         --m_numTimersSet;
      }
   }
}
  
CCallbackTimerWheel::TimerData *CCallbackTimerWheel::GetTimersToProcess(
   const Milliseconds now)
{
   TimerData *pTimers = 0;
  
   // Round 'now' down to the timer granularity
  
   const Milliseconds thisTime = ((now / m_timerGranularity) * m_timerGranularity);
  
   while (!pTimers &amp;&amp; m_currentTime != thisTime)
   {
      TimerData **ppTimers = GetTimerAtOffset(0);
  
      pTimers = *ppTimers;
  
      // Step along the wheel...
  
      m_pNow++;
  
      if (m_pNow &gt;= m_pTimersEnd)
      {
         m_pNow = m_pTimersStart + (m_pNow - m_pTimersEnd);
      }
  
      m_currentTime += m_timerGranularity;
   }
  
   if (pTimers)
   {
      m_pFirstTimerSetHint = 0;
   }
  
   return pTimers;
}
</pre>
<p>With this in place we're left with 20 tests that fail due to lack of implementation and 4 functions that we need to deal with properly. Three form the begin, handle, end API for unlocked timer dispatch and the fourth is for the the <code>SetTimer()</code> overload that doesn't require a handle. There's an interesting amount of functionality required to implement the remaining functions as you can see from <a href="http://www.lenholgate.com/archives/000795.html">here</a> and <a href="http://www.lenholgate.com/archives/000803.html">here</a>. We'll look at this next time.</p>

The code can be found <a href="http://www.lenholgate.com/zips/PracticalTesting-25.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'PracticalTesting-25.zip']);">here</a> and the <a href="http://www.lenholgate.com/archives/000906.html">previous rules</a> apply.]]>
    </content>
</entry>

<entry>
    <title>Practical Testing: 24 - Removing test duplication</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2010/07/practical-testing-24---removing-test-duplication.html" />
    <id>tag:www.socketframework.com,2010:/blog//12.962</id>

    <published>2010-07-20T13:53:18Z</published>
    <updated>2011-01-02T13:40:01Z</updated>

    <summary>The most recent articles in the &quot;Practical Testing&quot; series have been discussing the performance of the timer queue that we have built. Once I had got some new, optional, performance tests in place to measure what we were trying to...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="Testing" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<p>The most recent articles in the <a href="http://www.lenholgate.com/archives/000306.html">"Practical Testing"</a> series have been discussing the performance of the timer queue that we have built. Once I had got some new, optional, performance tests in place to measure what we were trying to improve I eventually came up with a new approach and <a href="http://www.lenholgate.com/archives/000909.html">began to implement a timer wheel</a> that conforms to the interface used by the other implementations of my timer queue. Whilst doing this it became obvious that there was duplication in my test code and so the tests have been refactored to remove the duplication of the test code between <code>CCallbackTimerQueue</code> and <code>CCallbackTimerQueueEx</code> and, in addition, to create a full suite of tests for the new <code>CCallbackTimerWheel</code> class. This puts us firmly in Test Driven Development as we now have a set of tests where most of them fail for the new implementation.</p>

<p>To make sure the tests fail in a reliable way I've gone through the new <code>CCallbackTimerWheel</code> class and added exceptions which are thrown whenever some of the 'currently not implemented' functionality is exercised.</p>

<p>Removing the duplication was fairly straight forward. A new template class has been created and this contains the tests which are common between the different implementations. The class it derived from by the concrete tests classes for each implementation class and is templatised on both the class under test, the tick count provider and a 'traits' class that I use to tell the tests about various differences in behaviour. The traits are useful to paper over the slight differences which are exposed due to the slightly over invasive way that I like to test...</p>

<p>There's a school of thought that says the way I test leads to brittle tests because I often validate the calls made into my mock objects, the tick count provider, for example, to make sure that the expected sequence of calls happens. Of course this ties my tests to my current implementation. I can understand why this is often a bad thing in that I could change my implementation to improve things and whilst the new implementation does what it should to pass a test it might fail due to the interaction changes that I've made. I agree, it's a pain to have to go and change a bunch of tests because you have changed the number of calls into one of your mocks but I prefer this to not testing these interactions and suddenly finding that our code is calling an expensive call multiple times due to laziness on the part of the developer (me) or that we're suddenly not using an interface that we've provided with and nobody was aware of it... Anyway, the three implementations of the timer queue all have slightly different interactions with their tick count providers and these differences are captured in the test traits which allows the tests to be invasive correctly for each implementation...</p>

<p>Another problem was the slight constructor parameter differences between the new timer wheel and the older queues. The wheel needs to know the maximum timeout range that it supports and the timer granularity and the queues don't. To get around this I've used a very thin shim class which simply defaults the timer wheel's parameters for the shared tests.</p>

<p>Since the calculation of expected timeouts needed to change in the shared tests due to the timer wheel's granularity settings the tests could now work with timer queues that support a timer granularity other than 1. This may be a simple performance tweak and is something that's now on the list of things to do...</p>

<p>Whilst I've removed the blatant copy and paste nature of the duplicate tests there's still plenty of scope to refactor them to reduce the small scale duplication that's going on; that, however, is a job for another day.</p>

The code can be found <a href="http://www.lenholgate.com/zips/PracticalTesting-24.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'PracticalTesting-24.zip']);">here</a> and the <a href="http://www.lenholgate.com/archives/000906.html">previous rules</a> apply.]]>
        
    </content>
</entry>

<entry>
    <title>Practical Testing: 23 - Another new approach: timer wheels</title>
    <link rel="alternate" type="text/html" href="http://www.lenholgate.com/blog/2010/07/practical-testing-23---another-new-approach-timer-wheels.html" />
    <id>tag:www.socketframework.com,2010:/blog//12.961</id>

    <published>2010-07-19T08:01:18Z</published>
    <updated>2011-01-02T13:39:24Z</updated>

    <summary>The most recent articles in the &quot;Practical Testing&quot; series have been discussing the performance of the timer queue that we have built. As I hinted when I first brought up the performance issues, the particular use case that I have...</summary>
    <author>
        <name>Len</name>
        
    </author>
    
        <category term="ENet" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Source Code" scheme="http://www.sixapart.com/ns/types#category" />
    
        <category term="Testing" scheme="http://www.sixapart.com/ns/types#category" />
    
    
    <content type="html" xml:lang="en-us" xml:base="http://www.lenholgate.com/blog/">
        <![CDATA[<p>The most recent articles in the <a href="http://www.lenholgate.com/archives/000306.html">"Practical Testing"</a> series have been discussing the performance of the timer queue that we have built. As I hinted when I <a href="http://www.lenholgate.com/archives/000906.html">first brought up</a> the performance issues, the particular use case that I have which is causing problems might well be more efficiently dealt with using a different (more specialised and less general purpose) design. </p>

<p>The timer queue has adequate performance for general purpose use and can handle timers set within a range of 0ms to 49.7 days with a theoretical granularity of 1ms. It achieves this by using a balanced search tree to store the timers by absolute timeout. The performance of setting a timer is O(log n) due to the tree insertion required. Cancelling a timer is O(1) since we keep an iterator to where the timer was inserted and thus can navigate straight to it to cancel it. Timer expiry is also an O(log n) operation due to the tree lookup. Due to the use of the standard program heap the worst case contention of the queue is <b>C(tq)+(C(tn-tq+1)+C(ts-tq+1)+C(tn-tq+1))</b> (see <a href="http://www.lenholgate.com/archives/000908.html">here</a> for details of my crazy Big C notation for describing contention).</p>

The more specialist use case is for driving reliable UDP protocols. This kind of work generally requires timers per connection for retransmission and data flow pacing. The timeouts tend to be short and the timers tend to expire rather than be reset without expiring. The range of timeouts is generally quite small; 0ms - 30seconds for the ENet system I'm working on. I'm currently looking at improving performance of the timer system for this kind of scenario and to do so requires that timer insertion speed be improved (so we can set timers more quickly), timer expiry speed be improved (so we can process timers faster) and contention be reduced, ideally tending towards <b>C(tq)</b> where we have contention only between users of the timer queue and not between any thread in the process.]]>
        <![CDATA[<p>As I have already mentioned the use of STL containers means that I'm doing more work than is strictly necessary when manipulating the timer queue (including dynamic memory allocation and release during timer insertion and removal). One way of improving contention is to switch to using custom STL allocators so that only the users of the queue ever access the allocators that we use for the queue. Another is to write a custom, invasive, balanced search tree that does not need to use dynamic allocation.</p>

<p>A third solution would be to use a simpler data structure. Our requirement is simply to store timers in order of timeout. Rather than using a complex tree structure we could use a simple sorted list. Unfortunately timer insertion would then rise to O(n) as we would need to traverse the list to locate the correct spot to insert our new timer. Cancellation can stay O(1) if we use our invasive <code>CNodeList</code> and timer handling becomes O(1) because we will always work from the head of the list when expiring timers. The usage pattern of the reliable retransmission means that we'll be inserting timers over the whole of our possible range, so the O(n) insertion would really bite us. </p>

<p>In a classic trade off between memory usage and performance we could use an array and have lots of wasted space in it. Setting a timer becomes O(1), you simply index directly into the array at the correct location. Cancellation and timer processing are also O(1) and there's no dynamic memory allocation required for insertion and removal so the worst case contention is C(tq). Such a structure is called a timer wheel due to the fact that the array is viewed as a circular buffer and timers are inserted with timeouts relative to a 'now' point on the wheel. </p>

<p>The amount of memory used can be reduced by reducing the granularity at which you can set your timers. For example, a timer wheel with a range of 0-30seconds and a granularity of 1ms requires 30,000 elements in the array, if you reduce the granularity to 15ms (which is pretty much the best you can get from <code>GetTickCount()</code> anyway), then the array size is reduced to a more manageable 2,000 elements. Given that the array is an array of pointers we're looking at 8kB on an x86 and 16kB on x64. Each array element points to either <code>null</code> if no timer is set or to the first timer in a doubly linked list of timers at this time. The list is invasive with the links being part of the data that is stored in the list. Insertion into the list is a case of simply pushing a new node onto the front of the existing list, cancellation is easy as the list is doubly linked and the node contains the links. Thus most timer manipulation becomes simply adjusting pointers.</p>

<p><img alt="TimerWheel-1.png" src="http://www.lenholgate.com/blog/images/TimerWheel-1.png" width="434" height="301" border="0" /></p>

<p>The wheel in the diagram above has a granularity of 5ms and has timers set at 30 and 50. The wheel is defined by two pointers, one to the start of it and one to one element beyond the end.</p>

<p><img alt="TimerWheel-2.png" src="http://www.lenholgate.com/blog/images/TimerWheel-2.png" width="434" height="301" border="0" /></p>

<p>This diagram clearly shows the circular nature of the array. This is just before we expire the 30ms timer. Note that the next timer is due in 20ms.</p>

<p>My implementation of a timer wheel is made easier by the fact that I have a set of tests that target the interface to which I wish to conform to. To start with I'll implement a basic timer wheel that allows us to create, set and cancel timers but that doesn't deal with any of the complexity of expiring timers. Also all of the nice and implied or explicit implementation details will be left out. Don't worry, once we write the tests for these pieces of functionality it'll be obvious where we're failing.</p>

<p>Creation and destruction of the timer wheel are pretty straight forward. We have an array of pointers to create, the size of which is based on the maximum timeout that we can set and the granularity of the timers that can be set. Destruction is similar to the timer queue in that we iterate any existing timers and clean them up. Timer creation is very similar to our timer queue as we dynamically allocate the timer data and insert it into a map for validation and clean up purposes. The timers themselves are, at present at least, quite simple. a link for the next timer in the list, a link to the previous timer and the timer and user data. Setting a timer simply involves validating it, locating the correct index into the timer wheel array and then adding the timer to the list of timers at that point in the array.</p>
<pre class="brush: cpp gutter: false">bool CCallbackTimerWheel::SetTimer(
   const Handle &amp;handle,
   Timer &amp;timer,
   const Milliseconds timeout,
   const UserData userData)
{
   if (timeout &gt; m_maximumTimeout)
   {
      throw CException(
         _T("CCallbackTimerWheel::SetTimer()"), 
         _T("Timeout is too long. Max is: ") + ToString(m_maximumTimeout) + _T(" tried to set: ") + ToString(timeout));
   }
  
   TimerData &amp;data = ValidateHandle(handle);
  
   const bool wasSet = data.CancelTimer();
  
   data.UpdateData(timer, userData);
  
   InsertTimer(timeout, data);
  
   return wasSet;
}
  
void CCallbackTimerWheel::InsertTimer(
   const Milliseconds timeout,
   TimerData &amp;data)
{
   const size_t timerOffset = timeout / m_timerGranularity;
  
   TimerData **ppTimer = GetTimerAtOffset(timerOffset);
  
   data.SetTimer(ppTimer, *ppTimer);
}
  
void CCallbackTimerWheel::TimerData::SetTimer(
   TimerData **ppPrevious,
   TimerData *pNext)
{
   if (m_ppPrevious)
   {
      throw CException(
         _T("CCallbackTimerWheel::TimerData::SetTimer()"),
         _T("Internal Error: Timer is already set"));
   }
  
   m_ppPrevious = ppPrevious;
  
   m_pNext = pNext;
  
   if (m_pNext)
   {
      m_pNext-&gt;m_ppPrevious = &amp;m_pNext;
   }
  
   *ppPrevious = this;
}
</pre>

<p>I'm using a pointer to the previous pointer rather than a pointer to the previous node as it makes things slightly simpler; honest...</p>

<p>With just enough code to get the first set of tests to run I have enough to get some initial performance figures out of the new timer system. Timer creation is about the same as with the queue, but that's expected as the code is almost identical; the contention for creation and destruction are also the same as for the queue and thus could also be improved with custom allocators and private heaps. The performance tests for <code>SetTimer()</code> show a dramatic improvement. On my test machine I get figures of around 4ms to set a single timer 100,000 times against 90ms for the queue and similar improvements in the other two performance tests for <code>SetTimer()</code>. What's even better is that <code>SetTimer()</code> would have a contention of <b>C(t-queue)</b> as we no longer have to do any of the dynamic allocation that was going on with the timer queue's STL manipulation.</p>

<p>Right now we're left with a failing test which points the way for what we need to do next which is deal with being able to process these timers when they time out, but before I look at that I think it's about time that I take a good hard look at the duplication in the tests. We're testing an interface with three implementations and we should have a single set of tests which does that and then have some implementation specific tests as well if we feel we need them. Having one set of duplicate test code for the Ex version of the queue was wrong but I could just about live with it, having another duplicate set for the timer wheel is just something I'm not prepared to put up with unless it's simply not possible to remove the duplication. </p>

The code can be found <a href="http://www.lenholgate.com/zips/PracticalTesting-23.zip" onclick="_gaq.push(['_trackEvent', 'Downloads', 'PracticalTesting-23.zip']);">here</a> and the <a href="http://www.lenholgate.com/archives/000906.html">previous rules</a> apply.]]>
    </content>
</entry>

</feed>



