Allocating page aligned buffers

Back in October 2007 I briefly looked at, what seemed to be at the time, a simple change to The Server Framework so that you had the option to use buffers that were aligned to page boundaries. This could help you scale better as one of the two problems with scalability at the time was the ‘I/O page lock limit’; there’s a finite limit to the number of memory pages that can be locked at one time and in some circumstances data in transit via sockets is locked in memory whilst it is sent. Reducing the number of pages used, by making sure that buffers were aligned on page boundaries and so used the fewest pages possible, can help if your server is hitting this limit.

Anyway, I proposed a simple change which was immediately shot down by a commenter for being too simple. The change was to use VirtualAlloc() to allocate our I/O buffers on page boundaries; the reason it was too simple is that VirtualAlloc() works in terms of the system’s allocation granularity and not arbitrary sizes. This meant that the proposed changes, whilst simple, wasted oodles of memory.

I thought about it some more and then went away and made some more complicated changes. The results then sat around on a development branch for some time as no clients were desperate for the changes and I never had time to profile them.

Well, I’ve finally profiled them and they perform pretty well and they are, after all, entirely optional, and so they’re going to be included in the next revision of The Server Framework.

The changes are that the CBufferAllocator can now either be passed flags which tell it to use page aligned buffers, in which case it uses a very simple fixed sized memory allocator that can return page aligned allocations of arbitrary size OR you can pass the CBufferAllocator and instance of IAllocateFixedSizedMemory and it will use your own allocator.

The fixed sized memory allocator is very simple and pretty basic. It does deliver page aligned fixed sized memory blocks with little wastage. It doesn’t return memory that has been allocated from it and then released back to it back to the operating system. It simply keeps it in its free list for later reuse. This may or may not be a problem to you.

Anyway, the performance of the allocator is such that it’s pretty much on par with the default allocator that’s used for non aligned memory and the fact that the buffers can be page aligned means that each pending send operation could take up one page less than if you don’t use the page aligned allocator. That may make a big difference to your scalability.

These changes will be included in the 6.1 release of The Server Framework which currently has no scheduled release date.