CORBA - Enumeration

| 0 Comments
CORBA provides sequences as a way of returning collections of items from an method call. The problem with just using unbounded sequences is that the client has no control over how many items it receives as a result of the call. COM gets around this problem using the IEnum style interfaces that allow a client to control how it accesses the items in a collection.

Setting the scene
As we pointed out in the last article, reference counting without a keep-alive protocol is not especially robust and objects may be leaked on the server. We'll assume for the time being that this is acceptable to us and, in this articles and the next, examine how COM and CORBA handle the problem of allowing the client to control how a sequence of objects is returned to it.

Once we've presented our enumeration and iteration interfaces we'll make them more robust by using the Evictor Pattern on the server to ensure that our server never becomes clogged with objects that are no longer used.

The problem
CORBA provides sequences as a way of returning collections of items from an method call. The problem with just using unbounded sequences is that the client has no control over how many items it receives as a result of the call. COM gets around this problem using the IEnum style interfaces that allow a client to control how it accesses the items in a collection, in the following article we'll implement a COM style IEnum interface in CORBA and compare the client memory requirements against simply returning a sequence.

Sequences considered harmful
In the articles about reference counted CORBA objects we developed a server that could return a sequence of named counter objects. The problem with this style of interface is that the client has no control over how many objects will be returned. The interface works fine when we have few counter objects to return but can cause problems if there are many of them and the client is operating in a situation where the memory available to it is restricted.

In Enum1.zip we take the server developed in the reference counting articles and add some code so that you can specify the number of counters that are automatically generated. This allows us to test the client with a server that contains thousands of counters. The client code has been changed so that it does some Win32 specific memory usage calculations so that we can show how much memory is used by a call to the server's GetAll() method that returns a sequence of all of the counters that the server contains. We'll use this example as the basis for adding an enumeration interface and compare the memory usage and client side code required for each approach.

If you run the server without any command line arguments it will generate 1000 named counters by default. You can also specify the number of counters on the command line, but I found that 1000 was about the right number to show the differences in client memory consumption. Once the server is running, run the client with the list command line parameter and you'll see the list of named counters displayed. Once the list has finished displaying you'll get some information on the amount of memory used. This display compares the number and size of the blocks in the heap before and after the call to the server. By experimenting with different numbers of counters in the server you will see that as the number of counters is increased so the memory used in the client goes up. The important thing to realise is that the client has no way of controlling how many counters are returned in a call to GetAll() and because of this there is no way to know how much memory may be required. This could lead to problems if the client must process all of the items on the server but doesn't have enough memory available to do so. Of course, if the server interface were more flexible then the client could process the items in batches of a size that it knows it can handle...

IEnumXXXX
A COM programmer would get around the problem above by using an enumeration interface. In COM there is a standard manner for returning a collection of objects and that is to use an interface that resembles the IEnumXXXX style of interface shown below:

interface IEnumXXXX : IUnknown
{
   HRESULT Next(
      [in] ULONG celt,
      [out] IXXXX **rgelt,
      [out] ULONG *pceltFetched);
  
   HRESULT Skip(
      [in] ULONG celt);
  
   HRESULT Reset();
  
   HRESULT Clone(
      [out] IEnumXXXX **ppenum);
};

Rather than returning a sequence, or the COM equivalent, a method call will return an interface similar to the one shown above. This allows the client to decide how many objects to retrieve on each call to Next(). The client is fully in control of how much memory can be used whilst they are processing the sequence of objects.

Note that the additional methods allow the client to skip a number of elements, reset the enumeration to the start and to create a copy of the enumerator which is positioned at the same point in the sequence as the original interface. Of course, being a COM interface it's derived from IUnknown and has the standard methods for reference counting and interface discovery.

A CORBA version of the above interface, specialised for the named counter objects used in our server could look something like this.

typedef sequence<namedcounter> counterSeq;
  
interface EnumNamedCounter
{
   void AddRef();
  
   void Release();
  
   long Next(
      in unsigned long maxCounters, 
      inout counterSeq theCounters);
     
   boolean Skip(
      in unsigned long numToSkip);
  
   void Reset();
  
   EnumNamedCounter Clone();
};

Note that we've made the reference counting methods explicit as we don't inherit from any other interfaces. The reference counting is required by an enumeration interface as the server creates the enumeration object especially for a particular client - the enumeration object effectively contains a cursor into the sequence that's being traversed and so is caller specific. What's more, since the client itself can create more copies of these objects by calling Clone() it's important that the client can let the object know when it's no longer required and can clean itself up. Of course, we could ignore reference counting here and just have a "destroy" method, but we'll leave that for later...

Implementation of the above interface is shown in Enum2.zip and is relatively straight forward. We use the reference counting template base class that we developed in RefCounted8.zip which means we can focus on the implementation of the enumerator. We also take the approach of using the simplest approach that could possibly work. Each enumerator object is initialised by passing in the server's container of counters the enumerator simply iterates through the server's counter container and adds each counter to its own internal list. This is very much a brute force approach but it allows us to add an enumeration interface to the server with very little in the way of code changes and it's good enough for us to be able to use from the client to explore the differences in memory usage on the client side. Of course, the memory usage on the server side is less than ideal as a list of pointers to counters is created inside every enumeration object that the server hands out. We'll address that issue in a later build of the server.

If you run a client with the new server you can compare the memory usage on the client between doing a list, which returns the sequence of objects in one go and an enum where you can specify the number of objects to be returned with each call to Next() on the IEnum interface.

The client code is slightly more complex but it's worth it for having a client that can work with any number of items in the sequence rather than having one that may crash if memory becomes tight. It's also possible to make the client code cleaner by using a technique that I discussed in an earlier article on COM enumeration that makes it possible to use STL style iterators with an enumeration interface.

A better implementation
Our first implementation of the server side enumeration object is pretty crude. We copy the server's internal list and provide our own cursor into it. It might seem that we should just be able to reference the server's counter collection and use an iterator onto it as our cursor. The advantage of this approach is that the sequence that we're enumerating is fixed at the time that the enumerator is created. Enumerating the entire sequence, calling reset and enumerating the sequence again will give exactly the same results even if the server has had counters added or removed whilst we were enumerating.

For some situations there's no other way to implement an enumerator than this brute force method. If we're prepared to relax our requirements somewhat so that we accept that the enumeration will represent he current state of the collection of counters on the server at the time that the Next() method is called then we could, potentially access the server's collection directly. Unfortunately we need to change the server's collection to do this as at present we're using a std::map and iterators into that structure are invalidated by insertions and deletions.

What we need is a container on the server that presents the std::map interface that the server currently uses for looking up individual counters, and also presents an interface that's suitable for iterating over the contents of the map even if the map is added to or deleted from during the iteration.

In Enum3.zip we use a templatised counter container that presents all of the required interfaces and fulfils our requirements for not invalidating our iterator whilst we're traversing the collection. This collection contains a std::map and exposes parts of the map's interfaces so that it can be plugged into the server with few code changes. Internally however it's quite different and the map stores the name of the counter and a pointer to a node in the collections internal list of counters rather than a pointer to the counter itself. This allows us to remove a counter from the server's map of counters but still have that counter available for iteration if required. It happens that the only time this situation occurs is when an enumeration interface's cursor is currently at the item that is being deleted.

The advantages of this code is that the enumeration object doesn't need to copy the server's collection of objects when it's created, this leads to faster object creation and less memory used on the server. The collection represented by the enumeration is dynamic, it reflects the state of the collection within the server but because of how the list is implemented in the server the enumeration objects can safely hold onto pointers to nodes in the list and step through the list even if the next node has been deleted by another client...

Know your clients
There are a lot of round-trips occurring in this simple example. The client is using the object's in a read-only fashion, simply displaying their data, but the server is providing fully fledged object references for the client to work with. This means that for each object the client must make a server call to discover the name of the counter and another to fetch the value and another to release the object once it has finished with it. In this situation it would be advantageous for the server to provide a read-only enumeration or list interface that simply returns the data that's stored in the objects concerned rather than the object references.

By defining a struct that contains the name and value of a named counter we can declare a sequence of these structs and then return this sequence of data rather than the sequence of object references. We add this functionality to the client and server in Enum4.zip

If you now compare the performance of the "list" and "listro" commands on the client you will see that the "listro" command returns data much faster than the "list" command. This is because all of the data copying is done on the server and then it's marshaled across the client in one call. You'll also notice that there's actually less memory used on the client when we do things this way. If we then compare the "enum" and "enumro" calls we'll see the same speed increases and memory reductions.

Of course, the exact amount of memory used on the client will vary with the data that's being retrieved but this just goes to show that in certain circumstances it is more efficient to allow a client access to all of the data they want in one hit, rather than requiring them to use an object reference to retrieve the data one piece at a time.

The CORBA way
The examples above show how to implement a COM style enumeration interface. There is a slightly more "standard" CORBA way to achieve a similar thing. The Naming Service returns sequences using CORBA iteration interfaces, we'll compare these to the COM style enumeration interfaces in the next article.

Download
The following source was built using Visual Studio 6.0 SP3 and tested with OmniORB - the open source Corba ORB from AT&T Cambridge. You need to add OMNI_HOME to your environment so that the idl compiler, headers and libraries can be found.

To compile the IDL files you will need to change the path used in the build command for the idl files - well, you will unless you happen to have installed OMNI ORB into exactly the same location as I have... Select the IDL files in the project workspace, right click on them, select settings and edit the path to the OMNIIDL compiler.

Get OmniORB
Enum1.zip - shows the memory issues in returning a sequence
Enum2.zip - a simple enumeration implementation
Enum3.zip - don't copy objects on the server
Enum4.zip - read only enumeration of data rather than objects

Revision history

Leave a comment

About this Entry

CORBA - Reference Counting Issues was the previous entry in this blog.

CORBA - Iteration is the next entry in this blog.

I usually write about C++ development on Windows platforms, but I often ramble on about other less technical stuff...

Find recent content on the main index or look in the archives to find all content.

I have other blogs...

Subscribe to feed The Server Framework - high performance server development
Subscribe to feed Lock Explorer - deadlock detection and multi-threaded performance tools
Subscribe to feed l'Hexapod - embedded electronics and robotics
Subscribe to feed MegèveSki - skiing