CORBA - Reference Counting Issues

At the end of the second article we have developed a self contained reference counting implementation that appears to work. Unfortunately, it’s still far from reliable as CORBA doesn’t provide the level of support for reference counting that’s built into COM. In this article we discuss the problem and the various CORBA methods for controlling server object lifetime.

The problem

In the last two articles we’ve worked our way towards implementing a COM style reference counting system in CORBA. The final example works pretty well, is easy to add to an existing servant object and appears to solve our problem. The main problem is that although it looks like we’re doing everything we would be doing under COM we don’t have the same support from the COM run-time in CORBA and because of this any form of effective and safe reference counting is very difficult to achieve within a CORBA system.

Firstly the COM run-time manages the creation of the proxy and stub code used when using an out of process object, this means that COM has an intimate knowledge of who is holding references to which objects. Secondly, when using COM between machines the runtime sends keep-alive messages, or pings, between machines. This allows the COM run-time on each machine to know when another machine is no longer reachable and release any references held by clients on that machine. CORBA doesn’t do either of these things and it’s very hard in a CORBA system to know when a client has gone away. What’s more, CORBA object references can be saved as a string and then passed between clients without the server knowing what’s going on - this is similar to being able to save a COM interface pointer in a stream and un-stream it any where, any number of times.

As you can see, although our latest example code works if the client is well behaved and reliable it can’t be guaranteed to work if the client deliberately tries to subvert the reference count or if the client or client’s network connection terminate unexpectedly.

The DCOM ping

CORBA doesn’t impose a life-time management protocol on you. COM does. When using COM between different machines the DCOM run-time sends keep-alive messages at periodic intervals on a per machine basis. That means that only one ping message is sent per machine connected to a server and the same packet is used for all client connections between those machines. It doesn’t matter if a client has 100 connections to server objects, only one packet needs to be sent to keep all of the client connections alive. What’s more, DCOM uses delta pinging. This means that to minimise the size of the ping packet sent between machines, only the differences between the last successful ping message are sent. Finally COM piggybacks the ping message onto normal messages between the client and server so it’s only if a client is completely inactive that it actually sends periodic ping messages (at 2 minute intervals).

So, the COM lifetime management overhead is fairly low and provides a useful service. Unfortunately this is not an option with CORBA. CORBA doesn’t do any lifetime management for you. If you want it you have to do it yourself. The people who don’t need lifetime management and who want to scale their systems to a great size think this is cool. They don’t want CORBA doing anything behind their backs. They don’t want one extra message per client machine every 2 minutes. Whilst I agree that it would be good if you could configure the DCOM ping so that you could easily turn it off, on a per object basis, if you didn’t need it, I’m surprised that CORBA doesn’t provide any support for this kind of thing. With the highly configurable object activation system that’s provided by the POA, it would seem that it would be easy to have some form of client/server keep-alive system as an option.

CORBA solutions

So, robust reference counting solutions aren’t as easy as they look on CORBA due to the lack of support you get from the underlying framework. Basically, you have to do any keep-alive stuff that you might want yourself, in user code.

There are several ways to do this but the favorite amongst CORBA types seems to be using the Evictor Pattern on the server. Rather than allowing the client to control the lifetime of a server object as you would if you used reference counting, you allow let the server control it. The idea is that the server can decide that your object is no longer needed and dispose of it for you. How inconvenient that is to you as a client depends on how the person who designs the server picks the algorithm that they use to decide when to dispose of objects. In the absence of any keep-alive protocol, this method is, at best, the lesser of two evils. Assuming a sensible algorithm is chosen the server should never get overrun with objects and die due to lack of resources. Of course, it may well become unusable if it gets to a point where it begins to thrash and dispose of objects before clients can do useful work with them. The problem is that the Evictor Pattern must always result in more complex code on the client. Even if the server’s eviction algorithm is published as part of the interface specification and doesn’t change, the client code could fall foul of it and should probably be able to handle the need to request another object and resynchronize itself with the newly created state of that object.

One problem with the Evictor Pattern is that it doesn’t work in all situations where, if you were a COM person, you’d usually use reference counting. For example, we can’t replace our use of reference counting in the previous examples with a pure Evictor implementation as the objects life times could be completely random… In the next two articles we’ll discuss implementing COM IEnum style enumeration interfaces in CORBA and then their CORBA equivalent, iteration interfaces. We’ll assume that the level of safety given by our current reference counting implementation is sufficient, for the time being, and then adapt these objects to use the Evictor Pattern - as they are suited to it.

If we wanted to continue towards a truly robust reference counting implementation then we will have to do the keep-alive protocol ourselves. This can be driven from the client end, by using explicit keep-alive messages or a combination of keep-alive messages and normal method calls with the server evicting the object if it hasn’t been touched for a certain period of time; or from the server end, with reverse keep alive where the client hands out a callback interface either once, or with with every reference counted object it requests, and the server then periodically pings its clients - perhaps just before it attempts to evict their object.

The thing to remember is, if the run-time doesn’t help you then simple reference counting schemes can’t be robust.

Revision history