Explicit class initialiser methods

| 20 Comments
Codemonkey uk has an interesting piece on the use of explicit initialiser and destroy member functions rather than allowing object lifetime to be managed by the constructor and destructor.

Codemonkey uk comes down on the right side of the argument, in my opinion; explicit initialiser methods are devil spawn.

The problem with classes that have a separate init method, no matter what it's called, is that the class can exist at least two states; constructed and initialised. Add in explicit destruction and you have three states; constructed, initialised, destructed. The fact that there are different states means that the object is harder to write and harder to use. As a user of the object, you need to know what you can do in each state and you need to know what state the object is in. As a writer of the object you need to support these states internally you need to know which state the object is in and know what's allowed in each state. This complicates the object and the use of the object and unnecessary complexity is the number one enemy of programming.

The arguments that Codemonkey uk's friends use are all pretty weak and all stem from inappropriate use of objects.

1) Explicit initialisation and destruction prevent hidden constructon/destruction. Don't ya just love the ol' 'my complexity is for performance' excuse? If you don't want to incur the performance hit of fully constructing your object, don't construct it. Simple. Codemonkey uk hits the nail on the head with his response to their argument. If you have an object that can operate in two modes, make it two objects so that both are always fully constructed and the operations that you can do on each are always available. Then switch from one mode to the other by calling a method on the first object that returns you the second... If you are wary of accidentally constructing objects that are expensive don't allow accidents to happen; know what you're doing. Use language features such as explicit single argument constructors to remove, forever, the 'chance' that an object might get constructed when you don't want it to; use private, undefined copy constructors and assignment operators to prevent unexpected copying of objects. Program by intent, not by accident. When you create an object you accept whatever performance hit you're going to take, in exactly the same way you do when you explicitly call the 'init' method. So, first argument for is rubbish ;)

2) Implementing constructors that can fail where use of exceptions is problematic. Simple answer to this one; an object factory. The factory effectively does everything that the constructor would do before calling the real (private) constructor. The factory can do any checks it likes and report the failure any way it likes, but if it gives you an object, it's a real object, fully constructed and ready to use, not some half baked blob.

3) Some complex memory management excuse. Yeah, right, whatever. ;) Placement new should solve the problem, you probably don't need to get that complex most of the time. How does splitting construction and initialisation help?

In summary, in my opinion, none of the arguments hold water. Two stage object construction leads to more complicated code, both in the client of the object and in the object itself. Checking an internal state inside the object to see if you're fully initialised is just poor design. Avoid it.

20 Comments

I'd have to agree with you here. I don't think anybody in the C++ community would now say that implicit construction and ESPECIALLY implicit destruction are good ideas. In fact I have become a pretty big fan of RAII which requires deterministic destruction. It is one the major benefits of C++ over languages like Java, C#, and Python.

If you are not familiar with ScopeGuard, here's the relevant article:

http://www.cuj.com/documents/s=8000/cujcexp1812alexandr/alexandr.htm

Agree re RAII. Thanks for the link; interesting stuff.

The nice thing about ScopeGuard is not only does it work with exceptions, it works really well with functions that have multiple return paths. This is far more common with my style of coding. It is very useful for low level systems programming where you either have to clean up handles or file descriptors in failure conditions.

My colleague Thomas Becker has pointed out a couple problems with the implementation. One it doesn't allow you to check the return value of the clean up function. This may or may not be important. Secondly, it uses catch(...) in the destructor to prevent the exception from propagating out of the destruction (never a good thing). This prevents you from handling the exceptions that are caught, and if you are using _se_set_exception to throw the exception as a C++ exception (which now many people in the C++ community claim is a bad idea), then you could mask a lot of problems (ie NULL pointer exceptions, etc.)

Christopher, I presume you meant "nobody would say that explicit ctors/dtors are good ideas?"

Anyway, for what it's worth, as a C guy I'd agree ... even though it's sometimes required to have them be seperate (and even then, that's limited to when allocating on the stack).

It also seems like a major wart in Java/C#/etc. that you have to manually call dispose, and I'm not convinced that using papers over it well enough.

Christopher,

I tend to craft my own RAII style wrapper classes, but I can see that scopeguard has its place. I usually end up with more than just an 'auto close' wrapper; ie one that adjusts interfaces to suit how I want to use them, so the fact that the dtor is doing the clean up is just a small part of the class...

Agree re catch(...), you could throw two kinds of exception from _se_set_exception though, ones you might be able to recover from and ones you cant... I've been bitten by catch(...) recovering from things that shouldn't be recovered from and it's not fun; but I still prefer to have my se exceptions funnelled down the same route as my c++ ones; at least at present.

James,

Agree re using and dispose; I'd personally prefer to see using as optional and if you have a dispose method then it gets called when your object 'goes out of scope'. If you dont want that to happen you use a 'notusing' ...

I'm now pretty convinced that translating SEs to C++ exceptions is a bad idea now, although I've done it myself for years. I had a long discussion on comp.lang.c++ over the stack overflow problem. If a stack overflow occurs while calling a function and a C++ exception is generated there is no way to enforce the no throw gaurentee. This is a bad thing. Basically C++ exception model is broken. An argument David Abrahams made convinced me that it is generally a bad idea. Most of the time when an SE is raised there isn't a whole lot that can be done anyway. I usually just log it and exit, and try to prevent SEs from occurring.

> Christopher, I presume you meant "nobody would say that explicit ctors/dtors are good ideas?"

Yes that is correct. Sorry. I wish I could have JIT editing.

There's no way to enforce a no throw guarentee with SEs anyway and if you've got a stack overflow you're screwed anyway, so, like you say, the only thing you can do is log as much info as you can and try and make sure that they dont happen. However, I personally prefer to at least try and shut down cleanly and if it's 'just' a access violation, you can usually manage to stagger towards shutdown... I find that converting SEs to C++ exceptions makes this easier. It may just be that almost all of my exception "handling" is of the 'oh dear, we're screwed, lets see if we can die cleanly' variety, so it fits reasonably well with that...

Len, yeh, maybe. It might even be better to have it so everything declared in a function would always be using() ... but it's too late for either of those things now.

You might also be sicly amused by:

http://blogs.msdn.com/ericgu/archive/2004/07/23/192819.aspx

Agree it's too late :( If deterministic destruction didnt matter, why add using at all, and if it did, why not do it properly ;)

Thanks for the link, I saw that post yesterday and cringed... :)

Here here Len.

Deterministic destruction does matter. In fact I believe it is possible to address many of the problems that exceptions were meant to solve with RAII, return values, and functions with multiple exit points. This is the style I now use in C++, and I have to say it works very well for me. In fact I find languages like Python very frustrating since they don't provide deterministic destruction. I personally don't think try: finally: is a good work around.

I very rarely use exceptions in C++. RAII is mostly all I need.

Chris

How do you get around the need to multiplex valid return values with errors ? Take, for example, a factory method that returns an object by value... If you switch to returning by pointer with null to indicate failure then you can no longer write code that ignores optionality - this function always returns an object or throws - if you 'return' the error result by reference as an 'in/out' param then you need to have your object support a 'not valid' state ... I know you can take the COM style route; everything uses the return value for error and status and anything real comes out via in/out params but I find that you lose so much simplicity by taking that route; lack of const correctness, complex error handling, lots of optionality in the code. I just find exceptions help so much with all of that that the resulting code is just so much easier to work with... I'd be interested in seeing some examples of your style though.

Len

boost::tuple is an option.

> Take, for example, a factory method that returns an object by value...

Give me a more concrete example. Do you mean:


foo buildFoo()
{
return foo(1,2,3);
}

?

If the constructor of foo is no-throw, how could this fail? Stack Overflow, is one way, but we've pretty much decided you are screwed in that case anyway.

You can argue for using exceptions from constructors that could fail. Although I would try to avoid writing such constructors. I didn't say I didn't use exceptions at all. I just don't use them often.

For instance:


struct STATUS
{
..blah..
};

STATUS DoIt()
{
int fd = ::connect(..blah..);
if(fd < 0){
return ERROR;
}
ON_BLOCK_EXIT(close, fd);

if(::write(..blah) == -1)
return ERROR;

if(::read(..blah..) == -1)
return ERROR;

return SUCCESS;
}

Looks like your comments box isn't too code friendly.

Christopher

I've edited the comment and added some

<pre>tags</pre>
to pretty it up a bit...

Will reply when I've had some sleep.

Chris

I was actually referring to more complicated object factories; such as pluggable factories where you may be requesting an object by type name, or whatever. These kinds of factories have many more chances to fail; request for unknown object type, etc, as well as failure to construct a valid object... In these cases I'm loathe to use the return value for error reporting...

Len

There are scenarios where you want an explicit Init() function, separate from constructors.

1) Where you have multiple constructors that each take different parameters, but have a lot of duplicate setup code. You have them all call Init() at some point.

2) Where you have serialization, that is, a way to set an object's state completely from an external source. You need to have the object constructed before you stream a stored version of it, into its binary form. If you have full construction it is a complete waste because you are immediately going to overwrite all of the internal state.

3) Member objects may need to exist in a "closed" state until the containing class needs them. For example you may have a class that wants to access files and has a "File" member object. That object may need to be opened and closed (init and de-init) multiple times. You may have a TCP socket class that you want to "re-connect" if it loses a connection. It needs to be able to be re-initialized and closed without being destroyed. This means you must actually get a new socket() from the operating system without destroying the enclosing class.

4) You may have static objects which cannot be initialized but need to be constructed. For example a global TCP socket object that must be constructed, but cannot be connected until the entire program is up and running. Is a TCP socket initialized if it's not connected? Who knows? "Initialized" is a subjective thing. There is no clear objective meaning.

You may say, the above examples should have a "pointer" which you new and delete. That may be true but if you do that, you are really just admitting that you need an object to virtually be there, if it isn't ready to be used, which is my point.

I have been working with Java and C# and I can say without a hint of uncertainty that non-deterministic destruction is terrible. The entire concept shows a lack of understanding of what a destructor is. The Java authors attempted to solve a minor C++ problem (memory leaks) by destroying the most important part of OOP.

I always said if there was one thing I could have from OOP, it would be destructors. Destructors are the one thing you cannot write in another language. You cannot add them to C or assembly or earlier languages because they are a concept, not a statement. If you leave a scope, there is no trigger mechanism in non-OOP languages to destruct temporaries.

Someone at Sun thought that memory de-allocation was all that destructors did. WRONG. I continually rely on automatic destruction to close files, release mutexes, drop tcp connections, close windows etc. When exceptions are thrown destructors are the only code you can rely on to clean up. Imagine if you were relying on a mutex destructor to release a mutex, and the language didn't call that destructor until it felt like it?

Walter,

I agree that sometimes Init methods have their place but most of your examples would have a private init method that would be called as part of construction, either by the object's constructors in the case of multiple ctors with common requirements or, perhaps, by a factory, that handles serialising the object.

For the file and tcp connection examples I'd say that the object could be designed to be completely constructed and not referencing a socket or file. You'd then call open. The point is not that the object doesnt need anything else done to it to be usable but that construction itself doesnt require multiple steps.

Oh, and I completely agree about the deterministic destruction comments.

Leave a comment