|
Wednesday, June 14, 2006 By: Jason Doucette (printer friendly version)
Large Classes means No Copies Wanted I am involved in a rather large project written in mostly C. I use C++ where it helps, so I have a few classes. They exist mainly for the purposes of defining a proper hierarchy and for data encapsulation. These classes have a large amount of data which is dynamically allocated, since statically allocated data eats up stack space. The fact that they use a large amount means I do not want copies of them being made. Making copies would be a performance issue, a memory issue, and there just is no reason to have duplicate copies to begin with. If such a thing occurred, it would definitely be a bug in my code, and I would like to know about it, so I could fix it. It would be great if the compiler simply disallowed this, and refused to compile, since that’s the best time to find bugs. But such is not the case…
C++'s Forced Favours are Unhelpful The C++ standard states that a default copy constructor and a default assignment operator will be made if you do not declare one. This happens whether you want them or not. It seems awfully nice thing to do. Except, you can not reject this help. Usually favours that you did not ask for are not all that appeasing. This case is no different. The C++ standard makes the most ridiculous versions of these member functions you can possibly imagine. The default versions of the copy constructor and the assignment operator merely copy all of your data members over as-is, with no further thought. Yes, it does invoke the copy constructor and the assignment operator for each data member, so they are copied properly. It does not just perform a simple data copy via memcpy(), which would not invoke any copy constructors or assignment operators, so it is not that bad. But realize that the copy constructors and assignment operators it automatically invokes on your data members, if created automatically by the compiler, will merely do the same thing. What’s bad about this? If you have pointers, you will wind up with two pointers pointing to the same piece of data. This is not only a concern for dynamically allocated memory. It is equally bad for pointers that point to anything other than NULL, such as other data members. Most C++ mentors fail to point this out, likely excusing that it was “left as an exercise for the student”. Let’s hope the exercise is not production code. So, for the purposes of anything larger than a trivially contrived C++ tutorial example, these function members may as well just call exit().
Stop Complaining. It is Easily Resolved. Hey, after all, you just need to program your own copy constructor and assignment operator, right? C’mon! It’s easy. For the copy constructor, if you have pointers, just allocate some more memory, pass the return value of the allocation to the new pointer, and copy the data over. Voilà! The assignment operator is not much harder. You can do the same thing as above, but first, you should do a preliminary check that you are not assigning something to itself. If so, abort immediately. If not, deallocate the original copy (if you did this during self assignment, you would lose your data!). In fact, for the assignment operator, if the dynamic memory is known to be the same size, you do not even have to deallocate it, since you can just reuse it!
So, How is this Bad? I will tell you:
So, you need to worry about ALL this even when you do not ever want a copy of your classes being made! This is the result due to an early decision that C++ should make its own versions of these member functions... versions that do not work except for the most elementary classes. I hope someone can explain what purpose they hold, because they have little use in my experience. Their detriments far outweigh any use I can imagine at this time. It is unfortunate that these mistakes could not simply be fixed, by turning off this ‘feature’, without worrying about breaking millions of lines of code that relies on the original implementation. Perhaps compilers could have a switch that toggles the existence of the feature, thus allowing for both worlds to co-exist.
Is There Any Hope? Must we submit to this atrocity? If you do not want these member functions, and C++ gives them to you, perhaps you can prevent them from being invoked. A first stab may be making our own copies that do nothing, but assert() that they will never be run. This gives you a warning to fix your code that is invoking a copy being made. But, this is a run-time catch, not compile-time catch. It is fine in the lab, but it is not OK when one of these is invoked in a distributed build, in some code path that was never tested during beta testing, on a customer’s machine. Ouch.
The Real Solution The only decent way to prevent them from being invoked is to declare them as private. This way, no users of your classes can invoke them, and the compiler will complain if you do. Code cannot leave the lab without being compiled, so no such functions will ever be invoked on a customer’s machine. Catching bugs at compile-time is orders of magnitude better than at run-time – they are much less costly to fix, and you will not have angry customers. Not so fast. The class itself can access private class members. Is there a way to avoid this? Yes! Declare the member functions, but do not define them. That is, provide the function prototypes, so the compiler does not automatically define its own versions, and also so that it knows you want them private. But, do not define the actual code. If you attempt to use them from within the class, the compile process succeeds, but the linker will fail to find the code it needs, and will issue an error: “LNK2019: unresolved external symbol”. This is the solution in code: class CExample
Final Thoughts Your C++ mentor may have led you to believe these member functions are special (apparently the C++ Standard calls them "special"), but they are just member functions. The copy constructor is just a constructor. It is a constructor that takes one parameter, which happens to be the class itself. This parameter does not even need to be a constant, or a reference, for it to compile. (However, if it is not a reference, it is ‘passed by value’, so, when the function is invoked, the compiler makes a copy of it. How does it make that copy? By calling the copy constructor, of course. Say “Hello, infinite recursion”.) The default assignment operator is just an overloaded operator. If the compiler did not automatically make its own useless versions, you would not even think about them in a special way. You would just make them when they needed to be made. How wonderful would that be?
Links
About the Author: I am Jason Doucette of Xona Games, an award-winning, team-of-two indie studio concentrating on "intense retro" games (Xbox LIVE, PSN, WiiWare, and Windows PC). We've released Decimation X (XBLIG), a 1-4 player shmup, #1 best selling and #1 top rated XBLIG in Japan. We're working on Duality ZF (XBLA), a groundbreaking 1-4 player shmup, which placed #1 in Canada and #5 in the world in Microsoft's Dream Build Play 2010 contest. It features dual play, the ability to control two fighters at once, and a massively upgradable 32-stage spread/laser weapon system. 4 player dual play allows up to eight fighters at once. Many of these features are never before seen shoot'em up firsts. Both games feature beautiful electronic Imphenzia soundtracks. Help spread the word with our official dualityzf.com and decimationx.com websites. P.S. Watch out for Score Rush (official website scorerush.com), another 1-4 player shmup. Coming soon to XBLIG. *Shmup also known as: shoot'em up, 2D shooter, scrolling shooter, space shooter, spaceship shooter, retro shooter, etc.
|
|
"Xona Games" and "Xona.com" trademarked and copyrighted by Xona Games Inc., Jason Doucette, and Matthew Doucette. 6,775,235 page views (since 2004-Jul-27) © Xona Games Inc. |