Tuesday, September 28, 2004

Top Ten Traps in C# for C++ Programmers

Trap #1: Nondeterministic finalization and the C# destructor

The biggest difference in C# for most C++ programmers will be garbage collection. You will no longer have to worry about memory leaks and ensuring that pointers are deleted, but you also give up precise control over when your objects will be destroyed.
If you control an unmanaged resource, however, you will need to explicitly free that resource when you are done with it. Implicit control over unmanaged resources is provided by a destructor, which will be called by the garbage collector when your object is destroyed.
The destructor should only release unmanaged resources that your object holds on to, and it should not reference other objects. If you have only managed references you do not need to (and should not) implement a destructor. You want this only for handling unmanaged resources. Because there is some cost to having a destructor, you ought to implement this only on methods that consume valuable, unmanaged resources.
You never call an object’s destructor directly. The garbage collector will call it for you.
How destructors work
The garbage collector maintains a list of objects that have a destructor. This list is updated every time such an object is created or destroyed.
When an object on this list is first collected, it is placed on a queue with other objects waiting to be destroyed. After the destructor executes, the garbage collector then collects the object and updates the queue, as well as its list of destructible objects.

The C# destructor
C#’s destructor looks, syntactically, much like a C++ destructor, but it behaves quite differently. You declare a C# destructor with a tilde as follows: ~MyClass(){}
In C#, however, this syntax is simply a shortcut for declaring a Finalize() method that chains up to its base class. Thus, when you write:
~MyClass() { // do work here }
the C# compiler translates it to:
protected override void Finalize()
// do work here

Trap #2: Finalize versus Dispose
It is not legal to call a destructor explicitly. Your destructor will be called by the garbage collector. If you do handle precious unmanaged resources (such as file handles) that you want to close and dispose of as quickly as possible, you ought to implement the IDisposable interface. The IDisposable interface requires its implementers to define one method, named Dispose(), to perform whatever cleanup you consider to be crucial. The availability of Dispose() is a way for your clients to say, "Don’t wait for the destructor to be called; do it right now."
If you provide a Dispose() method, you should stop the garbage collector from calling your object’s destructor. To stop the garbage collector, you call the static method, GC.SuppressFinalize(), passing in this reference for your object. Your destructor can then call your Dispose() method. Thus, you might write:
using System;
class Testing : IDisposable
bool is_disposed = false;
protected virtual void Dispose(bool disposing)
if (!is_disposed) // only dispose once!
if (disposing)
Console.WriteLine("Not in destructor, OK to referenceother objects"); }
// perform cleanup for this object
this.is_disposed = true;
public void Dispose()
// tell the GC not to finalize
Console.WriteLine("In destructor.");
Implementing the Close method
For some objects, you’d rather have your clients call the Close() method. (For example, Close makes more sense than Dispose() for file objects.) You can implement this by creating a private Dispose() method and a public Close() method and having your Close() method invoke Dispose().

Trap #3: C# distinguishes between value types and reference types
Like C++, C# is a strongly typed language, and like C++, C# divides types into two sets: intrinsic (built-in) types offered by the language, and user-defined types that are defined by the programmer.
In addition to intrinsic types and user-defined types, C# differentiates between value types and reference types. Value types hold their value on the stack, like variables in C++, unless they are embedded within a reference type. Reference-type variables sit on the stack, but they hold the address of an object on the heap, much like pointers in C++. Value types are passed to methods by value (a copy is made) while reference types are effectively passed by reference.
Classes and interfaces create reference types, but note carefully that structs are value types, as are all the intrinsic types (see Trap #5).

Trap #4: Watch out for implicit boxing
Boxing and unboxing are the processes that enable value types (for example, integers) to be treated as reference types (objects). The value is "boxed" inside an object, and subsequently "unboxed" back to a value type. Every type in C#, including the intrinsic types, derive from Object and may be implicitly cast to an object. Boxing a value allocates an instance of Object and copies the value into the new object instance. Boxing is implicit, so when you provide a value type where a reference is expected the value is implicitly boxed. Boxing brings some performance overhead, so avoid boxing where possible, especially in large collections.
To return the boxed object back to a value type you must explicitly unbox it. The unboxing occurs in two steps: Check the object instance to make sure it is a boxed value of the given value type. Copy the value from the instance to the value-type variable. In order for the unboxing to succeed, the object being unboxed must be a reference to an object that was created by boxing a value of the value type.
using System;
public class UnboxingTest
public static void Main()
int i = 123;
object o = i; //Boxing
int j = (int) o; // unboxing (must be explicit)
Console.WriteLine(“j: {0}”, j);
If the object being unboxed is null or a reference to an object of a different type, an InvalidCastException is thrown.

Trap #5: Struct is very different in C#
In C++ a struct is nearly identical to a class. In C++, the only difference is that a struct has public access as its default (rather than private) and its inheritance is public by default (again, rather than private). Some C++ programmers use structs as data-only objects, but that is a convention not supported by the language and discouraged by many object-oriented designers.
In C#, a struct is a simple user-defined type, a lightweight alternative that is quite different from a class. While structs do support properties, methods, fields, and operators, structs don’t support inheritance or destructors.
More importantly, while a class is a reference type, a struct is a value type (see Trap #3). Thus, structs are useful for representing objects that do not require reference semantics. Structs are somewhat more efficient in their use of memory in arrays, however, they may be less efficient when used in collections. Collections expect references, and structs must be boxed (see Trap #4). There is overhead in boxing and unboxing, and classes may be more efficient in large collections.

Trap #6: Virtual methods must be explicitly overridden
In C#, the programmer’s decision to override a virtual method must be made explicit with the override keyword.
To see why this is useful, assume that a Window class was written by Company A, and that ListBox and RadioButton classes were written by programmers from Company B, using a purchased copy of the Company A Window class as a base. The programmers in Company B have little or no control over the design of the Window class, including future changes that Company A might choose to make.
Now suppose that one of the programmers for Company B decides to add a Sort method to ListBox:
public class ListBox : Window
public virtual void Sort() {...}
This presents no problems until Company A, the author of Window, releases version 2 of its Window class. It turns out that the programmers in Company A also added a Sort method: public class Window
// ...
public virtual void Sort()
In C++ the new virtual Sort method in Window would now act as a base method for the virtual Sort method in ListBox. The compiler would call the Sort method in ListBox when you intend to call the Sort in Window. In C#, a virtual function is always considered to be the root of a virtual dispatch, that is, once C# finds a virtual method, it looks no further up the inheritance hierarchy. If a new virtual Sort Employee::Employee(int theAge, int theSalaryLevel):
Person(theAge) // initialize base
salaryLevel(theSalaryLevel) // initialize member variable
// body of constructor
This construct is not legal in C#. While you can still initialize the base, the initialization of the member variable as shown here would cause a compile error. You can, however, set the initial value for the member variable in C# when you declare it: Class Employee : public Person{ // declarations here private salaryLevel = 3; // initialization}
Note also that you do not add a semicolon after the class declaration, and that each member must have its access declared explicitly.

Trap #8: Boolean values do not convert to integers
In C#, Boolean values (true, false) do not equate to integer variables. Thus, you may not write: if ( someFuncWhichReturnsAValue() )
and count on the idea that if someFuncWhichReturnsAValue returns a zero it will evaluate false, otherwise true. The good news is that the old error of using assignment versus equality is no longer a problem. Thus, if you write:
if ( x = 5 )
you will not inadvertently assign 5 to x, you will get a compile error, since x = 5 evaluates to 5, which is not a Boolean value.

Trap #9: You may not “fall through” in switch statements
In C#, a switch statement may not “fall through” to the next statement if it does any work. Thus, while the following is legal in C++, it is not legal in C#:
switch (i)
case 4:
case 5: // error, no fall through

To accomplish this, you need to use an explicit Go To statement:
switch (i)
case 4:
goto case 5;
case 5:
If the case statement does no work (has no code within it) then you can fall through: switch (i)
case 4: // fall through
case 5: // fall through
case 6:

Trap #10: C# requires definite assignment
C# imposes definite assignment, which requires that all variables be assigned a value before they are used. Thus, you can declare a variable without initializing it, but you may not pass it to a method until it has a value.

This raises a problem with values you create simply to pass them to a method by reference, to act as "out" parameters. For example, suppose you have a method that returns the current hour, minute, and second. If you were to write: int theHour;int theMinute;int theSecond;timeObject.GetTime( ref theHour, ref theMinute, ref theSecond)
You would get a compile error for using theHour, theMinute, and theSecond without initializing them:
Use of unassigned local variable ‘theHour’
Use of unassigned local variable ‘theMinute’
Use of unassigned local variable ‘theSecond’

You can initialize them to zero or some other innocuous value to quiet the pesky compiler:
int theHour = 0;
int theMinute = 0;
int theSecond = 0;
timeObject.GetTime( ref theHour, ref theMinute, ref theSecond)

But that is too silly. The point of these variables is to pass them by reference into GetTime, where they’ll be changed. C# provides the out parameter modifier for this situation. The out modifier removes the requirement that a reference parameter be initialized. The parameters to GetTime, for example, provide no information to the method; they are simply a mechanism for getting information out of it. Thus, by marking all three as out parameters, you eliminate the need to initialize them outside the method. Out parameters must be assigned a value before the method they are passed into returns. Here are the altered parameter declarations for GetTime: public void GetTime(out int h, out int m, out int s)
h = Hour;
m = Minute;
s = Second;
and here is the new invocation of the GetTime method:
timeObject.GetTime( out theHour, out theMinute, out theSecond);

No comments:

Post a Comment