C++

The C++ language evolved gradually from C. The evolution was not planned but rather followed the lessons learned through success and mistakes. Classes, inheritance, operator overloading and function name overloading were first added. Later, multiple inheritance appeared and function name overloading was modified. Eventually, templates and exceptions were included as well as a number of minor additions and modifications. Still, booleans, name space management and run time type identification are being added and standard libraries are being defined. Because of this, different compilers do not necessarily provide the same set of features and different libraries do not use the same features and conventions. This section describes the new features available for a programmer migrating from C to C++.

Classes

Classes are similar to structures. They define static data members (class variables) and dynamic data members (instance variables) as well as methods. Static methods only operate on static data members and may be called directly: class_name::method_name(). Dynamic methods may operate on static and dynamic data members (data members specific to an instance of the class) and are thus called upon an instance: instance.method_name(). Virtual methods are called through the method table. Thus, the method called does not depend on the declared type of the variable pointing to the instance but instead on its actual type.

Within a method, the instance upon which it was called is received as an implicit argument (not listed in the argument list) and accessible through the reserved keyword this. Members of the instance, and class variables, are accessible either through their name member_name or through this->member_name. If the same name is used for more than one member, inherited from different parent classes, the ambiguity is resolved by prefixing the class name class_name:: member_name.

Each data member and method may be put in the private, protected or public portion of the class. The private portion may only be accessed from within the methods of the class. The protected portion is also accessible from the child classes that inherit from the class. The public portion is accessible even outside methods. However, functions declared as friends of a class may access all portions as if they were a method of the class.



// Declaration of class Shape

class Shape {
   // These members may only be accessed by methods
   private: 
      static int nb;  // this is a global variable for the class
      int color;      // each instance has a color

   // These members may be accessed from child classes
   protected:
      int get_color() { return (color); }

   // These members may be accessed from anywhere
   public:
      // A null method means that it will be implemented in
      // child classes and that Shape is an abstract class.
      // No instances of it may be created.
      virtual int Width() = 0;

      // This method may be called as Shape::get_count()
      static int get_count()
       { return(nb);}

      // A method with the same name as the class is
      // a constructor, called each time a new Shape is created
      Shape() 
       { color = 1; nb++; }

      // This is the destructor called each time a Shape is deleted
      ~Shape() { nb--; }
 };

int Shape::nb = 0;

// Rectangle is a child class of Shape

class Rectangle : Shape {
      // By default, these members are private
      int width, height;
   public:
      // This method being virtual, it is accessed through
      // the method table
      virtual int Width() 
       { return(width); }
};

// Circle is another child class of Shape

class Circle : Shape {
      int radius;
   public:
      virtual int Width()
       { return(2 * radius); }
};

References

A reference to a variable is a constant pointer which is automatically dereferenced when accessed. This is mainly useful for passing arguments and function results by reference instead of by value. By default, arguments are passed by value, (i.e. copied), and modifications performed on the copy do not affect the original value. In C, arguments to be modified were passed by address such that, even though the address was copied, the original value was accessible through that pointer.



main()
{
  int i;

  c_increment(&i);

  c++_increment(i);
}

c_increment(int *pi)
{
  *pi += 1;
}

c++_increment(int &i)
{
  i++;
}

Passing arguments by reference is thus simpler than sending the address, when these arguments must be modified by the function. Sometimes, for performance reasons, it is also desirable to pass an argument by reference even if it needs not be modified. The function result is returned by reference when it must be modified by the calling function or for performance reasons. It is usually a bad idea to pass both the arguments and the return value by reference. Indeed, having one passed by value, usually the return value, insulates the input from the output and avoids aliasing problems.

In the example below, everything is passed by reference. When the function is called, matrices a and b are multiplied and the result goes into a. Because this is done by reference, the same memory region is used for both the result (a) and one argument (a). Most matrix multiplication algorithms will not work in this context. Passing the result by value is less costly than passing both arguments by value and solves the problem.


matrix &multiply(const matrix &arg1, const matrix &arg2);

matrix a, b;

main() {
  a = multiply(a,b);
}

Automatic Type Conversion

When an expression requires a float value and an integer is supplied, the compiler automatically converts the integer into a float value. In C++, such automatic conversions may be defined by the programmer. Functions having a type name as name and a single argument will automatically be called when a value of that type is required and a value of the argument type is supplied. These automatic conversions will even be applied in cascade if needed.


float f = float(10);

typedef char* charp;

char *p = charp(0xffff);

Constructors and Destructors

Static variables, including global variables, are allocated before the main program starts and deallocated when the main program ends or upon calling the exit function. Local variables are allocated when declared (typically at the beginning of a function) and deallocated when the corresponding block ends (typically the function end). They are often allocated on the stack. Some local variables are also allocated without explicit declaration to hold temporary values in expressions. Dynamic variables are allocated with the new operator (instead of the malloc function) and deallocated with the delete operator (delete[] for arrays).

Each time a variable is allocated, its constructor method is called. The constructor is a method with the same name as the class. There may be several constructors defined for each class, each accepting different argument types. The arguments are specified when allocating an object through a declaration or the new operator.

For objects created to hold temporary expressions or arguments passed by value, the constructor argument is implicitly an instance of the same class. The constructor method accepting a single argument of the same class is thus special and called the Copy Constructor. When no copy constructor has been defined, a default one is supplied which does copy the data members one by one. When an array is allocated, the constructor without argument is called for each instance in the array. Each time a variable is deallocated, its destructor method is called. Destructors have no argument.


class table {
    int size;
    int *vector;
...
};

void h()
{
   // The constructor accepting an integer as argument is called
   table t1(100);  

   // The copy constructor is called. The default copy constructor
   // would copy each member including the vector pointer.
   // A copy of the vector, not of its pointer, is probably required.
   table t2 = t1;  // beware
   table t3(200);
   // The assignment operator is used. The default implementation
   // copies each member which is probably not correct here.
   // The vector, not the pointer, should be copied.
   t3 = t2;        // beware
}

When a class inherits from others, the constructors of the parent classes are called first. Furthermore, if the class contains data members of other classes, their constructors are called thereafter. The class constructor is called last. The declaration of the class constructor must also list which of its arguments must be sent to its parent classes and data member classes. The destructors are called in the opposite order. First the class destructor is called, then the data members destructors and finally the parent classes destructors.


class employee {
  char *name;
  Address home;
public:
  employee(char *n, Address adr) : name(n), home(adr) {}
}

class manager: public employee {
  Group group;
public
  manager(char *n, Address adr, Group grp): 
      employee(n,adr), group(grp) {}
}

manager boss("John Smith","1 Main Street CA",Finance);

Function Name Overload

In C++, many functions may be declared with the same name, as long as the number or type of their arguments differ. When a function call is encountered by the compiler, the compiler selects the right function to call among all the functions previously declared with the same name. The function with exactly the same arguments types is searched, then functions for which type conversions may be applied to match the arguments types and finally functions for which default arguments may provide the missing arguments, until a suitable function is found.

It is important to note that this is quite different from virtual methods or generic functions in the CLOS language. Indeed, here the function is selected based on the declared arguments types, and not their actual type each time the function is called at run time.


// print a float

void print(float f, FILE *fp = stdout)
 { fprintf(fp,"f = %g\n",f);
 }

// print a string

void print(char *s, FILE *fp=stdout)
 { fprintf(fp,"s = %s\n",string);
 }

  int val = 15;

  char* str = "yes";

  // val may be converted to float and the first function
  // may be used
  print(val,stderr);

  // the second function may be used with stdout as default
  // value for fp
  print(str);

Operator Overload

In C++, operators such as + - * / = simply call the corresponding function or method operator+, operator-, operator*, operator/ and operator=. These functions are predefined for the built-in types (int, float, char...) and for all the pointer types. Default implementations are provided for the assignment operator for all user defined classes but may be redefined. The default behavior is pairwise assignment of all the data members. Operator overloading is convenient for user defined types for which the operators have a mnemonic signification. For example, using + to add matrices or complex numbers may be more convenient than calling a function named Add. The ``.'' operator, for member access, however cannot be redefined.


class complex {
    double re, im;
  public:
    friend complex operator+(complex, complex);
}

Templates

Templates are classes or functions defined in terms of some parameters. It is not unlike a macro which may be called with different sets of parameters. Whenever a parameterized class is called with specific values for the parameters, the compiler compiles the template in the context of the parameters supplied. The resulting object file may use the name of the template concatenated with the values specified as parameters. Whenever the template is used several times with the same parameters, only one copy of the resulting object file is required.


template <class C, int size> class Queue {

      C v[size];
      int pos;
   public:
      int push(C elem) 
       { if(pos >= size) return(0);
         v[pos] = elem; pos++; return(1);
       }
};
...
Queue<Shape,100> drawing;
...

Exceptions

Exceptions are used to interrupt the current function, when an exceptional condition is encountered, and return to one of the calling functions possibly several levels up in the call tree. This way, error conditions need not be checked at every level. The error types are ordinary classes used to store the associated information and to specify which errors each handler is willing to handle. In particular, the class inheritance is used to decide at which granularity the error types are handled by separate program sections.


class Vector {
    int size;
...
  public:
    class Range {
      public:
        int index;
        Range(int i) : index(i) {}
     };
...
    int& operator[](int i)
...
};

int& Vector::operator[](int i)
{
  if(0 <= i && i < size) return(p[i]);
  throw Range(i);
}

void f(Vector& v)
{
  // Vector v is accessed in one of the nested calls of do_something
  try { do_something(v); }  
  catch (Vector::Range r)
   { cerr << "bad index" << r.index << '\n'; 
     ... 
   }
}

Linkage with C

In principle, C++ is a superset of C. There are however a few C constructs which are not valid in C++. A number of keywords exist in C++ and not in C; for instance, variables or functions named class, private, public... are not accepted in C++. Also, class names in C++ become reserved words. Thus, while it was possible in C to have a structure and a variable using the same name, this is not accepted in C++.

The above mentioned problems arise when compiling a C program with a C++ compiler. Linking modules compiled with a C compiler with C++ compiled modules brings another problem. In C++, because of function name overloading, several functions may have the same name. To differentiate these functions, the compiler generates a mangled name for the function, which includes the argument types, in the object files. This way, the mangled names produced in the C++ compiled modules are unique, which is required by the linker. Fortunately, it is possible to tell the compiler not to use mangled names for some declared functions. This is required for C functions called from C++ or for C++ functions called from C functions.

Whether a C program calls C++ functions or a C++ program calls C functions, the final link must be performed with a C++ aware linker such that the calls to constructors for static variables are setup properly.


extern "C" {
...
#include <stdio.h>
...
}

Conclusion

The strengths of C++ are its similarity with C, its efficiency and its popularity. It has however a number of weaknesses, mostly due to its gradual, unplanned, evolution from C. The syntax is complex and somewhat irregular and thus difficult to implement. Interfaces are not well separated from the implementations which causes problems to minimize recompilations. C++ does not distinguish between arrays and pointers, which makes optimization, vectorization, parallelization, memory checks and garbage collection much more difficult.


Copyright 1995 Michel Dagenais, dagenais@vlsi.polymtl.ca, Wed Mar 8 14:41:03 EST 1995