James Thornton logo
James Thornton
Google
Web jamesthornton.com
Internet Business Consultant Call Toll Free: 1 (800) 409-2501
About James My MySpace Internet Marketing Enron Loophole Lock Bumping Contact Me
JamesThornton.com -> Bruce Eckel Books -> TICPP-2nd-ed-Vol-one -> One Page

MindView Inc.
[ Viewing Hints ] [ Exercise Solutions ] [ Volume 2 ] [ Free Newsletter ]
[ Seminars ] [ Seminars on CD ROM ] [ Consulting ]

Thinking in C++, 2nd ed. Volume 1

©2000 by Bruce Eckel

[ Previous Chapter ] [ Table of Contents ] [ Index ] [ Next Chapter ]

12: Operator Overloading

Operator overloading is just “syntactic sugar,” which means it is simply another way for you to make a function call.

The difference is that the arguments for this function don’t appear inside parentheses, but instead they surround or are next to characters you’ve always thought of as immutable operators.

There are two differences between the use of an operator and an ordinary function call. The syntax is different; an operator is often “called” by placing it between or sometimes after the arguments. The second difference is that the compiler determines which “function” to call. For instance, if you are using the operator + with floating-point arguments, the compiler “calls” the function to perform floating-point addition (this “call” is typically the act of inserting in-line code, or a floating-point-processor instruction). If you use operator + with a floating-point number and an integer, the compiler “calls” a special function to turn the int into a float, and then “calls” the floating-point addition code.

But in C++, it’s possible to define new operators that work with classes. This definition is just like an ordinary function definition except that the name of the function consists of the keyword operator followed by the operator. That’s the only difference, and it becomes a function like any other function, which the compiler calls when it sees the appropriate pattern.

Warning & reassurance

It’s tempting to become overenthusiastic with operator overloading. It’s a fun toy, at first. But remember it’s only syntactic sugar, another way of calling a function. Looking at it this way, you have no reason to overload an operator except if it will make the code involving your class easier to write and especially easier to read. (Remember, code is read much more than it is written.) If this isn’t the case, don’t bother.

Another common response to operator overloading is panic; suddenly, C operators have no familiar meaning anymore. “Everything’s changed and all my C code will do different things!” This isn’t true. All the operators used in expressions that contain only built-in data types cannot be changed. You can never overload operators such that

1 << 4;

behaves differently, or

1.414 << 2;

has meaning. Only an expression containing a user-defined type can have an overloaded operator.

Syntax

Defining an overloaded operator is like defining a function, but the name of that function is operator@, in which @ represents the operator that’s being overloaded. The number of arguments in the overloaded operator’s argument list depends on two factors:

  1. Whether it’s a unary operator (one argument) or a binary operator (two arguments).
  2. Whether the operator is defined as a global function (one argument for unary, two for binary) or a member function (zero arguments for unary, one for binary – the object becomes the left-hand argument).

Here’s a small class that shows the syntax for operator overloading:

//: C12:OperatorOverloadingSyntax.cpp
#include <iostream>
using namespace std;

class Integer {
  int i;
public:
  Integer(int ii) : i(ii) {}
  const Integer
  operator+(const Integer& rv) const {
    cout << "operator+" << endl;
    return Integer(i + rv.i);
  }
  Integer&
  operator+=(const Integer& rv) {
    cout << "operator+=" << endl;
    i += rv.i;
    return *this;
  }
};

int main() {
  cout << "built-in types:" << endl;
  int i = 1, j = 2, k = 3;
  k += i + j;
  cout << "user-defined types:" << endl;
  Integer ii(1), jj(2), kk(3);
  kk += ii + jj;
} ///:~

The two overloaded operators are defined as inline member functions that announce when they are called. The single argument is what appears on the right-hand side of the operator for binary operators. Unary operators have no arguments when defined as member functions. The member function is called for the object on the left-hand side of the operator.

For non-conditional operators (conditionals usually return a Boolean value), you’ll almost always want to return an object or reference of the same type you’re operating on if the two arguments are the same type. (If they’re not the same type, the interpretation of what it should produce is up to you.) This way, complicated expressions can be built up:

kk += ii + jj;

The operator+ produces a new Integer (a temporary) that is used as the rv argument for the operator+=. This temporary is destroyed as soon as it is no longer needed.

Overloadable operators

Although you can overload almost all the operators available in C, the use of operator overloading is fairly restrictive. In particular, you cannot combine operators that currently have no meaning in C (such as ** to represent exponentiation), you cannot change the evaluation precedence of operators, and you cannot change the number of arguments required by an operator. This makes sense – all of these actions would produce operators that confuse meaning rather than clarify it.

The next two subsections give examples of all the “regular” operators, overloaded in the form that you’ll most likely use.

Unary operators

The following example shows the syntax to overload all the unary operators, in the form of both global functions (non-member friend functions) and as member functions. These will expand upon the Integer class shown previously and add a new byte class. The meaning of your particular operators will depend on the way you want to use them, but consider the client programmer before doing something unexpected.

Here is a catalog of all the unary functions:

//: C12:OverloadingUnaryOperators.cpp
#include <iostream>
using namespace std;

// Non-member functions:
class Integer {
  long i;
  Integer* This() { return this; }
public:
  Integer(long ll = 0) : i(ll) {}
  // No side effects takes const& argument:
  friend const Integer&
    operator+(const Integer& a);
  friend const Integer
    operator-(const Integer& a);
  friend const Integer
    operator~(const Integer& a);
  friend Integer*
    operator&(Integer& a);
  friend int
    operator!(const Integer& a);
  // Side effects have non-const& argument:
  // Prefix:
  friend const Integer&
    operator++(Integer& a);
  // Postfix:
  friend const Integer
    operator++(Integer& a, int);
  // Prefix:
  friend const Integer&
    operator--(Integer& a);
  // Postfix:
  friend const Integer
    operator--(Integer& a, int);
};

// Global operators:
const Integer& operator+(const Integer& a) {
  cout << "+Integer\n";
  return a; // Unary + has no effect
}
const Integer operator-(const Integer& a) {
  cout << "-Integer\n";
  return Integer(-a.i);
}
const Integer operator~(const Integer& a) {
  cout << "~Integer\n";
  return Integer(~a.i);
}
Integer* operator&(Integer& a) {
  cout << "&Integer\n";
  return a.This(); // &a is recursive!
}
int operator!(const Integer& a) {
  cout << "!Integer\n";
  return !a.i;
}
// Prefix; return incremented value
const Integer& operator++(Integer& a) {
  cout << "++Integer\n";
  a.i++;
  return a;
}
// Postfix; return the value before increment:
const Integer operator++(Integer& a, int) {
  cout << "Integer++\n";
  Integer before(a.i);
  a.i++;
  return before;
}
// Prefix; return decremented value
const Integer& operator--(Integer& a) {
  cout << "--Integer\n";
  a.i--;
  return a;
}
// Postfix; return the value before decrement:
const Integer operator--(Integer& a, int) {
  cout << "Integer--\n";
  Integer before(a.i);
  a.i--;
  return before;
}

// Show that the overloaded operators work:
void f(Integer a) {
  +a;
  -a;
  ~a;
  Integer* ip = &a;
  !a;
  ++a;
  a++;
  --a;
  a--;
}

// Member functions (implicit "this"):
class Byte {
  unsigned char b;
public:
  Byte(unsigned char bb = 0) : b(bb) {}
  // No side effects: const member function:
  const Byte& operator+() const {
    cout << "+Byte\n";
    return *this;
  }
  const Byte operator-() const {
    cout << "-Byte\n";
    return Byte(-b);
  }
  const Byte operator~() const {
    cout << "~Byte\n";
    return Byte(~b);
  }
  Byte operator!() const {
    cout << "!Byte\n";
    return Byte(!b);
  }
  Byte* operator&() {
    cout << "&Byte\n";
    return this;
  }
  // Side effects: non-const member function:
  const Byte& operator++() { // Prefix
    cout << "++Byte\n";
    b++;
    return *this;
  }
  const Byte operator++(int) { // Postfix
    cout << "Byte++\n";
    Byte before(b);
    b++;
    return before;
  }
  const Byte& operator--() { // Prefix
    cout << "--Byte\n";
    --b;
    return *this;
  }
  const Byte operator--(int) { // Postfix
    cout << "Byte--\n";
    Byte before(b);
    --b;
    return before;
  }
};

void g(Byte b) {
  +b;
  -b;
  ~b;
  Byte* bp = &b;
  !b;
  ++b;
  b++;
  --b;
  b--;
}

int main() {
  Integer a;
  f(a);
  Byte b;
  g(b);
} ///:~

The functions are grouped according to the way their arguments are passed. Guidelines for how to pass and return arguments are given later. The forms above (and the ones that follow in the next section) are typically what you’ll use, so start with them as a pattern when overloading your own operators.

Increment & decrement

The overloaded ++ and – – operators present a dilemma because you want to be able to call different functions depending on whether they appear before (prefix) or after (postfix) the object they’re acting upon. The solution is simple, but people sometimes find it a bit confusing at first. When the compiler sees, for example, ++a (a pre-increment), it generates a call to operator++(a); but when it sees a++, it generates a call to operator++(a, int). That is, the compiler differentiates between the two forms by making calls to different overloaded functions. In OverloadingUnaryOperators.cpp for the member function versions, if the compiler sees ++b, it generates a call to B::operator++( ); if it sees b++ it calls B::operator++(int).

All the user sees is that a different function gets called for the prefix and postfix versions. Underneath, however, the two functions calls have different signatures, so they link to two different function bodies. The compiler passes a dummy constant value for the int argument (which is never given an identifier because the value is never used) to generate the different signature for the postfix version.

Binary operators

The following listing repeats the example of OverloadingUnaryOperators.cpp for binary operators so you have an example of all the operators you might want to overload. Again, both global versions and member function versions are shown.

//: C12:Integer.h
// Non-member overloaded operators
#ifndef INTEGER_H
#define INTEGER_H
#include <iostream>

// Non-member functions:
class Integer { 
  long i;
public:
  Integer(long ll = 0) : i(ll) {}
  // Operators that create new, modified value:
  friend const Integer
    operator+(const Integer& left,
              const Integer& right);
  friend const Integer
    operator-(const Integer& left,
              const Integer& right);
  friend const Integer
    operator*(const Integer& left,
              const Integer& right);
  friend const Integer
    operator/(const Integer& left,
              const Integer& right);
  friend const Integer
    operator%(const Integer& left,
              const Integer& right);
  friend const Integer
    operator^(const Integer& left,
              const Integer& right);
  friend const Integer
    operator&(const Integer& left,
              const Integer& right);
  friend const Integer
    operator|(const Integer& left,
              const Integer& right);
  friend const Integer
    operator<<(const Integer& left,
               const Integer& right);
  friend const Integer
    operator>>(const Integer& left,
               const Integer& right);
  // Assignments modify & return lvalue:
  friend Integer&
    operator+=(Integer& left,
               const Integer& right);
  friend Integer&
    operator-=(Integer& left,
               const Integer& right);
  friend Integer&
    operator*=(Integer& left,
               const Integer& right);
  friend Integer&
    operator/=(Integer& left,
               const Integer& right);
  friend Integer&
    operator%=(Integer& left,
               const Integer& right);
  friend Integer&
    operator^=(Integer& left,
               const Integer& right);
  friend Integer&
    operator&=(Integer& left,
               const Integer& right);
  friend Integer&
    operator|=(Integer& left,
               const Integer& right);
  friend Integer&
    operator>>=(Integer& left,
                const Integer& right);
  friend Integer&
    operator<<=(Integer& left,
                const Integer& right);
  // Conditional operators return true/false:
  friend int
    operator==(const Integer& left,
               const Integer& right);
  friend int
    operator!=(const Integer& left,
               const Integer& right);
  friend int
    operator<(const Integer& left,
              const Integer& right);
  friend int
    operator>(const Integer& left,
              const Integer& right);
  friend int
    operator<=(const Integer& left,
               const Integer& right);
  friend int
    operator>=(const Integer& left,
               const Integer& right);
  friend int
    operator&&(const Integer& left,
               const Integer& right);
  friend int
    operator||(const Integer& left,
               const Integer& right);
  // Write the contents to an ostream:
  void print(std::ostream& os) const { os << i; }
}; 
#endif // INTEGER_H ///:~
//: C12:Integer.cpp {O}
// Implementation of overloaded operators
#include "Integer.h"
#include "../require.h"

const Integer
  operator+(const Integer& left,
            const Integer& right) {
  return Integer(left.i + right.i);
}
const Integer
  operator-(const Integer& left,
            const Integer& right) {
  return Integer(left.i - right.i);
}
const Integer
  operator*(const Integer& left,
            const Integer& right) {
  return Integer(left.i * right.i);
}
const Integer
  operator/(const Integer& left,
            const Integer& right) {
  require(right.i != 0, "divide by zero");
  return Integer(left.i / right.i);
}
const Integer
  operator%(const Integer& left,
            const Integer& right) {
  require(right.i != 0, "modulo by zero");
  return Integer(left.i % right.i);
}
const Integer
  operator^(const Integer& left,
            const Integer& right) {
  return Integer(left.i ^ right.i);
}
const Integer
  operator&(const Integer& left,
            const Integer& right) {
  return Integer(left.i & right.i);
}
const Integer
  operator|(const Integer& left,
            const Integer& right) {
  return Integer(left.i | right.i);
}
const Integer
  operator<<(const Integer& left,
             const Integer& right) {
  return Integer(left.i << right.i);
}
const Integer
  operator>>(const Integer& left,
             const Integer& right) {
  return Integer(left.i >> right.i);
}
// Assignments modify & return lvalue:
Integer& operator+=(Integer& left,
                    const Integer& right) {
   if(&left == &right) {/* self-assignment */}
   left.i += right.i;
   return left;
}
Integer& operator-=(Integer& left,
                    const Integer& right) {
   if(&left == &right) {/* self-assignment */}
   left.i -= right.i;
   return left;
}
Integer& operator*=(Integer& left,
                    const Integer& right) {
   if(&left == &right) {/* self-assignment */}
   left.i *= right.i;
   return left;
}
Integer& operator/=(Integer& left,
                    const Integer& right) {
   require(right.i != 0, "divide by zero");
   if(&left == &right) {/* self-assignment */}
   left.i /= right.i;
   return left;
}
Integer& operator%=(Integer& left,
                    const Integer& right) {
   require(right.i != 0, "modulo by zero");
   if(&left == &right) {/* self-assignment */}
   left.i %= right.i;
   return left;
}
Integer& operator^=(Integer& left,
                    const Integer& right) {
   if(&left == &right) {/* self-assignment */}
   left.i ^= right.i;
   return left;
}
Integer& operator&=(Integer& left,
                    const Integer& right) {
   if(&left == &right) {/* self-assignment */}
   left.i &= right.i;
   return left;
}
Integer& operator|=(Integer& left,
                    const Integer& right) {
   if(&left == &right) {/* self-assignment */}
   left.i |= right.i;
   return left;
}
Integer& operator>>=(Integer& left,
                     const Integer& right) {
   if(&left == &right) {/* self-assignment */}
   left.i >>= right.i;
   return left;
}
Integer& operator<<=(Integer& left,
                     const Integer& right) {
   if(&left == &right) {/* self-assignment */}
   left.i <<= right.i;
   return left;
}
// Conditional operators return true/false:
int operator==(const Integer& left,
               const Integer& right) {
    return left.i == right.i;
}
int operator!=(const Integer& left,
               const Integer& right) {
    return left.i != right.i;
}
int operator<(const Integer& left,
              const Integer& right) {
    return left.i < right.i;
}
int operator>(const Integer& left,
              const Integer& right) {
    return left.i > right.i;
}
int operator<=(const Integer& left,
               const Integer& right) {
    return left.i <= right.i;
}
int operator>=(const Integer& left,
               const Integer& right) {
    return left.i >= right.i;
}
int operator&&(const Integer& left,
               const Integer& right) {
    return left.i && right.i;
}
int operator||(const Integer& left,
               const Integer& right) {
    return left.i || right.i;
} ///:~
//: C12:IntegerTest.cpp
//{L} Integer
#include "Integer.h"
#include <fstream>
using namespace std;
ofstream out("IntegerTest.out");

void h(Integer& c1, Integer& c2) {
  // A complex expression:
  c1 += c1 * c2 + c2 % c1;
  #define TRY(OP) \
    out << "c1 = "; c1.print(out); \
    out << ", c2 = "; c2.print(out); \
    out << ";  c1 " #OP " c2 produces "; \
    (c1 OP c2).print(out); \
    out << endl;
  TRY(+) TRY(-) TRY(*) TRY(/)
  TRY(%) TRY(^) TRY(&) TRY(|)
  TRY(<<) TRY(>>) TRY(+=) TRY(-=)
  TRY(*=) TRY(/=) TRY(%=) TRY(^=)
  TRY(&=) TRY(|=) TRY(>>=) TRY(<<=)
  // Conditionals:
  #define TRYC(OP) \
    out << "c1 = "; c1.print(out); \
    out << ", c2 = "; c2.print(out); \
    out << ";  c1 " #OP " c2 produces "; \
    out << (c1 OP c2); \
    out << endl;
  TRYC(<) TRYC(>) TRYC(==) TRYC(!=) TRYC(<=)
  TRYC(>=) TRYC(&&) TRYC(||)
} 

int main() {
  cout << "friend functions" << endl;
  Integer c1(47), c2(9);
  h(c1, c2);
} ///:~
//: C12:Byte.h
// Member overloaded operators
#ifndef BYTE_H
#define BYTE_H
#include "../require.h"
#include <iostream>
// Member functions (implicit "this"):
class Byte { 
  unsigned char b;
public:
  Byte(unsigned char bb = 0) : b(bb) {}
  // No side effects: const member function:
  const Byte
    operator+(const Byte& right) const {
    return Byte(b + right.b);
  }
  const Byte
    operator-(const Byte& right) const {
    return Byte(b - right.b);
  }
  const Byte
    operator*(const Byte& right) const {
    return Byte(b * right.b);
  }
  const Byte
    operator/(const Byte& right) const {
    require(right.b != 0, "divide by zero");
    return Byte(b / right.b);
  }
  const Byte
    operator%(const Byte& right) const {
    require(right.b != 0, "modulo by zero");
    return Byte(b % right.b);
  }
  const Byte
    operator^(const Byte& right) const {
    return Byte(b ^ right.b);
  }
  const Byte
    operator&(const Byte& right) const {
    return Byte(b & right.b);
  }
  const Byte
    operator|(const Byte& right) const {
    return Byte(b | right.b);
  }
  const Byte
    operator<<(const Byte& right) const {
    return Byte(b << right.b);
  }
  const Byte
    operator>>(const Byte& right) const {
    return Byte(b >> right.b);
  }
  // Assignments modify & return lvalue.
  // operator= can only be a member function:
  Byte& operator=(const Byte& right) {
    // Handle self-assignment:
    if(this == &right) return *this;
    b = right.b;
    return *this;
  }
  Byte& operator+=(const Byte& right) {
    if(this == &right) {/* self-assignment */}
    b += right.b;
    return *this;
  }
  Byte& operator-=(const Byte& right) {
    if(this == &right) {/* self-assignment */}
    b -= right.b;
    return *this;
  }
  Byte& operator*=(const Byte& right) {
    if(this == &right) {/* self-assignment */}
    b *= right.b;
    return *this;
  }
  Byte& operator/=(const Byte& right) {
    require(right.b != 0, "divide by zero");
    if(this == &right) {/* self-assignment */}
    b /= right.b;
    return *this;
  }
  Byte& operator%=(const Byte& right) {
    require(right.b != 0, "modulo by zero");
    if(this == &right) {/* self-assignment */}
    b %= right.b;
    return *this;
  }
  Byte& operator^=(const Byte& right) {
    if(this == &right) {/* self-assignment */}
    b ^= right.b;
    return *this;
  }
  Byte& operator&=(const Byte& right) {
    if(this == &right) {/* self-assignment */}
    b &= right.b;
    return *this;
  }
  Byte& operator|=(const Byte& right) {
    if(this == &right) {/* self-assignment */}
    b |= right.b;
    return *this;
  }
  Byte& operator>>=(const Byte& right) {
    if(this == &right) {/* self-assignment */}
    b >>= right.b;
    return *this;
  }
  Byte& operator<<=(const Byte& right) {
    if(this == &right) {/* self-assignment */}
    b <<= right.b;
    return *this;
  }
  // Conditional operators return true/false:
  int operator==(const Byte& right) const {
      return b == right.b;
  }
  int operator!=(const Byte& right) const {
      return b != right.b;
  }
  int operator<(const Byte& right) const {
      return b < right.b;
  }
  int operator>(const Byte& right) const {
      return b > right.b;
  }
  int operator<=(const Byte& right) const {
      return b <= right.b;
  }
  int operator>=(const Byte& right) const {
      return b >= right.b;
  }
  int operator&&(const Byte& right) const {
      return b && right.b;
  }
  int operator||(const Byte& right) const {
      return b || right.b;
  }
  // Write the contents to an ostream:
  void print(std::ostream& os) const {
    os << "0x" << std::hex << int(b) << std::dec;
  }
}; 
#endif // BYTE_H ///:~
//: C12:ByteTest.cpp
#include "Byte.h"
#include <fstream>
using namespace std;
ofstream out("ByteTest.out");

void k(Byte& b1, Byte& b2) {
  b1 = b1 * b2 + b2 % b1;

  #define TRY2(OP) \
    out << "b1 = "; b1.print(out); \
    out << ", b2 = "; b2.print(out); \
    out << ";  b1 " #OP " b2 produces "; \
    (b1 OP b2).print(out); \
    out << endl;

  b1 = 9; b2 = 47;
  TRY2(+) TRY2(-) TRY2(*) TRY2(/)
  TRY2(%) TRY2(^) TRY2(&) TRY2(|)
  TRY2(<<) TRY2(>>) TRY2(+=) TRY2(-=)
  TRY2(*=) TRY2(/=) TRY2(%=) TRY2(^=)
  TRY2(&=) TRY2(|=) TRY2(>>=) TRY2(<<=)
  TRY2(=) // Assignment operator

  // Conditionals:
  #define TRYC2(OP) \
    out << "b1 = "; b1.print(out); \
    out << ", b2 = "; b2.print(out); \
    out << ";  b1 " #OP " b2 produces "; \
    out << (b1 OP b2); \
    out << endl;

  b1 = 9; b2 = 47;
  TRYC2(<) TRYC2(>) TRYC2(==) TRYC2(!=) TRYC2(<=)
  TRYC2(>=) TRYC2(&&) TRYC2(||)

  // Chained assignment:
  Byte b3 = 92;
  b1 = b2 = b3;
}

int main() {
  out << "member functions:" << endl;
  Byte b1(47), b2(9);
  k(b1, b2);
} ///:~

You can see that operator= is only allowed to be a member function. This is explained later.

Notice that all of the assignment operators have code to check for self-assignment; this is a general guideline. In some cases this is not necessary; for example, with operator+= you often want to say A+=A and have it add A to itself. The most important place to check for self-assignment is operator= because with complicated objects disastrous results may occur. (In some cases it’s OK, but you should always keep it in mind when writing operator=.)

All of the operators shown in the previous two examples are overloaded to handle a single type. It’s also possible to overload operators to handle mixed types, so you can add apples to oranges, for example. Before you start on an exhaustive overloading of operators, however, you should look at the section on automatic type conversion later in this chapter. Often, a type conversion in the right place can save you a lot of overloaded operators.

Arguments & return values

It may seem a little confusing at first when you look at OverloadingUnaryOperators.cpp, Integer.h and Byte.h and see all the different ways that arguments are passed and returned. Although you can pass and return arguments any way you want to, the choices in these examples were not selected at random. They follow a logical pattern, the same one you’ll want to use in most of your choices.

  1. As with any function argument, if you only need to read from the argument and not change it, default to passing it as a const reference. Ordinary arithmetic operations (like + and , etc.) and Booleans will not change their arguments, so pass by const reference is predominantly what you’ll use. When the function is a class member, this translates to making it a const member function. Only with the operator-assignments (like +=) and the operator=, which change the left-hand argument, is the left argument not a constant, but it’s still passed in as an address because it will be changed.
  2. The type of return value you should select depends on the expected meaning of the operator. (Again, you can do anything you want with the arguments and return values.) If the effect of the operator is to produce a new value, you will need to generate a new object as the return value. For example, Integer::operator+ must produce an Integer object that is the sum of the operands. This object is returned by value as a const, so the result cannot be modified as an lvalue.
  3. All the assignment operators modify the lvalue. To allow the result of the assignment to be used in chained expressions, like a=b=c, it’s expected that you will return a reference to that same lvalue that was just modified. But should this reference be a const or nonconst? Although you read a=b=c from left to right, the compiler parses it from right to left, so you’re not forced to return a nonconst to support assignment chaining. However, people do sometimes expect to be able to perform an operation on the thing that was just assigned to, such as (a=b).func( ); to call func( ) on a after assigning b to it. Thus, the return value for all of the assignment operators should be a nonconst reference to the lvalue.
  4. For the logical operators, everyone expects to get at worst an int back, and at best a bool. (Libraries developed before most compilers supported C++’s built-in bool will use int or an equivalent typedef.)

The increment and decrement operators present a dilemma because of the pre- and postfix versions. Both versions change the object and so cannot treat the object as a const. The prefix version returns the value of the object after it was changed, so you expect to get back the object that was changed. Thus, with prefix you can just return *this as a reference. The postfix version is supposed to return the value before the value is changed, so you’re forced to create a separate object to represent that value and return it. So with postfix you must return by value if you want to preserve the expected meaning. (Note that you’ll sometimes find the increment and decrement operators returning an int or bool to indicate, for example, whether an object designed to move through a list is at the end of that list.) Now the question is: Should these be returned as const or nonconst? If you allow the object to be modified and someone writes (++a).func( ), func( ) will be operating on a itself, but with (a++).func( ), func( ) operates on the temporary object returned by the postfix operator++. Temporary objects are automatically const, so this would be flagged by the compiler, but for consistency’s sake it may make more sense to make them both const, as was done here. Or you may choose to make the prefix version non-const and the postfix const. Because of the variety of meanings you may want to give the increment and decrement operators, they will need to be considered on a case-by-case basis.

Return by value as const

Returning by value as a const can seem a bit subtle at first, so it deserves a bit more explanation. Consider the binary operator+. If you use it in an expression such as f(a+b), the result of a+b becomes a temporary object that is used in the call to f( ). Because it’s a temporary, it’s automatically const, so whether you explicitly make the return value const or not has no effect.

However, it’s also possible for you to send a message to the return value of a+b, rather than just passing it to a function. For example, you can say (a+b).g( ), in which g( ) is some member function of Integer, in this case. By making the return value const, you state that only a const member function can be called for that return value. This is const-correct, because it prevents you from storing potentially valuable information in an object that will most likely be lost.

The return optimization

When new objects are created to return by value, notice the form used. In operator+, for example:

return Integer(left.i + right.i);

This may look at first like a “function call to a constructor,” but it’s not. The syntax is that of a temporary object; the statement says “make a temporary Integer object and return it.” Because of this, you might think that the result is the same as creating a named local object and returning that. However, it’s quite different. If you were to say instead:

Integer tmp(left.i + right.i);
return tmp;

three things will happen. First, the tmp object is created including its constructor call. Second, the copy-constructor copies the tmp to the location of the outside return value. Third, the destructor is called for tmp at the end of the scope.

In contrast, the “returning a temporary” approach works quite differently. When the compiler sees you do this, it knows that you have no other need for the object it’s creating than to return it. The compiler takes advantage of this by building the object directly into the location of the outside return value. This requires only a single ordinary constructor call (no copy-constructor is necessary) and there’s no destructor call because you never actually create a local object. Thus, while it doesn’t cost anything but programmer awareness, it’s significantly more efficient. This is often called the return value optimization.

Unusual operators

Several additional operators have a slightly different syntax for overloading.

The subscript, operator[ ], must be a member function and it requires a single argument. Because operator[ ] implies that the object it’s being called for acts like an array, you will often return a reference from this operator, so it can be conveniently used on the left-hand side of an equal sign. This operator is commonly overloaded; you’ll see examples in the rest of the book.

The operators new and delete control dynamic storage allocation and can be overloaded in a number of different ways. This topic is covered in the Chapter 13.

Operator comma

The comma operator is called when it appears next to an object of the type the comma is defined for. However, “operator, is not called for function argument lists, only for objects that are out in the open, separated by commas. There doesn’t seem to be a lot of practical uses for this operator; it’s in the language for consistency. Here’s an example showing how the comma function can be called when the comma appears before an object, as well as after:

//: C12:OverloadingOperatorComma.cpp
#include <iostream>
using namespace std;

class After {
public:
  const After& operator,(const After&) const {
    cout << "After::operator,()" << endl;
    return *this;
  }
};

class Before {};

Before& operator,(int, Before& b) {
  cout << "Before::operator,()" << endl;
  return b;
}

int main() {
  After a, b;
  a, b;  // Operator comma called

  Before c;
  1, c;  // Operator comma called
} ///:~

The global function allows the comma to be placed before the object in question. The usage shown is fairly obscure and questionable. Although you would probably use a comma-separated list as part of a more complex expression, it’s too subtle to use in most situations.

Operator->

The operator–> is generally used when you want to make an object appear to be a pointer. Since such an object has more “smarts” built into it than exist for a typical pointer, an object like this is often called a smart pointer. These are especially useful if you want to “wrap” a class around a pointer to make that pointer safe, or in the common usage of an iterator, which is an object that moves through a collection /container of other objects and selects them one at a time, without providing direct access to the implementation of the container. (You’ll often find containers and iterators in class libraries, such as in the Standard C++ Library, described in Volume 2 of this book.)

A pointer dereference operator must be a member function. It has additional, atypical constraints: It must return an object (or reference to an object) that also has a pointer dereference operator, or it must return a pointer that can be used to select what the pointer dereference operator arrow is pointing at. Here’s a simple example:

//: C12:SmartPointer.cpp
#include <iostream>
#include <vector>
#include "../require.h"
using namespace std;

class Obj {
  static int i, j;
public:
  void f() const { cout << i++ << endl; }
  void g() const { cout << j++ << endl; }
};

// Static member definitions:
int Obj::i = 47;
int Obj::j = 11;

// Container:
class ObjContainer {
  vector<Obj*> a;
public:
  void add(Obj* obj) { a.push_back(obj); }
  friend class SmartPointer;
};

class SmartPointer {
  ObjContainer& oc;
  int index;
public:
  SmartPointer(ObjContainer& objc) : oc(objc) {
    index = 0;
  }
  // Return value indicates end of list:
  bool operator++() { // Prefix
    if(index >= oc.a.size()) return false;
    if(oc.a[++index] == 0) return false;
    return true;
  }
  bool operator++(int) { // Postfix
    return operator++(); // Use prefix version
  }
  Obj* operator->() const {
    require(oc.a[index] != 0, "Zero value "
      "returned by SmartPointer::operator->()");
    return oc.a[index];
  }
};

int main() {
  const int sz = 10;
  Obj o[sz];
  ObjContainer oc;
  for(int i = 0; i < sz; i++)
    oc.add(&o[i]); // Fill it up
  SmartPointer sp(oc); // Create an iterator
  do {
    sp->f(); // Pointer dereference operator call
    sp->g();
  } while(sp++);
} ///:~

The class Obj defines the objects that are manipulated in this program. The functions f( ) and g( ) simply print out interesting values using static data members. Pointers to these objects are stored inside containers of type ObjContainer using its add( ) function. ObjContainer looks like an array of pointers, but you’ll notice there’s no way to get the pointers back out again. However, SmartPointer is declared as a friend class, so it has permission to look inside the container. The SmartPointer class looks very much like an intelligent pointer – you can move it forward using operator++ (you can also define an operator– –), it won’t go past the end of the container it’s pointing to, and it produces (via the pointer dereference operator) the value it’s pointing to. Notice that the SmartPointer is a custom fit for the container it’s created for; unlike an ordinary pointer, there isn’t a “general purpose” smart pointer. You will learn more about the smart pointers called “iterators” in the last chapter of this book and in Volume 2 (downloadable from www.BruceEckel.com).

In main( ), once the container oc is filled with Obj objects, a SmartPointer sp is created. The smart pointer calls happen in the expressions:

sp->f(); // Smart pointer calls
sp->g(); 

Here, even though sp doesn’t actually have f( ) and g( ) member functions, the pointer dereference operator automatically calls those functions for the Obj* that is returned by SmartPointer::operator–>. The compiler performs all the checking to make sure the function call works properly.

Although the underlying mechanics of the pointer dereference operator are more complex than the other operators, the goal is exactly the same: to provide a more convenient syntax for the users of your classes.

A nested iterator

It’s more common to see a “smart pointer” or “iterator” class nested within the class that it services. The previous example can be rewritten to nest SmartPointer inside ObjContainer like this:

//: C12:NestedSmartPointer.cpp
#include <iostream>
#include <vector>
#include "../require.h"
using namespace std;

class Obj {
  static int i, j;
public:
  void f() { cout << i++ << endl; }
  void g() { cout << j++ << endl; }
};

// Static member definitions:
int Obj::i = 47;
int Obj::j = 11;

// Container:
class ObjContainer {
  vector<Obj*> a;
public:
  void add(Obj* obj) { a.push_back(obj); }
  class SmartPointer;
  friend class SmartPointer;
  class SmartPointer {
    ObjContainer& oc;
    unsigned int index;
  public:
    SmartPointer(ObjContainer& objc) : oc(objc) {
      index = 0;
    }
    // Return value indicates end of list:
    bool operator++() { // Prefix
      if(index >= oc.a.size()) return false;
      if(oc.a[++index] == 0) return false;
      return true;
    }
    bool operator++(int) { // Postfix
      return operator++(); // Use prefix version
    }
    Obj* operator->() const {
      require(oc.a[index] != 0, "Zero value "
        "returned by SmartPointer::operator->()");
      return oc.a[index];
    }
  };
  // Function to produce a smart pointer that 
  // points to the beginning of the ObjContainer:
  SmartPointer begin() { 
    return SmartPointer(*this);
  }
};

int main() {
  const int sz = 10;
  Obj o[sz];
  ObjContainer oc;
  for(int i = 0; i < sz; i++)
    oc.add(&o[i]); // Fill it up
  ObjContainer::SmartPointer sp = oc.begin();
  do {
    sp->f(); // Pointer dereference operator call
    sp->g();
  } while(++sp);
} ///:~

Besides the actual nesting of the class, there are only two differences here. The first is in the declaration of the class so that it can be a friend:

class SmartPointer;
friend SmartPointer;

The compiler must first know that the class exists before it can be told that it’s a friend.

The second difference is in the ObjContainer member function begin( ), which produces a SmartPointer that points to the beginning of the ObjContainer sequence. Although it’s really only a convenience, it’s valuable because it follows part of the form used in the Standard C++ Library.

Operator->*

The operator–>* is a binary operator that behaves like all the other binary operators. It is provided for those situations when you want to mimic the behavior provided by the built-in pointer-to-member syntax, described in the previous chapter.

Just like operator->, the pointer-to-member dereference operator is generally used with some kind of object that represents a “smart pointer,” although the example shown here will be simpler so it’s understandable. The trick when defining operator->* is that it must return an object for which the operator( ) can be called with the arguments for the member function you’re calling.

The function call operator( ) must be a member function, and it is unique in that it allows any number of arguments. It makes your object look like it’s actually a function. Although you could define several overloaded operator( ) functions with different arguments, it’s often used for types that only have a single operation, or at least an especially prominent one. You’ll see in Volume 2 that the Standard C++ Library uses the function call operator in order to create “function objects.”

To create an operator->* you must first create a class with an operator( ) that is the type of object that operator->* will return. This class must somehow capture the necessary information so that when the operator( ) is called (which happens automatically), the pointer-to-member will be dereferenced for the object. In the following example, the FunctionObject constructor captures and stores both the pointer to the object and the pointer to the member function, and then the operator( ) uses those to make the actual pointer-to-member call:

//: C12:PointerToMemberOperator.cpp
#include <iostream>
using namespace std;

class Dog {
public:
  int run(int i) const { 
    cout << "run\n";  
    return i; 
  }
  int eat(int i) const { 
     cout << "eat\n";  
     return i; 
  }
  int sleep(int i) const { 
    cout << "ZZZ\n"; 
    return i; 
  }
  typedef int (Dog::*PMF)(int) const;
  // operator->* must return an object 
  // that has an operator():
  class FunctionObject {
    Dog* ptr;
    PMF pmem;
  public:
    // Save the object pointer and member pointer
    FunctionObject(Dog* wp, PMF pmf) 
      : ptr(wp), pmem(pmf) { 
      cout << "FunctionObject constructor\n";
    }
    // Make the call using the object pointer
    // and member pointer
    int operator()(int i) const {
      cout << "FunctionObject::operator()\n";
      return (ptr->*pmem)(i); // Make the call
    }
  };
  FunctionObject operator->*(PMF pmf) { 
    cout << "operator->*" << endl;
    return FunctionObject(this, pmf);
  }
};
 
int main() {
  Dog w;
  Dog::PMF pmf = &Dog::run;
  cout << (w->*pmf)(1) << endl;
  pmf = &Dog::sleep;
  cout << (w->*pmf)(2) << endl;
  pmf = &Dog::eat;
  cout << (w->*pmf)(3) << endl;
} ///:~

Dog has three member functions, all of which take an int argument and return an int. PMF is a typedef to simplify defining a pointer-to-member to Dog’s member functions.

A FunctionObject is created and returned by operator->*. Notice that operator->* knows both the object that the pointer-to-member is being called for (this) and the pointer-to-member, and it passes those to the FunctionObject constructor that stores the values. When operator->* is called, the compiler immediately turns around and calls operator( ) for the return value of operator->*, passing in the arguments that were given to operator->*. The FunctionObject::operator( ) takes the arguments and then dereferences the “real” pointer-to-member using its stored object pointer and pointer-to-member.

Notice that what you are doing here, just as with operator->, is inserting yourself in the middle of the call to operator->*. This allows you to perform some extra operations if you need to.

The operator->* mechanism implemented here only works for member functions that take an int argument and return an int. This is limiting, but if you try to create overloaded mechanisms for each different possibility, it seems like a prohibitive task. Fortunately, C++’s template mechanism (described in the last chapter of this book, and in Volume 2) is designed to handle just such a problem.

Operators you can’t overload

There are certain operators in the available set that cannot be overloaded. The general reason for the restriction is safety. If these operators were overloadable, it would somehow jeopardize or break safety mechanisms, make things harder, or confuse existing practice.

Non-member operators

In some of the previous examples, the operators may be members or non-members, and it doesn’t seem to make much difference. This usually raises the question, “Which should I choose?” In general, if it doesn’t make any difference, they should be members, to emphasize the association between the operator and its class. When the left-hand operand is always an object of the current class, this works fine.

However, sometimes you want the left-hand operand to be an object of some other class. A common place you’ll see this is when the operators << and >> are overloaded for iostreams. Since iostreams is a fundamental C++ library, you’ll probably want to overload these operators for most of your classes, so the process is worth memorizing:

//: C12:IostreamOperatorOverloading.cpp
// Example of non-member overloaded operators
#include "../require.h"
#include <iostream>
#include <sstream> // "String streams"
#include <cstring>
using namespace std;

class IntArray {
  enum { sz = 5 };
  int i[sz];
public:
  IntArray() { memset(i, 0, sz* sizeof(*i)); }
  int& operator[](int x) {
    require(x >= 0 && x < sz,
      "IntArray::operator[] out of range");
    return i[x];
  }
  friend ostream&
    operator<<(ostream& os, const IntArray& ia);
  friend istream&
    operator>>(istream& is, IntArray& ia);
};

ostream& 
operator<<(ostream& os, const IntArray& ia) {
  for(int j = 0; j < ia.sz; j++) {
    os << ia.i[j];
    if(j != ia.sz -1)
      os << ", ";
  }
  os << endl;
  return os;
}

istream& operator>>(istream& is, IntArray& ia){
  for(int j = 0; j < ia.sz; j++)
    is >> ia.i[j];
  return is;
}

int main() {
  stringstream input("47 34 56 92 103");
  IntArray I;
  input >> I;
  I[4] = -1; // Use overloaded operator[]
  cout << I;
} ///:~

This class also contains an overloaded operator [ ], which returns a reference to a legitimate value in the array. Because a reference is returned, the expression

I[4] = -1;

not only looks much more civilized than if pointers were used, it also accomplishes the desired effect.

It’s important that the overloaded shift operators pass and return by reference, so the actions will affect the external objects. In the function definitions, expressions like

os << ia.i[j];

cause the existing overloaded operator functions to be called (that is, those defined in <iostream>). In this case, the function called is ostream& operator<<(ostream&, int) because ia.i[j] resolves to an int.

Once all the actions are performed on the istream or ostream, it is returned so it can be used in a more complicated expression.

In main( ), a new type of iostream is used: the stringstream (declared in <sstream>). This is a class that takes a string (which it can create from a char array, as shown here) and turns it into an iostream. In the example above, this means that the shift operators can be tested without opening a file or typing data in on the command line.

The form shown in this example for the inserter and extractor is standard. If you want to create these operators for your own class, copy the function signatures and return types above and follow the form of the body.

Basic guidelines

Murray[49] suggests these guidelines for choosing between members and non-members:

Operator

Recommended use

All unary operators

member

= ( ) [ ] –> –>*

must be member

+= –= /= *= ^=
&= |= %= >>= <<=

member

All other binary operators

non-member

Overloading assignment

A common source of confusion with new C++ programmers is assignment. This is no doubt because the = sign is such a fundamental operation in programming, right down to copying a register at the machine level. In addition, the copy-constructor (described in Chapter 11) is also sometimes invoked when the = sign is used:

MyType b;
MyType a = b;
a = b;

In the second line, the object a is being defined. A new object is being created where one didn’t exist before. Because you know by now how defensive the C++ compiler is about object initialization, you know that a constructor must always be called at the point where an object is defined. But which constructor? a is being created from an existing MyType object (b, on the right side of the equal sign), so there’s only one choice: the copy-constructor. Even though an equal sign is involved, the copy-constructor is called.

In the third line, things are different. On the left side of the equal sign, there’s a previously initialized object. Clearly, you don’t call a constructor for an object that’s already been created. In this case MyType::operator= is called for a, taking as an argument whatever appears on the right-hand side. (You can have multiple operator= functions to take different types of right-hand arguments.)

This behavior is not restricted to the copy-constructor. Any time you’re initializing an object using an = instead of the ordinary function-call form of the constructor, the compiler will look for a constructor that accepts whatever is on the right-hand side:

//: C12:CopyingVsInitialization.cpp
class Fi {
public:
  Fi() {}
};

class Fee {
public:
  Fee(int) {}
  Fee(const Fi&) {}
};

int main() {
  Fee fee = 1; // Fee(int)
  Fi fi;
  Fee fum = fi; // Fee(Fi)
} ///:~

When dealing with the = sign, it’s important to keep this distinction in mind: If the object hasn’t been created yet, initialization is required; otherwise the assignment operator= is used.

It’s even better to avoid writing code that uses the = for initialization; instead, always use the explicit constructor form. The two constructions with the equal sign then become:

Fee fee(1);
Fee fum(fi);

This way, you’ll avoid confusing your readers.

Behavior of operator=

In Integer.h and Byte.h, you saw that operator= can be only a member function. It is intimately connected to the object on the left side of the ‘=’. If it was possible to define operator= globally, then you might attempt to redefine the built-in ‘=’ sign:

int operator=(int, MyType); // Global = not allowed!

The compiler skirts this whole issue by forcing you to make operator= a member function.

When you create an operator=, you must copy all of the necessary information from the right-hand object into the current object (that is, the object that operator= is being called for) to perform whatever you consider “assignment” for your class. For simple objects, this is obvious:

//: C12:SimpleAssignment.cpp
// Simple operator=()
#include <iostream>
using namespace std;

class Value {
  int a, b;
  float c;
public:
  Value(int aa = 0, int bb = 0, float cc = 0.0)
    : a(aa), b(bb), c(cc) {}
  Value& operator=(const Value& rv) {
    a = rv.a;
    b = rv.b;
    c = rv.c;
    return *this;
  }
  friend ostream&
  operator<<(ostream& os, const Value& rv) {
    return os << "a = " << rv.a << ", b = "
      << rv.b << ", c = " << rv.c;
  }
};

int main() {
  Value a, b(1, 2, 3.3);
  cout << "a: " << a << endl;
  cout << "b: " << b << endl;
  a = b;
  cout << "a after assignment: " << a << endl;
} ///:~

Here, the object on the left side of the = copies all the elements of the object on the right, then returns a reference to itself, which allows a more complex expression to be created.

This example includes a common mistake. When you’re assigning two objects of the same type, you should always check first for self-assignment: is the object being assigned to itself? In some cases, such as this one, it’s harmless if you perform the assignment operations anyway, but if changes are made to the implementation of the class, it can make a difference, and if you don’t do it as a matter of habit, you may forget and cause hard-to-find bugs.

Pointers in classes

What happens if the object is not so simple? For example, what if the object contains pointers to other objects? Simply copying a pointer means that you’ll end up with two objects pointing to the same storage location. In situations like these, you need to do bookkeeping of your own.

There are two common approaches to this problem. The simplest technique is to copy whatever the pointer refers to when you do an assignment or a copy-construction. This is straightforward:

//: C12:CopyingWithPointers.cpp
// Solving the pointer aliasing problem by
// duplicating what is pointed to during 
// assignment and copy-construction.
#include "../require.h"
#include <string>
#include <iostream>
using namespace std;

class Dog {
  string nm;
public:
  Dog(const string& name) : nm(name) {
    cout << "Creating Dog: " << *this << endl;
  }
  // Synthesized copy-constructor & operator= 
  // are correct.
  // Create a Dog from a Dog pointer:
  Dog(const Dog* dp, const string& msg) 
    : nm(dp->nm + msg) {
    cout << "Copied dog " << *this << " from "
         << *dp << endl;
  }
  ~Dog() { 
    cout << "Deleting Dog: " << *this << endl;
  }
  void rename(const string& newName) {
    nm = newName;
    cout << "Dog renamed to: " << *this << endl;
  }
  friend ostream&
  operator<<(ostream& os, const Dog& d) {
    return os << "[" << d.nm << "]";
  }
};

class DogHouse {
  Dog* p;
  string houseName;
public:
  DogHouse(Dog* dog, const string& house)
   : p(dog), houseName(house) {}
  DogHouse(const DogHouse& dh)
    : p(new Dog(dh.p, " copy-constructed")),
      houseName(dh.houseName 
        + " copy-constructed") {}
  DogHouse& operator=(const DogHouse& dh) {
    // Check for self-assignment:
    if(&dh != this) {
      p = new Dog(dh.p, " assigned");
      houseName = dh.houseName + " assigned";
    }
    return *this;
  }
  void renameHouse(const string& newName) {
    houseName = newName;
  }
  Dog* getDog() const { return p; }
  ~DogHouse() { delete p; }
  friend ostream&
  operator<<(ostream& os, const DogHouse& dh) {
    return os << "[" << dh.houseName 
      << "] contains " << *dh.p;
  }
}; 

int main() {
  DogHouse fidos(new Dog("Fido"), "FidoHouse");
  cout << fidos << endl;
  DogHouse fidos2 = fidos; // Copy construction
  cout << fidos2 << endl;
  fidos2.getDog()->rename("Spot");
  fidos2.renameHouse("SpotHouse");
  cout << fidos2 << endl;
  fidos = fidos2; // Assignment
  cout << fidos << endl;
  fidos.getDog()->rename("Max");
  fidos2.renameHouse("MaxHouse");
} ///:~

Dog is a simple class that contains only a string that holds the name of the dog. However, you’ll generally know when something happens to a Dog because the constructors and destructors print information when they are called. Notice that the second constructor is a bit like a copy-constructor except that it takes a pointer to a Dog instead of a reference, and it has a second argument that is a message that’s concatenated to the argument Dog’s name. This is used to help trace the behavior of the program.

You can see that whenever a member function prints information, it doesn’t access that information directly but instead sends *this to cout. This in turn calls the ostream operator<<. It’s valuable to do it this way because if you want to reformat the way that Dog information is displayed (as I did by adding the ‘[’ and ‘]’) you only need to do it in one place.

A DogHouse contains a Dog* and demonstrates the four functions you will always need to define when your class contains pointers: all necessary ordinary constructors, the copy-constructor, operator= (either define it or disallow it), and a destructor. The operator= checks for self-assignment as a matter of course, even though it’s not strictly necessary here. This virtually eliminates the possibility that you’ll forget to check for self-assignment if you do change the code so that it matters.

Reference Counting

In the example above, the copy-constructor and operator= make a new copy of what the pointer points to, and the destructor deletes it. However, if your object requires a lot of memory or a high initialization overhead, you may want to avoid this copying. A common approach to this problem is called reference counting. You give intelligence to the object that’s being pointed to so it knows how many objects are pointing to it. Then copy-construction or assignment means attaching another pointer to an existing object and incrementing the reference count. Destruction means reducing the reference count and destroying the object if the reference count goes to zero.

But what if you want to write to the object (the Dog in the example above)? More than one object may be using this Dog, so you’d be modifying someone else’s Dog as well as yours, which doesn’t seem very neighborly. To solve this “aliasing” problem, an additional technique called copy-on-write is used. Before writing to a block of memory, you make sure no one else is using it. If the reference count is greater than one, you must make yourself a personal copy of that block before writing it, so you don’t disturb someone else’s turf. Here’s a simple example of reference counting and copy-on-write:

//: C12:ReferenceCounting.cpp
// Reference count, copy-on-write
#include "../require.h"
#include <string>
#include <iostream>
using namespace std;

class Dog {
  string nm;
  int refcount;
  Dog(const string& name) 
    : nm(name), refcount(1) {
    cout << "Creating Dog: " << *this << endl;
  }
  // Prevent assignment:
  Dog& operator=(const Dog& rv);
public:
  // Dogs can only be created on the heap:
  static Dog* make(const string& name) {
    return new Dog(name);
  }
  Dog(const Dog& d) 
    : nm(d.nm + " copy"), refcount(1) {
    cout << "Dog copy-constructor: " 
         << *this << endl;
  }
  ~Dog() { 
    cout << "Deleting Dog: " << *this << endl;
  }
  void attach() { 
    ++refcount;
    cout << "Attached Dog: " << *this << endl;
  }
  void detach() {
    require(refcount != 0);
    cout << "Detaching Dog: " << *this << endl;
    // Destroy object if no one is using it:
    if(--refcount == 0) delete this;
  }
  // Conditionally copy this Dog.
  // Call before modifying the Dog, assign
  // resulting pointer to your Dog*.
  Dog* unalias() {
    cout << "Unaliasing Dog: " << *this << endl;
    // Don't duplicate if not aliased:
    if(refcount == 1) return this;
    --refcount;
    // Use copy-constructor to duplicate:
    return new Dog(*this);
  }
  void rename(const string& newName) {
    nm = newName;
    cout << "Dog renamed to: " << *this << endl;
  }
  friend ostream&
  operator<<(ostream& os, const Dog& d) {
    return os << "[" << d.nm << "], rc = " 
      << d.refcount;
  }
};

class DogHouse {
  Dog* p;
  string houseName;
public:
  DogHouse(Dog* dog, const string& house)
   : p(dog), houseName(house) {
    cout << "Created DogHouse: "<< *this << endl;
  }
  DogHouse(const DogHouse& dh)
    : p(dh.p),
      houseName("copy-constructed " + 
        dh.houseName) {
    p->attach();
    cout << "DogHouse copy-constructor: "
         << *this << endl;
  }
  DogHouse& operator=(const DogHouse& dh) {
    // Check for self-assignment:
    if(&dh != this) {
      houseName = dh.houseName + " assigned";
      // Clean up what you're using first:
      p->detach();
      p = dh.p; // Like copy-constructor
      p->attach();
    }
    cout << "DogHouse operator= : "
         << *this << endl;
    return *this;
  }
  // Decrement refcount, conditionally destroy
  ~DogHouse() {
    cout << "DogHouse destructor: " 
         << *this << endl;
    p->detach(); 
  }
  void renameHouse(const string& newName) {
    houseName = newName;
  }
  void unalias() { p = p->unalias(); }
  // Copy-on-write. Anytime you modify the 
  // contents of the pointer you must 
  // first unalias it:
  void renameDog(const string& newName) {
    unalias();
    p->rename(newName);
  }
  // ... or when you allow someone else access:
  Dog* getDog() {
    unalias();
    return p; 
  }
  friend ostream&
  operator<<(ostream& os, const DogHouse& dh) {
    return os << "[" << dh.houseName 
      << "] contains " << *dh.p;
  }
}; 

int main() {
  DogHouse 
    fidos(Dog::make("Fido"), "FidoHouse"),
    spots(Dog::make("Spot"), "SpotHouse");
  cout << "Entering copy-construction" << endl;
  DogHouse bobs(fidos);
  cout << "After copy-constructing bobs" << endl;
  cout << "fidos:" << fidos << endl;
  cout << "spots:" << spots << endl;
  cout << "bobs:" << bobs << endl;
  cout << "Entering spots = fidos" << endl;
  spots = fidos;
  cout << "After spots = fidos" << endl;
  cout << "spots:" << spots << endl;
  cout << "Entering self-assignment" << endl;
  bobs = bobs;
  cout << "After self-assignment" << endl;
  cout << "bobs:" << bobs << endl;
  // Comment out the following lines:
  cout << "Entering rename(\"Bob\")" << endl;
  bobs.getDog()->rename("Bob");
  cout << "After rename(\"Bob\")" << endl;
} ///:~

The class Dog is the object pointed to by a DogHouse. It contains a reference count and functions to control and read the reference count. There’s a copy-constructor so you can make a new Dog from an existing one.

The attach( ) function increments the reference count of a Dog to indicate there’s another object using it. detach( ) decrements the reference count. If the reference count goes to zero, then no one is using it anymore, so the member function destroys its own object by saying delete this.

Before you make any modifications (such as renaming a Dog), you should ensure that you aren’t changing a Dog that some other object is using. You do this by calling DogHouse::unalias( ), which in turn calls Dog::unalias( ). The latter function will return the existing Dog pointer if the reference count is one (meaning no one else is pointing to that Dog), but will duplicate the Dog if the reference count is more than one.

The copy-constructor, instead of creating its own memory, assigns Dog to the Dog of the source object. Then, because there’s now an additional object using that block of memory, it increments the reference count by calling Dog::attach( ).

The operator= deals with an object that has already been created on the left side of the =, so it must first clean that up by calling detach( ) for that Dog, which will destroy the old Dog if no one else is using it. Then operator= repeats the behavior of the copy-constructor. Notice that it first checks to detect whether you’re assigning the same object to itself.

The destructor calls detach( ) to conditionally destroy the Dog.

To implement copy-on-write, you must control all the actions that write to your block of memory. For example, the renameDog( ) member function allows you to change the values in the block of memory. But first, it uses unalias( ) to prevent the modification of an aliased Dog (a Dog with more than one DogHouse object pointing to it). And if you need to produce a pointer to a Dog from within a DogHouse, you unalias( ) that pointer first.

main( ) tests the various functions that must work correctly to implement reference counting: the constructor, copy-constructor, operator=, and destructor. It also tests the copy-on-write by calling renameDog( ).

Here’s the output (after a little reformatting):

Creating Dog: [Fido], rc = 1
Created DogHouse: [FidoHouse] 
  contains [Fido], rc = 1
Creating Dog: [Spot], rc = 1
Created DogHouse: [SpotHouse] 
  contains [Spot], rc = 1
Entering copy-construction
Attached Dog: [Fido], rc = 2
DogHouse copy-constructor: 
  [copy-constructed FidoHouse] 
    contains [Fido], rc = 2
After copy-constructing bobs
fidos:[FidoHouse] contains [Fido], rc = 2
spots:[SpotHouse] contains [Spot], rc = 1
bobs:[copy-constructed FidoHouse] 
  contains [Fido], rc = 2
Entering spots = fidos
Detaching Dog: [Spot], rc = 1
Deleting Dog: [Spot], rc = 0
Attached Dog: [Fido], rc = 3
DogHouse operator= : [FidoHouse assigned]
  contains [Fido], rc = 3
After spots = fidos
spots:[FidoHouse assigned] contains [Fido],rc = 3
Entering self-assignment
DogHouse operator= : [copy-constructed FidoHouse]
  contains [Fido], rc = 3
After self-assignment
bobs:[copy-constructed FidoHouse] 
  contains [Fido], rc = 3
Entering rename("Bob")
After rename("Bob")
DogHouse destructor: [copy-constructed FidoHouse]
  contains [Fido], rc = 3
Detaching Dog: [Fido], rc = 3
DogHouse destructor: [FidoHouse assigned] 
  contains [Fido], rc = 2
Detaching Dog: [Fido], rc = 2
DogHouse destructor: [FidoHouse] 
  contains [Fido], rc = 1
Detaching Dog: [Fido], rc = 1
Deleting Dog: [Fido], rc = 0

By studying the output, tracing through the source code, and experimenting with the program, you’ll deepen your understanding of these techniques.

Automatic operator= creation

Because assigning an object to another object of the same type is an activity most people expect to be possible, the compiler will automatically create a type::operator=(type) if you don’t make one. The behavior of this operator mimics that of the automatically created copy-constructor; if the class contains objects (or is inherited from another class), the operator= for those objects is called recursively. This is called memberwise assignment. For example,

//: C12:AutomaticOperatorEquals.cpp
#include <iostream>
using namespace std;

class Cargo {
public:
  Cargo& operator=(const Cargo&) {
    cout << "inside Cargo::operator=()" << endl;
    return *this;
  }
};

class Truck {
  Cargo b;
};

int main() {
  Truck a, b;
  a = b; // Prints: "inside Cargo::operator=()"
} ///:~

The automatically generated operator= for Truck calls Cargo::operator=.

In general, you don’t want to let the compiler do this for you. With classes of any sophistication (especially if they contain pointers!) you want to explicitly create an operator=. If you really don’t want people to perform assignment, declare operator= as a private function. (You do