logo

C++ Reference Counting Map

Posted on: 2015-02-16

In the source base I was working on I have a 'Object' base class which uses reference counting. I needed a container which could honor reference counting where necessary but also other features. I have two different styles of smart pointers - pointers and references. A 'reference' smart pointer is like a regular smart pointer, except it cannot be null and when they are compared they are compared by value not by pointer. This is the same behavior as a C++ reference - and thus the name. The smart references also have the added feature that their value can change, unlike C++, as long as the value is not null.

The map implementation needs to work between Object and non object derived types. It also needs to handle comparison as a reference, or as a pointer as is appropriate. If I were going to implement this as a single class the class would have to handle the following combinations

Producing 4 combinations. Since a map has both a key and value, that means we have 16 combinations. Clearly we don't want to write 16 classes.

When the key is a Reference it has a huge impact on the implementation. With a Reference objects are compared by value, not just a pointer comparison. In practice for equality testing this means calling a virtual method valueEquals if the pointers aren't equal. The actual map functionality uses a hash - so any object that wants to be a key in a map must also implement the calcHash virtual method. Meaning if we have a map with a Reference key, the implementation will call calcHash on it, and see if it can find a match using valueEquals by finding all the objects with the same hash in a multi map.

With a non reference key - it's much easier. The pointer itself is the key - and the pointers are unique. So the underlying implementation can basically just use a regular hash map, and we can build reference counting and reference behaviour elsewhere.

So there is a natural split in the containers. If the key is a smart reference we implement with the multimap as described above. If not we use a regular pointer map, and thus my initial implementation has two classes TRefHashMap for a map where the key is a Reference and TPtrHashMap where the key is a Pointer.

Okay - now each of the implementations has 8 possible combinations, from

So another questions arise - how is this implemented in the container? Will we have a template that generates all the combinations? How will the user of the container indicate their intent?

On the second point the changes between the combinations are fairly small. So instead of making a template that generated all combinations, I made a concrete base class that held flags for the combination and then have a template that derives from this and acts as a type 'shim' doing the type casting but deferring the work to the underlying class. On a release build the shim will dissolve to nothing - but I get all the strong typing behaviour I want. The advantage of the approach is very much smaller amounts of code produced, better code cache usage, and the concrete implementation can be exhaustively tested programmatically. The disadvantage is perhaps a small runtime hit to testing for the special cases. On balance I think this is a good deal.

To inform the templates of the behaviour wanted the types are embellished with the appropriate smart pointer. Notionally the smart pointer is how the type is handled in the container.  Using a traits template class I can pull out the flags, and get to the underlying type. Heres how it looks

template <typename KEY, typename VALUE>
class TRefHashMap;

// To use I mark the types to say the behavior I want
// o TStrongRef - Ref counted, and a reference
// o TWeakRef   - Not ref counted and a reference
// o TStrongPtr - Ref counted, and a pointer
// o TWeakPtr   - Not ref counted, and a pointer
// o T*         - Same as a weak pointer 

// Maps a ref counted (object derived 'Thing') behaving as a reference
// (the key has to be Reference type on TRefHashMap)
TRefHashMap<TStrongRef<Thing>, char*> map;

So how does the template use the embellishments on the template types to set the flags and get the underlying target types (Thing, char)?

We use another template to pull the bits we want out

// Note - no default impl, as we only want specializations 
template <typename T>
struct TPointerTraits;

template <typename T> struct TPointerTraits<T*> { 
    typedef T Type; 
    enum { IS_REF_COUNTED = 0 , IS_REF = 0 }; 
};
template <typename T> struct TPointerTraits<TStrongRef<T> > { 
    typedef T Type; 
    enum { IS_REF_COUNTED = 1 , IS_REF = 1 }; 
};
template <typename T> struct TPointerTraits<TWeakRef<T> > { 
    typedef T Type; 
    enum { IS_REF_COUNTED = 0 , IS_REF = 1 }; 
};
template <typename T> struct TPointerTraits<TStrongPtr<T> > { 
    typedef T Type; 
    enum { IS_REF_COUNTED = 1 , IS_REF = 0 }; 
};
template <typename T> struct TPointerTraits<TWeakPtr<T> > { 
    typedef T Type; 
    enum { IS_REF_COUNTED = 0 , IS_REF = 0 }; 
};

It is now quite easy to get the flags and the underlying types returned by the container

template <typename KEY, typename VALUE>
class TRefHashMap {
    typedef TPointerTraits<KEY> KeyTraits;
    typedef TPointerTraits<VALUE> ValueTraits;

    typedef typename KeyTraits::Type K;
    typedef typename ValueTraits::Type V;

    enum { FLAGS = KeyTraits::IS_REF_COUNTED | 
                   (ValueTraits::IS_REF_COUNTED << 1) |
                   (ValueTraits::IS_REF << 2) };

// ...
};

There's one remaining thing that feels a little wrong - that's that we have two separate classes and we need to remember to use the right one depending on how we want the key used. It's not quite that bad, because a compile time assert can determine if a ref or not is used and so display an error if the wrong class is chosen. Still it would be better if the client didn't have to know this extra feature.

The key type can be used to pull in the appropriate implementation via - surprise - a template. The implementation could be applied either via composition or through derivation. For my purposes derivation is fine - and means for less code, and less work for the compiler to do. All that is basically needed is

template <typename V, typename T, typename F> 
struct TIfRefPointer { typedef F Value; };
template <typename V, typename T, typename F>
struct TIfRefPointer<TStrongRef<V> > { typedef T Value; };
template <typename V, typename T, typename F>
struct TIfRefPointer<TWeakRef<V> > { typedef T Value; };

Now we just have one shim and it just derives from the appropriate base

template <typename KEY, typename VALUE>
class TPtrHashMap: public 
    TIfRefPointer<KEY, RefHashMap, PtrHashMap>::Value {
    public:
    // ...
};

Not exactly beautiful  - but a little bit simpler to use.

That was basically it. It allowed me to throw away a variety of other templates and classes, and simplify yet others as I could just use one of these templates instead of doing the ref counting in the client code.