Value Semantics

Posted on: 2023-01-01

Value Semantics

Much of this is based on thoughts after watching Value Semantics: Safety, Independence, Projection, & Future of Programming - Dave Abrahams

'References' allow multiple paths to access data - thats a problem because
- It makes it hard to reason about programs (lots of places can access and mutate data!)
- If there is multi-threading there needs to be protection around access
- A pointer/reference provides a 'dangerous' key to accessing data
Value semantics aims to (largely) remove the need for pointer style references
Items are constructed out of hierachies of values
Ownership is obvious
Concepts such as equality and hashing are equally obvious
- Deep/shallow ness are not meaningful, or a concern
Not possible to access through "back doors" of other references
Language could have in/inout as a way to indicate to the compiler what kind of access
- Distinct concepts with different semantics from pointers/references
- Something the compiler can more accurately reason about
We can have "reference-like" relationships through indices and handles
One problem with that is that it doesn't provide type safety
- In C++ you could have a type Id<Type, IndexBaseType>.
- Debugging remains problematic
In Swift an iterator isn't a pointer but more of an index, such that the contents can only be accessed via the container
- This makes all access mediated by one thing (the owning value - the container in this case)
Graph like relationships can be represented by arrays/arrays of indices
- It's claimed this is typically performant
Copy on write is (perhaps?) a way to have value semantics and efficiency
- It also allows values that's contents size is not known - like a string
You can have value types with class derivation (with a tagged union like arrangement - so fixed size)

Independence is about knowing something can be accessed by one path.

In much of C++ code that uses references, assumes that parameters are independent. If they are not you get undefined/unexpected behavior.

"functional update" via inout. It would be akin to something like

void doThing(Array<T>& t, ...);

To get independence in C++ we could do

Array<T> doThing(Array<T>& t, ...)
{
    // do some work

    // Hopefully this will see this is the t and not copy...
    return t;
}

//...
Array<T> t;
// Note the use of move such that we don't have to copy
t = doThing(std::move(t), ... );

It may sort of work efficiently maybe - but it's not very easy to write. With inout we could do

void doThing(Array<T> inout t, ...)

Array<T> t;
doThing(t, ...);

Where the compiler can ensure t is independent, and does not do expensive copies.

in is sort of similar to const & but has the guarantee of independence.

Note that does make in different from not marking (as in normal copy by value), in that without it, the compiler is not being told there is a guarentee of independence, and so it might think it needs to do an expensive copy potentially unnecessarily. It also allows the compiler to see if a call is being used in a way that there isn't independence and give an error.

As concrete examples

void someFunc(Thing& t, const Thing& t2);

Thing t;
someFunc(t, t);

If the t parameter mutates, it will likely have unexpected behavior as during mutation, t2 may (or may not) mutate. If we use in and inout:

void someFunc(Thing inout t, Thing in t2);

Thing t;
// Compiler can fail with an error message as itcan `see` that the 
// parameters don't have `independence` property.
someFunc(t, t);

This might be more difficult in general. Although in the simple cases if it's only possible to access via the root value a trivial test would be to check all in/inout have independent roots.

In a typesystem that has modifiers in order, you'd have something like

void someFunc(inout Thing t, in Thing t2);

Which is arguably a little more readable for the more C/C++ like */& replacement.

Hetergenous Programming

Value semantics may useful in hetergenous programming environments due to limiting pointer use. One problem with pointers is that they don't work across memory spaces. It is typical in modern systems to have different pools of memory. Sometimes these pools alias some shared memory address space - such as with caches. On such systems a pointer may work, but in doing so it requires the memory system to sync memory across caches. This process whilst simple from a programming point of view (it's largely a hardware absraction), is under limited control. It also doesn't change that copying through probably slow memory is likely. It is perhaps also worth saying that copying in this way implies synchronization. When a compute element completes writing data, it's not enough for another element to access there need to be memory barriers such that the extent of the changes on the writer can be seen on the reader.

A discrete GPU typically has local memory. This may be in a different address space. If we want to move data between CPU and GPU there may be a need to remap the addresses.

Note that having a pointer that isn't accessed is ok, and can be passed between compute elements, as long as it's only accessed on the compute element where the address is valid. A pointer could be considered in this scenario a kind of opaque handle. It would be nice if a programming language had some type system mechanism that made this distinction clear and only made indirection possible on compute elements where the handle is valid.

The SPE on Cell is a heterogeneous compute system that has fast limited local memory, and doesn't have shared memory caching. Access to memory can be achieved by DMA. Such a system requires address remapping if addresses are going to access local memory. Using indexes or some other handle that can work across address spaces greatly simplies programming such a system.

Value semantics can help with lots of these situations as it attempts to remove/reduce/limit the use of pointers. If we have indices to reference data, these of work across boundaries. Additionally value semantics perfers data to either be copied, or moved such there is only one write accessor. This removes issues race issues between compute elements.

Reference Counting

Elsewhere I've talked about Nim and it's use of non atomic reference counting. This improves performance, because atomic reference counting is typically somewhat slow, although recent Apple Silicon has made it surprisingly fast. The problem is how to make this work in an environment with multiple cores? This can be achieved by having the cores as "islands" of non atomic items, and when they move between threads a new copy is created. In a language allows lots of references, this implies a deep copy.

With value semantics and copy on write (COW) you can perhaps do a little better. You typically will not have regular pointers. It is also possible to move an object such that it is available to another thread. With COW, this isn't quite enough. Imagine trying to move an array of strings. The array and strings are COW and non atomic reference counted. If the reference count of the array is 1, it seems as if we can move ownership to another thread, but actually this isn't enough. Imagine there is a string in the array with a reference count of 2. If that's the case there is, a pre-existing other reference to this data on the current thread. So just moving, is not enough, and can lead to a race.

What we actually want as well as move is uniqueMove. A uniqueMove will do a move if that's possible, but if not (because some sub element contains a reference) it will copy. In this example it will need to copy the string, such that it is unique. Unlike a "deep copy" a uniqueMove is well defined in value semantics because of the ownership/containment relationship. It's perhaps also worth note, that depending on the structure the uniqueMoves work will vary on how (or if) it's constituents are referenced. If they aren't shared (or aren't COW types), it's the same as a move. A uniqueMove only needs to copy what is required to achieve the uniqueness constraint. Although such a function could be generated, it may not be super efficient even if no copying is required, because it will need to traverse the value type determining COW types don't need copies.

I suppose on silicon where atomic reference counting is really fast, it is conceivable to make a uniqueMove just do a move. Perhaps we need to distinguish between a move that perform produces unique copies, and one where it copies such the value is usable in another thread, let's call it threadMove.

Torn off buffers

An separate observation was about buffers in older graphics APIs, where the application can 'tear off' a copy. This provides a mechanism to change data, and have the system bound that amount of space. If a torn off buffer is not possible (say because lack of memory) an implemetation has choices, including stalling. An implementation can see how the data is accessed - if nothing is using the data can be discarded and the buffer reused for example. This is all whilst not having to alter the program, or pass around the data.

It relies on an implicit serialization though. In a graphics API there serialization is the command buffer. There would need to be synchronization around this. It wouldn't be meaningful for threads to tear off and other threads issue draws for example without coordination, because there is an implicit association between the draws and which underlying buffer to be used.

Value Semantics

Value Semantics

Hetergenous Programming

Reference Counting

Torn off buffers

Links