You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucy.apache.org by Marvin Humphrey <ma...@rectangular.com> on 2012/12/17 06:55:17 UTC

[lucy-dev] CAS example

Greets,

This week, the Lucy Book Club is reading up on a section of Programming
Language Pragmatics 3 that covers the atomic "compare-and-swap" operation, or
"CAS" for short.  Lucy uses CAS in Clownfish::LockFreeRegistry, a lock-free
hash table where we store our metaclass singletons, so I thought I'd write to
the dev list and explore an example of CAS usage in practice.

The book provides the following two examples on how to update a variable
atomically, when calculating the new value involves an arbitrary instruction
sequence rather than something simple like an increment:

1.  Using a global lock:

        acquire(L)
            r1 := x
            r2 := foo(r1)         -- probably a multi-instruction sequence
            x := r2
        release(L)

2.  Using CAS:

        start:
            r1 := x
            r2 := foo(r1)         -- probably a multi-instruction sequence
            r2 := CAS(x, r1, r2)  -- replace x if it hasn't changed
            if !r2 goto start

The CAS-based algorithm performs better, so we use it when we can -- however
CAS is not very standardized: on OS X 10.4 and later, it's in
<libkern/OSAtomic.h>, in Solaris 10 and later it's in <sys/atomic.h>,
depending on target architecture you can sometimes get CAS via GCC, etc.  If
Charmonizer manages to scrounge up a version of CAS from somewhere, we use it.
If not, we fall back to simulating CAS using the "global lock" algorithm above
with pthread mutex locks:

>From Clownfish/Util/Atomic.cfh:

    /**************** Fall back to pthread.h. ******************/
    #elif defined(CHY_HAS_PTHREAD_H)
    #include <pthread.h>

    extern pthread_mutex_t lucy_Atomic_mutex;

    static CHY_INLINE bool
    lucy_Atomic_cas_ptr(void *volatile *target, void *old_value,
                        void *new_value) {
        pthread_mutex_lock(&lucy_Atomic_mutex);
        if (*target == old_value) {
            *target = new_value;
            pthread_mutex_unlock(&lucy_Atomic_mutex);
            return true;
        }
        else {
            pthread_mutex_unlock(&lucy_Atomic_mutex);
            return false;
        }
    }

Our simulated CAS would not work well under high contention, but it's good
enough for our purposes for now because LockFreeRegistry storage operations
are not heavily utilized: we only go through them when a new metaclass is
inserted into our class registry -- and that doesn't happen very often.

Here's the place in LockFreeRegistry.c where we actually call CAS:

    /* Attempt to append the new node onto the end of the linked list.
     * However, if another thread filled the slot since we found it (perhaps
     * while we were allocating that new node), the compare-and-swap will
     * fail.  If that happens, we have to go back and find the new end of the
     * linked list, then try again. */
    if (!Atomic_cas_ptr((void*volatile*)slot, NULL, new_entry)) {
        goto FIND_END_OF_LINKED_LIST;
    }

The algorithm is exactly as described in the text (minus the optional "clean
up" phase):

    repeat
        prepare
        CAS
    until success
    clean up

Marvin Humphrey