You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Ryan Rawson <ry...@gmail.com> on 2010/04/16 00:00:59 UTC

volatile considered harmful

So there I was, running a unit test.  I step away and hours later it
still had not finished.  It was stuck on this line of the code:

      putThread.done();
>    putThread.join();

So the putThread was not finishing as it was supposed to do.  Well,
what does putThread.done() do:

    public void done() {
      done = true;
      synchronized (this) {
        interrupt();
      }
    }


Looks reasonable, wait, what is this access of a variable shared among
multiple threads outside of a sync block?  Oh wait, its "ok", it's
volatile:

    private volatile boolean done;


And what does the run loop of the thread look like?  Again reasonable:

    public void run() {
      done.set(false);
      int val = 0;
      while (!done.get()) {
       // Do HRegion.put call with some exception handling
      }
    }


Well in theory this all looks reasonable.

But why harmful?

I had changed the Gets to not acquire row locks, thus removing a
synchronization block between the main thread (which did Gets and then
eventually called that code up top).  Two consequences, one was puts
happened much faster (good!), and a negative consequence is sometimes
the test would fail to stop the put thread.  For minutes and hours
after the main thread had signalled the put thread to stop, it still
had not.

I switched the volatile boolean => AtomicBoolean and the problem went away.

In short, the use of volatile is not guaranteed to see an update from
a different thread in any timely manner. The Java Memory Model does
not ensure this.  If you need to rely on passing data between threads
you need to either use a monitor lock (using synchronized keyword
either at the method level or as an explicit block) or use one of the
concurrent utilities classes - such as AtomicLong, AtomicBoolean, or
the excellent read/write locks.

If you wish to increment a counter from multiple threads, doing this:
volatile int count = 0;
// later:
count++;

will not work.  You must use an AtomicInt/Long or use a monitor lock.
This is because the operation "count++" is not atomic, it is actually
3 - get, add, put.

As we code in HBase, we should avoid the use of volatile as much as
possible.  The guarantees it makes are not good enough and can cause a
lot of pain figuring this out months after you originally coded it.

Re: volatile considered harmful

Posted by Joydeep Sarma <js...@gmail.com>.
just to clarify - the volatile in that link referred to C language
'volatile' keyword that happened to be used in the JDK native
implementation of park(), unpark() routines. C language 'volatile'
only prevents compiler optimizations - it doesn't apply memory
barriers after/before load/store.

i remember we had a terrible lost interrupt in hdfs once that was
similar to this problem (as ryan finally described it).

On Thu, Apr 15, 2010 at 3:28 PM, Kannan Muthukkaruppan
<Ka...@facebook.com> wrote:
> Ryan:
>
> Joy had brought up a related issue to attention recently, and fwd'ed this link:
>
> http://blogs.sun.com/dave/entry/a_race_in_locksupport_park
>
> It has to do with a volatile "_counter" variable as well.
>
> From the above writeup:
>
> <<< The problem would only manifest when we were using the -UseMembar optimization that lets us remove fences from certain hot thread state transitions paths that need to coordinate safepoints between mutator threads and the JVM. This feature is enabled by default, but we can turn it off with the -XX:+UseMembar switch, which causes the JVM to emit normal fence instructions in the state transitions paths. (That particular optimization is an example of asymmetric Dekker synchronization). Crucially, the park() path contains such a state transition. In reality the fence emitted by the +UseMembar switch was simply covering up the otherwise latent Parker:: bug. +UseMembar constitutes a work-around. Sensitivity to UseMembar was initially confounding but ultimately a valuable clue.>>>
>
>
> -----Original Message-----
> From: Dhruba Borthakur [mailto:dhruba@gmail.com]
> Sent: Thursday, April 15, 2010 3:24 PM
> To: hbase-dev@hadoop.apache.org
> Subject: Re: volatile considered harmful
>
> I agree. The Java spec clearly mentions that load/store/read/write to
> volatile are atomic
>
> http://java.sun.com/docs/books/jls/second_edition/html/memory.doc.html#28330
>
> <http://java.sun.com/docs/books/jls/second_edition/html/memory.doc.html#28330>but
> definitely not increment and decrement. increment and decrement operations
> need to load *and* store data to a volatile variable and are not atomic.
>
> -dhruba
>
>
> On Thu, Apr 15, 2010 at 3:00 PM, Ryan Rawson <ry...@gmail.com> wrote:
>
>> So there I was, running a unit test.  I step away and hours later it
>> still had not finished.  It was stuck on this line of the code:
>>
>>      putThread.done();
>> >    putThread.join();
>>
>> So the putThread was not finishing as it was supposed to do.  Well,
>> what does putThread.done() do:
>>
>>    public void done() {
>>      done = true;
>>      synchronized (this) {
>>        interrupt();
>>      }
>>    }
>>
>>
>> Looks reasonable, wait, what is this access of a variable shared among
>> multiple threads outside of a sync block?  Oh wait, its "ok", it's
>> volatile:
>>
>>    private volatile boolean done;
>>
>>
>> And what does the run loop of the thread look like?  Again reasonable:
>>
>>    public void run() {
>>      done.set(false);
>>      int val = 0;
>>      while (!done.get()) {
>>       // Do HRegion.put call with some exception handling
>>      }
>>    }
>>
>>
>> Well in theory this all looks reasonable.
>>
>> But why harmful?
>>
>> I had changed the Gets to not acquire row locks, thus removing a
>> synchronization block between the main thread (which did Gets and then
>> eventually called that code up top).  Two consequences, one was puts
>> happened much faster (good!), and a negative consequence is sometimes
>> the test would fail to stop the put thread.  For minutes and hours
>> after the main thread had signalled the put thread to stop, it still
>> had not.
>>
>> I switched the volatile boolean => AtomicBoolean and the problem went away.
>>
>> In short, the use of volatile is not guaranteed to see an update from
>> a different thread in any timely manner. The Java Memory Model does
>> not ensure this.  If you need to rely on passing data between threads
>> you need to either use a monitor lock (using synchronized keyword
>> either at the method level or as an explicit block) or use one of the
>> concurrent utilities classes - such as AtomicLong, AtomicBoolean, or
>> the excellent read/write locks.
>>
>> If you wish to increment a counter from multiple threads, doing this:
>> volatile int count = 0;
>> // later:
>> count++;
>>
>> will not work.  You must use an AtomicInt/Long or use a monitor lock.
>> This is because the operation "count++" is not atomic, it is actually
>> 3 - get, add, put.
>>
>> As we code in HBase, we should avoid the use of volatile as much as
>> possible.  The guarantees it makes are not good enough and can cause a
>> lot of pain figuring this out months after you originally coded it.
>>
>
>
>
> --
> Connect to me at http://www.facebook.com/dhruba
>
>

RE: volatile considered harmful

Posted by Kannan Muthukkaruppan <Ka...@facebook.com>.
Ryan:

Joy had brought up a related issue to attention recently, and fwd'ed this link:

http://blogs.sun.com/dave/entry/a_race_in_locksupport_park

It has to do with a volatile "_counter" variable as well.

>From the above writeup:

<<< The problem would only manifest when we were using the -UseMembar optimization that lets us remove fences from certain hot thread state transitions paths that need to coordinate safepoints between mutator threads and the JVM. This feature is enabled by default, but we can turn it off with the -XX:+UseMembar switch, which causes the JVM to emit normal fence instructions in the state transitions paths. (That particular optimization is an example of asymmetric Dekker synchronization). Crucially, the park() path contains such a state transition. In reality the fence emitted by the +UseMembar switch was simply covering up the otherwise latent Parker:: bug. +UseMembar constitutes a work-around. Sensitivity to UseMembar was initially confounding but ultimately a valuable clue.>>>


-----Original Message-----
From: Dhruba Borthakur [mailto:dhruba@gmail.com]
Sent: Thursday, April 15, 2010 3:24 PM
To: hbase-dev@hadoop.apache.org
Subject: Re: volatile considered harmful

I agree. The Java spec clearly mentions that load/store/read/write to
volatile are atomic

http://java.sun.com/docs/books/jls/second_edition/html/memory.doc.html#28330

<http://java.sun.com/docs/books/jls/second_edition/html/memory.doc.html#28330>but
definitely not increment and decrement. increment and decrement operations
need to load *and* store data to a volatile variable and are not atomic.

-dhruba


On Thu, Apr 15, 2010 at 3:00 PM, Ryan Rawson <ry...@gmail.com> wrote:

> So there I was, running a unit test.  I step away and hours later it
> still had not finished.  It was stuck on this line of the code:
>
>      putThread.done();
> >    putThread.join();
>
> So the putThread was not finishing as it was supposed to do.  Well,
> what does putThread.done() do:
>
>    public void done() {
>      done = true;
>      synchronized (this) {
>        interrupt();
>      }
>    }
>
>
> Looks reasonable, wait, what is this access of a variable shared among
> multiple threads outside of a sync block?  Oh wait, its "ok", it's
> volatile:
>
>    private volatile boolean done;
>
>
> And what does the run loop of the thread look like?  Again reasonable:
>
>    public void run() {
>      done.set(false);
>      int val = 0;
>      while (!done.get()) {
>       // Do HRegion.put call with some exception handling
>      }
>    }
>
>
> Well in theory this all looks reasonable.
>
> But why harmful?
>
> I had changed the Gets to not acquire row locks, thus removing a
> synchronization block between the main thread (which did Gets and then
> eventually called that code up top).  Two consequences, one was puts
> happened much faster (good!), and a negative consequence is sometimes
> the test would fail to stop the put thread.  For minutes and hours
> after the main thread had signalled the put thread to stop, it still
> had not.
>
> I switched the volatile boolean => AtomicBoolean and the problem went away.
>
> In short, the use of volatile is not guaranteed to see an update from
> a different thread in any timely manner. The Java Memory Model does
> not ensure this.  If you need to rely on passing data between threads
> you need to either use a monitor lock (using synchronized keyword
> either at the method level or as an explicit block) or use one of the
> concurrent utilities classes - such as AtomicLong, AtomicBoolean, or
> the excellent read/write locks.
>
> If you wish to increment a counter from multiple threads, doing this:
> volatile int count = 0;
> // later:
> count++;
>
> will not work.  You must use an AtomicInt/Long or use a monitor lock.
> This is because the operation "count++" is not atomic, it is actually
> 3 - get, add, put.
>
> As we code in HBase, we should avoid the use of volatile as much as
> possible.  The guarantees it makes are not good enough and can cause a
> lot of pain figuring this out months after you originally coded it.
>



--
Connect to me at http://www.facebook.com/dhruba


Re: volatile considered harmful

Posted by Dhruba Borthakur <dh...@gmail.com>.
I agree. The Java spec clearly mentions that load/store/read/write to
volatile are atomic

http://java.sun.com/docs/books/jls/second_edition/html/memory.doc.html#28330

<http://java.sun.com/docs/books/jls/second_edition/html/memory.doc.html#28330>but
definitely not increment and decrement. increment and decrement operations
need to load *and* store data to a volatile variable and are not atomic.

-dhruba


On Thu, Apr 15, 2010 at 3:00 PM, Ryan Rawson <ry...@gmail.com> wrote:

> So there I was, running a unit test.  I step away and hours later it
> still had not finished.  It was stuck on this line of the code:
>
>      putThread.done();
> >    putThread.join();
>
> So the putThread was not finishing as it was supposed to do.  Well,
> what does putThread.done() do:
>
>    public void done() {
>      done = true;
>      synchronized (this) {
>        interrupt();
>      }
>    }
>
>
> Looks reasonable, wait, what is this access of a variable shared among
> multiple threads outside of a sync block?  Oh wait, its "ok", it's
> volatile:
>
>    private volatile boolean done;
>
>
> And what does the run loop of the thread look like?  Again reasonable:
>
>    public void run() {
>      done.set(false);
>      int val = 0;
>      while (!done.get()) {
>       // Do HRegion.put call with some exception handling
>      }
>    }
>
>
> Well in theory this all looks reasonable.
>
> But why harmful?
>
> I had changed the Gets to not acquire row locks, thus removing a
> synchronization block between the main thread (which did Gets and then
> eventually called that code up top).  Two consequences, one was puts
> happened much faster (good!), and a negative consequence is sometimes
> the test would fail to stop the put thread.  For minutes and hours
> after the main thread had signalled the put thread to stop, it still
> had not.
>
> I switched the volatile boolean => AtomicBoolean and the problem went away.
>
> In short, the use of volatile is not guaranteed to see an update from
> a different thread in any timely manner. The Java Memory Model does
> not ensure this.  If you need to rely on passing data between threads
> you need to either use a monitor lock (using synchronized keyword
> either at the method level or as an explicit block) or use one of the
> concurrent utilities classes - such as AtomicLong, AtomicBoolean, or
> the excellent read/write locks.
>
> If you wish to increment a counter from multiple threads, doing this:
> volatile int count = 0;
> // later:
> count++;
>
> will not work.  You must use an AtomicInt/Long or use a monitor lock.
> This is because the operation "count++" is not atomic, it is actually
> 3 - get, add, put.
>
> As we code in HBase, we should avoid the use of volatile as much as
> possible.  The guarantees it makes are not good enough and can cause a
> lot of pain figuring this out months after you originally coded it.
>



-- 
Connect to me at http://www.facebook.com/dhruba

Re: Re: volatile considered harmful

Posted by Stack <st...@duboce.net>.
On Fri, Apr 16, 2010 at 6:00 PM, Ryan Rawson <ry...@gmail.com> wrote:
> Ok mystery solved, the Put thread was stuck in a wait() loop inside
> HRegion which catches InterruptedException, eats it then goes back to
> waiting for a condition that will never clear (memstore too big).
>

Sounds like we could run into this in other than unit test context?
You got a patch for us?  (Smile)
St.Ack

Re: Re: volatile considered harmful

Posted by Ryan Rawson <ry...@gmail.com>.
Ok mystery solved, the Put thread was stuck in a wait() loop inside
HRegion which catches InterruptedException, eats it then goes back to
waiting for a condition that will never clear (memstore too big).

Attempted solution is to bring up the done flag, flush the cache, then
wait for the put thread to join.

The problem manifests itself on Linux with JDK 1.6.0_19.  In the end
it wasnt a JDK bug so.

Having said that, I think it's good for people to read this:
http://www.cs.umd.edu/users/pugh/java/memoryModel/jsr-133-faq.html

Which explains how volatile works.  I had some mistaken impressions
about how it works, and more importantly I was confused about how
monitor blocks and field read/writes work.  In short: if you use a
monitor block , you dont need volatile.



-ryan

On Fri, Apr 16, 2010 at 12:25 AM, Jean-Daniel Cryans
<jd...@apache.org> wrote:
> I think I've seen stuff like that too when writing the replication
> code, and it only seems to happen on OSX.
>
> @Ryan, did you try running the same test a couple of times on a linux box?
>
> J-D
>
> On Fri, Apr 16, 2010 at 2:29 AM, Ryan Rawson <ry...@gmail.com> wrote:
>> At least it's in test code right?
>>
>> *sigh*
>>
>> On Thu, Apr 15, 2010 at 5:27 PM, Todd Lipcon <to...@cloudera.com> wrote:
>>> On Thu, Apr 15, 2010 at 5:26 PM, Ryan Rawson <ry...@gmail.com> wrote:
>>>
>>>> Looking at the code to AtomicBoolean it uses an atomic int to
>>>> accomplish it's task on OSX.
>>>>
>>>> So just what the heck is going on here?
>>>>
>>>>
>>> I think you're fooling yourself, and the bug isn't gone, just hiding :)
>>>
>>> -Todd
>>>
>>>
>>>>  On Thu, Apr 15, 2010 at 5:24 PM, Ryan Rawson <ry...@gmail.com> wrote:
>>>> > I doubt it's case #2, there is a lot of complex code that runs between
>>>> > putThread.start() and putThread.done().
>>>> >
>>>> > In terms of JVMs, I'm using Java 6 on OSX x64.  HBase effectively
>>>> > requires Java 6 (and if we dont explicitly require it, we should) and
>>>> > it also specifically cannot use certain broken JVM pushes (eg:
>>>> > jdk6u18) so much that we are adding in code to prevent ourselves from
>>>> > running on it and warning the user.
>>>> >
>>>> > But just for a moment, I think it's inappropriate for the JVM to be
>>>> > specifying caching or non-caching of variables in the systems cache.
>>>> > That is way too much abstraction leakage up to the language level.
>>>> > Most SMP systems have cache coherency control that allow you to read
>>>> > from cache yet get invalidations when other processors (on other dies)
>>>> > write to that memory entry.
>>>> >
>>>> > But nevertheless, the problem no longer exists with AtomicBoolean :-)
>>>> >
>>>> >
>>>> >
>>>> > On Thu, Apr 15, 2010 at 5:05 PM, Paul Cowan <co...@aconex.com> wrote:
>>>> >> On -9/01/37 05:59, Ryan Rawson wrote:
>>>> >>>
>>>> >>> So the previous use of volatile for a boolean seems like a textbook
>>>> >>> case, but the situation i discovered was pretty clear cut. I have no
>>>> >>> other explanation than a highly delayed volatile read (which are
>>>> >>> allowed).
>>>> >>
>>>> >> I don't see that they are allowed, actually.
>>>> >>
>>>> >> Section 17.4.5 of the JLS says that:
>>>> >>
>>>> >>> * An unlock on a monitor happens-before every subsequent lock on that
>>>> >>> monitor.
>>>> >>> * A write to a volatile field (§8.3.1.4) happens-before every
>>>> subsequent
>>>> >>> read of that field.
>>>> >>
>>>> >> IOW, the situations (unlock-then-lock) and (volatile-write then
>>>> >> volatile-read) have the same visibility guarantees.
>>>> >>
>>>> >> Section 8.3.1.4 says:
>>>> >>
>>>> >>> A field may be declared volatile, in which case the Java memory model
>>>> >>> (§17)  ensures that all threads see a consistent value for the
>>>> variable.
>>>> >>
>>>> >> In your case, the thread calling done() is not seeing the same value as
>>>> the
>>>> >> thread calling run(), which is not consistent.
>>>> >>
>>>> >> And for good measure Java Concurrency in Practice makes it much more
>>>> >> explicit (emphasis mine):
>>>> >>
>>>> >>> Volatile variables are not cached in registers or in caches where they
>>>> are
>>>> >>> hidden from other processors, so *a read of a volatile variable always
>>>> >>> returns the most recent write by any thread*.
>>>> >>
>>>> >> And finally, on changing to an AtomicBoolean fixing the problem, JCIP
>>>> says:
>>>> >>
>>>> >>> Atomic variables offer the same memory semantics as volatile variables
>>>> >>
>>>> >> So this doesn't really make sense either.
>>>> >>
>>>> >> All that's a long way of saying that the only ways I can see your
>>>> situation
>>>> >> happening are:
>>>> >>
>>>> >> * pre-Java-1.5 (and hence pre-JSR-133) JVM
>>>> >> * JVM with a bug
>>>> >> * ordering is not as you expect, i.e. the actual chronological order is
>>>> not:
>>>> >>
>>>> >>    THREAD 1                 THREAD 2
>>>> >>    spawn new thread
>>>> >>                             run()
>>>> >>    done()
>>>> >>    join()
>>>> >>
>>>> >> but rather:
>>>> >>
>>>> >>    THREAD 1                 THREAD 2
>>>> >>    spawn new thread
>>>> >>    done()
>>>> >>                             run()
>>>> >>    join()
>>>> >>
>>>> >> in which case the set of run to false at the start of run() overwrites
>>>> the
>>>> >> set of it to true at the start of done(), and you're in for infinite
>>>> loop
>>>> >> fun.
>>>> >>
>>>> >> Cheers,
>>>> >>
>>>> >> Paul
>>>> >>
>>>> >
>>>>
>>>
>>>
>>>
>>> --
>>> Todd Lipcon
>>> Software Engineer, Cloudera
>>>
>>
>

Re: Re: volatile considered harmful

Posted by Jean-Daniel Cryans <jd...@apache.org>.
I think I've seen stuff like that too when writing the replication
code, and it only seems to happen on OSX.

@Ryan, did you try running the same test a couple of times on a linux box?

J-D

On Fri, Apr 16, 2010 at 2:29 AM, Ryan Rawson <ry...@gmail.com> wrote:
> At least it's in test code right?
>
> *sigh*
>
> On Thu, Apr 15, 2010 at 5:27 PM, Todd Lipcon <to...@cloudera.com> wrote:
>> On Thu, Apr 15, 2010 at 5:26 PM, Ryan Rawson <ry...@gmail.com> wrote:
>>
>>> Looking at the code to AtomicBoolean it uses an atomic int to
>>> accomplish it's task on OSX.
>>>
>>> So just what the heck is going on here?
>>>
>>>
>> I think you're fooling yourself, and the bug isn't gone, just hiding :)
>>
>> -Todd
>>
>>
>>>  On Thu, Apr 15, 2010 at 5:24 PM, Ryan Rawson <ry...@gmail.com> wrote:
>>> > I doubt it's case #2, there is a lot of complex code that runs between
>>> > putThread.start() and putThread.done().
>>> >
>>> > In terms of JVMs, I'm using Java 6 on OSX x64.  HBase effectively
>>> > requires Java 6 (and if we dont explicitly require it, we should) and
>>> > it also specifically cannot use certain broken JVM pushes (eg:
>>> > jdk6u18) so much that we are adding in code to prevent ourselves from
>>> > running on it and warning the user.
>>> >
>>> > But just for a moment, I think it's inappropriate for the JVM to be
>>> > specifying caching or non-caching of variables in the systems cache.
>>> > That is way too much abstraction leakage up to the language level.
>>> > Most SMP systems have cache coherency control that allow you to read
>>> > from cache yet get invalidations when other processors (on other dies)
>>> > write to that memory entry.
>>> >
>>> > But nevertheless, the problem no longer exists with AtomicBoolean :-)
>>> >
>>> >
>>> >
>>> > On Thu, Apr 15, 2010 at 5:05 PM, Paul Cowan <co...@aconex.com> wrote:
>>> >> On -9/01/37 05:59, Ryan Rawson wrote:
>>> >>>
>>> >>> So the previous use of volatile for a boolean seems like a textbook
>>> >>> case, but the situation i discovered was pretty clear cut. I have no
>>> >>> other explanation than a highly delayed volatile read (which are
>>> >>> allowed).
>>> >>
>>> >> I don't see that they are allowed, actually.
>>> >>
>>> >> Section 17.4.5 of the JLS says that:
>>> >>
>>> >>> * An unlock on a monitor happens-before every subsequent lock on that
>>> >>> monitor.
>>> >>> * A write to a volatile field (§8.3.1.4) happens-before every
>>> subsequent
>>> >>> read of that field.
>>> >>
>>> >> IOW, the situations (unlock-then-lock) and (volatile-write then
>>> >> volatile-read) have the same visibility guarantees.
>>> >>
>>> >> Section 8.3.1.4 says:
>>> >>
>>> >>> A field may be declared volatile, in which case the Java memory model
>>> >>> (§17)  ensures that all threads see a consistent value for the
>>> variable.
>>> >>
>>> >> In your case, the thread calling done() is not seeing the same value as
>>> the
>>> >> thread calling run(), which is not consistent.
>>> >>
>>> >> And for good measure Java Concurrency in Practice makes it much more
>>> >> explicit (emphasis mine):
>>> >>
>>> >>> Volatile variables are not cached in registers or in caches where they
>>> are
>>> >>> hidden from other processors, so *a read of a volatile variable always
>>> >>> returns the most recent write by any thread*.
>>> >>
>>> >> And finally, on changing to an AtomicBoolean fixing the problem, JCIP
>>> says:
>>> >>
>>> >>> Atomic variables offer the same memory semantics as volatile variables
>>> >>
>>> >> So this doesn't really make sense either.
>>> >>
>>> >> All that's a long way of saying that the only ways I can see your
>>> situation
>>> >> happening are:
>>> >>
>>> >> * pre-Java-1.5 (and hence pre-JSR-133) JVM
>>> >> * JVM with a bug
>>> >> * ordering is not as you expect, i.e. the actual chronological order is
>>> not:
>>> >>
>>> >>    THREAD 1                 THREAD 2
>>> >>    spawn new thread
>>> >>                             run()
>>> >>    done()
>>> >>    join()
>>> >>
>>> >> but rather:
>>> >>
>>> >>    THREAD 1                 THREAD 2
>>> >>    spawn new thread
>>> >>    done()
>>> >>                             run()
>>> >>    join()
>>> >>
>>> >> in which case the set of run to false at the start of run() overwrites
>>> the
>>> >> set of it to true at the start of done(), and you're in for infinite
>>> loop
>>> >> fun.
>>> >>
>>> >> Cheers,
>>> >>
>>> >> Paul
>>> >>
>>> >
>>>
>>
>>
>>
>> --
>> Todd Lipcon
>> Software Engineer, Cloudera
>>
>

Re: Re: volatile considered harmful

Posted by Ryan Rawson <ry...@gmail.com>.
At least it's in test code right?

*sigh*

On Thu, Apr 15, 2010 at 5:27 PM, Todd Lipcon <to...@cloudera.com> wrote:
> On Thu, Apr 15, 2010 at 5:26 PM, Ryan Rawson <ry...@gmail.com> wrote:
>
>> Looking at the code to AtomicBoolean it uses an atomic int to
>> accomplish it's task on OSX.
>>
>> So just what the heck is going on here?
>>
>>
> I think you're fooling yourself, and the bug isn't gone, just hiding :)
>
> -Todd
>
>
>>  On Thu, Apr 15, 2010 at 5:24 PM, Ryan Rawson <ry...@gmail.com> wrote:
>> > I doubt it's case #2, there is a lot of complex code that runs between
>> > putThread.start() and putThread.done().
>> >
>> > In terms of JVMs, I'm using Java 6 on OSX x64.  HBase effectively
>> > requires Java 6 (and if we dont explicitly require it, we should) and
>> > it also specifically cannot use certain broken JVM pushes (eg:
>> > jdk6u18) so much that we are adding in code to prevent ourselves from
>> > running on it and warning the user.
>> >
>> > But just for a moment, I think it's inappropriate for the JVM to be
>> > specifying caching or non-caching of variables in the systems cache.
>> > That is way too much abstraction leakage up to the language level.
>> > Most SMP systems have cache coherency control that allow you to read
>> > from cache yet get invalidations when other processors (on other dies)
>> > write to that memory entry.
>> >
>> > But nevertheless, the problem no longer exists with AtomicBoolean :-)
>> >
>> >
>> >
>> > On Thu, Apr 15, 2010 at 5:05 PM, Paul Cowan <co...@aconex.com> wrote:
>> >> On -9/01/37 05:59, Ryan Rawson wrote:
>> >>>
>> >>> So the previous use of volatile for a boolean seems like a textbook
>> >>> case, but the situation i discovered was pretty clear cut. I have no
>> >>> other explanation than a highly delayed volatile read (which are
>> >>> allowed).
>> >>
>> >> I don't see that they are allowed, actually.
>> >>
>> >> Section 17.4.5 of the JLS says that:
>> >>
>> >>> * An unlock on a monitor happens-before every subsequent lock on that
>> >>> monitor.
>> >>> * A write to a volatile field (§8.3.1.4) happens-before every
>> subsequent
>> >>> read of that field.
>> >>
>> >> IOW, the situations (unlock-then-lock) and (volatile-write then
>> >> volatile-read) have the same visibility guarantees.
>> >>
>> >> Section 8.3.1.4 says:
>> >>
>> >>> A field may be declared volatile, in which case the Java memory model
>> >>> (§17)  ensures that all threads see a consistent value for the
>> variable.
>> >>
>> >> In your case, the thread calling done() is not seeing the same value as
>> the
>> >> thread calling run(), which is not consistent.
>> >>
>> >> And for good measure Java Concurrency in Practice makes it much more
>> >> explicit (emphasis mine):
>> >>
>> >>> Volatile variables are not cached in registers or in caches where they
>> are
>> >>> hidden from other processors, so *a read of a volatile variable always
>> >>> returns the most recent write by any thread*.
>> >>
>> >> And finally, on changing to an AtomicBoolean fixing the problem, JCIP
>> says:
>> >>
>> >>> Atomic variables offer the same memory semantics as volatile variables
>> >>
>> >> So this doesn't really make sense either.
>> >>
>> >> All that's a long way of saying that the only ways I can see your
>> situation
>> >> happening are:
>> >>
>> >> * pre-Java-1.5 (and hence pre-JSR-133) JVM
>> >> * JVM with a bug
>> >> * ordering is not as you expect, i.e. the actual chronological order is
>> not:
>> >>
>> >>    THREAD 1                 THREAD 2
>> >>    spawn new thread
>> >>                             run()
>> >>    done()
>> >>    join()
>> >>
>> >> but rather:
>> >>
>> >>    THREAD 1                 THREAD 2
>> >>    spawn new thread
>> >>    done()
>> >>                             run()
>> >>    join()
>> >>
>> >> in which case the set of run to false at the start of run() overwrites
>> the
>> >> set of it to true at the start of done(), and you're in for infinite
>> loop
>> >> fun.
>> >>
>> >> Cheers,
>> >>
>> >> Paul
>> >>
>> >
>>
>
>
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Re: Re: volatile considered harmful

Posted by Todd Lipcon <to...@cloudera.com>.
On Thu, Apr 15, 2010 at 5:26 PM, Ryan Rawson <ry...@gmail.com> wrote:

> Looking at the code to AtomicBoolean it uses an atomic int to
> accomplish it's task on OSX.
>
> So just what the heck is going on here?
>
>
I think you're fooling yourself, and the bug isn't gone, just hiding :)

-Todd


>  On Thu, Apr 15, 2010 at 5:24 PM, Ryan Rawson <ry...@gmail.com> wrote:
> > I doubt it's case #2, there is a lot of complex code that runs between
> > putThread.start() and putThread.done().
> >
> > In terms of JVMs, I'm using Java 6 on OSX x64.  HBase effectively
> > requires Java 6 (and if we dont explicitly require it, we should) and
> > it also specifically cannot use certain broken JVM pushes (eg:
> > jdk6u18) so much that we are adding in code to prevent ourselves from
> > running on it and warning the user.
> >
> > But just for a moment, I think it's inappropriate for the JVM to be
> > specifying caching or non-caching of variables in the systems cache.
> > That is way too much abstraction leakage up to the language level.
> > Most SMP systems have cache coherency control that allow you to read
> > from cache yet get invalidations when other processors (on other dies)
> > write to that memory entry.
> >
> > But nevertheless, the problem no longer exists with AtomicBoolean :-)
> >
> >
> >
> > On Thu, Apr 15, 2010 at 5:05 PM, Paul Cowan <co...@aconex.com> wrote:
> >> On -9/01/37 05:59, Ryan Rawson wrote:
> >>>
> >>> So the previous use of volatile for a boolean seems like a textbook
> >>> case, but the situation i discovered was pretty clear cut. I have no
> >>> other explanation than a highly delayed volatile read (which are
> >>> allowed).
> >>
> >> I don't see that they are allowed, actually.
> >>
> >> Section 17.4.5 of the JLS says that:
> >>
> >>> * An unlock on a monitor happens-before every subsequent lock on that
> >>> monitor.
> >>> * A write to a volatile field (§8.3.1.4) happens-before every
> subsequent
> >>> read of that field.
> >>
> >> IOW, the situations (unlock-then-lock) and (volatile-write then
> >> volatile-read) have the same visibility guarantees.
> >>
> >> Section 8.3.1.4 says:
> >>
> >>> A field may be declared volatile, in which case the Java memory model
> >>> (§17)  ensures that all threads see a consistent value for the
> variable.
> >>
> >> In your case, the thread calling done() is not seeing the same value as
> the
> >> thread calling run(), which is not consistent.
> >>
> >> And for good measure Java Concurrency in Practice makes it much more
> >> explicit (emphasis mine):
> >>
> >>> Volatile variables are not cached in registers or in caches where they
> are
> >>> hidden from other processors, so *a read of a volatile variable always
> >>> returns the most recent write by any thread*.
> >>
> >> And finally, on changing to an AtomicBoolean fixing the problem, JCIP
> says:
> >>
> >>> Atomic variables offer the same memory semantics as volatile variables
> >>
> >> So this doesn't really make sense either.
> >>
> >> All that's a long way of saying that the only ways I can see your
> situation
> >> happening are:
> >>
> >> * pre-Java-1.5 (and hence pre-JSR-133) JVM
> >> * JVM with a bug
> >> * ordering is not as you expect, i.e. the actual chronological order is
> not:
> >>
> >>    THREAD 1                 THREAD 2
> >>    spawn new thread
> >>                             run()
> >>    done()
> >>    join()
> >>
> >> but rather:
> >>
> >>    THREAD 1                 THREAD 2
> >>    spawn new thread
> >>    done()
> >>                             run()
> >>    join()
> >>
> >> in which case the set of run to false at the start of run() overwrites
> the
> >> set of it to true at the start of done(), and you're in for infinite
> loop
> >> fun.
> >>
> >> Cheers,
> >>
> >> Paul
> >>
> >
>



-- 
Todd Lipcon
Software Engineer, Cloudera

Re: Re: volatile considered harmful

Posted by Ryan Rawson <ry...@gmail.com>.
Looking at the code to AtomicBoolean it uses an atomic int to
accomplish it's task on OSX.

So just what the heck is going on here?

On Thu, Apr 15, 2010 at 5:24 PM, Ryan Rawson <ry...@gmail.com> wrote:
> I doubt it's case #2, there is a lot of complex code that runs between
> putThread.start() and putThread.done().
>
> In terms of JVMs, I'm using Java 6 on OSX x64.  HBase effectively
> requires Java 6 (and if we dont explicitly require it, we should) and
> it also specifically cannot use certain broken JVM pushes (eg:
> jdk6u18) so much that we are adding in code to prevent ourselves from
> running on it and warning the user.
>
> But just for a moment, I think it's inappropriate for the JVM to be
> specifying caching or non-caching of variables in the systems cache.
> That is way too much abstraction leakage up to the language level.
> Most SMP systems have cache coherency control that allow you to read
> from cache yet get invalidations when other processors (on other dies)
> write to that memory entry.
>
> But nevertheless, the problem no longer exists with AtomicBoolean :-)
>
>
>
> On Thu, Apr 15, 2010 at 5:05 PM, Paul Cowan <co...@aconex.com> wrote:
>> On -9/01/37 05:59, Ryan Rawson wrote:
>>>
>>> So the previous use of volatile for a boolean seems like a textbook
>>> case, but the situation i discovered was pretty clear cut. I have no
>>> other explanation than a highly delayed volatile read (which are
>>> allowed).
>>
>> I don't see that they are allowed, actually.
>>
>> Section 17.4.5 of the JLS says that:
>>
>>> * An unlock on a monitor happens-before every subsequent lock on that
>>> monitor.
>>> * A write to a volatile field (§8.3.1.4) happens-before every subsequent
>>> read of that field.
>>
>> IOW, the situations (unlock-then-lock) and (volatile-write then
>> volatile-read) have the same visibility guarantees.
>>
>> Section 8.3.1.4 says:
>>
>>> A field may be declared volatile, in which case the Java memory model
>>> (§17)  ensures that all threads see a consistent value for the variable.
>>
>> In your case, the thread calling done() is not seeing the same value as the
>> thread calling run(), which is not consistent.
>>
>> And for good measure Java Concurrency in Practice makes it much more
>> explicit (emphasis mine):
>>
>>> Volatile variables are not cached in registers or in caches where they are
>>> hidden from other processors, so *a read of a volatile variable always
>>> returns the most recent write by any thread*.
>>
>> And finally, on changing to an AtomicBoolean fixing the problem, JCIP says:
>>
>>> Atomic variables offer the same memory semantics as volatile variables
>>
>> So this doesn't really make sense either.
>>
>> All that's a long way of saying that the only ways I can see your situation
>> happening are:
>>
>> * pre-Java-1.5 (and hence pre-JSR-133) JVM
>> * JVM with a bug
>> * ordering is not as you expect, i.e. the actual chronological order is not:
>>
>>    THREAD 1                 THREAD 2
>>    spawn new thread
>>                             run()
>>    done()
>>    join()
>>
>> but rather:
>>
>>    THREAD 1                 THREAD 2
>>    spawn new thread
>>    done()
>>                             run()
>>    join()
>>
>> in which case the set of run to false at the start of run() overwrites the
>> set of it to true at the start of done(), and you're in for infinite loop
>> fun.
>>
>> Cheers,
>>
>> Paul
>>
>

Re: Re: volatile considered harmful

Posted by Ryan Rawson <ry...@gmail.com>.
I doubt it's case #2, there is a lot of complex code that runs between
putThread.start() and putThread.done().

In terms of JVMs, I'm using Java 6 on OSX x64.  HBase effectively
requires Java 6 (and if we dont explicitly require it, we should) and
it also specifically cannot use certain broken JVM pushes (eg:
jdk6u18) so much that we are adding in code to prevent ourselves from
running on it and warning the user.

But just for a moment, I think it's inappropriate for the JVM to be
specifying caching or non-caching of variables in the systems cache.
That is way too much abstraction leakage up to the language level.
Most SMP systems have cache coherency control that allow you to read
from cache yet get invalidations when other processors (on other dies)
write to that memory entry.

But nevertheless, the problem no longer exists with AtomicBoolean :-)



On Thu, Apr 15, 2010 at 5:05 PM, Paul Cowan <co...@aconex.com> wrote:
> On -9/01/37 05:59, Ryan Rawson wrote:
>>
>> So the previous use of volatile for a boolean seems like a textbook
>> case, but the situation i discovered was pretty clear cut. I have no
>> other explanation than a highly delayed volatile read (which are
>> allowed).
>
> I don't see that they are allowed, actually.
>
> Section 17.4.5 of the JLS says that:
>
>> * An unlock on a monitor happens-before every subsequent lock on that
>> monitor.
>> * A write to a volatile field (§8.3.1.4) happens-before every subsequent
>> read of that field.
>
> IOW, the situations (unlock-then-lock) and (volatile-write then
> volatile-read) have the same visibility guarantees.
>
> Section 8.3.1.4 says:
>
>> A field may be declared volatile, in which case the Java memory model
>> (§17)  ensures that all threads see a consistent value for the variable.
>
> In your case, the thread calling done() is not seeing the same value as the
> thread calling run(), which is not consistent.
>
> And for good measure Java Concurrency in Practice makes it much more
> explicit (emphasis mine):
>
>> Volatile variables are not cached in registers or in caches where they are
>> hidden from other processors, so *a read of a volatile variable always
>> returns the most recent write by any thread*.
>
> And finally, on changing to an AtomicBoolean fixing the problem, JCIP says:
>
>> Atomic variables offer the same memory semantics as volatile variables
>
> So this doesn't really make sense either.
>
> All that's a long way of saying that the only ways I can see your situation
> happening are:
>
> * pre-Java-1.5 (and hence pre-JSR-133) JVM
> * JVM with a bug
> * ordering is not as you expect, i.e. the actual chronological order is not:
>
>    THREAD 1                 THREAD 2
>    spawn new thread
>                             run()
>    done()
>    join()
>
> but rather:
>
>    THREAD 1                 THREAD 2
>    spawn new thread
>    done()
>                             run()
>    join()
>
> in which case the set of run to false at the start of run() overwrites the
> set of it to true at the start of done(), and you're in for infinite loop
> fun.
>
> Cheers,
>
> Paul
>

Re: Re: volatile considered harmful

Posted by Paul Cowan <co...@aconex.com>.
On -9/01/37 05:59, Ryan Rawson wrote:
> So the previous use of volatile for a boolean seems like a textbook
> case, but the situation i discovered was pretty clear cut. I have no
> other explanation than a highly delayed volatile read (which are
> allowed).

I don't see that they are allowed, actually.

Section 17.4.5 of the JLS says that:

> * An unlock on a monitor happens-before every subsequent lock on that monitor.
> * A write to a volatile field (§8.3.1.4) happens-before every subsequent read of that field.

IOW, the situations (unlock-then-lock) and (volatile-write then 
volatile-read) have the same visibility guarantees.

Section 8.3.1.4 says:

> A field may be declared volatile, in which case the Java memory model (§17)  ensures that all threads see a consistent value for the variable.

In your case, the thread calling done() is not seeing the same value as 
the thread calling run(), which is not consistent.

And for good measure Java Concurrency in Practice makes it much more 
explicit (emphasis mine):

> Volatile variables are not cached in registers or in caches where they are hidden from other processors, so *a read of a volatile variable always returns the most recent write by any thread*.

And finally, on changing to an AtomicBoolean fixing the problem, JCIP says:

> Atomic variables offer the same memory semantics as volatile variables

So this doesn't really make sense either.

All that's a long way of saying that the only ways I can see your 
situation happening are:

* pre-Java-1.5 (and hence pre-JSR-133) JVM
* JVM with a bug
* ordering is not as you expect, i.e. the actual chronological order is not:

     THREAD 1                 THREAD 2
     spawn new thread
                              run()
     done()
     join()

but rather:

     THREAD 1                 THREAD 2
     spawn new thread
     done()
                              run()
     join()

in which case the set of run to false at the start of run() overwrites 
the set of it to true at the start of done(), and you're in for infinite 
loop fun.

Cheers,

Paul

Re: volatile considered harmful

Posted by Ryan Rawson <ry...@gmail.com>.
So the previous use of volatile for a boolean seems like a textbook
case, but the situation i discovered was pretty clear cut. I have no
other explanation than a highly delayed volatile read (which are
allowed).

-ryan

On Thu, Apr 15, 2010 at 3:34 PM, Todd Lipcon <to...@cloudera.com> wrote:
> On Thu, Apr 15, 2010 at 3:00 PM, Ryan Rawson <ry...@gmail.com> wrote:
>
>>
>> In short, the use of volatile is not guaranteed to see an update from
>> a different thread in any timely manner. The Java Memory Model does
>> not ensure this.
>
>
> Are you sure?
> http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#volatile seems
> to indicate it does, and I've always believed it does as well.
>
> The incrementing thing is obvious, though :)
>
> -Todd
>
> --
> Todd Lipcon
> Software Engineer, Cloudera
>

Re: volatile considered harmful

Posted by Todd Lipcon <to...@cloudera.com>.
On Thu, Apr 15, 2010 at 3:00 PM, Ryan Rawson <ry...@gmail.com> wrote:

>
> In short, the use of volatile is not guaranteed to see an update from
> a different thread in any timely manner. The Java Memory Model does
> not ensure this.


Are you sure?
http://www.cs.umd.edu/~pugh/java/memoryModel/jsr-133-faq.html#volatile seems
to indicate it does, and I've always believed it does as well.

The incrementing thing is obvious, though :)

-Todd

-- 
Todd Lipcon
Software Engineer, Cloudera