You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Weldon Washburn <we...@gmail.com> on 2007/02/21 07:03:13 UTC

[drlvm][reliability tests] Harmony-2986, Dekker's algorithm -- is this a valid test for modern SMP hardware?

It seems Dekker's algorithm is not expected to work on SPARC or IA32 SMP
boxes unless memory fences are used.  DekkerTest.java in Harmony-2986 does
not contain memory fences.  The volatile keyword guarantees the compiler
will write a given variable to memory.  However, the HW may actually have a
write buffer and allow reads to pass writes.  As far as I know, the Java
language does not provide a means to invoke a memory fence.  Thus there is
no way to fix up DekkerTest.java.  I may be misunderstanding something
here.  Does anyone have comment?

An excellent description of the issues involved is in a David Dice
presentation at:

http://blogs.sun.com/dave/resource/synchronization-public2.pdf

-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm][reliability tests] Harmony-2986, Dekker's algorithm -- is this a valid test for modern SMP hardware?

Posted by Weldon Washburn <we...@gmail.com>.
On 3/1/07, Rana Dasgupta <rd...@gmail.com> wrote:
>
> What is the purpose of the Dekker test in 2986? Is it intended to test
> correct implementation of long volatile? Other than this, I am not sure
> why
> the test exists.



For now, I linked JIRA H2986 so that it is "a part of" JIRA H2092.  We also
need to write an "int" version of the H2092.

If this test is to work, 64 bit volatile load/stores will have to be atomic,
> and there is no workaround for Weldon's locked implementation of volatile
> long on x86. sse2 supports 64 bit aligned moves, and effectively memory
> moves that don't split cache lines should be atomic because they are a
> single bus transaction. But x86 does not provide an atomicity guarantee on
> the 64 bit moves. Worse, no write ordering guarantees are provided with
> sse2
> instructions( eg movntps, movaps etc. ) which is a much bigger problem,
> unless we want to start generating all the sfence, lfence etc.
> instructions
> also.
>

Re: [drlvm][reliability tests] Harmony-2986, Dekker's algorithm -- is this a valid test for modern SMP hardware?

Posted by Rana Dasgupta <rd...@gmail.com>.
What is the purpose of the Dekker test in 2986? Is it intended to test
correct implementation of long volatile? Other than this, I am not sure why
the test exists.
If this test is to work, 64 bit volatile load/stores will have to be atomic,
and there is no workaround for Weldon's locked implementation of volatile
long on x86. sse2 supports 64 bit aligned moves, and effectively memory
moves that don't split cache lines should be atomic because they are a
single bus transaction. But x86 does not provide an atomicity guarantee on
the 64 bit moves. Worse, no write ordering guarantees are provided with sse2
instructions( eg movntps, movaps etc. ) which is a much bigger problem,
unless we want to start generating all the sfence, lfence etc. instructions
also.

Solving the volatile problem does not eliminate the weakness due of lack of
fences in Java.

Eg., the following is a perfectly reasonable usage

class SingleClass {
    volatile static SingleClass singleinst;
    public string val;
    public SingleClass() { val = "initial"; }

    public static SingleClass fetch() {
        if (singleinst == null) {    //  check instead of lock
            synchronized(SingleClass.class) {
                if (singleinst == null)      // another check for the race
                    singleinst = new SingleClass();
            }
              }
        return singleinst;
    }
};

In the common case, one does not need the lock since the singleton will be
usually initialized. But if the assignment to "val" passes the setting of
the singleton under stress test, one would get an uninitialized singleton.
This cannot happen on x86 because of its strong store ordering even on SMP,
but certainly can on IPF, alpha and other achitectures. But on x86, the fact
that the loads are not ordered and can pass stores will create the Dekker
like problems( using 32 bit volatiles ) on SMP.

I don't think Dekker/Peterson etc. algorithm implementations make much sense
in Java.  There are better jit tests for volatiles. The Linux kernel eg.,
uses dekker etc. heavily to implement critsecs, spin locks etc., but that's
a different type of usage, and Linux both uses fences heavily and offers its
own platform neutral fence calls.




On 2/28/07, Gregory Shimansky <gs...@gmail.com> wrote:
>
> On Wednesday 28 February 2007 23:28 Weldon Washburn wrote:
> > On 2/28/07, Gregory Shimansky <gs...@gmail.com> wrote:
> > > Weldon Washburn wrote:
> > > > On second thought, the only way I know to implement volatile long
> > >
> > > (64-bit)
> > >
> > > > Java variables on ia32 is:
> > > >
> > > > grab critical section
> > > > mov [ecx], low32bits;   // to do a write, the code for doing a read
> is
> > > > similar
> > > > mov[ecx+4], hi32bits;
> > > > release critical section
> > >
> > > Is it possible for 64-bit atomic load stores to use double load/stores
> >
> > hmm... can you tell us the specific instructions you are suggesting?  I
> see
> > quad loads/stores but can't find the double load/store version.  I also
> > tried to find the guarantees on bus transactions.  Somewhere I recall it
> is
> > documented that 4-byte aligned loads/stores are guaranteed to be atomic.
> > Maybe there are some new guarantees on 64-bit writes.  In any case, we
> > would still have to be compatible with existing Pentium III hardware and
> > probably have to go with some sort of critical section approach.
>
> Yes this is true. I hoped that someone would point out exactly if there
> are
> any 64-bit atomic operations that work with doubles. It seems like there
> aren't because the patch by Ivan in HARMONY-2092 has comments that it is
> enough to change GC and class loader to align objects on 64-bits boundary
> and
> that's enough for 64-bit load/stores but only with memory fence
> instructions
> in interpreter in addition.
>
> > > or SSE4 on the processors that have it?
> >
> > Good point.  I recall old versions were really only focused on
> multimedia.
> > And writing multimedia bits to memory is not sensitive to order or
> > atomicity.  In other words, if you are writing to a frame buffer, speed
> of
> > writes is important but the order the bits hit the buffer is
> not.  Again, I
> > looked but could not find the latest info SSE4 and atomicity.
>
> Actually it should have been SSE2. I pressed a wrong digit. I just meant
> quad
> load/stores when I wanted to mention it.
>
> > > Some observations:
> > > > 1)
> > > > Fixing the "volatile long" bug (Harmony-2092) by using critical
> section
> > >
> > > as
> > >
> > > > above should, as a side-effect, allow DekkerTest.java to run.
> > > > 2)
> > > > Using volatile long sort of, kind of defeats a major reason to use
> > >
> > > Dekker
> > >
> > > > algorithm in the first place.  Why bother if the performance is the
> > > > same
> > >
> > > as
> > >
> > > > using critical sections?
> > > > 3)
> > > > Using "volatile int" in DekkerTest.java probably still fails because
> > >
> > > reads
> > >
> > > > can pass writes.  One way to fix this might be to make the JIT emit
> r/w
> > > > memory fence whenever reading/writing the volatile int.  While
> memory
> > > > fences
> > > > are often cheaper than HW locks, they are not free.
> > > > 4)
> > > > My guess is that there are no old legacy Java apps that use Dekker
> > > > algorithm.  In other words, nobody is dependant on Dekker algorithm
> > > > working.  My guess is that they are, however, dependent on volatile
> > > > long and
> > > > volatile int working properly. (which has the side effect of making
> > >
> > > Dekker
> > >
> > > > algo work.)
> > > >
> > > > On 2/21/07, Weldon Washburn <we...@gmail.com> wrote:
> > > >> On 2/21/07, Gregory Shimansky <gs...@gmail.com> wrote:
> > > >> > On Wednesday 21 February 2007 21:47 Rana Dasgupta wrote:
> > > >> > > Weldon,
> > > >> > >   But I am not sure why the behavior would be different from J9
> on
> > > >>
> > > >> the
> > > >>
> > > >> > same
> > > >> >
> > > >> > > hardware. Do we jit volatiles differently?
> > > >>
> > > >> The differences in behavior can be caused by lots of things that
> are
> > >
> > > not
> > >
> > > >> related to memory model.  For example the JIT might actually emit
> > >
> > > slighly
> > >
> > > >> different code.  Slighly different code can easily open/close race
> > > >> conditions.  The important concept is that both J9 and drlvm fail.
> > > >> And the
> > > >> failure appears to be because modern hardware is most likely not
> > > >> designed to
> > > >> run Dekker's algo without memory fences.
> > > >>
> > > >> There is a bug on DRLVM about volatile variables HARMONY-2092. It
> is
> > > >> about
> > > >>
> > > >> > long and double type variables assignments. Is it the same as in
> > > >> > Dekker's
> > > >> > algorithm?
> > > >>
> > > >>  DekkerTest.java uses "long" variables.  Yes, this could change the
> > >
> > > rate
> > >
> > > >> of failure but not eliminate failures completely.
> > > >>
> > > >> > On 2/20/07, Weldon Washburn <we...@gmail.com> wrote:
> > > >> > > > It seems Dekker's algorithm is not expected to work on SPARC
> or
> > > >>
> > > >> IA32
> > > >>
> > > >> > SMP
> > > >> >
> > > >> > > > boxes unless memory fences are used.  DekkerTest.java in
> > > >> >
> > > >> > Harmony-2986
> > > >> >
> > > >> > > > does not contain memory fences.  The volatile keyword
> guarantees
> > > >>
> > > >> the
> > > >>
> > > >> > > > compiler will write a given variable to memory.  However, the
> HW
> > > >>
> > > >> may
> > > >>
> > > >> > > > actually have a
> > > >> > > > write buffer and allow reads to pass writes.  As far as I
> know,
> > >
> > > the
> > >
> > > >> > Java
> > > >> >
> > > >> > > > language does not provide a means to invoke a memory fence.
> > > >> > > > Thus
> > > >> >
> > > >> > there
> > > >> >
> > > >> > > > is no way to fix up DekkerTest.java.  I may be
> misunderstanding
> > > >> >
> > > >> > something
> > > >> >
> > > >> > > > here.  Does anyone have comment?
> > > >> > > >
> > > >> > > > An excellent description of the issues involved is in a David
> > >
> > > Dice
> > >
> > > >> > > > presentation at:
> > > >> > > >
> > > >> > > >
> http://blogs.sun.com/dave/resource/synchronization-public2.pdf
> > > >> > > >
> > > >> > > > --
> > > >> > > > Weldon Washburn
> > > >> > > > Intel Enterprise Solutions Software Division
> > > >> >
> > > >> > --
> > > >> > Gregory
> > >
> > > --
> > > Gregory
>
> --
> Gregory
>

Re: [drlvm][reliability tests] Harmony-2986, Dekker's algorithm -- is this a valid test for modern SMP hardware?

Posted by Gregory Shimansky <gs...@gmail.com>.
On Wednesday 28 February 2007 23:28 Weldon Washburn wrote:
> On 2/28/07, Gregory Shimansky <gs...@gmail.com> wrote:
> > Weldon Washburn wrote:
> > > On second thought, the only way I know to implement volatile long
> >
> > (64-bit)
> >
> > > Java variables on ia32 is:
> > >
> > > grab critical section
> > > mov [ecx], low32bits;   // to do a write, the code for doing a read is
> > > similar
> > > mov[ecx+4], hi32bits;
> > > release critical section
> >
> > Is it possible for 64-bit atomic load stores to use double load/stores
>
> hmm... can you tell us the specific instructions you are suggesting?  I see
> quad loads/stores but can't find the double load/store version.  I also
> tried to find the guarantees on bus transactions.  Somewhere I recall it is
> documented that 4-byte aligned loads/stores are guaranteed to be atomic.
> Maybe there are some new guarantees on 64-bit writes.  In any case, we
> would still have to be compatible with existing Pentium III hardware and
> probably have to go with some sort of critical section approach.

Yes this is true. I hoped that someone would point out exactly if there are 
any 64-bit atomic operations that work with doubles. It seems like there 
aren't because the patch by Ivan in HARMONY-2092 has comments that it is 
enough to change GC and class loader to align objects on 64-bits boundary and 
that's enough for 64-bit load/stores but only with memory fence instructions 
in interpreter in addition.

> > or SSE4 on the processors that have it?
>
> Good point.  I recall old versions were really only focused on multimedia.
> And writing multimedia bits to memory is not sensitive to order or
> atomicity.  In other words, if you are writing to a frame buffer, speed of
> writes is important but the order the bits hit the buffer is not.  Again, I
> looked but could not find the latest info SSE4 and atomicity.

Actually it should have been SSE2. I pressed a wrong digit. I just meant quad 
load/stores when I wanted to mention it.

> > Some observations:
> > > 1)
> > > Fixing the "volatile long" bug (Harmony-2092) by using critical section
> >
> > as
> >
> > > above should, as a side-effect, allow DekkerTest.java to run.
> > > 2)
> > > Using volatile long sort of, kind of defeats a major reason to use
> >
> > Dekker
> >
> > > algorithm in the first place.  Why bother if the performance is the
> > > same
> >
> > as
> >
> > > using critical sections?
> > > 3)
> > > Using "volatile int" in DekkerTest.java probably still fails because
> >
> > reads
> >
> > > can pass writes.  One way to fix this might be to make the JIT emit r/w
> > > memory fence whenever reading/writing the volatile int.  While memory
> > > fences
> > > are often cheaper than HW locks, they are not free.
> > > 4)
> > > My guess is that there are no old legacy Java apps that use Dekker
> > > algorithm.  In other words, nobody is dependant on Dekker algorithm
> > > working.  My guess is that they are, however, dependent on volatile
> > > long and
> > > volatile int working properly. (which has the side effect of making
> >
> > Dekker
> >
> > > algo work.)
> > >
> > > On 2/21/07, Weldon Washburn <we...@gmail.com> wrote:
> > >> On 2/21/07, Gregory Shimansky <gs...@gmail.com> wrote:
> > >> > On Wednesday 21 February 2007 21:47 Rana Dasgupta wrote:
> > >> > > Weldon,
> > >> > >   But I am not sure why the behavior would be different from J9 on
> > >>
> > >> the
> > >>
> > >> > same
> > >> >
> > >> > > hardware. Do we jit volatiles differently?
> > >>
> > >> The differences in behavior can be caused by lots of things that are
> >
> > not
> >
> > >> related to memory model.  For example the JIT might actually emit
> >
> > slighly
> >
> > >> different code.  Slighly different code can easily open/close race
> > >> conditions.  The important concept is that both J9 and drlvm fail.
> > >> And the
> > >> failure appears to be because modern hardware is most likely not
> > >> designed to
> > >> run Dekker's algo without memory fences.
> > >>
> > >> There is a bug on DRLVM about volatile variables HARMONY-2092. It is
> > >> about
> > >>
> > >> > long and double type variables assignments. Is it the same as in
> > >> > Dekker's
> > >> > algorithm?
> > >>
> > >>  DekkerTest.java uses "long" variables.  Yes, this could change the
> >
> > rate
> >
> > >> of failure but not eliminate failures completely.
> > >>
> > >> > On 2/20/07, Weldon Washburn <we...@gmail.com> wrote:
> > >> > > > It seems Dekker's algorithm is not expected to work on SPARC or
> > >>
> > >> IA32
> > >>
> > >> > SMP
> > >> >
> > >> > > > boxes unless memory fences are used.  DekkerTest.java in
> > >> >
> > >> > Harmony-2986
> > >> >
> > >> > > > does not contain memory fences.  The volatile keyword guarantees
> > >>
> > >> the
> > >>
> > >> > > > compiler will write a given variable to memory.  However, the HW
> > >>
> > >> may
> > >>
> > >> > > > actually have a
> > >> > > > write buffer and allow reads to pass writes.  As far as I know,
> >
> > the
> >
> > >> > Java
> > >> >
> > >> > > > language does not provide a means to invoke a memory fence. 
> > >> > > > Thus
> > >> >
> > >> > there
> > >> >
> > >> > > > is no way to fix up DekkerTest.java.  I may be misunderstanding
> > >> >
> > >> > something
> > >> >
> > >> > > > here.  Does anyone have comment?
> > >> > > >
> > >> > > > An excellent description of the issues involved is in a David
> >
> > Dice
> >
> > >> > > > presentation at:
> > >> > > >
> > >> > > > http://blogs.sun.com/dave/resource/synchronization-public2.pdf
> > >> > > >
> > >> > > > --
> > >> > > > Weldon Washburn
> > >> > > > Intel Enterprise Solutions Software Division
> > >> >
> > >> > --
> > >> > Gregory
> >
> > --
> > Gregory

-- 
Gregory

Re: [drlvm][reliability tests] Harmony-2986, Dekker's algorithm -- is this a valid test for modern SMP hardware?

Posted by Weldon Washburn <we...@gmail.com>.
On 2/28/07, Gregory Shimansky <gs...@gmail.com> wrote:
>
> Weldon Washburn wrote:
> > On second thought, the only way I know to implement volatile long
> (64-bit)
> > Java variables on ia32 is:
> >
> > grab critical section
> > mov [ecx], low32bits;   // to do a write, the code for doing a read is
> > similar
> > mov[ecx+4], hi32bits;
> > release critical section
>
> Is it possible for 64-bit atomic load stores to use double load/stores


hmm... can you tell us the specific instructions you are suggesting?  I see
quad loads/stores but can't find the double load/store version.  I also
tried to find the guarantees on bus transactions.  Somewhere I recall it is
documented that 4-byte aligned loads/stores are guaranteed to be atomic.
Maybe there are some new guarantees on 64-bit writes.  In any case, we would
still have to be compatible with existing Pentium III hardware and probably
have to go with some sort of critical section approach.



> or SSE4 on the processors that have it?


Good point.  I recall old versions were really only focused on multimedia.
And writing multimedia bits to memory is not sensitive to order or
atomicity.  In other words, if you are writing to a frame buffer, speed of
writes is important but the order the bits hit the buffer is not.  Again, I
looked but could not find the latest info SSE4 and atomicity.


> Some observations:
> > 1)
> > Fixing the "volatile long" bug (Harmony-2092) by using critical section
> as
> > above should, as a side-effect, allow DekkerTest.java to run.
> > 2)
> > Using volatile long sort of, kind of defeats a major reason to use
> Dekker
> > algorithm in the first place.  Why bother if the performance is the same
> as
> > using critical sections?
> > 3)
> > Using "volatile int" in DekkerTest.java probably still fails because
> reads
> > can pass writes.  One way to fix this might be to make the JIT emit r/w
> > memory fence whenever reading/writing the volatile int.  While memory
> > fences
> > are often cheaper than HW locks, they are not free.
> > 4)
> > My guess is that there are no old legacy Java apps that use Dekker
> > algorithm.  In other words, nobody is dependant on Dekker algorithm
> > working.  My guess is that they are, however, dependent on volatile long
> > and
> > volatile int working properly. (which has the side effect of making
> Dekker
> > algo work.)
> >
> >
> > On 2/21/07, Weldon Washburn <we...@gmail.com> wrote:
> >>
> >>
> >>
> >> On 2/21/07, Gregory Shimansky <gs...@gmail.com> wrote:
> >> >
> >> > On Wednesday 21 February 2007 21:47 Rana Dasgupta wrote:
> >> > > Weldon,
> >> > >   But I am not sure why the behavior would be different from J9 on
> >> the
> >> > same
> >> > > hardware. Do we jit volatiles differently?
> >>
> >>
> >> The differences in behavior can be caused by lots of things that are
> not
> >> related to memory model.  For example the JIT might actually emit
> slighly
> >> different code.  Slighly different code can easily open/close race
> >> conditions.  The important concept is that both J9 and drlvm fail.
> >> And the
> >> failure appears to be because modern hardware is most likely not
> >> designed to
> >> run Dekker's algo without memory fences.
> >>
> >> There is a bug on DRLVM about volatile variables HARMONY-2092. It is
> >> about
> >> > long and double type variables assignments. Is it the same as in
> >> > Dekker's
> >> > algorithm?
> >>
> >>  DekkerTest.java uses "long" variables.  Yes, this could change the
> rate
> >> of failure but not eliminate failures completely.
> >>
> >>
> >> > On 2/20/07, Weldon Washburn <we...@gmail.com> wrote:
> >> > > > It seems Dekker's algorithm is not expected to work on SPARC or
> >> IA32
> >> > SMP
> >> > > > boxes unless memory fences are used.  DekkerTest.java in
> >> > Harmony-2986
> >> > > > does not contain memory fences.  The volatile keyword guarantees
> >> the
> >> >
> >> > > > compiler will write a given variable to memory.  However, the HW
> >> may
> >> > > > actually have a
> >> > > > write buffer and allow reads to pass writes.  As far as I know,
> the
> >> > Java
> >> > > > language does not provide a means to invoke a memory fence.  Thus
> >> > there
> >> > > > is no way to fix up DekkerTest.java.  I may be misunderstanding
> >> > something
> >> > > > here.  Does anyone have comment?
> >> > > >
> >> > > > An excellent description of the issues involved is in a David
> Dice
> >> > > > presentation at:
> >> > > >
> >> > > > http://blogs.sun.com/dave/resource/synchronization-public2.pdf
> >> > > >
> >> > > > --
> >> > > > Weldon Washburn
> >> > > > Intel Enterprise Solutions Software Division
> >> >
> >> > --
> >> > Gregory
> >> >
>
> --
> Gregory
>
>


-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm][reliability tests] Harmony-2986, Dekker's algorithm -- is this a valid test for modern SMP hardware?

Posted by Gregory Shimansky <gs...@gmail.com>.
Weldon Washburn wrote:
> On second thought, the only way I know to implement volatile long (64-bit)
> Java variables on ia32 is:
> 
> grab critical section
> mov [ecx], low32bits;   // to do a write, the code for doing a read is
> similar
> mov[ecx+4], hi32bits;
> release critical section

Is it possible for 64-bit atomic load stores to use double load/stores 
or SSE4 on the processors that have it?

> Some observations:
> 1)
> Fixing the "volatile long" bug (Harmony-2092) by using critical section as
> above should, as a side-effect, allow DekkerTest.java to run.
> 2)
> Using volatile long sort of, kind of defeats a major reason to use Dekker
> algorithm in the first place.  Why bother if the performance is the same as
> using critical sections?
> 3)
> Using "volatile int" in DekkerTest.java probably still fails because reads
> can pass writes.  One way to fix this might be to make the JIT emit r/w
> memory fence whenever reading/writing the volatile int.  While memory 
> fences
> are often cheaper than HW locks, they are not free.
> 4)
> My guess is that there are no old legacy Java apps that use Dekker
> algorithm.  In other words, nobody is dependant on Dekker algorithm
> working.  My guess is that they are, however, dependent on volatile long 
> and
> volatile int working properly. (which has the side effect of making Dekker
> algo work.)
> 
> 
> On 2/21/07, Weldon Washburn <we...@gmail.com> wrote:
>>
>>
>>
>> On 2/21/07, Gregory Shimansky <gs...@gmail.com> wrote:
>> >
>> > On Wednesday 21 February 2007 21:47 Rana Dasgupta wrote:
>> > > Weldon,
>> > >   But I am not sure why the behavior would be different from J9 on 
>> the
>> > same
>> > > hardware. Do we jit volatiles differently?
>>
>>
>> The differences in behavior can be caused by lots of things that are not
>> related to memory model.  For example the JIT might actually emit slighly
>> different code.  Slighly different code can easily open/close race
>> conditions.  The important concept is that both J9 and drlvm fail.  
>> And the
>> failure appears to be because modern hardware is most likely not 
>> designed to
>> run Dekker's algo without memory fences.
>>
>> There is a bug on DRLVM about volatile variables HARMONY-2092. It is 
>> about
>> > long and double type variables assignments. Is it the same as in
>> > Dekker's
>> > algorithm?
>>
>>  DekkerTest.java uses "long" variables.  Yes, this could change the rate
>> of failure but not eliminate failures completely.
>>
>>
>> > On 2/20/07, Weldon Washburn <we...@gmail.com> wrote:
>> > > > It seems Dekker's algorithm is not expected to work on SPARC or 
>> IA32
>> > SMP
>> > > > boxes unless memory fences are used.  DekkerTest.java in
>> > Harmony-2986
>> > > > does not contain memory fences.  The volatile keyword guarantees 
>> the
>> >
>> > > > compiler will write a given variable to memory.  However, the HW 
>> may
>> > > > actually have a
>> > > > write buffer and allow reads to pass writes.  As far as I know, the
>> > Java
>> > > > language does not provide a means to invoke a memory fence.  Thus
>> > there
>> > > > is no way to fix up DekkerTest.java.  I may be misunderstanding
>> > something
>> > > > here.  Does anyone have comment?
>> > > >
>> > > > An excellent description of the issues involved is in a David Dice
>> > > > presentation at:
>> > > >
>> > > > http://blogs.sun.com/dave/resource/synchronization-public2.pdf
>> > > >
>> > > > --
>> > > > Weldon Washburn
>> > > > Intel Enterprise Solutions Software Division
>> >
>> > --
>> > Gregory
>> >

-- 
Gregory


Re: [drlvm][reliability tests] Harmony-2986, Dekker's algorithm -- is this a valid test for modern SMP hardware?

Posted by Weldon Washburn <we...@gmail.com>.
On second thought, the only way I know to implement volatile long (64-bit)
Java variables on ia32 is:

grab critical section
mov [ecx], low32bits;   // to do a write, the code for doing a read is
similar
mov[ecx+4], hi32bits;
release critical section

Some observations:
1)
Fixing the "volatile long" bug (Harmony-2092) by using critical section as
above should, as a side-effect, allow DekkerTest.java to run.
2)
Using volatile long sort of, kind of defeats a major reason to use Dekker
algorithm in the first place.  Why bother if the performance is the same as
using critical sections?
3)
Using "volatile int" in DekkerTest.java probably still fails because reads
can pass writes.  One way to fix this might be to make the JIT emit r/w
memory fence whenever reading/writing the volatile int.  While memory fences
are often cheaper than HW locks, they are not free.
4)
My guess is that there are no old legacy Java apps that use Dekker
algorithm.  In other words, nobody is dependant on Dekker algorithm
working.  My guess is that they are, however, dependent on volatile long and
volatile int working properly. (which has the side effect of making Dekker
algo work.)


On 2/21/07, Weldon Washburn <we...@gmail.com> wrote:
>
>
>
> On 2/21/07, Gregory Shimansky <gs...@gmail.com> wrote:
> >
> > On Wednesday 21 February 2007 21:47 Rana Dasgupta wrote:
> > > Weldon,
> > >   But I am not sure why the behavior would be different from J9 on the
> > same
> > > hardware. Do we jit volatiles differently?
>
>
> The differences in behavior can be caused by lots of things that are not
> related to memory model.  For example the JIT might actually emit slighly
> different code.  Slighly different code can easily open/close race
> conditions.  The important concept is that both J9 and drlvm fail.  And the
> failure appears to be because modern hardware is most likely not designed to
> run Dekker's algo without memory fences.
>
> There is a bug on DRLVM about volatile variables HARMONY-2092. It is about
> > long and double type variables assignments. Is it the same as in
> > Dekker's
> > algorithm?
>
>  DekkerTest.java uses "long" variables.  Yes, this could change the rate
> of failure but not eliminate failures completely.
>
>
> > On 2/20/07, Weldon Washburn <we...@gmail.com> wrote:
> > > > It seems Dekker's algorithm is not expected to work on SPARC or IA32
> > SMP
> > > > boxes unless memory fences are used.  DekkerTest.java in
> > Harmony-2986
> > > > does not contain memory fences.  The volatile keyword guarantees the
> >
> > > > compiler will write a given variable to memory.  However, the HW may
> > > > actually have a
> > > > write buffer and allow reads to pass writes.  As far as I know, the
> > Java
> > > > language does not provide a means to invoke a memory fence.  Thus
> > there
> > > > is no way to fix up DekkerTest.java.  I may be misunderstanding
> > something
> > > > here.  Does anyone have comment?
> > > >
> > > > An excellent description of the issues involved is in a David Dice
> > > > presentation at:
> > > >
> > > > http://blogs.sun.com/dave/resource/synchronization-public2.pdf
> > > >
> > > > --
> > > > Weldon Washburn
> > > > Intel Enterprise Solutions Software Division
> >
> > --
> > Gregory
> >
>
>
>
> --
> Weldon Washburn
> Intel Enterprise Solutions Software Division
>



-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm][reliability tests] Harmony-2986, Dekker's algorithm -- is this a valid test for modern SMP hardware?

Posted by Weldon Washburn <we...@gmail.com>.
On 2/21/07, Gregory Shimansky <gs...@gmail.com> wrote:
>
> On Wednesday 21 February 2007 21:47 Rana Dasgupta wrote:
> > Weldon,
> >   But I am not sure why the behavior would be different from J9 on the
> same
> > hardware. Do we jit volatiles differently?


The differences in behavior can be caused by lots of things that are not
related to memory model.  For example the JIT might actually emit slighly
different code.  Slighly different code can easily open/close race
conditions.  The important concept is that both J9 and drlvm fail.  And the
failure appears to be because modern hardware is most likely not designed to
run Dekker's algo without memory fences.

There is a bug on DRLVM about volatile variables HARMONY-2092. It is about
> long and double type variables assignments. Is it the same as in Dekker's
> algorithm?

 DekkerTest.java uses "long" variables.  Yes, this could change the rate of
failure but not eliminate failures completely.


> On 2/20/07, Weldon Washburn <we...@gmail.com> wrote:
> > > It seems Dekker's algorithm is not expected to work on SPARC or IA32
> SMP
> > > boxes unless memory fences are used.  DekkerTest.java in Harmony-2986
> > > does not contain memory fences.  The volatile keyword guarantees the
> > > compiler will write a given variable to memory.  However, the HW may
> > > actually have a
> > > write buffer and allow reads to pass writes.  As far as I know, the
> Java
> > > language does not provide a means to invoke a memory fence.  Thus
> there
> > > is no way to fix up DekkerTest.java.  I may be misunderstanding
> something
> > > here.  Does anyone have comment?
> > >
> > > An excellent description of the issues involved is in a David Dice
> > > presentation at:
> > >
> > > http://blogs.sun.com/dave/resource/synchronization-public2.pdf
> > >
> > > --
> > > Weldon Washburn
> > > Intel Enterprise Solutions Software Division
>
> --
> Gregory
>



-- 
Weldon Washburn
Intel Enterprise Solutions Software Division

Re: [drlvm][reliability tests] Harmony-2986, Dekker's algorithm -- is this a valid test for modern SMP hardware?

Posted by Gregory Shimansky <gs...@gmail.com>.
On Wednesday 21 February 2007 21:47 Rana Dasgupta wrote:
> Weldon,
>   But I am not sure why the behavior would be different from J9 on the same
> hardware. Do we jit volatiles differently?

There is a bug on DRLVM about volatile variables HARMONY-2092. It is about 
long and double type variables assignments. Is it the same as in Dekker's 
algorithm?

> On 2/20/07, Weldon Washburn <we...@gmail.com> wrote:
> > It seems Dekker's algorithm is not expected to work on SPARC or IA32 SMP
> > boxes unless memory fences are used.  DekkerTest.java in Harmony-2986
> > does not contain memory fences.  The volatile keyword guarantees the
> > compiler will write a given variable to memory.  However, the HW may
> > actually have a
> > write buffer and allow reads to pass writes.  As far as I know, the Java
> > language does not provide a means to invoke a memory fence.  Thus there
> > is no way to fix up DekkerTest.java.  I may be misunderstanding something
> > here.  Does anyone have comment?
> >
> > An excellent description of the issues involved is in a David Dice
> > presentation at:
> >
> > http://blogs.sun.com/dave/resource/synchronization-public2.pdf
> >
> > --
> > Weldon Washburn
> > Intel Enterprise Solutions Software Division

-- 
Gregory

Re: [drlvm][reliability tests] Harmony-2986, Dekker's algorithm -- is this a valid test for modern SMP hardware?

Posted by Rana Dasgupta <rd...@gmail.com>.
Weldon,
  But I am not sure why the behavior would be different from J9 on the same
hardware. Do we jit volatiles differently?

Rana


On 2/20/07, Weldon Washburn <we...@gmail.com> wrote:
>
> It seems Dekker's algorithm is not expected to work on SPARC or IA32 SMP
> boxes unless memory fences are used.  DekkerTest.java in Harmony-2986 does
> not contain memory fences.  The volatile keyword guarantees the compiler
> will write a given variable to memory.  However, the HW may actually have
> a
> write buffer and allow reads to pass writes.  As far as I know, the Java
> language does not provide a means to invoke a memory fence.  Thus there is
> no way to fix up DekkerTest.java.  I may be misunderstanding something
> here.  Does anyone have comment?
>
> An excellent description of the issues involved is in a David Dice
> presentation at:
>
> http://blogs.sun.com/dave/resource/synchronization-public2.pdf
>
> --
> Weldon Washburn
> Intel Enterprise Solutions Software Division
>