You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@harmony.apache.org by Weldon Washburn <we...@gmail.com> on 2007/08/19 16:54:19 UTC

[drlvm][threading] a description of thread suspension race conditions plus initial ideas on how to fix

All,

Four things.

1)
I put a description of thread suspension race conditions in
harmony-4644, HarmonyJVM_Thread_State_Diagrams.ppt.  If the analysis
is correct, there is a race condition where the suspend request thread
can see a stale value of the target thread's disable_count.  In other
words, the suspend request thread can set the target thread->request
field to true then (accidentally) read a stale value of
thread->disable_count.  The stale value misleads the requestor thread
to think it is OK to enumerate the target's stack.

2)
Below is rough code that shows an algorithm that might fix the above
problem.  The concept is to attempt to repair the existing code base
rather than a complete redesign.  Basically the idea is that the
target thread will set disable_count = 1 *before* it checks
thread->request.  And the suspension requestor thread needs to wait
for the HW store buffer to drain.  I do not know of any HW
architectural guarantees on the maximum clock cycles to drain the
store buffer.  I made an intial guess of 300Msec.  It may be that
30msec is sufficient for many generations of future HW.  I simply do
not know at this point.  Please see the below code for more details.
(Even more details are in harmony-4644,  AAA_thread_suspension.cpp.)

3)
A question for GC folks.  What is the maximum sleep() time in the
below code that is acceptable?  Note that for GC suspending a single
thread, we always need to wait for the HW store buffer to drain.
However for stop-the-world GC, we do not neccessarily have to wait for
each target thread's store buffer to drain. We can set thread->request
on all threads, then do a single sleep(xx msec) to ensure the store
buffers have drained.

4)
Next steps -- clean up the below suspension kernel and run experiments
on 4-way SMP to look for race conditions.  I hope to do this today.



///////////////////////////////////////////////

void mutator__hythread_suspend_disable()
{
    hythread_t self = get_TLS_thread();

    // the following write will slowly dribble out to the coherency domain
    // we could add an mfence but this would slow down
hythread_suspend_disable() execution (bad idea)
    // instead of an mfence() in the target thread, we sleep 300msec
in the requestor thread to
    // allow the mutator's HW store buffer to empty
    self->disable_count++;

    if (self->disable_count == 1) {
        if (readLocation(&self->request) ) {

            self->disable_count = 0;
            self->now_blocked_at_disable_count_0_to_1 = true;
            mfence();

            while (self->j_l_t_suspend_request || self->gc_suspend_request) {
                // put a timeout to ensure a missed semaphore set does not
                // hang the system (graceful degradation)
                status = hysem_wait(self->resume_event, timout_3000msec);
                if (status == timeout && self->gc_suspend_request) log
a warning/diagnostic somewhere
            }

            self->disable_count = 1;
            self->now_blocked_at_disable_count_0_to_1 = false;
            mfence();
        }
        else {
            // intentionally do nothing
            // eventually disable_count going from 0 to 1 will make it
into the coherency domain
        }
}

void hythread_suspend_other(hythread_t thread, suspend_t kind)
{
    if (kind == J_L_T_SUSPEND) {
        if (thread->j_l_t_suspend_request == true) return;  //its
already been dealt with
        thread->j_l_t_suspend_request = true;
    } else {
        assert(kind == GCSUSPEND);
        // only allow one GC at a time for now
        assert(thread->gc_suspend_request = false);
        thread->gc_suspend_request = true;
    }
    apr_atomic_inc32(&thread->request);

    // if the target thread just did a "disable_count = 1", the below
sleep will allow
    // enough time so that the target thread's HW store buffer will
empty into the
    // coherency domain
    sleep(300msec);

    //we now know we are not fetching stale version of disable_count
from coherency domain
    while (readLocation(&thread->disable_count) != 0) {
        yield();
    }
    if ( (thread->now_blocked_at_back_branch == 0) &&
        (thread->now_blocked_at_disable_count_0_to_1 == 0) ) {
            assert(0);
    }
}

Re: [drlvm][threading] a description of thread suspension race conditions plus initial ideas on how to fix

Posted by Weldon Washburn <we...@gmail.com>.
On 8/19/07, Xiao-Feng Li <xi...@gmail.com> wrote:
> On 8/19/07, Weldon Washburn <we...@gmail.com> wrote:
> > All,
> >
> > Four things.
> >
> > 1)
> > I put a description of thread suspension race conditions in
> > harmony-4644, HarmonyJVM_Thread_State_Diagrams.ppt.  If the analysis
> > is correct, there is a race condition where the suspend request thread
> > can see a stale value of the target thread's disable_count.  In other
> > words, the suspend request thread can set the target thread->request
> > field to true then (accidentally) read a stale value of
> > thread->disable_count.  The stale value misleads the requestor thread
> > to think it is OK to enumerate the target's stack.
> >
> > 2)
> > Below is rough code that shows an algorithm that might fix the above
> > problem.  The concept is to attempt to repair the existing code base
> > rather than a complete redesign.  Basically the idea is that the
> > target thread will set disable_count = 1 *before* it checks
> > thread->request.  And the suspension requestor thread needs to wait
> > for the HW store buffer to drain.  I do not know of any HW
> > architectural guarantees on the maximum clock cycles to drain the
> > store buffer.  I made an intial guess of 300Msec.  It may be that
> > 30msec is sufficient for many generations of future HW.  I simply do
> > not know at this point.  Please see the below code for more details.
> > (Even more details are in harmony-4644,  AAA_thread_suspension.cpp.)
>
> I guess it's not a good idea to sleep explicitly for 300ms. A minor
> collection can be as fast as tens of ms.

Thanks.  This is the info I am looking for.  This puts a limit on the
sleep() duration of something like 100 microseconds.  Now that I think
about it, HW store buffers should drain fairly quickly since they
interface with L0 caches.  I will have to find out if a HW store
buffer can drain in 100microseconds.
>
> > 3)
> > A question for GC folks.  What is the maximum sleep() time in the
> > below code that is acceptable?  Note that for GC suspending a single
> > thread, we always need to wait for the HW store buffer to drain.
> > However for stop-the-world GC, we do not neccessarily have to wait for
> > each target thread's store buffer to drain. We can set thread->request
> > on all threads, then do a single sleep(xx msec) to ensure the store
> > buffers have drained.
>
> To force the store buffer to drain, I think a safer (and more elegant)
> way is to kill a signal to the target thread, so that the target
> thread is interrupted.
>
> I am afraid this kind of explicit timing control as 300msec sleep
> should always be avoided.

I agree hard coded timings that depend on HW implementation should
always be avoided.  Using pthread_kill on linux and OS thread
suspend/resume on windows is a good idea.  The linux signal handler
would do an mfence and Windowxp thread suspend/resume causes a context
switch (which in turn causes store buffer flush).  I don't know if
pthread_kill does exactly what we need but I will find out.

>
> > 4)
> > Next steps -- clean up the below suspension kernel and run experiments
> > on 4-way SMP to look for race conditions.  I hope to do this today.
> >
> >
> >
> > ///////////////////////////////////////////////
> >
> > void mutator__hythread_suspend_disable()
> > {
> >     hythread_t self = get_TLS_thread();
> >
> >     // the following write will slowly dribble out to the coherency domain
> >     // we could add an mfence but this would slow down
> > hythread_suspend_disable() execution (bad idea)
> >     // instead of an mfence() in the target thread, we sleep 300msec
> > in the requestor thread to
> >     // allow the mutator's HW store buffer to empty
> >     self->disable_count++;
> >
> >     if (self->disable_count == 1) {
> >         if (readLocation(&self->request) ) {
> >
> >             self->disable_count = 0;
> >             self->now_blocked_at_disable_count_0_to_1 = true;
> >             mfence();
> >
> >             while (self->j_l_t_suspend_request || self->gc_suspend_request) {
> >                 // put a timeout to ensure a missed semaphore set does not
> >                 // hang the system (graceful degradation)
> >                 status = hysem_wait(self->resume_event, timout_3000msec);
> >                 if (status == timeout && self->gc_suspend_request) log
> > a warning/diagnostic somewhere
> >             }
> >
> >             self->disable_count = 1;
> >             self->now_blocked_at_disable_count_0_to_1 = false;
> >             mfence();
> >         }
> >         else {
> >             // intentionally do nothing
> >             // eventually disable_count going from 0 to 1 will make it
> > into the coherency domain
> >         }
> > }
> >
> > void hythread_suspend_other(hythread_t thread, suspend_t kind)
> > {
> >     if (kind == J_L_T_SUSPEND) {
> >         if (thread->j_l_t_suspend_request == true) return;  //its
> > already been dealt with
> >         thread->j_l_t_suspend_request = true;
> >     } else {
> >         assert(kind == GCSUSPEND);
> >         // only allow one GC at a time for now
> >         assert(thread->gc_suspend_request = false);
> >         thread->gc_suspend_request = true;
> >     }
> >     apr_atomic_inc32(&thread->request);
> >
> >     // if the target thread just did a "disable_count = 1", the below
> > sleep will allow
> >     // enough time so that the target thread's HW store buffer will
> > empty into the
> >     // coherency domain
> >     sleep(300msec);
> >
> >     //we now know we are not fetching stale version of disable_count
> > from coherency domain
> >     while (readLocation(&thread->disable_count) != 0) {
> >         yield();
> >     }
> >     if ( (thread->now_blocked_at_back_branch == 0) &&
> >         (thread->now_blocked_at_disable_count_0_to_1 == 0) ) {
> >             assert(0);
> >     }
> > }
> >
>
>
> --
> http://xiao-feng.blogspot.com
>


-- 
Weldon Washburn

Re: [drlvm][threading] a description of thread suspension race conditions plus initial ideas on how to fix

Posted by Xiao-Feng Li <xi...@gmail.com>.
On 8/19/07, Weldon Washburn <we...@gmail.com> wrote:
> All,
>
> Four things.
>
> 1)
> I put a description of thread suspension race conditions in
> harmony-4644, HarmonyJVM_Thread_State_Diagrams.ppt.  If the analysis
> is correct, there is a race condition where the suspend request thread
> can see a stale value of the target thread's disable_count.  In other
> words, the suspend request thread can set the target thread->request
> field to true then (accidentally) read a stale value of
> thread->disable_count.  The stale value misleads the requestor thread
> to think it is OK to enumerate the target's stack.
>
> 2)
> Below is rough code that shows an algorithm that might fix the above
> problem.  The concept is to attempt to repair the existing code base
> rather than a complete redesign.  Basically the idea is that the
> target thread will set disable_count = 1 *before* it checks
> thread->request.  And the suspension requestor thread needs to wait
> for the HW store buffer to drain.  I do not know of any HW
> architectural guarantees on the maximum clock cycles to drain the
> store buffer.  I made an intial guess of 300Msec.  It may be that
> 30msec is sufficient for many generations of future HW.  I simply do
> not know at this point.  Please see the below code for more details.
> (Even more details are in harmony-4644,  AAA_thread_suspension.cpp.)

I guess it's not a good idea to sleep explicitly for 300ms. A minor
collection can be as fast as tens of ms.

> 3)
> A question for GC folks.  What is the maximum sleep() time in the
> below code that is acceptable?  Note that for GC suspending a single
> thread, we always need to wait for the HW store buffer to drain.
> However for stop-the-world GC, we do not neccessarily have to wait for
> each target thread's store buffer to drain. We can set thread->request
> on all threads, then do a single sleep(xx msec) to ensure the store
> buffers have drained.

To force the store buffer to drain, I think a safer (and more elegant)
way is to kill a signal to the target thread, so that the target
thread is interrupted.

I am afraid this kind of explicit timing control as 300msec sleep
should always be avoided.

> 4)
> Next steps -- clean up the below suspension kernel and run experiments
> on 4-way SMP to look for race conditions.  I hope to do this today.
>
>
>
> ///////////////////////////////////////////////
>
> void mutator__hythread_suspend_disable()
> {
>     hythread_t self = get_TLS_thread();
>
>     // the following write will slowly dribble out to the coherency domain
>     // we could add an mfence but this would slow down
> hythread_suspend_disable() execution (bad idea)
>     // instead of an mfence() in the target thread, we sleep 300msec
> in the requestor thread to
>     // allow the mutator's HW store buffer to empty
>     self->disable_count++;
>
>     if (self->disable_count == 1) {
>         if (readLocation(&self->request) ) {
>
>             self->disable_count = 0;
>             self->now_blocked_at_disable_count_0_to_1 = true;
>             mfence();
>
>             while (self->j_l_t_suspend_request || self->gc_suspend_request) {
>                 // put a timeout to ensure a missed semaphore set does not
>                 // hang the system (graceful degradation)
>                 status = hysem_wait(self->resume_event, timout_3000msec);
>                 if (status == timeout && self->gc_suspend_request) log
> a warning/diagnostic somewhere
>             }
>
>             self->disable_count = 1;
>             self->now_blocked_at_disable_count_0_to_1 = false;
>             mfence();
>         }
>         else {
>             // intentionally do nothing
>             // eventually disable_count going from 0 to 1 will make it
> into the coherency domain
>         }
> }
>
> void hythread_suspend_other(hythread_t thread, suspend_t kind)
> {
>     if (kind == J_L_T_SUSPEND) {
>         if (thread->j_l_t_suspend_request == true) return;  //its
> already been dealt with
>         thread->j_l_t_suspend_request = true;
>     } else {
>         assert(kind == GCSUSPEND);
>         // only allow one GC at a time for now
>         assert(thread->gc_suspend_request = false);
>         thread->gc_suspend_request = true;
>     }
>     apr_atomic_inc32(&thread->request);
>
>     // if the target thread just did a "disable_count = 1", the below
> sleep will allow
>     // enough time so that the target thread's HW store buffer will
> empty into the
>     // coherency domain
>     sleep(300msec);
>
>     //we now know we are not fetching stale version of disable_count
> from coherency domain
>     while (readLocation(&thread->disable_count) != 0) {
>         yield();
>     }
>     if ( (thread->now_blocked_at_back_branch == 0) &&
>         (thread->now_blocked_at_disable_count_0_to_1 == 0) ) {
>             assert(0);
>     }
> }
>


-- 
http://xiao-feng.blogspot.com