You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Brian Pane <br...@cnet.com> on 2002/05/21 20:59:17 UTC

[PATCH] worker MPM deadlock

This patch addresses a problem that Ian found this morning:
it's possible for the current worker code (from 2.0.36 onward)
to get stuck in a state where all the worker threads are waiting
in ap_queue_pop() but the queue_info object thinks there are no
idle threads.  With this patch, the queue and queue_info objects
are combined, and the increment/decrement of the idle worker
count is done around the condition variable wait in ap_queue_pop().

Ian and I will be doing stress testing of the patch this afternoon,
but I'm posting the patch now in case anyone else wants to try it
in the meantime.

--Brian

Re: [PATCH 2] worker MPM deadlock

Posted by Aaron Bannert <aa...@clove.org>.

On Wed, May 22, 2002 at 02:14:30PM -0400, Cliff Woolley wrote:
> If it isn't guaranteed to do something but is rather just a "hint", that's
> fine, as long as it's documented.  But we should still make an attempt to
> honor the hint if the underlying system will let us.
> 
> [For example, I know my research work frequently requires these hints
> because ~10ms is farrrrr too long a timeslice in my field, and I love
> using APR for my research projects... I personally just assumed
> apr_thread_yield() did what I wanted until I looked into it and found that
> it did nothing.]

[moving this discussion to APR]

FWIW, here's the top of the thread that introduced this API:

http://www.apachelabs.org/apr-mbox/200107.mbox/%3csb655900.070@prv-mail20.provo.novell.com%3e

You bring up a good point, so my only question is whether we know how
to implement this on all the unix thread libraries out there.

-aaron

Re: [PATCH 2] worker MPM deadlock

Posted by Aaron Bannert <aa...@clove.org>.

On Wed, May 22, 2002 at 02:14:30PM -0400, Cliff Woolley wrote:
> If it isn't guaranteed to do something but is rather just a "hint", that's
> fine, as long as it's documented.  But we should still make an attempt to
> honor the hint if the underlying system will let us.
> 
> [For example, I know my research work frequently requires these hints
> because ~10ms is farrrrr too long a timeslice in my field, and I love
> using APR for my research projects... I personally just assumed
> apr_thread_yield() did what I wanted until I looked into it and found that
> it did nothing.]

[moving this discussion to APR]

FWIW, here's the top of the thread that introduced this API:

http://www.apachelabs.org/apr-mbox/200107.mbox/%3csb655900.070@prv-mail20.provo.novell.com%3e

You bring up a good point, so my only question is whether we know how
to implement this on all the unix thread libraries out there.

-aaron

Re: [PATCH 2] worker MPM deadlock

Posted by Bill Stoddard <bi...@wstoddard.com>.


> On Wed, May 22, 2002 at 01:59:02PM -0400, Cliff Woolley wrote:
> > On Wed, 22 May 2002, Aaron Bannert wrote:
> >
> > > The reason it's not implemented is because it's not guaranteed to do
> > > anything. Yielding is up to the discresion of the underlying system,
> > > and depending on many things it may behave differently. We can not
> > > get predictable scheduling with any variant of yield() in a way that
> > > will be portable. The only method we have right now for predictable
> > > scheduling in APR is apr_thread_cond.h.
> >
> > I'd say it's up to us to guarantee that it *does* do something on a
> > cross-platform basis.  That's APR's job.  If we're not going to do that,
> > then it needs to go away.  Having a function in the API that doesn't do
> > what it's supposed to do uniformly across platforms is bad karma.  It
> > works on Win32, why shouldn't it work on Unix?
>
> IIRC, this function exists so that netware and other
> single-multiplexed-process based thread libraries can have finer-grain
> control over execution sharing. The use case is a large computationally
> expensive function that hogs the CPU for a long time and can't yield the
> CPU since it hits none of the yieldable system calls. On systems that do
> true context switching between userspace threads, this doesn't need to be
> implemented. It in no way guarantees that the execution will be yielded,
> since that's up to the scheduling mechanism.
>
> -aaron
>

Aaron,
That is very disturbing but makes a lot of sense. Now I wonder if Sleep(0) on
WinNT -really- behaves as documented...

Bill

Re: [PATCH 2] worker MPM deadlock

Posted by Cliff Woolley <jw...@virginia.edu>.

On Wed, 22 May 2002, Aaron Bannert wrote:

> CPU since it hits none of the yieldable system calls. On systems that do
> true context switching between userspace threads, this doesn't need to
> be implemented. It in no way guarantees that the execution will be
> yielded, since that's up to the scheduling mechanism.

If it isn't guaranteed to do something but is rather just a "hint", that's
fine, as long as it's documented.  But we should still make an attempt to
honor the hint if the underlying system will let us.

[For example, I know my research work frequently requires these hints
because ~10ms is farrrrr too long a timeslice in my field, and I love
using APR for my research projects... I personally just assumed
apr_thread_yield() did what I wanted until I looked into it and found that
it did nothing.]

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA

Re: [PATCH 2] worker MPM deadlock

Posted by Aaron Bannert <aa...@clove.org>.

On Wed, May 22, 2002 at 01:59:02PM -0400, Cliff Woolley wrote:
> On Wed, 22 May 2002, Aaron Bannert wrote:
> 
> > The reason it's not implemented is because it's not guaranteed to do
> > anything. Yielding is up to the discresion of the underlying system,
> > and depending on many things it may behave differently. We can not
> > get predictable scheduling with any variant of yield() in a way that
> > will be portable. The only method we have right now for predictable
> > scheduling in APR is apr_thread_cond.h.
> 
> I'd say it's up to us to guarantee that it *does* do something on a
> cross-platform basis.  That's APR's job.  If we're not going to do that,
> then it needs to go away.  Having a function in the API that doesn't do
> what it's supposed to do uniformly across platforms is bad karma.  It
> works on Win32, why shouldn't it work on Unix?

IIRC, this function exists so that netware and other
single-multiplexed-process based thread libraries can have finer-grain
control over execution sharing. The use case is a large computationally
expensive function that hogs the CPU for a long time and can't yield the
CPU since it hits none of the yieldable system calls. On systems that do
true context switching between userspace threads, this doesn't need to be
implemented. It in no way guarantees that the execution will be yielded,
since that's up to the scheduling mechanism.

-aaron

Re: [PATCH 2] worker MPM deadlock

Posted by Cliff Woolley <jw...@virginia.edu>.

On Wed, 22 May 2002, Aaron Bannert wrote:

> The reason it's not implemented is because it's not guaranteed to do
> anything. Yielding is up to the discresion of the underlying system,
> and depending on many things it may behave differently. We can not
> get predictable scheduling with any variant of yield() in a way that
> will be portable. The only method we have right now for predictable
> scheduling in APR is apr_thread_cond.h.

I'd say it's up to us to guarantee that it *does* do something on a
cross-platform basis.  That's APR's job.  If we're not going to do that,
then it needs to go away.  Having a function in the API that doesn't do
what it's supposed to do uniformly across platforms is bad karma.  It
works on Win32, why shouldn't it work on Unix?

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA

Re: [PATCH 2] worker MPM deadlock

Posted by Aaron Bannert <aa...@clove.org>.

On Wed, May 22, 2002 at 01:50:44PM -0400, Cliff Woolley wrote:
> Right.  pthread_yield seems to be a non-portable extension, if I'm reading
> right.  From /usr/include/pthread.h on Linux:
> 
> #ifdef __USE_GNU
> /* Yield the processor to another thread or process.
>    This function is similar to the POSIX `sched_yield' function but
>    might be differently implemented in the case of a m-on-n thread
>    implementation.  */
> extern int pthread_yield (void) __THROW;
> #endif
> 
> But likewise, yield() is not portable and neither is sched_yield().  This
> is why we need apr_thread_yield() to be implemented on Unix, as I
> mentioned the last time this came up.  But it's not.
> 
> void apr_thread_yield()
> {
> }

The reason it's not implemented is because it's not guaranteed to do
anything. Yielding is up to the discresion of the underlying system,
and depending on many things it may behave differently. We can not
get predictable scheduling with any variant of yield() in a way that
will be portable. The only method we have right now for predictable
scheduling in APR is apr_thread_cond.h.

-aaron

RE: [PATCH 2] worker MPM deadlock

Posted by Cliff Woolley <jw...@virginia.edu>.

On Wed, 22 May 2002, Ryan Bloom wrote:

> > You may be right but I would -hope- that a spec pthread implementation
> > would handle pthread_yield() correctly...
>
> The problem is that Single Unix doesn't even define pthread_yield, and
> neither does the pthreads spec if I am reading it right.  It looks like
> they both require sched_yield, which should do the same thing, however.
> All I am saying is that this isn't a panacea, a lot more research would
> be needed before pthread_ or sched_ yield could be considered portable.

Right.  pthread_yield seems to be a non-portable extension, if I'm reading
right.  From /usr/include/pthread.h on Linux:

#ifdef __USE_GNU
/* Yield the processor to another thread or process.
   This function is similar to the POSIX `sched_yield' function but
   might be differently implemented in the case of a m-on-n thread
   implementation.  */
extern int pthread_yield (void) __THROW;
#endif

But likewise, yield() is not portable and neither is sched_yield().  This
is why we need apr_thread_yield() to be implemented on Unix, as I
mentioned the last time this came up.  But it's not.

void apr_thread_yield()
{
}

Oh joy.

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA

RE: [PATCH 2] worker MPM deadlock

Posted by Cliff Woolley <jw...@virginia.edu>.

On Wed, 22 May 2002, Ryan Bloom wrote:

> > You may be right but I would -hope- that a spec pthread implementation
> > would handle pthread_yield() correctly...
>
> The problem is that Single Unix doesn't even define pthread_yield, and
> neither does the pthreads spec if I am reading it right.  It looks like
> they both require sched_yield, which should do the same thing, however.
> All I am saying is that this isn't a panacea, a lot more research would
> be needed before pthread_ or sched_ yield could be considered portable.

Right.  pthread_yield seems to be a non-portable extension, if I'm reading
right.  From /usr/include/pthread.h on Linux:

#ifdef __USE_GNU
/* Yield the processor to another thread or process.
   This function is similar to the POSIX `sched_yield' function but
   might be differently implemented in the case of a m-on-n thread
   implementation.  */
extern int pthread_yield (void) __THROW;
#endif

But likewise, yield() is not portable and neither is sched_yield().  This
is why we need apr_thread_yield() to be implemented on Unix, as I
mentioned the last time this came up.  But it's not.

void apr_thread_yield()
{
}

Oh joy.

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA

RE: [PATCH 2] worker MPM deadlock

Posted by Ryan Bloom <rb...@covalent.net>.

Copying APR, because this is becoming an APR issue quickly.

> > > From: trawick@rdu88-251-253.nc.rr.com [mailto:trawick@rdu88-251-
> > >
> > > "Bill Stoddard" <bi...@wstoddard.com> writes:
> > >
> > > > Is there a unix equivalent to the Win32 Sleep(0) call? If so,
then
> > we
> > > can rip out all this
> > > > cruft and use the patch I posted earlier.
> > >
> > > google for pthread_yield() and compare the doc with the doc for
> > > Win32's Sleep(0).  I suspect it is the same as long as we're
dealing
> > > with pthreads.
> >
> > The biggest problem with pthread_yield is that it doesn't always do
> > anything.  Some pthreads implementations don't really implement
> > pthread_yield, in fact, if I remember correctly AIX is one of those.
> 
> You are thinking about the old draft 7 pthread implementation in AIX
4.?
> (where ?=2 if I
> recall correctly).

Yep, you are correct, even down to AIX 4.2.

> 
> > I
> > could be wrong about AIX though.  I do know that when I first wrote
the
> > APR thread library, I specifically left out yield because it had
> > different results on different platforms.
> 
> You may be right but I would -hope- that a spec pthread implementation
> would handle
> pthread_yield() correctly...

The problem is that Single Unix doesn't even define pthread_yield, and
neither does the pthreads spec if I am reading it right.  It looks like
they both require sched_yield, which should do the same thing, however.
All I am saying is that this isn't a panacea, a lot more research would
be needed before pthread_ or sched_ yield could be considered portable.

Ryan

RE: [PATCH 2] worker MPM deadlock

Posted by Ryan Bloom <rb...@covalent.net>.

Copying APR, because this is becoming an APR issue quickly.

> > > From: trawick@rdu88-251-253.nc.rr.com [mailto:trawick@rdu88-251-
> > >
> > > "Bill Stoddard" <bi...@wstoddard.com> writes:
> > >
> > > > Is there a unix equivalent to the Win32 Sleep(0) call? If so,
then
> > we
> > > can rip out all this
> > > > cruft and use the patch I posted earlier.
> > >
> > > google for pthread_yield() and compare the doc with the doc for
> > > Win32's Sleep(0).  I suspect it is the same as long as we're
dealing
> > > with pthreads.
> >
> > The biggest problem with pthread_yield is that it doesn't always do
> > anything.  Some pthreads implementations don't really implement
> > pthread_yield, in fact, if I remember correctly AIX is one of those.
> 
> You are thinking about the old draft 7 pthread implementation in AIX
4.?
> (where ?=2 if I
> recall correctly).

Yep, you are correct, even down to AIX 4.2.

> 
> > I
> > could be wrong about AIX though.  I do know that when I first wrote
the
> > APR thread library, I specifically left out yield because it had
> > different results on different platforms.
> 
> You may be right but I would -hope- that a spec pthread implementation
> would handle
> pthread_yield() correctly...

The problem is that Single Unix doesn't even define pthread_yield, and
neither does the pthreads spec if I am reading it right.  It looks like
they both require sched_yield, which should do the same thing, however.
All I am saying is that this isn't a panacea, a lot more research would
be needed before pthread_ or sched_ yield could be considered portable.

Ryan

Re: [PATCH 2] worker MPM deadlock

Posted by Bill Stoddard <bi...@wstoddard.com>.

> > From: trawick@rdu88-251-253.nc.rr.com [mailto:trawick@rdu88-251-
> >
> > "Bill Stoddard" <bi...@wstoddard.com> writes:
> >
> > > Is there a unix equivalent to the Win32 Sleep(0) call? If so, then
> we
> > can rip out all this
> > > cruft and use the patch I posted earlier.
> >
> > google for pthread_yield() and compare the doc with the doc for
> > Win32's Sleep(0).  I suspect it is the same as long as we're dealing
> > with pthreads.
>
> The biggest problem with pthread_yield is that it doesn't always do
> anything.  Some pthreads implementations don't really implement
> pthread_yield, in fact, if I remember correctly AIX is one of those.

You are thinking about the old draft 7 pthread implementation in AIX 4.? (where ?=2 if I
recall correctly).

> I
> could be wrong about AIX though.  I do know that when I first wrote the
> APR thread library, I specifically left out yield because it had
> different results on different platforms.

You may be right but I would -hope- that a spec pthread implementation would handle
pthread_yield() correctly...

Bill

RE: [PATCH 2] worker MPM deadlock

Posted by Ryan Bloom <rb...@covalent.net>.

> From: trawick@rdu88-251-253.nc.rr.com [mailto:trawick@rdu88-251-
> 
> "Bill Stoddard" <bi...@wstoddard.com> writes:
> 
> > Is there a unix equivalent to the Win32 Sleep(0) call? If so, then
we
> can rip out all this
> > cruft and use the patch I posted earlier.
> 
> google for pthread_yield() and compare the doc with the doc for
> Win32's Sleep(0).  I suspect it is the same as long as we're dealing
> with pthreads.

The biggest problem with pthread_yield is that it doesn't always do
anything.  Some pthreads implementations don't really implement
pthread_yield, in fact, if I remember correctly AIX is one of those.  I
could be wrong about AIX though.  I do know that when I first wrote the
APR thread library, I specifically left out yield because it had
different results on different platforms.

Ryan

Re: [PATCH 2] worker MPM deadlock

Posted by Jeff Trawick <tr...@attglobal.net>.

"Bill Stoddard" <bi...@wstoddard.com> writes:

> Is there a unix equivalent to the Win32 Sleep(0) call? If so, then we can rip out all this
> cruft and use the patch I posted earlier.

google for pthread_yield() and compare the doc with the doc for
Win32's Sleep(0).  I suspect it is the same as long as we're dealing
with pthreads.

-- 
Jeff Trawick | trawick@attglobal.net
Born in Roswell... married an alien...

Re: [PATCH 2] worker MPM deadlock

Posted by Jeff Trawick <tr...@attglobal.net>.

"Ryan Bloom" <rb...@covalent.net> writes:

> > From: Aaron Bannert [mailto:aaron@clove.org]
> > 
> > On Wed, May 22, 2002 at 01:31:06PM -0400, Bill Stoddard wrote:
> > > Yep this is a problem. I believe the pthread spec states that the
> > cond_wait can be
> > > spuriously triggered.
> > 
> > I'm pretty sure it does, which is why we always check for the
> condition
> > when we wake up and go right back to sleep if the condition hasn't
> been
> > met.
> 
> I'll raise you and say without a doubt, pthread_cond can be triggered
> spuriously.

Yep, I too have seen it repeatedly.  Infuriating to have to code for
something like that.

-- 
Jeff Trawick | trawick@attglobal.net
Born in Roswell... married an alien...

RE: [PATCH 2] worker MPM deadlock

Posted by Ryan Bloom <rb...@covalent.net>.

> From: Aaron Bannert [mailto:aaron@clove.org]
> 
> On Wed, May 22, 2002 at 01:31:06PM -0400, Bill Stoddard wrote:
> > Yep this is a problem. I believe the pthread spec states that the
> cond_wait can be
> > spuriously triggered.
> 
> I'm pretty sure it does, which is why we always check for the
condition
> when we wake up and go right back to sleep if the condition hasn't
been
> met.

I'll raise you and say without a doubt, pthread_cond can be triggered
spuriously.

Ryan

Re: [PATCH 2] worker MPM deadlock

Posted by Aaron Bannert <aa...@clove.org>.

On Wed, May 22, 2002 at 01:31:06PM -0400, Bill Stoddard wrote:
> Yep this is a problem. I believe the pthread spec states that the cond_wait can be
> spuriously triggered.

I'm pretty sure it does, which is why we always check for the condition
when we wake up and go right back to sleep if the condition hasn't been
met.

-aaron

Re: [PATCH 2] worker MPM deadlock

Posted by Bill Stoddard <bi...@wstoddard.com>.


> Bill Stoddard wrote:
>
> >I don't know if this will actually work, haven't fully considered all failure
> >possibilities. I think there are better implementations of the basic idea. Need to use
> >atomic increment/decrement and replace the yield() call with pthread_yield() or
> >whatever...
> >
> >Bill
> >
> >===================================================================
> >RCS file: /home/cvs/httpd-2.0/server/mpm/worker/worker.c,v
> >retrieving revision 1.117
> >diff -u -r1.117 worker.c
> >--- worker.c 18 Apr 2002 17:46:20 -0000 1.117
> >+++ worker.c 26 Apr 2002 17:59:16 -0000
> >@@ -156,6 +156,7 @@
> >  */
> >
> > int ap_threads_per_child = 0;         /* Worker threads per child */
> >+static int worker_thread_cnt = 0;
> > static int ap_daemons_to_start = 0;
> > static int min_spare_threads = 0;
> > static int max_spare_threads = 0;
> >@@ -693,6 +694,14 @@
> >         }
> >         if (listener_may_exit) break;
> >
> >+        /* If no worker threads are available, yield our quanta and try again
> >+         * later
> >+         */
> >+        if (!worker_thread_cnt) {
> >+            yield();
> >+            continue;
> >+        }
> >+
> >         if ((rv = SAFE_ACCEPT(apr_proc_mutex_lock(accept_mutex)))
> >             != APR_SUCCESS) {
> >             int level = APLOG_EMERG;
> >@@ -791,6 +800,7 @@
> >                 signal_threads(ST_GRACEFUL);
> >             }
> >             if (csd != NULL) {
> >+                worker_thread_cnt--;
> >                 rv = ap_queue_push(worker_queue, csd, ptrans,
> >                                    &recycled_pool);
> >                 if (rv) {
> >@@ -852,6 +862,7 @@
> >
> >     while (!workers_may_exit) {
> >         ap_update_child_status_from_indexes(process_slot, thread_slot, SERVER_READY,
> >NULL);
> >+        worker_thread_cnt++;
> >         rv = ap_queue_pop(worker_queue, &csd, &ptrans, last_ptrans);
> >         last_ptrans = NULL;
> >
>
> Thanks for re-posting.  At first glance, what I like about this patch is
> that it creates a model in which the workers produce "tokens" and the
> listener
> consumes the tokens.  This removes some of the race conditions associated
> with the my patch in which the workers do both the increment and the
> decrement.
>
> However, there's a problem with this approach (aside from the whole issue of
> spinning around the (worker_thread_count != 0) check).  If the
> ap_queue_pop()
> function returns without actually acquiring a connection, you'll end up
> doubly incrementing the idle worker count (as the worker loop enters its
> next iteration).  This can happen if the cond_wait gets interrupted.  I
> believe Aaron has some test cases in which this has happened, and I saw
> what I think is this same phenomenon yesterday when debugging.
>
> --Brian

Yep this is a problem. I believe the pthread spec states that the cond_wait can be
spuriously triggered.

Bill

Re: [PATCH 2] worker MPM deadlock

Posted by Brian Pane <br...@cnet.com>.

Bill Stoddard wrote:

>I don't know if this will actually work, haven't fully considered all failure
>possibilities. I think there are better implementations of the basic idea. Need to use
>atomic increment/decrement and replace the yield() call with pthread_yield() or
>whatever...
>
>Bill
>
>===================================================================
>RCS file: /home/cvs/httpd-2.0/server/mpm/worker/worker.c,v
>retrieving revision 1.117
>diff -u -r1.117 worker.c
>--- worker.c 18 Apr 2002 17:46:20 -0000 1.117
>+++ worker.c 26 Apr 2002 17:59:16 -0000
>@@ -156,6 +156,7 @@
>  */
>
> int ap_threads_per_child = 0;         /* Worker threads per child */
>+static int worker_thread_cnt = 0;
> static int ap_daemons_to_start = 0;
> static int min_spare_threads = 0;
> static int max_spare_threads = 0;
>@@ -693,6 +694,14 @@
>         }
>         if (listener_may_exit) break;
>
>+        /* If no worker threads are available, yield our quanta and try again
>+         * later
>+         */
>+        if (!worker_thread_cnt) {
>+            yield();
>+            continue;
>+        }
>+
>         if ((rv = SAFE_ACCEPT(apr_proc_mutex_lock(accept_mutex)))
>             != APR_SUCCESS) {
>             int level = APLOG_EMERG;
>@@ -791,6 +800,7 @@
>                 signal_threads(ST_GRACEFUL);
>             }
>             if (csd != NULL) {
>+                worker_thread_cnt--;
>                 rv = ap_queue_push(worker_queue, csd, ptrans,
>                                    &recycled_pool);
>                 if (rv) {
>@@ -852,6 +862,7 @@
>
>     while (!workers_may_exit) {
>         ap_update_child_status_from_indexes(process_slot, thread_slot, SERVER_READY,
>NULL);
>+        worker_thread_cnt++;
>         rv = ap_queue_pop(worker_queue, &csd, &ptrans, last_ptrans);
>         last_ptrans = NULL;
>

Thanks for re-posting.  At first glance, what I like about this patch is
that it creates a model in which the workers produce "tokens" and the 
listener
consumes the tokens.  This removes some of the race conditions associated
with the my patch in which the workers do both the increment and the 
decrement.

However, there's a problem with this approach (aside from the whole issue of
spinning around the (worker_thread_count != 0) check).  If the 
ap_queue_pop()
function returns without actually acquiring a connection, you'll end up
doubly incrementing the idle worker count (as the worker loop enters its
next iteration).  This can happen if the cond_wait gets interrupted.  I
believe Aaron has some test cases in which this has happened, and I saw
what I think is this same phenomenon yesterday when debugging.

--Brian

Re: [PATCH 2] worker MPM deadlock

Posted by Bill Stoddard <bi...@wstoddard.com>.

I don't know if this will actually work, haven't fully considered all failure
possibilities. I think there are better implementations of the basic idea. Need to use
atomic increment/decrement and replace the yield() call with pthread_yield() or
whatever...

Bill

===================================================================
RCS file: /home/cvs/httpd-2.0/server/mpm/worker/worker.c,v
retrieving revision 1.117
diff -u -r1.117 worker.c
--- worker.c 18 Apr 2002 17:46:20 -0000 1.117
+++ worker.c 26 Apr 2002 17:59:16 -0000
@@ -156,6 +156,7 @@
  */

 int ap_threads_per_child = 0;         /* Worker threads per child */
+static int worker_thread_cnt = 0;
 static int ap_daemons_to_start = 0;
 static int min_spare_threads = 0;
 static int max_spare_threads = 0;
@@ -693,6 +694,14 @@
         }
         if (listener_may_exit) break;

+        /* If no worker threads are available, yield our quanta and try again
+         * later
+         */
+        if (!worker_thread_cnt) {
+            yield();
+            continue;
+        }
+
         if ((rv = SAFE_ACCEPT(apr_proc_mutex_lock(accept_mutex)))
             != APR_SUCCESS) {
             int level = APLOG_EMERG;
@@ -791,6 +800,7 @@
                 signal_threads(ST_GRACEFUL);
             }
             if (csd != NULL) {
+                worker_thread_cnt--;
                 rv = ap_queue_push(worker_queue, csd, ptrans,
                                    &recycled_pool);
                 if (rv) {
@@ -852,6 +862,7 @@

     while (!workers_may_exit) {
         ap_update_child_status_from_indexes(process_slot, thread_slot, SERVER_READY,
NULL);
+        worker_thread_cnt++;
         rv = ap_queue_pop(worker_queue, &csd, &ptrans, last_ptrans);
         last_ptrans = NULL;



----- Original Message -----
From: "Brian Pane" <br...@cnet.com>
To: <de...@httpd.apache.org>
Sent: Wednesday, May 22, 2002 11:52 AM
Subject: Re: [PATCH 2] worker MPM deadlock


> Bill Stoddard wrote:
>
> >Is there a unix equivalent to the Win32 Sleep(0) call? If so, then we can rip out all
this
> >cruft and use the patch I posted earlier.
> >
>
> Can you repost your patch (or post a link to an archived copy)?
>
> Thanks,
> --Brian
>
>
>

Re: [PATCH 2] worker MPM deadlock

Posted by Brian Pane <br...@cnet.com>.

Bill Stoddard wrote:

>Is there a unix equivalent to the Win32 Sleep(0) call? If so, then we can rip out all this
>cruft and use the patch I posted earlier.
>

Can you repost your patch (or post a link to an archived copy)?

Thanks,
--Brian

Re: [PATCH 2] worker MPM deadlock

Posted by Bill Stoddard <bi...@wstoddard.com>.

Is there a unix equivalent to the Win32 Sleep(0) call? If so, then we can rip out all this
cruft and use the patch I posted earlier.

Bill

----- Original Message -----
From: "Brian Pane" <br...@cnet.com>
To: <de...@httpd.apache.org>
Sent: Tuesday, May 21, 2002 8:08 PM
Subject: [PATCH 2] worker MPM deadlock


> Here is an updated worker patch.  It fixes a race condition
> that the first patch didn't: it was possible for the listener
> thread to push multiple connections onto the fd queue between
> the cond_signal and the subsequent wakeup of the next worker.
> This caused problems because the idle worker count, which the
> listener used to decide when to accept more connections, didn't
> get decremented until the worker woke up.  Thus the listener
> could overflow the connection queue.
>
> The fix, as implemented in this new patch, is to make the listener
> block if either: 1) there are no idle workers, or 2) the queue is
> full.
>
> This patch isn't a complete fix, as there is now an error case
> in which the listener thread fails to exit during shutdown.  It
> needs some more testing and cleanup work.
>
> Meanwhile, I'm going to take another look at leader/follower,
> because IMHO the worker synchronization logic is getting far too
> complicated.
>
> --Brian
>
>


--------------------------------------------------------------------------------


> Index: server/mpm/worker/fdqueue.h
> ===================================================================
> RCS file: /home/cvs/httpd-2.0/server/mpm/worker/fdqueue.h,v
> retrieving revision 1.19
> diff -u -r1.19 fdqueue.h
> --- server/mpm/worker/fdqueue.h 28 Apr 2002 23:12:35 -0000 1.19
> +++ server/mpm/worker/fdqueue.h 21 May 2002 23:57:16 -0000
> @@ -71,16 +71,6 @@
>  #endif
>  #include <apr_errno.h>
>
> -typedef struct fd_queue_info_t fd_queue_info_t;
> -
> -apr_status_t ap_queue_info_create(fd_queue_info_t **queue_info,
> -                                  apr_pool_t *pool, int max_idlers);
> -apr_status_t ap_queue_info_set_idle(fd_queue_info_t *queue_info,
> -                                    apr_pool_t *pool_to_recycle);
> -apr_status_t ap_queue_info_wait_for_idler(fd_queue_info_t *queue_info,
> -                                          apr_pool_t **recycled_pool);
> -apr_status_t ap_queue_info_term(fd_queue_info_t *queue_info);
> -
>  struct fd_queue_elem_t {
>      apr_socket_t      *sd;
>      apr_pool_t        *p;
> @@ -94,13 +84,19 @@
>      apr_thread_mutex_t *one_big_mutex;
>      apr_thread_cond_t  *not_empty;
>      int                 terminated;
> +    int                 idlers;
> +    apr_thread_cond_t  *idlers_available;
> +    apr_pool_t        **recycled_pools;
> +    int num_recycled;
>  };
>  typedef struct fd_queue_t fd_queue_t;
>
>  apr_status_t ap_queue_init(fd_queue_t *queue, int queue_capacity, apr_pool_t *a);
>  apr_status_t ap_queue_push(fd_queue_t *queue, apr_socket_t *sd, apr_pool_t *p);
> -apr_status_t ap_queue_pop(fd_queue_t *queue, apr_socket_t **sd, apr_pool_t **p);
> +apr_status_t ap_queue_pop(fd_queue_t *queue, apr_socket_t **sd, apr_pool_t **p,
> +                          apr_pool_t **recycled);
>  apr_status_t ap_queue_interrupt_all(fd_queue_t *queue);
> +apr_status_t ap_queue_wait_for_idler(fd_queue_t *queue, apr_pool_t **recycled);
>  apr_status_t ap_queue_term(fd_queue_t *queue);
>
>  #endif /* FDQUEUE_H */
> Index: server/mpm/worker/fdqueue.c
> ===================================================================
> RCS file: /home/cvs/httpd-2.0/server/mpm/worker/fdqueue.c,v
> retrieving revision 1.22
> diff -u -r1.22 fdqueue.c
> --- server/mpm/worker/fdqueue.c 29 Apr 2002 01:57:39 -0000 1.22
> +++ server/mpm/worker/fdqueue.c 21 May 2002 23:57:16 -0000
> @@ -58,138 +58,6 @@
>
>  #include "fdqueue.h"
>
> -struct fd_queue_info_t {
> -    int idlers;
> -    apr_thread_mutex_t *idlers_mutex;
> -    apr_thread_cond_t *wait_for_idler;
> -    int terminated;
> -    int max_idlers;
> -    apr_pool_t        **recycled_pools;
> -    int num_recycled;
> -};
> -
> -static apr_status_t queue_info_cleanup(void *data_)
> -{
> -    fd_queue_info_t *qi = data_;
> -    int i;
> -    apr_thread_cond_destroy(qi->wait_for_idler);
> -    apr_thread_mutex_destroy(qi->idlers_mutex);
> -    for (i = 0; i < qi->num_recycled; i++) {
> -        apr_pool_destroy(qi->recycled_pools[i]);
> -    }
> -    return APR_SUCCESS;
> -}
> -
> -apr_status_t ap_queue_info_create(fd_queue_info_t **queue_info,
> -                                  apr_pool_t *pool, int max_idlers)
> -{
> -    apr_status_t rv;
> -    fd_queue_info_t *qi;
> -
> -    qi = apr_palloc(pool, sizeof(*qi));
> -    memset(qi, 0, sizeof(*qi));
> -
> -    rv = apr_thread_mutex_create(&qi->idlers_mutex, APR_THREAD_MUTEX_DEFAULT,
> -                                 pool);
> -    if (rv != APR_SUCCESS) {
> -        return rv;
> -    }
> -    rv = apr_thread_cond_create(&qi->wait_for_idler, pool);
> -    if (rv != APR_SUCCESS) {
> -        return rv;
> -    }
> -    qi->recycled_pools = (apr_pool_t **)apr_palloc(pool, max_idlers *
> -                                                   sizeof(apr_pool_t *));
> -    qi->num_recycled = 0;
> -    qi->max_idlers = max_idlers;
> -    apr_pool_cleanup_register(pool, qi, queue_info_cleanup,
> -                              apr_pool_cleanup_null);
> -
> -    *queue_info = qi;
> -
> -    return APR_SUCCESS;
> -}
> -
> -apr_status_t ap_queue_info_set_idle(fd_queue_info_t *queue_info,
> -                                    apr_pool_t *pool_to_recycle)
> -{
> -    apr_status_t rv;
> -    rv = apr_thread_mutex_lock(queue_info->idlers_mutex);
> -    if (rv != APR_SUCCESS) {
> -        return rv;
> -    }
> -    AP_DEBUG_ASSERT(queue_info->idlers >= 0);
> -    AP_DEBUG_ASSERT(queue_info->num_recycled < queue_info->max_idlers);
> -    if (pool_to_recycle) {
> -        queue_info->recycled_pools[queue_info->num_recycled++] =
> -            pool_to_recycle;
> -    }
> -    if (queue_info->idlers++ == 0) {
> -        /* Only signal if we had no idlers before. */
> -        apr_thread_cond_signal(queue_info->wait_for_idler);
> -    }
> -    rv = apr_thread_mutex_unlock(queue_info->idlers_mutex);
> -    if (rv != APR_SUCCESS) {
> -        return rv;
> -    }
> -    return APR_SUCCESS;
> -}
> -
> -apr_status_t ap_queue_info_wait_for_idler(fd_queue_info_t *queue_info,
> -                                          apr_pool_t **recycled_pool)
> -{
> -    apr_status_t rv;
> -    *recycled_pool = NULL;
> -    rv = apr_thread_mutex_lock(queue_info->idlers_mutex);
> -    if (rv != APR_SUCCESS) {
> -        return rv;
> -    }
> -    AP_DEBUG_ASSERT(queue_info->idlers >= 0);
> -    while ((queue_info->idlers == 0) && (!queue_info->terminated)) {
> -        rv = apr_thread_cond_wait(queue_info->wait_for_idler,
> -                                  queue_info->idlers_mutex);
> -        if (rv != APR_SUCCESS) {
> -            apr_status_t rv2;
> -            rv2 = apr_thread_mutex_unlock(queue_info->idlers_mutex);
> -            if (rv2 != APR_SUCCESS) {
> -                return rv2;
> -            }
> -            return rv;
> -        }
> -    }
> -    queue_info->idlers--; /* Oh, and idler? Let's take 'em! */
> -    if (queue_info->num_recycled) {
> -        *recycled_pool =
> -            queue_info->recycled_pools[--queue_info->num_recycled];
> -    }
> -    rv = apr_thread_mutex_unlock(queue_info->idlers_mutex);
> -    if (rv != APR_SUCCESS) {
> -        return rv;
> -    }
> -    else if (queue_info->terminated) {
> -        return APR_EOF;
> -    }
> -    else {
> -        return APR_SUCCESS;
> -    }
> -}
> -
> -apr_status_t ap_queue_info_term(fd_queue_info_t *queue_info)
> -{
> -    apr_status_t rv;
> -    rv = apr_thread_mutex_lock(queue_info->idlers_mutex);
> -    if (rv != APR_SUCCESS) {
> -        return rv;
> -    }
> -    queue_info->terminated = 1;
> -    apr_thread_cond_broadcast(queue_info->wait_for_idler);
> -    rv = apr_thread_mutex_unlock(queue_info->idlers_mutex);
> -    if (rv != APR_SUCCESS) {
> -        return rv;
> -    }
> -    return APR_SUCCESS;
> -}
> -
>  /**
>   * Detects when the fd_queue_t is full. This utility function is expected
>   * to be called from within critical sections, and is not threadsafe.
> @@ -209,13 +77,17 @@
>  static apr_status_t ap_queue_destroy(void *data)
>  {
>      fd_queue_t *queue = data;
> +    int i;
>
>      /* Ignore errors here, we can't do anything about them anyway.
>       * XXX: We should at least try to signal an error here, it is
>       * indicative of a programmer error. -aaron */
>      apr_thread_cond_destroy(queue->not_empty);
> +    apr_thread_cond_destroy(queue->idlers_available);
>      apr_thread_mutex_destroy(queue->one_big_mutex);
> -
> +    for (i = 0; i < queue->num_recycled; i++) {
> +        apr_pool_destroy(queue->recycled_pools[i]);
> +    }
>      return APR_SUCCESS;
>  }
>
> @@ -234,15 +106,25 @@
>      if ((rv = apr_thread_cond_create(&queue->not_empty, a)) != APR_SUCCESS) {
>          return rv;
>      }
> +    if ((rv = apr_thread_cond_create(&queue->idlers_available, a)) !=
> +        APR_SUCCESS) {
> +        return rv;
> +    }
>
>      queue->data = apr_palloc(a, queue_capacity * sizeof(fd_queue_elem_t));
>      queue->bounds = queue_capacity;
>      queue->nelts = 0;
> +    queue->terminated = 0;
> +    queue->idlers = 0;
>
>      /* Set all the sockets in the queue to NULL */
>      for (i = 0; i < queue_capacity; ++i)
>          queue->data[i].sd = NULL;
>
> +    queue->recycled_pools = (apr_pool_t **)apr_palloc(a, queue->bounds *
> +                                                      sizeof(apr_pool_t *));
> +    queue->num_recycled = 0;
> +
>      apr_pool_cleanup_register(a, queue, ap_queue_destroy, apr_pool_cleanup_null);
>
>      return APR_SUCCESS;
> @@ -285,7 +167,8 @@
>   * Once retrieved, the socket is placed into the address specified by
>   * 'sd'.
>   */
> -apr_status_t ap_queue_pop(fd_queue_t *queue, apr_socket_t **sd, apr_pool_t **p)
> +apr_status_t ap_queue_pop(fd_queue_t *queue, apr_socket_t **sd, apr_pool_t **p,
> +                          apr_pool_t **recycled)
>  {
>      fd_queue_elem_t *elem;
>      apr_status_t rv;
> @@ -294,6 +177,21 @@
>          return rv;
>      }
>
> +    if (recycled && *recycled) {
> +        if (queue->num_recycled == queue->bounds) {
> +            apr_pool_destroy(*recycled);
> +        }
> +        else {
> +            queue->recycled_pools[queue->num_recycled++] = *recycled;
> +        }
> +        *recycled = NULL;
> +    }
> +
> +    queue->idlers++;
> +    if (queue->idlers == 1) {
> +        apr_thread_cond_signal(queue->idlers_available);
> +    }
> +
>      /* Keep waiting until we wake up and find that the queue is not empty. */
>      if (ap_queue_empty(queue)) {
>          if (!queue->terminated) {
> @@ -301,6 +199,7 @@
>          }
>          /* If we wake up and it's still empty, then we were interrupted */
>          if (ap_queue_empty(queue)) {
> +            queue->idlers--;
>              rv = apr_thread_mutex_unlock(queue->one_big_mutex);
>              if (rv != APR_SUCCESS) {
>                  return rv;
> @@ -312,7 +211,10 @@
>                  return APR_EINTR;
>              }
>          }
> -    }
> +    }
> +    else if (queue->nelts == queue->bounds) {
> +        apr_thread_cond_signal(queue->idlers_available);
> +    }
>
>      elem = &queue->data[--queue->nelts];
>      *sd = elem->sd;
> @@ -322,6 +224,7 @@
>      elem->p = NULL;
>  #endif /* AP_DEBUG */
>
> +    queue->idlers--;
>      rv = apr_thread_mutex_unlock(queue->one_big_mutex);
>      return rv;
>  }
> @@ -338,6 +241,28 @@
>          return rv;
>      }
>      return APR_SUCCESS;
> +}
> +
> +apr_status_t ap_queue_wait_for_idler(fd_queue_t *queue,
> +                                     apr_pool_t **recycled_pool)
> +{
> +    apr_status_t rv;
> +    if ((rv = apr_thread_mutex_lock(queue->one_big_mutex)) != APR_SUCCESS) {
> +        return rv;
> +    }
> +    while (!queue->terminated && (queue->idlers == 0 ||
> +                                  queue->nelts == queue->bounds)) {
> +        apr_thread_cond_wait(queue->idlers_available, queue->one_big_mutex);
> +    }
> +    if (recycled_pool && *recycled_pool) {
> +        if (queue->num_recycled > 0) {
> +            *recycled_pool = queue->recycled_pools[--queue->num_recycled];
> +        }
> +        else {
> +            *recycled_pool = NULL;
> +        }
> +    }
> +    return apr_thread_mutex_unlock(queue->one_big_mutex);
>  }
>
>  apr_status_t ap_queue_term(fd_queue_t *queue)
> Index: server/mpm/worker/worker.c
> ===================================================================
> RCS file: /home/cvs/httpd-2.0/server/mpm/worker/worker.c,v
> retrieving revision 1.124
> diff -u -r1.124 worker.c
> --- server/mpm/worker/worker.c 17 May 2002 11:11:39 -0000 1.124
> +++ server/mpm/worker/worker.c 21 May 2002 23:57:17 -0000
> @@ -173,7 +173,6 @@
>  static int num_listensocks = 0;
>  static int resource_shortage = 0;
>  static fd_queue_t *worker_queue;
> -static fd_queue_info_t *worker_queue_info;
>
>  /* The structure used to pass unique initialization info to each thread */
>  typedef struct {
> @@ -315,7 +314,6 @@
>      if (mode == ST_UNGRACEFUL) {
>          workers_may_exit = 1;
>          ap_queue_interrupt_all(worker_queue);
> -        ap_queue_info_term(worker_queue_info);
>          close_worker_sockets(); /* forcefully kill all current connections */
>      }
>  }
> @@ -711,8 +709,7 @@
>          }
>          if (listener_may_exit) break;
>
> -        rv = ap_queue_info_wait_for_idler(worker_queue_info,
> -                                          &recycled_pool);
> +        rv = ap_queue_wait_for_idler(worker_queue, &recycled_pool);
>          if (APR_STATUS_IS_EOF(rv)) {
>              break; /* we've been signaled to die now */
>          }
> @@ -796,6 +793,7 @@
>              }
>              else {
>                  ptrans = recycled_pool;
> +                recycled_pool = NULL;
>              }
>              apr_pool_tag(ptrans, "transaction");
>              rv = lr->accept_func(&csd, lr, ptrans);
> @@ -883,22 +881,13 @@
>      bucket_alloc = apr_bucket_alloc_create(apr_thread_pool_get(thd));
>
>      while (!workers_may_exit) {
> -        rv = ap_queue_info_set_idle(worker_queue_info, last_ptrans);
>          last_ptrans = NULL;
> -        if (rv != APR_SUCCESS) {
> -            ap_log_error(APLOG_MARK, APLOG_EMERG, rv, ap_server_conf,
> -                         "ap_queue_info_set_idle failed. Attempting to "
> -                         "shutdown process gracefully.");
> -            signal_threads(ST_GRACEFUL);
> -            break;
> -        }
> -
>          ap_update_child_status_from_indexes(process_slot, thread_slot, SERVER_READY,
NULL);
>  worker_pop:
>          if (workers_may_exit) {
>              break;
>          }
> -        rv = ap_queue_pop(worker_queue, &csd, &ptrans);
> +        rv = ap_queue_pop(worker_queue, &csd, &ptrans, &last_ptrans);
>
>          if (rv != APR_SUCCESS) {
>              /* We get APR_EOF during a graceful shutdown once all the connections
> @@ -1010,14 +999,6 @@
>      if (rv != APR_SUCCESS) {
>          ap_log_error(APLOG_MARK, APLOG_ALERT, rv, ap_server_conf,
>                       "ap_queue_init() failed");
> -        clean_child_exit(APEXIT_CHILDFATAL);
> -    }
> -
> -    rv = ap_queue_info_create(&worker_queue_info, pchild,
> -                              ap_threads_per_child);
> -    if (rv != APR_SUCCESS) {
> -        ap_log_error(APLOG_MARK, APLOG_ALERT, rv, ap_server_conf,
> -                     "ap_queue_info_create() failed");
>          clean_child_exit(APEXIT_CHILDFATAL);
>      }
>
>

[PATCH 2] worker MPM deadlock

Posted by Brian Pane <br...@cnet.com>.

Here is an updated worker patch.  It fixes a race condition
that the first patch didn't: it was possible for the listener
thread to push multiple connections onto the fd queue between
the cond_signal and the subsequent wakeup of the next worker.
This caused problems because the idle worker count, which the
listener used to decide when to accept more connections, didn't
get decremented until the worker woke up.  Thus the listener
could overflow the connection queue.

The fix, as implemented in this new patch, is to make the listener
block if either: 1) there are no idle workers, or 2) the queue is
full.

This patch isn't a complete fix, as there is now an error case
in which the listener thread fails to exit during shutdown.  It
needs some more testing and cleanup work.

Meanwhile, I'm going to take another look at leader/follower,
because IMHO the worker synchronization logic is getting far too
complicated.

--Brian