You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Aaron Bannert <aa...@clove.org> on 2002/04/10 07:51:16 UTC

Re: [PATCH] convert worker MPM to leader/followers design

On Tue, Apr 09, 2002 at 10:51:29PM -0700, Brian Pane wrote:
> ...

Having not read or reviewed this patch, I must say that I'm -1
for replacing the worker MPM, but I'm fully in favor of adding
a new MPM with these new features.

That said, I'll review this patch tomorrow, :)
-aaron

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Bill Stoddard <bi...@wstoddard.com>.

> On Thu, Apr 11, 2002 at 07:33:40AM -0700, Brian Pane wrote:
> > I disagree.  Unless someone wants to volunteer to put a workaround
> > in the current worker code to fix the queue-full case, I can't
> > rationalize including it in another beta or GA release.  We need
> > to either fix it or remove it.
> 
> I volunteer to fix it.
> 
> -aaron
> 

I actually like Brian's patch quite a bit.  I think it is the right approach.

Bill

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Aaron Bannert <aa...@clove.org>.

On Thu, Apr 11, 2002 at 05:19:36PM -0700, Roy T. Fielding wrote:
> It doesn't cause a ceiling -- it causes the M+1 request to go hang out in
> limbo when it could have been processed by the other child.  This isn't 
> only
> likely, it will happen consistently on any server that handles more than
> M requests per second on a regular basis.  Any high-end site.

OK, I think I've just realized something. When I was working on the worker
MPM I tested all these kinds of scenarios, with both ab and flood and
I never ran into what we're seeing now. I have a theory:

Back then we had the POD and at least one listener, which essentially
caused us to never use S_L_U_A, meaning we always had an accept mutex.
Now that that's been corrected, the problem with the "we can accept
more connections than we can immediately process" is showing up because,
as you said (below), the OS is electing the accept() call from the
same process before migrating to another process.

A simple way to check if this is true is by adding a second Listen
directive and see if the problem goes away.

[...]

> What do you mean by "worst case"?  That is almost every case.  You are
> forgetting that the child has LIFO characteristics, which means it will
> handle every request until the M+1 arrives.

Not exactly, not unless you're talking about the OS and not the worker
design. That is not what LIFO means to the worker. If you have an accept
mutex then the election over that mutex determines which child will
call accept.

> Furthermore, you are assuming
> that request arrival rates are normally distributed, which simply isn't
> the case.

I stated that assumption. I don't know what kind of distribution we're
getting, but I can assure you that when I did my load tests on the
worker MPM it was very nicely balanced between the child processes,
even in the extreme case (few ThreadsPerChild, many Children).

> What will happen in the real case is a single child will
> accept connections from a sequence of M/4 or so clients until its threads
> are busy, and the last client will have one of its requests stuck waiting
> for the other threads to be finished with serving a slow client or simply
> waiting in lingering close or simply writing the log file.  In other words,
> one in every M/4 clients will encounter two or three seconds of additional
> latency because the MPM is broken.

Was this by observation? Can you tell me what version this is? Did you
have S_L_U_A?

[...]

-aaron

RE: [PATCH] convert worker MPM to leader/followers design

Posted by Ryan Bloom <rb...@covalent.net>.

> From: Brian Pane [mailto:brian.pane@cnet.com]
> Justin Erenkrantz wrote:
> 
> >I'm not terribly sure how forking a new process can be worked
> >around - unless we use a spmt model (my favorite, but the
> >group doesn't like it because it isn't "robust").  Perhaps we
> >could do something where a process can dynamically add threads
> >as needed?  (worker has a fixed number of threads per process,
> >right?)
> >
> 
> I think the right combination may be fixed # of processes (greater
> than 1 for robustness) and variable # of threads per process.

That would be the definition of perchild.

Ryan

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Brian Pane <br...@cnet.com>.

Justin Erenkrantz wrote:

>I'm not terribly sure how forking a new process can be worked
>around - unless we use a spmt model (my favorite, but the
>group doesn't like it because it isn't "robust").  Perhaps we
>could do something where a process can dynamically add threads
>as needed?  (worker has a fixed number of threads per process,
>right?)
>

I think the right combination may be fixed # of processes (greater
than 1 for robustness) and variable # of threads per process.

--Brian

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Justin Erenkrantz <je...@apache.org>.

On Thu, Apr 11, 2002 at 05:10:51PM -0700, Brian Pane wrote:
> Adding the mutex check will fix one of the three problems in worker that
> I know of.  The other two--the large-grained overhead of forking a new
> process to add concurrency, and the mutex contention on the queue--will
> require more radical solutions, so they're better explored in a new MPM.

AIUI, these two problems aren't going to directly impact the
user in such a drastic way as the accept() when no threads are
available.

I'm not terribly sure how forking a new process can be worked
around - unless we use a spmt model (my favorite, but the
group doesn't like it because it isn't "robust").  Perhaps we
could do something where a process can dynamically add threads
as needed?  (worker has a fixed number of threads per process,
right?)

Mutex contention is a fact of life.  I'd be curious to see
how we can reduce it, but I don't think we'll be able to
remove it entirely.

Correct first, fast second.  =)  -- justin

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Brian Pane <br...@cnet.com>.

Justin Erenkrantz wrote:

>Can't we just add in the extra mutex check to worker and move on?
>(i.e. don't call accept() when the worker queue is empty).
>

+1

Adding the mutex check will fix one of the three problems in worker that
I know of.  The other two--the large-grained overhead of forking a new
process to add concurrency, and the mutex contention on the queue--will
require more radical solutions, so they're better explored in a new MPM.

--Brian

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Justin Erenkrantz <je...@apache.org>.

On Thu, Apr 11, 2002 at 04:57:23PM -0700, Brian Pane wrote:
> On the contrary, production servers sometimes have *huge* discontinuities
> in the number of concurrent connections.  Think about what happens to the
> connection rate at an online brokerage every day at the instant when the
> stock markets open, for example.  Or a news site when there's major breaking
> news.  Or a small site that's just been featured on Slashdot.
> 
> A server that can't handle discontinuities in request volume just isn't
> suitable for general-purpose use in the real world.

Can't we just add in the extra mutex check to worker and move on?
(i.e. don't call accept() when the worker queue is empty).
If the performance is proven to be detrimental, then we can
replace it with another model - only after that new MPM is
proven to be better in all aspects.  -- justin

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Brian Pane <br...@cnet.com>.

Cliff Woolley wrote:

>On Thu, 11 Apr 2002, Aaron Bannert wrote:
>
>> Under typical conditions, long-running and short-running requests will
>> be distributed throughout the children. In order for this scenario to
>> occur, all M threads in a child would have to be in use by a long-lived
>> connection. Assuming a random distribution of these clients, I don't
>> see how this scenario can consistently occur except when all threads
>> across all children are already being occupied by long-lived connections.
>>
>
>Another thing of note is that this sort of problem will only happen (or
>rather, will only be severe) when the server goes instantaneously from x
>concurrent connections on average to x+y concurrent connections, where y
>is large, which is what's happening when you suddenly ab pound a server,
>because p_i_s_m doesn't have a chance to keep up.
>

So far, so good, but...

>Under production
>circumstances, the number of concurrent connections would tend to have
>less significant discontinuities.
>

On the contrary, production servers sometimes have *huge* discontinuities
in the number of concurrent connections.  Think about what happens to the
connection rate at an online brokerage every day at the instant when the
stock markets open, for example.  Or a news site when there's major breaking
news.  Or a small site that's just been featured on Slashdot.

A server that can't handle discontinuities in request volume just isn't
suitable for general-purpose use in the real world.

--Brian

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Cliff Woolley <jw...@virginia.edu>.

On Fri, 12 Apr 2002 dirkx@covalent.net wrote:

> or the POS systems at the end of the day. Even the newspaper site I used
> to work for had regular peaks which where about 150-250x higher during
> predicatable 15-30 minute time slots; than the average - and the median
> was well below that.

I said _instantaneously_.  But fine....

> So no - you want to design for the capability to handle the worst cases -
> whilst ensuring it does a good job of the avergage case :-)

Of course.

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA

Re: [PATCH] convert worker MPM to leader/followers design

Posted by di...@covalent.net.

Whoa ! That sort of a situation in in my experience extremely common; e.g.
a URL flashed in a TV or Advert - or during a soap/talk show to 'vote' or
something. Bazillions of people on crappy modem links going on line and
fetching too-big-an images as the producers of the TV show think that you
reduce the size of an image by using <img width=10 height=10> but still
leave the SRC a 200k tiff.

Another area is in corpeate internets; where people all log on during the
morning; and need to fech a 1Mb applet. Or in brokerage/finance trading;
or the POS systems at the end of the day. Even the newspaper site I used
to work for had regular peaks which where about 150-250x higher during
predicatable 15-30 minute time slots; than the average - and the median
was well below that.

So no - you want to design for the capability to handle the worst cases -
whilst ensuring it does a good job of the avergage case :-)

Dw
-- 
Dirk-Willem van Gulik

On Thu, 11 Apr 2002, Cliff Woolley wrote:

> On Thu, 11 Apr 2002, Aaron Bannert wrote:
>
> >  Under typical conditions, long-running and short-running requests will
> >  be distributed throughout the children. In order for this scenario to
> >  occur, all M threads in a child would have to be in use by a long-lived
> >  connection. Assuming a random distribution of these clients, I don't
> >  see how this scenario can consistently occur except when all threads
> >  across all children are already being occupied by long-lived connections.
>
> Another thing of note is that this sort of problem will only happen (or
> rather, will only be severe) when the server goes instantaneously from x
> concurrent connections on average to x+y concurrent connections, where y
> is large, which is what's happening when you suddenly ab pound a server,
> because p_i_s_m doesn't have a chance to keep up.  Under production
> circumstances, the number of concurrent connections would tend to have
> less significant discontinuities.
>
> --Cliff
>
> --------------------------------------------------------------
>    Cliff Woolley
>    cliffwoolley@yahoo.com
>    Charlottesville, VA
>
>
>

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Cliff Woolley <jw...@virginia.edu>.

On Thu, 11 Apr 2002, Aaron Bannert wrote:

>  Under typical conditions, long-running and short-running requests will
>  be distributed throughout the children. In order for this scenario to
>  occur, all M threads in a child would have to be in use by a long-lived
>  connection. Assuming a random distribution of these clients, I don't
>  see how this scenario can consistently occur except when all threads
>  across all children are already being occupied by long-lived connections.

Another thing of note is that this sort of problem will only happen (or
rather, will only be severe) when the server goes instantaneously from x
concurrent connections on average to x+y concurrent connections, where y
is large, which is what's happening when you suddenly ab pound a server,
because p_i_s_m doesn't have a chance to keep up.  Under production
circumstances, the number of concurrent connections would tend to have
less significant discontinuities.

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA

Re: [PATCH] convert worker MPM to leader/followers design

Posted by "Roy T. Fielding" <fi...@apache.org>.

> I do not believe that we have a scalability problem in the worker MPM.
> I believe we have a scalability problem in our testing tool. I agree
> that there is a problem that can cause some new connections to appear
> to hang under certain unlikely conditions, but I do not believe this can
> cause the server to hang as a whole, nor do I believe that this problem
> can show up enough to cause a ceiling on concurrent request processing.

It doesn't cause a ceiling -- it causes the M+1 request to go hang out in
limbo when it could have been processed by the other child.  This isn't 
only
likely, it will happen consistently on any server that handles more than
M requests per second on a regular basis.  Any high-end site.

Go ahead and try it with flood -- it has nothing whatsoever to do with the
tool other than the fact that the tool is attempting to make simultaneous
connections very fast.  The same problem will be seen by a busy site trying
to serve many slow connections over time.  I'd show it to you on our own
apache.org server, but that one doesn't use worker.

> Since this is an important issue, and I do not want this to become a
> flame fest, I will describe what I think is happening here:
>
>  The worker MPM has N children.
>  Each child has M threads.
>  Each thread can handle exactly 1 concurrent request.
>
>  In the worse case imagine that M requests were all handled by the same
>  child, and that 1 additional request arrives and is to be handled by
>  that same child. In this case, that last request must wait for 1 of the 
> M
>  busy threads to finish with a previous thread before it can be processed.
>  The likelyhood of this happening, however, is a function of the ability
>  of the accept mutex to deal with lock contention, and the number of
>  concurrent requests. In my opinion, this likelyhood is very small,
>  so small that in normal testing I do not believe we will encounter
>  this scenario.

What do you mean by "worst case"?  That is almost every case.  You are
forgetting that the child has LIFO characteristics, which means it will
handle every request until the M+1 arrives.  Furthermore, you are assuming
that request arrival rates are normally distributed, which simply isn't
the case.  What will happen in the real case is a single child will
accept connections from a sequence of M/4 or so clients until its threads
are busy, and the last client will have one of its requests stuck waiting
for the other threads to be finished with serving a slow client or simply
waiting in lingering close or simply writing the log file.  In other words,
one in every M/4 clients will encounter two or three seconds of additional
latency because the MPM is broken.

On a server that is receiving 1000 reqs/sec with 25 threads/child, a child
will become impacted within 25 ms.  That means the 26th request is going
to be sitting around for at least

    (time required to close fastest connection) - 25ms.

which is a buttload of time in Web terms.  Furthermore, the actual 
long-lived,
multi-minute connections will pile-up over time, reducing the number of
"short-timer" threads available on average, and thus increasing the rate
of impact on clients.

Brian is right -- the worker MPM must be fixed to not accept connections
when it has no available threads.

....Roy

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Aaron Bannert <aa...@clove.org>.

On Thu, Apr 11, 2002 at 03:27:23PM -0700, Roy T. Fielding wrote:
> >Ok, now we're on the same page. I see this as a problem as well, but I
> >don't think this is what is causing the problem described earlier in this
> >thread. Considering how unlikely it is that all of the threads on one
> >process are on long-lived connections, I don't see this as a critical
> >short-term problem. What is more likely is that 'ab', used to observe
> >this phenomon, is flawed in a way that prevents it from truly testing the
> >concurrent processing capabilities of the worker MPM, when it is possible
> >for a request on a different socket to be returned sooner than another.
> >Flood would be much more appropriate for this kind of a test.
> 
> So, what you are saying is that it isn't common for Apache httpd to be used
> for sites that serve large images to people behind modems.  Right?  And
> therefore we shouldn't fix the only MPM that exists solely because sites
> that mostly serve large images to people behind modems didn't want the
> memory overhead of prefork.  Think about it.

I do not believe that we have a scalability problem in the worker MPM.
I believe we have a scalability problem in our testing tool. I agree
that there is a problem that can cause some new connections to appear
to hang under certain unlikely conditions, but I do not believe this can
cause the server to hang as a whole, nor do I believe that this problem
can show up enough to cause a ceiling on concurrent request processing.

Since this is an important issue, and I do not want this to become a
flame fest, I will describe what I think is happening here:

 The worker MPM has N children.
 Each child has M threads.
 Each thread can handle exactly 1 concurrent request.

 In the worse case imagine that M requests were all handled by the same
 child, and that 1 additional request arrives and is to be handled by
 that same child. In this case, that last request must wait for 1 of the M
 busy threads to finish with a previous thread before it can be processed.
 The likelyhood of this happening, however, is a function of the ability
 of the accept mutex to deal with lock contention, and the number of
 concurrent requests. In my opinion, this likelyhood is very small,
 so small that in normal testing I do not believe we will encounter
 this scenario.

 Under typical conditions, long-running and short-running requests will
 be distributed throughout the children. In order for this scenario to
 occur, all M threads in a child would have to be in use by a long-lived
 connection. Assuming a random distribution of these clients, I don't
 see how this scenario can consistently occur except when all threads
 across all children are already being occupied by long-lived connections.

Please do not misunderstand me. I believe Brian has a good design for a
potential replacement of the worker MPM, but I do not believe that it is
right to change the current design. It will be much more appropriate to
create another MPM to implement and test this design, so that the group as
a whole can determine when it has become better than the worker.

-aaron

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Brian Pane <br...@cnet.com>.

Brian Pane wrote:

> While flood definitely has more concurrency, I don't think the 
> performance
> problem in this test case was ab's fault.  From what I was able to 
> observe 

It's also worth noting that the performance was fine when
I used ab with prefork and leader/follower; only with worker
did the performance problems appear.

--Brian

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Brian Pane <br...@cnet.com>.

Aaron Bannert wrote:

>On Thu, Apr 11, 2002 at 02:09:27PM -0700, Brian Pane wrote:
>
>>The problem isn't that the busy worker threads will never become unbusy
>>and pick up new work from the queue.  If the queue is full, and the listener
>>is blocked, the listener will (with the current code) be properly awakened
>>as soon as one of the workers finishes its current connection.  The problem
>>is that it could be a long time before that happens, if all the workers are
>>handling very long-lived connections.  And meanwhile, there could be other
>>child procs with idle worker threads.
>>
>
>Ok, now we're on the same page. I see this as a problem as well, but I
>don't think this is what is causing the problem described earlier in this
>thread. Considering how unlikely it is that all of the threads on one
>process are on long-lived connections, I don't see this as a critical
>short-term problem. What is more likely is that 'ab', used to observe
>this phenomon, is flawed in a way that prevents it from truly testing the
>concurrent processing capabilities of the worker MPM, when it is possible
>for a request on a different socket to be returned sooner than another.
>Flood would be much more appropriate for this kind of a test.
>

While flood definitely has more concurrency, I don't think the performance
problem in this test case was ab's fault.  From what I was able to observe
when I recreated Nick De Decker's original test case on Linux, I think the
two big factors contributing to the poor performance are:

  * Adding concurrency (to handle increasing traffic) is expensive
    with the worker design, as it requires creating another child proc
    and a whole bunch of threads.

  * And while this expensive operation is happening, the existing child
    processes, which by definition already have most of their worker
    threads busy, are blindly accepting and queuing more connections.
    And those connections pile up in the queue because the currently
    busy worker threads are fighting for CPU cycles with the startup
    of the new child process.  When the new child proc finally starts
    up, a lot of the connections that it ought to be handling are
    stuck in the queues of the existing children.

>>Which all leads me back to the conclusion that we need to stop managing a
>>separate queue within each child process.  (In the short term, though,
>>blocking the listener from doing an accept if the # of idle workers is
>>zero would be a reasonable alternative.)
>>
>
>At one point I had an alternative queue design for worker where slots
>in the queue represented an available worker as opposed to now when it
>represents an accepted connection. This way when the queue is empty
>(or full, depending on how you want to look at it) then there are no
>more immediately available worker threads.
>

Yep, your "time/space tradeoff" design is similar to leader/follower.
They're almost the same design, except that leader/follower keeps
a connection within the same thread that accepted it, to try to get
the context switch and a couple of mutex operations out of the critical
path of request processing.

--Brian

Re: [PATCH] convert worker MPM to leader/followers design

Posted by "Roy T. Fielding" <fi...@apache.org>.

> Ok, now we're on the same page. I see this as a problem as well, but I
> don't think this is what is causing the problem described earlier in this
> thread. Considering how unlikely it is that all of the threads on one
> process are on long-lived connections, I don't see this as a critical
> short-term problem. What is more likely is that 'ab', used to observe
> this phenomon, is flawed in a way that prevents it from truly testing the
> concurrent processing capabilities of the worker MPM, when it is possible
> for a request on a different socket to be returned sooner than another.
> Flood would be much more appropriate for this kind of a test.

So, what you are saying is that it isn't common for Apache httpd to be used
for sites that serve large images to people behind modems.  Right?  And
therefore we shouldn't fix the only MPM that exists solely because sites
that mostly serve large images to people behind modems didn't want the
memory overhead of prefork.  Think about it.

....Roy

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Aaron Bannert <aa...@clove.org>.

On Thu, Apr 11, 2002 at 02:09:27PM -0700, Brian Pane wrote:
> The problem isn't that the busy worker threads will never become unbusy
> and pick up new work from the queue.  If the queue is full, and the listener
> is blocked, the listener will (with the current code) be properly awakened
> as soon as one of the workers finishes its current connection.  The problem
> is that it could be a long time before that happens, if all the workers are
> handling very long-lived connections.  And meanwhile, there could be other
> child procs with idle worker threads.

Ok, now we're on the same page. I see this as a problem as well, but I
don't think this is what is causing the problem described earlier in this
thread. Considering how unlikely it is that all of the threads on one
process are on long-lived connections, I don't see this as a critical
short-term problem. What is more likely is that 'ab', used to observe
this phenomon, is flawed in a way that prevents it from truly testing the
concurrent processing capabilities of the worker MPM, when it is possible
for a request on a different socket to be returned sooner than another.
Flood would be much more appropriate for this kind of a test.

> Which all leads me back to the conclusion that we need to stop managing a
> separate queue within each child process.  (In the short term, though,
> blocking the listener from doing an accept if the # of idle workers is
> zero would be a reasonable alternative.)

At one point I had an alternative queue design for worker where slots
in the queue represented an available worker as opposed to now when it
represents an accepted connection. This way when the queue is empty
(or full, depending on how you want to look at it) then there are no
more immediately available worker threads.

-aaron

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Brian Pane <br...@cnet.com>.

Bill Stoddard wrote:

>
>>Right, the problem is that the listener needs to avoid doing the
>>accept if the queue is full.
>>
>>That part is easy: add a "block_if_queue_full()" method on the
>>fd_queue class, and have the listener call it before doing an
>>accept.
>>
>
>I am not an expert on the worker MPM but I don't think that is an accurate statement of
>the problem.  The accept thread uses ap_queue_push() to enqueue a socket for the worker
>threads. ap_queue_push() will block if the queue is full.
>

You're right.  The logic really needs to be: don't do the accept unless
there is an idle thread.

--Brian

>The problem is that the queue can be EMPTY even when all the worker threads are busy
>handling connections.  The way the code is today, the queue can hold ap_threads_per_child
>elements. Now consider 2 x ap_threads_per_child connections comming into the server at
>once.. The first 1 x ap_threads_per_child connections will be accepted and handled by the
>worker threads. The next ap_threads_per_child connections will get queued and this is
>precisely the problem...
>
>Bill
>

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Brian Pane <br...@cnet.com>.

>
>
>>Now I know I'm missing something here, so maybe you can fill in the
>>blanks for me. This doesn't seem like a problem that would hang the
>>entire server or put a hard limit on the number of concurrent connections
>>(across processes). I would expect a finishing worker thread to return
>>to the queue and immediately try to pick up an accepted and queued
>>connection, waking up the listener thread if it was waiting for a slot
>>to free up in the queue. Is this not the case? Where does the listener
>>thread block in such a way that it prevents other idle threads from
>>processing incoming requests?
>>
>>-aaron
>>
>
>That sounds like a bug in the worker queue code (preventing the accept thread from being
>awakened if the queue is 'almost' full...)
>
>Bill
>

The problem isn't that the busy worker threads will never become unbusy
and pick up new work from the queue.  If the queue is full, and the listener
is blocked, the listener will (with the current code) be properly awakened
as soon as one of the workers finishes its current connection.  The problem
is that it could be a long time before that happens, if all the workers are
handling very long-lived connections.  And meanwhile, there could be other
child procs with idle worker threads.

Which all leads me back to the conclusion that we need to stop managing a
separate queue within each child process.  (In the short term, though,
blocking the listener from doing an accept if the # of idle workers is
zero would be a reasonable alternative.)

--Brian

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Bill Stoddard <bi...@wstoddard.com>.

> On Thu, Apr 11, 2002 at 03:04:27PM -0400, Bill Stoddard wrote:
> > I am not an expert on the worker MPM but I don't think that is an accurate statement
of
> > the problem.  The accept thread uses ap_queue_push() to enqueue a socket for the
worker
> > threads. ap_queue_push() will block if the queue is full.
> >
> > The problem is that the queue can be EMPTY even when all the worker threads are busy
> > handling connections.  The way the code is today, the queue can hold
ap_threads_per_child
> > elements. Now consider 2 x ap_threads_per_child connections comming into the server at
> > once.. The first 1 x ap_threads_per_child connections will be accepted and handled by
the
> > worker threads. The next ap_threads_per_child connections will get queued and this is
> > precisely the problem...
>
> Now I know I'm missing something here, so maybe you can fill in the
> blanks for me. This doesn't seem like a problem that would hang the
> entire server or put a hard limit on the number of concurrent connections
> (across processes). I would expect a finishing worker thread to return
> to the queue and immediately try to pick up an accepted and queued
> connection, waking up the listener thread if it was waiting for a slot
> to free up in the queue. Is this not the case? Where does the listener
> thread block in such a way that it prevents other idle threads from
> processing incoming requests?
>
> -aaron

That sounds like a bug in the worker queue code (preventing the accept thread from being
awakened if the queue is 'almost' full...)

Bill

RE: [PATCH] convert worker MPM to leader/followers design

Posted by Cliff Woolley <jw...@virginia.edu>.

On Thu, 11 Apr 2002, Ryan Bloom wrote:

> > > And you can always play some games with the counters to enable you
> > > to accept a few additional connections (however you define 'few')
> > > in order to keep some work in the queue.
> >
> > It just is hard to think about what "few" should be given that there
> > is always a spastic case (all threads handling large DAV transactions
> > from deep space probes).  But in practice the normal definition of
> > "few" should be fine :)
>
> I was thinking that the definition of "few" could be a directive.

My take on it is that any non-zero fixed number, configurable or not, is
just as wrong as any other, in the sense that it does not eliminate the
problem, it just just changes how many clients can be affected by it.

That said, I think the idea of shortening the queue would help.  It most
likely doesn't need to be as big as the number of worker threads.
However, it's hard to say what "the right length" is, because that number
changes dynamically.  Ideally, the listener thread would be able to tell
how long its workers spent on requests on average, and adjust the queue
length dynamically in a sort of feedback loop.

That way the listener thread will always have work to do for for the
workers as absolutely soon after the finish the last one as possible
(that's the point of the per-process queue as I understood it), and it
allows listeners who know that their workers are all tied up to avoid
shanghaiing more clients than they expect to be able to deal with in the
near future.

Just a thought.

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA

RE: [PATCH] convert worker MPM to leader/followers design

Posted by Ryan Bloom <rb...@covalent.net>.

> > > But this bites
> > >
> > > 1) when there is just one child process (wasted syscall)
> > >
> > > 2) because it would normally go faster if the listener could stay
just
> > >    ahead of the workers so that workers can dequeue new work when
they
> > >    finish with the old without having to wait on the listener to
> > >    enqueue something
> 
> > Sounds like a reasonable solution. I doubt you could measure the
> performance difference
> > due to a wasted syscalls.  And you can always play some games with
the
> counters to enable
> > you to accept a few additional connections (however you define
'few') in
> order to keep
> > some work in the queue.
> 
> agreed...
> 
> It just is hard to think about what "few" should be given that there
> is always a spastic case (all threads handling large DAV transactions
> from deep space probes).
> 
> But in practice the normal definition of "few" should be fine :)

I was thinking that the definition of "few" could be a directive.

Ryan

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Jeff Trawick <tr...@attglobal.net>.

"Bill Stoddard" <bi...@wstoddard.com> writes:

> > But this bites
> >
> > 1) when there is just one child process (wasted syscall)
> >
> > 2) because it would normally go faster if the listener could stay just
> >    ahead of the workers so that workers can dequeue new work when they
> >    finish with the old without having to wait on the listener to
> >    enqueue something

> Sounds like a reasonable solution. I doubt you could measure the performance difference
> due to a wasted syscalls.  And you can always play some games with the counters to enable
> you to accept a few additional connections (however you define 'few') in order to keep
> some work in the queue.

agreed...

It just is hard to think about what "few" should be given that there
is always a spastic case (all threads handling large DAV transactions
from deep space probes).

But in practice the normal definition of "few" should be fine :)

-- 
Jeff Trawick | trawick@attglobal.net
Born in Roswell... married an alien...

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Bill Stoddard <bi...@wstoddard.com>.

> "Bill Stoddard" <bi...@wstoddard.com> writes:
>
> > > Right, the problem is that the listener needs to avoid doing the
> > > accept if the queue is full.
> > >
> > > That part is easy: add a "block_if_queue_full()" method on the
> > > fd_queue class, and have the listener call it before doing an
> > > accept.
> > >
> >
> > I am not an expert on the worker MPM but I don't think that is an accurate statement
of
> > the problem.  The accept thread uses ap_queue_push() to enqueue a socket for the
worker
> > threads. ap_queue_push() will block if the queue is full.
> >
> > The problem is that the queue can be EMPTY even when all the worker threads are busy
> > handling connections.  The way the code is today, the queue can hold
ap_threads_per_child
> > elements. Now consider 2 x ap_threads_per_child connections comming into the server at
> > once.. The first 1 x ap_threads_per_child connections will be accepted and handled by
the
> > worker threads. The next ap_threads_per_child connections will get queued and this is
> > precisely the problem...
>
> That is the problem I thought was being dealt with originally...
> After seeing it stated differently a couple of times I figured I was confused.
>
> probably-stupid idea:
>
> Maybe a counter of blocked threads can be cheaply maintained within the
> existing serialization.  The listener can use this to decide if it
> should call accept() or give up the CPU.
>
> But this bites
>
> 1) when there is just one child process (wasted syscall)
>
> 2) because it would normally go faster if the listener could stay just
>    ahead of the workers so that workers can dequeue new work when they
>    finish with the old without having to wait on the listener to
>    enqueue something
>
> --
> Jeff Trawick | trawick@attglobal.net


Sounds like a reasonable solution. I doubt you could measure the performance difference
due to a wasted syscalls.  And you can always play some games with the counters to enable
you to accept a few additional connections (however you define 'few') in order to keep
some work in the queue.

Bill

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Jeff Trawick <tr...@attglobal.net>.

"Bill Stoddard" <bi...@wstoddard.com> writes:

> > Right, the problem is that the listener needs to avoid doing the
> > accept if the queue is full.
> >
> > That part is easy: add a "block_if_queue_full()" method on the
> > fd_queue class, and have the listener call it before doing an
> > accept.
> >
> 
> I am not an expert on the worker MPM but I don't think that is an accurate statement of
> the problem.  The accept thread uses ap_queue_push() to enqueue a socket for the worker
> threads. ap_queue_push() will block if the queue is full.
> 
> The problem is that the queue can be EMPTY even when all the worker threads are busy
> handling connections.  The way the code is today, the queue can hold ap_threads_per_child
> elements. Now consider 2 x ap_threads_per_child connections comming into the server at
> once.. The first 1 x ap_threads_per_child connections will be accepted and handled by the
> worker threads. The next ap_threads_per_child connections will get queued and this is
> precisely the problem...

That is the problem I thought was being dealt with originally...
After seeing it stated differently a couple of times I figured I was confused.

probably-stupid idea: 

Maybe a counter of blocked threads can be cheaply maintained within the
existing serialization.  The listener can use this to decide if it
should call accept() or give up the CPU.

But this bites

1) when there is just one child process (wasted syscall)

2) because it would normally go faster if the listener could stay just
   ahead of the workers so that workers can dequeue new work when they
   finish with the old without having to wait on the listener to
   enqueue something

-- 
Jeff Trawick | trawick@attglobal.net
Born in Roswell... married an alien...

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Aaron Bannert <aa...@clove.org>.

On Thu, Apr 11, 2002 at 03:04:27PM -0400, Bill Stoddard wrote:
> I am not an expert on the worker MPM but I don't think that is an accurate statement of
> the problem.  The accept thread uses ap_queue_push() to enqueue a socket for the worker
> threads. ap_queue_push() will block if the queue is full.
> 
> The problem is that the queue can be EMPTY even when all the worker threads are busy
> handling connections.  The way the code is today, the queue can hold ap_threads_per_child
> elements. Now consider 2 x ap_threads_per_child connections comming into the server at
> once.. The first 1 x ap_threads_per_child connections will be accepted and handled by the
> worker threads. The next ap_threads_per_child connections will get queued and this is
> precisely the problem...

Now I know I'm missing something here, so maybe you can fill in the
blanks for me. This doesn't seem like a problem that would hang the
entire server or put a hard limit on the number of concurrent connections
(across processes). I would expect a finishing worker thread to return
to the queue and immediately try to pick up an accepted and queued
connection, waking up the listener thread if it was waiting for a slot
to free up in the queue. Is this not the case? Where does the listener
thread block in such a way that it prevents other idle threads from
processing incoming requests?

-aaron

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Bill Stoddard <bi...@wstoddard.com>.


> Right, the problem is that the listener needs to avoid doing the
> accept if the queue is full.
>
> That part is easy: add a "block_if_queue_full()" method on the
> fd_queue class, and have the listener call it before doing an
> accept.
>

I am not an expert on the worker MPM but I don't think that is an accurate statement of
the problem.  The accept thread uses ap_queue_push() to enqueue a socket for the worker
threads. ap_queue_push() will block if the queue is full.

The problem is that the queue can be EMPTY even when all the worker threads are busy
handling connections.  The way the code is today, the queue can hold ap_threads_per_child
elements. Now consider 2 x ap_threads_per_child connections comming into the server at
once.. The first 1 x ap_threads_per_child connections will be accepted and handled by the
worker threads. The next ap_threads_per_child connections will get queued and this is
precisely the problem...

Bill

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Brian Pane <br...@cnet.com>.

Right, the problem is that the listener needs to avoid doing the
accept if the queue is full.

That part is easy: add a "block_if_queue_full()" method on the
fd_queue class, and have the listener call it before doing an
accept.

But there are two negative side-effects:

* More contention on a global mutex.  (This one, IMHO, is bad
  but not a showstopper.  The worker design already has way too
  much mutex contention, and the real solution is to move on to
  a new MPM design, so for now I won't object to adding one more
  lock/unlock pair.)

* If there's anything in the shutdown/restart logic that assumes
  that the listener will always do an accept in a timely manner,
  that code will break in cases where the listener is blocked
  waiting for the queue to become non-full.  (I'm not sure if
  this one is a problem in reality or not.)

--Brian

Bill Stoddard wrote:

>Nope. I that that back. This would not fix anything...
>
>Bill
>
>>A quick and dirty hack that would fix this problem is to reduce the queue size to
>>ThreadsPerChild/2.
>>
>>Bill
>>
>>
>>>On Thu, Apr 11, 2002 at 07:33:40AM -0700, Brian Pane wrote:
>>>
>>>>I disagree.  Unless someone wants to volunteer to put a workaround
>>>>in the current worker code to fix the queue-full case, I can't
>>>>rationalize including it in another beta or GA release.  We need
>>>>to either fix it or remove it.
>>>>
>>>I volunteer to fix it.
>>>
>>>-aaron
>>>

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Bill Stoddard <bi...@wstoddard.com>.

Nope. I that that back. This would not fix anything...

Bill

> A quick and dirty hack that would fix this problem is to reduce the queue size to
> ThreadsPerChild/2.
> 
> Bill
> 
> 
> > On Thu, Apr 11, 2002 at 07:33:40AM -0700, Brian Pane wrote:
> > > I disagree.  Unless someone wants to volunteer to put a workaround
> > > in the current worker code to fix the queue-full case, I can't
> > > rationalize including it in another beta or GA release.  We need
> > > to either fix it or remove it.
> >
> > I volunteer to fix it.
> >
> > -aaron
> >
>

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Bill Stoddard <bi...@wstoddard.com>.

A quick and dirty hack that would fix this problem is to reduce the queue size to
ThreadsPerChild/2.

Bill


> On Thu, Apr 11, 2002 at 07:33:40AM -0700, Brian Pane wrote:
> > I disagree.  Unless someone wants to volunteer to put a workaround
> > in the current worker code to fix the queue-full case, I can't
> > rationalize including it in another beta or GA release.  We need
> > to either fix it or remove it.
>
> I volunteer to fix it.
>
> -aaron
>

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Aaron Bannert <aa...@clove.org>.

On Thu, Apr 11, 2002 at 07:33:40AM -0700, Brian Pane wrote:
> I disagree.  Unless someone wants to volunteer to put a workaround
> in the current worker code to fix the queue-full case, I can't
> rationalize including it in another beta or GA release.  We need
> to either fix it or remove it.

I volunteer to fix it.

-aaron

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Bill Stoddard <bi...@wstoddard.com>.

> "Bill Stoddard" <bi...@wstoddard.com> writes:
>
> > > -----
> > > > Jeff Trawick wrote:
> > > >
> > > > >Aaron Bannert <aa...@clove.org> writes:
> > > > >
> > > > >>I appreciate that you are trying to moderate my usage of the -1 (veto),
> > > > >>but feel it is my duty to inform the list as soon as possible that I
> > > > >>wouldn't be happy with this big of a change.
> > > > >>
> > > > >
> > > > >Maybe a veto wasn't appropriate, but it seemed clear to me that a
> > > > >number of people would rather have the new code side by side with the
> > > > >old for a while, and Brian didn't seem disturbed about that notion at
> > > > >all.  Hopefully nobody will actually go remove worker as soon as
> > > > >Brian's code is committed :)
> > > > >
> > > >
> > > > I disagree.  Unless someone wants to volunteer to put a workaround
> > > > in the current worker code to fix the queue-full case, I can't
> > > > rationalize including it in another beta or GA release.  We need
> > > > to either fix it or remove it.
> > > >
> > > > --Brian
> > >
> > > I am in grudging agreement with Brian. I don't have a lot of time to spend on this,
but
> > > I'll take a look at making worker behave more like mpm_winnt, which should fix the
> > > problem.
> > >
> > > Bill
> >
> > Before anyone goes weird on me for that last comment, I am specifically referring to
how
> > mpm_winnt manages the queue between the accept and worker threads.  Not planning on
> > changing anything else about how worker operates other than how the accept/worker
queue is
> > managed.
>
> I'll be interested in exactly what you want to change.

This basic problem is that you want to not allow the worker to accept more connections
that it has threads (ie, a worker should never have more than ThreadsPerChild active
connections).  One way of doing this is to use Brian's approach.  It is pretty invasive
though and would likely take a while to drive out the bugs. There are much easier
solutions.

The problem with worker is that fd_queue_elem_t's are recycled too quickly.  They should
not be added back to the queue (and made available to the accept thread) until the worker
thread dispatched by the element is finished with the connection. (mpm_winnt does this
correctly with a queue element called a "completion_context"). In mpm_winnt, the accept
thread does the equivalent of an ap_queue_pop to get queue element. If one is not
available, then the accept thread yields its time slice.  The workers do the equivalent of
an ap_queue_push at the end of handling a connection to make more queue elements
available.

Bill

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Jeff Trawick <tr...@attglobal.net>.

"Bill Stoddard" <bi...@wstoddard.com> writes:

> > -----
> > > Jeff Trawick wrote:
> > >
> > > >Aaron Bannert <aa...@clove.org> writes:
> > > >
> > > >>I appreciate that you are trying to moderate my usage of the -1 (veto),
> > > >>but feel it is my duty to inform the list as soon as possible that I
> > > >>wouldn't be happy with this big of a change.
> > > >>
> > > >
> > > >Maybe a veto wasn't appropriate, but it seemed clear to me that a
> > > >number of people would rather have the new code side by side with the
> > > >old for a while, and Brian didn't seem disturbed about that notion at
> > > >all.  Hopefully nobody will actually go remove worker as soon as
> > > >Brian's code is committed :)
> > > >
> > >
> > > I disagree.  Unless someone wants to volunteer to put a workaround
> > > in the current worker code to fix the queue-full case, I can't
> > > rationalize including it in another beta or GA release.  We need
> > > to either fix it or remove it.
> > >
> > > --Brian
> >
> > I am in grudging agreement with Brian. I don't have a lot of time to spend on this, but
> > I'll take a look at making worker behave more like mpm_winnt, which should fix the
> > problem.
> >
> > Bill
> 
> Before anyone goes weird on me for that last comment, I am specifically referring to how
> mpm_winnt manages the queue between the accept and worker threads.  Not planning on
> changing anything else about how worker operates other than how the accept/worker queue is
> managed.

I'll be interested in exactly what you want to change.  I'd be shocked
if this problem can't be solved without more than tweaking the design.
Graceful termination (after user intervention or maxrequests or some
syscall failure) is built around the current queue handling, so it is
valuable to avoid changing the current code too much, particularly if
people have ideas about a better overall design which could be ready
before too long.

We have this recurring problem with the MPMs where somebody wants to
switch out the design for some MPM function because the new design
supposedly doesn't have some problem that the current design has.
Just about every time new problems are introduced and not resolved and
whoever did the design switch-out is disinterested or busy.

Something else I "go weird on" is the idea that worker isn't good
enough to go in another beta or GA release.  It is in pretty decent
shape right now.  prefork is the default MPM anyway.  Look at
perchild, for crying out loud.  And how many people read and work on
PRs? There's stuff that doesn't work right with any MPM.  Should that
function be removed before another beta or GA release unless the PR is
resolved?  (I don't think so.)

-- 
Jeff Trawick | trawick@attglobal.net
Born in Roswell... married an alien...

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Bill Stoddard <bi...@wstoddard.com>.

> -----
> > Jeff Trawick wrote:
> >
> > >Aaron Bannert <aa...@clove.org> writes:
> > >
> > >>I appreciate that you are trying to moderate my usage of the -1 (veto),
> > >>but feel it is my duty to inform the list as soon as possible that I
> > >>wouldn't be happy with this big of a change.
> > >>
> > >
> > >Maybe a veto wasn't appropriate, but it seemed clear to me that a
> > >number of people would rather have the new code side by side with the
> > >old for a while, and Brian didn't seem disturbed about that notion at
> > >all.  Hopefully nobody will actually go remove worker as soon as
> > >Brian's code is committed :)
> > >
> >
> > I disagree.  Unless someone wants to volunteer to put a workaround
> > in the current worker code to fix the queue-full case, I can't
> > rationalize including it in another beta or GA release.  We need
> > to either fix it or remove it.
> >
> > --Brian
>
> I am in grudging agreement with Brian. I don't have a lot of time to spend on this, but
> I'll take a look at making worker behave more like mpm_winnt, which should fix the
> problem.
>
> Bill

Before anyone goes weird on me for that last comment, I am specifically referring to how
mpm_winnt manages the queue between the accept and worker threads.  Not planning on
changing anything else about how worker operates other than how the accept/worker queue is
managed.

Bill

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Bill Stoddard <bi...@wstoddard.com>.

-----
> Jeff Trawick wrote:
>
> >Aaron Bannert <aa...@clove.org> writes:
> >
> >>I appreciate that you are trying to moderate my usage of the -1 (veto),
> >>but feel it is my duty to inform the list as soon as possible that I
> >>wouldn't be happy with this big of a change.
> >>
> >
> >Maybe a veto wasn't appropriate, but it seemed clear to me that a
> >number of people would rather have the new code side by side with the
> >old for a while, and Brian didn't seem disturbed about that notion at
> >all.  Hopefully nobody will actually go remove worker as soon as
> >Brian's code is committed :)
> >
>
> I disagree.  Unless someone wants to volunteer to put a workaround
> in the current worker code to fix the queue-full case, I can't
> rationalize including it in another beta or GA release.  We need
> to either fix it or remove it.
>
> --Brian

I am in grudging agreement with Brian. I don't have a lot of time to spend on this, but
I'll take a look at making worker behave more like mpm_winnt, which should fix the
problem.

Bill

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Brian Pane <bp...@pacbell.net>.

Jeff Trawick wrote:

>Aaron Bannert <aa...@clove.org> writes:
>
>>I appreciate that you are trying to moderate my usage of the -1 (veto),
>>but feel it is my duty to inform the list as soon as possible that I
>>wouldn't be happy with this big of a change.
>>
>
>Maybe a veto wasn't appropriate, but it seemed clear to me that a
>number of people would rather have the new code side by side with the
>old for a while, and Brian didn't seem disturbed about that notion at
>all.  Hopefully nobody will actually go remove worker as soon as
>Brian's code is committed :)
>

I disagree.  Unless someone wants to volunteer to put a workaround
in the current worker code to fix the queue-full case, I can't
rationalize including it in another beta or GA release.  We need
to either fix it or remove it.

--Brian

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Jeff Trawick <tr...@attglobal.net>.

Aaron Bannert <aa...@clove.org> writes:

> I appreciate that you are trying to moderate my usage of the -1 (veto),
> but feel it is my duty to inform the list as soon as possible that I
> wouldn't be happy with this big of a change.

Maybe a veto wasn't appropriate, but it seemed clear to me that a
number of people would rather have the new code side by side with the
old for a while, and Brian didn't seem disturbed about that notion at
all.  Hopefully nobody will actually go remove worker as soon as
Brian's code is committed :)

-- 
Jeff Trawick | trawick@attglobal.net
Born in Roswell... married an alien...

Re: [PATCH] convert worker MPM to leader/followers design

Posted by Aaron Bannert <aa...@clove.org>.

On Wed, Apr 10, 2002 at 02:03:23PM -0700, Roy T. Fielding wrote:
> What Brian describes is a fatal flaw in the current Worker MPM code.
> In order to veto his change, you will be expected to review the code
> first and then propose a better solution.  Until then, choose a better
> way than -1 to make reservations.
> 
> If Brian needs to create a new MPM to fix this kind of problem, then
> I will remove the worker MPM from the code base as soon as the new one
> is committed.  I don't believe in leaving code in httpd that is not
> robust enough for normal use.

The particular problem that Brian is describing (hanging when one process'
worker-queue has filled) was a case that I specifically tested for and
did not encounter back when I was working in that code. I have not yet
had time to review the current state of the worker MPM, but I believe
that it is something that can be corrected in the current model. My
veto was intended to imply that I do not want the model of the worker
MPM changed, but that I am totally in favor of this new model and would
support a new MPM to implement it.

I appreciate that you are trying to moderate my usage of the -1 (veto),
but feel it is my duty to inform the list as soon as possible that I
wouldn't be happy with this big of a change.

-aaron

Re: [PATCH] convert worker MPM to leader/followers design

Posted by "Roy T. Fielding" <fi...@apache.org>.

What Brian describes is a fatal flaw in the current Worker MPM code.
In order to veto his change, you will be expected to review the code
first and then propose a better solution.  Until then, choose a better
way than -1 to make reservations.

If Brian needs to create a new MPM to fix this kind of problem, then
I will remove the worker MPM from the code base as soon as the new one
is committed.  I don't believe in leaving code in httpd that is not
robust enough for normal use.

....Roy

On Tuesday, April 9, 2002, at 10:51  PM, Aaron Bannert wrote:

> On Tue, Apr 09, 2002 at 10:51:29PM -0700, Brian Pane wrote:
>> ...
>
> Having not read or reviewed this patch, I must say that I'm -1
> for replacing the worker MPM, but I'm fully in favor of adding
> a new MPM with these new features.
>
> That said, I'll review this patch tomorrow, :)
> -aaron
>