You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@httpd.apache.org by Jeff Trawick <tr...@attglobal.net> on 2001/06/20 15:11:22 UTC

pod+connect doesn't work with threaded idle server maintenance

Please tell me how I'm confused...

If we want to take out a single server process to get rid of some
threads because we're now idle, we want to do this gracefully so we
don't trash any current connections.

assume for this scenario:

  threaded + the new pod+connect design and
  SINGLE_LISTEN_UNSERIALIZED_ACCEPT

  process A
    twenty threads in accept()
    twenty threads processing connections

  process B
    twenty threads in accept()
    twenty threads processing connections

  process C
    twenty threads in accept()
    twenty threads processing connections

We want to get rid of a whole process.  

If we write one char to the pod and connect() one time, we tell one
process, call it process A, to go away but we leave 19 other threads
in process A stranded in accept().  Processes B and C are unaffected
so far.

If we write to the pod and connect() again in hopes of waking up one
of the 19 threads stranded in process A, we likely will tell process B
or process C to go away since there is no guarantee which thread will
be awakened by the kernel.

We can't simply connect() ap_daemons_limit*ap_threads_per_child times
and not write to the pod in hopes of waking up the 19 stranded
threads in process B or C, since a given thread in process B or C may
be awakened more than once.

It would seem that the old pod design used by threaded (extra mutex
calls + poll which breaks SINGLE_LISTEN_UNSERIALIZED_ACCEPT) is
required in order to be able to gracefully kill an entire threaded
server process.

-- 
Jeff Trawick | trawick@attglobal.net | PGP public key at web site:
       http://www.geocities.com/SiliconValley/Park/9289/
             Born in Roswell... married an alien...

Re: pod+connect doesn't work with threaded idle server maintenance

Posted by rb...@covalent.net.

On Wed, 20 Jun 2001, Paul J. Reder wrote:

> Jeff Trawick wrote:
> >
> > Please tell me how I'm confused...
> > ...
> > We want to get rid of a whole process.
> >
> > If we write one char to the pod and connect() one time, we tell one
> > process, call it process A, to go away but we leave 19 other threads
> > in process A stranded in accept().  Processes B and C are unaffected
> > so far.
> >
> > If we write to the pod and connect() again in hopes of waking up one
> > of the 19 threads stranded in process A, we likely will tell process B
> > or process C to go away since there is no guarantee which thread will
> > be awakened by the kernel.
>
> Since worker_thread is given the child_num of the process, could we not
> send the child_num of the killing process on the POD so the kids can
> check. If it doesn't match their process number (i.e. if this is a B or C
> thread) then they ignore it, or pump it back onto the POD.
>
> We may need to loop sending the POD message and joining threads until we
> have hit all 20 desired threads.
>
> Any joy here?

No.  The problem is that you can't ever be sure that you will actually
kill the child process.  There is nothing saying that a loop sending 20
characters down the pipe will wake up 20 different threads.  For the
threaded case, you MUST have a mutex around the accept call, but you can
still avoid the poll call.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------

Re: pod+connect doesn't work with threaded idle server maintenance

Posted by "Paul J. Reder" <re...@raleigh.ibm.com>.

Jeff Trawick wrote:
> 
> Please tell me how I'm confused...
> ...
> We want to get rid of a whole process.
> 
> If we write one char to the pod and connect() one time, we tell one
> process, call it process A, to go away but we leave 19 other threads
> in process A stranded in accept().  Processes B and C are unaffected
> so far.
> 
> If we write to the pod and connect() again in hopes of waking up one
> of the 19 threads stranded in process A, we likely will tell process B
> or process C to go away since there is no guarantee which thread will
> be awakened by the kernel.

Since worker_thread is given the child_num of the process, could we not 
send the child_num of the killing process on the POD so the kids can
check. If it doesn't match their process number (i.e. if this is a B or C
thread) then they ignore it, or pump it back onto the POD.

We may need to loop sending the POD message and joining threads until we
have hit all 20 desired threads.

Any joy here?

-- 
Paul J. Reder
-----------------------------------------------------------
"The strength of the Constitution lies entirely in the determination of each
citizen to defend it.  Only if every single citizen feels duty bound to do
his share in this defense are the constitutional rights secure."
-- Albert Einstein

Re: pod+connect doesn't work with threaded idle server maintenance

Posted by rb...@covalent.net.

On 20 Jun 2001, Jeff Trawick wrote:

> Please tell me how I'm confused...
>
> If we want to take out a single server process to get rid of some
> threads because we're now idle, we want to do this gracefully so we
> don't trash any current connections.
>
> assume for this scenario:
>
>   threaded + the new pod+connect design and
>   SINGLE_LISTEN_UNSERIALIZED_ACCEPT
>
>   process A
>     twenty threads in accept()
>     twenty threads processing connections
>
>   process B
>     twenty threads in accept()
>     twenty threads processing connections
>
>   process C
>     twenty threads in accept()
>     twenty threads processing connections
>
> We want to get rid of a whole process.
>
> If we write one char to the pod and connect() one time, we tell one
> process, call it process A, to go away but we leave 19 other threads
> in process A stranded in accept().  Processes B and C are unaffected
> so far.
>
> If we write to the pod and connect() again in hopes of waking up one
> of the 19 threads stranded in process A, we likely will tell process B
> or process C to go away since there is no guarantee which thread will
> be awakened by the kernel.
>
> We can't simply connect() ap_daemons_limit*ap_threads_per_child times
> and not write to the pod in hopes of waking up the 19 stranded
> threads in process B or C, since a given thread in process B or C may
> be awakened more than once.
>
> It would seem that the old pod design used by threaded (extra mutex
> calls + poll which breaks SINGLE_LISTEN_UNSERIALIZED_ACCEPT) is
> required in order to be able to gracefully kill an entire threaded
> server process.

You can avoid the poll, but leave in the mutex, and this will still work.
Just make sure you check the pod before you unlock the mutex.  I had
assumed that we required the mutex in the threaded MPM, otherwise the pod
just can't work in this case.

Ryan

_______________________________________________________________________________
Ryan Bloom                        	rbb@apache.org
406 29th St.
San Francisco, CA 94131
-------------------------------------------------------------------------------