You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Ryan Bloom <rb...@raleigh.ibm.com> on 1999/02/02 14:15:38 UTC

Accept Serialization.

A proposal for Acceptor serialization.  This is not an easy thing to
describe over e-mail, so let me describe it using words, and then using a
simple example.  This will be using a hybrid Thread/Process model.

We have an acceptor thread per socket per process.  Those threads do
accept and receive on their socket, and put the response on a blocking
queue.  Across processes, those threads are serialzied using the same
methods process based Apache currently implements.  We also have a pool of
worker threads.  When data is put on the queue, one worker is woken up,
and he grabs data off the queue and serves the request, doing all loggin
and everything else himself (as per the earlier decision).

The size of the pool of worker threads is based on the threadsperchild
directive.  The number of acceptor threads is based on the number of ports
to listen to.

Examples:

we are listening on port 80 and 443.  We have 2 processes and 30
threadsperchild.

Child 1:
	acceptor on 80
	acceptor on 443

	30 worker threads blocking on Q1

Child 2:
	acceptor on 80
	acceptor on 443

	30 worker threads blocking on Q2

Ex 1:
A request comes in on port 80.
Child 1 gets it, and puts the fd on Q1.  One of it's worker threads
wakes up and serves the request.  

Ex 2:
Two requests come in on port 80 and 443.
Child 2 gets them both, accepts and put data on Q2.  Two workers are woken
up and they serve the reqeusts.

Ex 3:
Two requests come in on port 80.
Child 1 gets on and puts it on the Q1, child 2 gets one and puts it on Q2.
One worker for each child is woken up, and they serve their respective
requests.

These are simple examples, and obviously, we are counting on the OS to do
some of our scheduling with a bit of intelligence.  This is not the best
algorithm, if the OS starves one of our processes.

Here is why it is good.  One path throught the code for all OS's.  Only
one thread is ever looking at the accept at any one time.  Our worker
threads don't care what kind of request they server.  For example, if we
had two pools of workers, one for each port, then potentially half of them
would be waiting on a request, even if the other port was struggling to
keep up with the requests.  With this algorithm, if one port gets 99% of
the requests, the threads concentrate on that port.  But if that port goes
quiet for some reason, and everybody starts asking for requests on another
port, the threads automatically switch to serving those requests.

Another solution is to have all threads block on the accept.  Then, you
take the chance of the OS having a problem with a thundering herd.  If all
threads block on the accept, and a request comes in, many OS's will wake
up all of the threads, and then put them back to sleep after the first
thread has gotten the request.  This causes many many context switches.
BAD :(  There is also the potential, out of pure dumb luck, that two
threads will get through to the accept loop at once (IBM's last web server
had this problem when they used this algorithm), this will cause bad
things to happen.

Thoughts?

Ryan 

_______________________________________________________________________
Ryan Bloom		rbb@raleigh.ibm.com
4205 S Miami Blvd	
RTP, NC 27709		It's a beautiful sight to see good dancers 
			doing simple steps.  It's a painful sight to
			see beginners doing complicated patterns.	


Re: Accept Serialization.

Posted by Ryan Bloom <rb...@raleigh.ibm.com>.
Responses to the small problems.....

On Tue, 2 Feb 1999, Michael H. Voase wrote:

> This aint a bad idea and is something Ive looked at meself . A
> couple of small problems is -
> 
>     1) Youre waking up two process at every request ( the
> acceptor and the worker , increasing load on the OS )

You pretty much have to wake up two threads regardless.  If you have each
thread blocking on a mutex before the accept, then the currently accepting
thread has to get the accept release the mutex.  Then, one of the blocked
threads moves into the accept code, and begins to block there.  The
original thread then serves the request.

Assuming the OS doesn't have thundering herd problem, then you are
correct, we are potentially waking up two THREADS, not processes, when we
don't need to.

>     2) Buffer queues get rather hectic ( if youre currently
> undergoing something like the slashdot effect , the actual
> queue contributes to the delay time as well as constantly
> checking for queue overflow)

IBM's experience with Go (not the best example, but the one we have
numbers for), is that with static pages, the queue doesn't add too much
overhead, and you don't see those annoying "Could not connect, try again
later" messages.  You only make the queue the size of the number of
threads in the process.  If you aren't serving static pages, then you
really aren't configured for performance, and maybe you don't care.  This
can also be alleviated a bit by having two processes serving instead of
just one.

> 
>     3) Creating  code to help overcome the deficiencies of
> poorly designed OS's only helps them to get away with it
> instead of prompting them to fix it ..

True, but what do you do while they are fixing it?  Let the user base
suffer.  Personnaly, I would rather create a method that will work for all
users, because let's face it, it's not the user's fault that their OS
sucks.

Ryan

> 
> Just me 2c worth .
> 
> otherwise its a really neat idea ...
> 
> Cheers Mik Voase .
> 
> Ryan Bloom wrote:
> 
> > A proposal for Acceptor serialization.  This is not an easy thing to
> > describe over e-mail, so let me describe it using words, and then using a
> > simple example.  This will be using a hybrid Thread/Process model.
> >
> > We have an acceptor thread per socket per process.  Those threads do
> > accept and receive on their socket, and put the response on a blocking
> > queue.  Across processes, those threads are serialzied using the same
> > methods process based Apache currently implements.  We also have a pool of
> > worker threads.  When data is put on the queue, one worker is woken up,
> > and he grabs data off the queue and serves the request, doing all loggin
> > and everything else himself (as per the earlier decision).
> >
> > The size of the pool of worker threads is based on the threadsperchild
> > directive.  The number of acceptor threads is based on the number of ports
> > to listen to.
> >
> > Examples:
> >
> > we are listening on port 80 and 443.  We have 2 processes and 30
> > threadsperchild.
> >
> > Child 1:
> >         acceptor on 80
> >         acceptor on 443
> >
> >         30 worker threads blocking on Q1
> >
> > Child 2:
> >         acceptor on 80
> >         acceptor on 443
> >
> >         30 worker threads blocking on Q2
> >
> > Ex 1:
> > A request comes in on port 80.
> > Child 1 gets it, and puts the fd on Q1.  One of it's worker threads
> > wakes up and serves the request.
> >
> > Ex 2:
> > Two requests come in on port 80 and 443.
> > Child 2 gets them both, accepts and put data on Q2.  Two workers are woken
> > up and they serve the reqeusts.
> >
> > Ex 3:
> > Two requests come in on port 80.
> > Child 1 gets on and puts it on the Q1, child 2 gets one and puts it on Q2.
> > One worker for each child is woken up, and they serve their respective
> > requests.
> >
> > These are simple examples, and obviously, we are counting on the OS to do
> > some of our scheduling with a bit of intelligence.  This is not the best
> > algorithm, if the OS starves one of our processes.
> >
> > Here is why it is good.  One path throught the code for all OS's.  Only
> > one thread is ever looking at the accept at any one time.  Our worker
> > threads don't care what kind of request they server.  For example, if we
> > had two pools of workers, one for each port, then potentially half of them
> > would be waiting on a request, even if the other port was struggling to
> > keep up with the requests.  With this algorithm, if one port gets 99% of
> > the requests, the threads concentrate on that port.  But if that port goes
> > quiet for some reason, and everybody starts asking for requests on another
> > port, the threads automatically switch to serving those requests.
> >
> > Another solution is to have all threads block on the accept.  Then, you
> > take the chance of the OS having a problem with a thundering herd.  If all
> > threads block on the accept, and a request comes in, many OS's will wake
> > up all of the threads, and then put them back to sleep after the first
> > thread has gotten the request.  This causes many many context switches.
> > BAD :(  There is also the potential, out of pure dumb luck, that two
> > threads will get through to the accept loop at once (IBM's last web server
> > had this problem when they used this algorithm), this will cause bad
> > things to happen.
> >
> > Thoughts?
> >
> > Ryan
> >
> > _______________________________________________________________________
> > Ryan Bloom              rbb@raleigh.ibm.com
> > 4205 S Miami Blvd
> > RTP, NC 27709           It's a beautiful sight to see good dancers
> >                         doing simple steps.  It's a painful sight to
> >                         see beginners doing complicated patterns.
> 
> --
> ----------------------------------------------------------------------------
>  /~\     /~\            CASTLE INDUSTRIES PTY. LTD.
>  | |_____| |            Incorporated 1969. in N.S.W., Australia
>  |         |            Phone +612 6562 1345 Fax +612 6567 1449
>  |   /~\   |            Web http://www.midcoast.com.au/~mvoase
>  |   [ ]   |            Michael H. Voase.  Director.
> ~~~~~~~~~~~~~~          Cause Linux Flies and Windoze Dies ... 'nuf said.
> ----------------------------------------------------------------------------
> 
> 
> 
> 

_______________________________________________________________________
Ryan Bloom		rbb@raleigh.ibm.com
4205 S Miami Blvd	
RTP, NC 27709		It's a beautiful sight to see good dancers 
			doing simple steps.  It's a painful sight to
			see beginners doing complicated patterns.	


Re: Accept Serialization.

Posted by "Michael H. Voase" <mv...@midcoast.com.au>.
This aint a bad idea and is something Ive looked at meself . A
couple of small problems is -

    1) Youre waking up two process at every request ( the
acceptor and the worker , increasing load on the OS )

    2) Buffer queues get rather hectic ( if youre currently
undergoing something like the slashdot effect , the actual
queue contributes to the delay time as well as constantly
checking for queue overflow)

    3) Creating  code to help overcome the deficiencies of
poorly designed OS's only helps them to get away with it
instead of prompting them to fix it ..

Just me 2c worth .

otherwise its a really neat idea ...

Cheers Mik Voase .

Ryan Bloom wrote:

> A proposal for Acceptor serialization.  This is not an easy thing to
> describe over e-mail, so let me describe it using words, and then using a
> simple example.  This will be using a hybrid Thread/Process model.
>
> We have an acceptor thread per socket per process.  Those threads do
> accept and receive on their socket, and put the response on a blocking
> queue.  Across processes, those threads are serialzied using the same
> methods process based Apache currently implements.  We also have a pool of
> worker threads.  When data is put on the queue, one worker is woken up,
> and he grabs data off the queue and serves the request, doing all loggin
> and everything else himself (as per the earlier decision).
>
> The size of the pool of worker threads is based on the threadsperchild
> directive.  The number of acceptor threads is based on the number of ports
> to listen to.
>
> Examples:
>
> we are listening on port 80 and 443.  We have 2 processes and 30
> threadsperchild.
>
> Child 1:
>         acceptor on 80
>         acceptor on 443
>
>         30 worker threads blocking on Q1
>
> Child 2:
>         acceptor on 80
>         acceptor on 443
>
>         30 worker threads blocking on Q2
>
> Ex 1:
> A request comes in on port 80.
> Child 1 gets it, and puts the fd on Q1.  One of it's worker threads
> wakes up and serves the request.
>
> Ex 2:
> Two requests come in on port 80 and 443.
> Child 2 gets them both, accepts and put data on Q2.  Two workers are woken
> up and they serve the reqeusts.
>
> Ex 3:
> Two requests come in on port 80.
> Child 1 gets on and puts it on the Q1, child 2 gets one and puts it on Q2.
> One worker for each child is woken up, and they serve their respective
> requests.
>
> These are simple examples, and obviously, we are counting on the OS to do
> some of our scheduling with a bit of intelligence.  This is not the best
> algorithm, if the OS starves one of our processes.
>
> Here is why it is good.  One path throught the code for all OS's.  Only
> one thread is ever looking at the accept at any one time.  Our worker
> threads don't care what kind of request they server.  For example, if we
> had two pools of workers, one for each port, then potentially half of them
> would be waiting on a request, even if the other port was struggling to
> keep up with the requests.  With this algorithm, if one port gets 99% of
> the requests, the threads concentrate on that port.  But if that port goes
> quiet for some reason, and everybody starts asking for requests on another
> port, the threads automatically switch to serving those requests.
>
> Another solution is to have all threads block on the accept.  Then, you
> take the chance of the OS having a problem with a thundering herd.  If all
> threads block on the accept, and a request comes in, many OS's will wake
> up all of the threads, and then put them back to sleep after the first
> thread has gotten the request.  This causes many many context switches.
> BAD :(  There is also the potential, out of pure dumb luck, that two
> threads will get through to the accept loop at once (IBM's last web server
> had this problem when they used this algorithm), this will cause bad
> things to happen.
>
> Thoughts?
>
> Ryan
>
> _______________________________________________________________________
> Ryan Bloom              rbb@raleigh.ibm.com
> 4205 S Miami Blvd
> RTP, NC 27709           It's a beautiful sight to see good dancers
>                         doing simple steps.  It's a painful sight to
>                         see beginners doing complicated patterns.

--
----------------------------------------------------------------------------
 /~\     /~\            CASTLE INDUSTRIES PTY. LTD.
 | |_____| |            Incorporated 1969. in N.S.W., Australia
 |         |            Phone +612 6562 1345 Fax +612 6567 1449
 |   /~\   |            Web http://www.midcoast.com.au/~mvoase
 |   [ ]   |            Michael H. Voase.  Director.
~~~~~~~~~~~~~~          Cause Linux Flies and Windoze Dies ... 'nuf said.
----------------------------------------------------------------------------




Re: Accept Serialization.

Posted by Dean Gaudet <dg...@arctic.org>.

On Tue, 2 Feb 1999, Ryan Bloom wrote:

> Another solution is to have all threads block on the accept.  Then, you
> take the chance of the OS having a problem with a thundering herd.  If all
> threads block on the accept, and a request comes in, many OS's will wake
> up all of the threads, and then put them back to sleep after the first
> thread has gotten the request.  This causes many many context switches.

But *some* OSs get it right (freebsd for example doesn't have thundering
herd on accept()). 

In my opinion it's wrong to advocate a single solution.  We should
abstract this away and tuck it in a library and allow folks to optimize it
for their configuration.

But yeah we need one solution which is the "default" and yours probably
works.

> BAD :(  There is also the potential, out of pure dumb luck, that two
> threads will get through to the accept loop at once (IBM's last web server
> had this problem when they used this algorithm), this will cause bad
> things to happen.

Sounds like that OS is broken :)

BTW, do folks know how widespread sem_{init,wait,trywait,...} are?  I know
linuxthreads and solaris have them... do all pthreads implementations have
them?  (Any known speed problems with them?)

Dean