You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Greg Ames <gr...@remulak.net> on 2004/06/08 16:23:56 UTC

[PATCH] event driven MPM

I'm interested to know how httpd 2.x can be made more scalable.  Could we serve
10,000 clients with current platforms as discussed at
http://www.kegel.com/c10k.html , without massive code churn and module breakage?
  I believe that reducing the number of active threads would help by reducing
the stack memory requirements.

Bill Stoddard created an event driven socket I/O patch a couple of years ago 
that could serve pages.  I picked it up and decided to see if I could simplify 
it to minimize the changes to request processing.

There are sites/workloads where keepalive timeouts tie up the majority of the
active threads or processes, such as our own web site as seen at
http://apache.org/server-status .  I also see a lot of "K"s when running 
specweb99 on a stock httpd.  Since we are between requests during keepalive 
waits, they can be handled differently without impacting a lot of code.

This patch decouples the connection from the worker thread, and offloads
keepalive processing to a new event thread in the worker MPM.  The event thread
uses apr_pollset_* calls to wait for new requests or for the keepalive timeout
to expire.  When the event thread sees that a socket is readable or a timeout
has occured, it passes the connection back to a worker thread, most likely a
different thread than the one previously used by this connection.

Here is the patch - http://apache.org/~gregames/event.patch .  It is currently
based on the worker MPM from 2.1 HEAD to make it easier to review.  It is
working well for me under load on Linux and serves pages on Solaris.  When I run 
specweb99 I've seen around 170 connections served by 5 to 20 worker threads. 
This is totally dependent on the time the workload spends in keepalive/user 
think time vs. time spent doing anything else, so YMMV.

This approach could very well hit scalability bottlenecks with various poll() 
implementations as described on Kegel's web site listed above.  There are many 
alternatives emerging.  I've gotten a few inquiries about using sys_epoll on 
Linux, which sounds like a great fit.  I wanted to get something working with 
plain ol' generic (portable) poll() first before trying out any OS dependent 
variations.  The current apr_pollset_remove will eventually become a bottleneck 
too -

#ifdef HAVE_POLL
     for (i = 0; i < pollset->nelts; i++) {
etc

Other issues are described at http://apache.org/~gregames/event.laundry_list .
However I think the patch is robust enough now to post.  I'm very interested in
feedback and having people try it out before going further.

Greg


Re: [PATCH] event driven MPM

Posted by Greg Ames <gr...@remulak.net>.
Paul Querna wrote:
> On Tue, 2004-06-08 at 10:23 -0400, Greg Ames wrote:
> 
>>Here is the patch - http://apache.org/~gregames/event.patch .  
> 
> Very Neat :D

thanks!

> I don't think everyone on this list is aware of this, but I have an
> outstanding patch[1] for apr_pollset to add both KQueue and sys_epoll
> backends.  Hopefully this will get committed soon.

excellent.  My Apache time recently has been spent pretty much heads down 
getting event.patch ready to post so I missed it - sorry.  Thanks for the pointer.

>>Other issues are described at http://apache.org/~gregames/event.laundry_list .
> 
> Mentioned on that laundry list is adding to the pollset from another
> thread.  I believe this is supported by both KQueue and sys_epoll.  I
> guess it could be easily supported with some #ifdefs. 

sounds good.  This is the first I've heard that KQueue could do it.  Adding 
connections to the timeout list complicates things though.  Now all the timeout 
list operations are done on the event thread so they are serialized for free.

>>From: http://apache.org/~gregames/event.laundry_list
>>Should new connections be passed to the event_thread to test for 
>>readability? It's good if there are long delays before the first data
>>packet is processed; if there is no delay, it's extra complexity and
>>wasted cycles.  The thread switches could be minimized if the event
>>thread and listener were merged. 
> 
> I like the general idea, but as a note, FreeBSD has accept filters
> which already do some of this.

yep, and Win32 has AcceptEx.

> I think having the Event Thread also handle the Accept() would be the
> best method in the long run.

I think so too long term.  Short term, getting it in shape to do benchmarking 
seems more important.

Thanks much for taking the time to review this and for the feedback.
Greg


Re: [PATCH] event driven MPM

Posted by Paul Querna <ch...@force-elite.com>.
On Tue, 2004-06-08 at 10:23 -0400, Greg Ames wrote:
> Here is the patch - http://apache.org/~gregames/event.patch .  It is currently
> based on the worker MPM from 2.1 HEAD to make it easier to review.  It is
> working well for me under load on Linux and serves pages on Solaris.  When I run 
> specweb99 I've seen around 170 connections served by 5 to 20 worker threads. 
> This is totally dependent on the time the workload spends in keepalive/user 
> think time vs. time spent doing anything else, so YMMV.

Very Neat :D

> This approach could very well hit scalability bottlenecks with various poll() 
> implementations as described on Kegel's web site listed above.  There are many 
> alternatives emerging.  I've gotten a few inquiries about using sys_epoll on 
> Linux, which sounds like a great fit.  I wanted to get something working with 
> plain ol' generic (portable) poll() first before trying out any OS dependent 
> variations.  The current apr_pollset_remove will eventually become a bottleneck 
> too -
> 
> #ifdef HAVE_POLL
>      for (i = 0; i < pollset->nelts; i++) {
> etc

I don't think everyone on this list is aware of this, but I have an
outstanding patch[1] for apr_pollset to add both KQueue and sys_epoll
backends.  Hopefully this will get committed soon.


> Other issues are described at http://apache.org/~gregames/event.laundry_list .
> However I think the patch is robust enough now to post.  I'm very interested in
> feedback and having people try it out before going further.

Mentioned on that laundry list is adding to the pollset from another
thread.  I believe this is supported by both KQueue and sys_epoll.  I
guess it could be easily supported with some #ifdefs.

> From: http://apache.org/~gregames/event.laundry_list
> Should new connections be passed to the event_thread to test for 
> readability? It's good if there are long delays before the first data
> packet is processed; if there is no delay, it's extra complexity and
> wasted cycles.  The thread switches could be minimized if the event
> thread and listener were merged.

I like the general idea, but as a note, FreeBSD has accept filters
which already do some of this. See:
http://www.freebsd.org/cgi/man.cgi?query=accf_http
(It is enabled in the Apache code, but is not commonly enabled in the
FreeBSD kernel.  I don't believe Linux has any equivalent)

I think having the Event Thread also handle the Accept() would be the
best method in the long run.

-Paul Querna

[1] - http://marc.theaimsgroup.com/?t=108650227500001&r=1&w=2


Re: [PATCH] event driven MPM

Posted by Brian Akins <ba...@web.turner.com>.
Paul Querna wrote:

>
>My basic question for the list is, are we better off modifying
>the Worker MPM, or should we create a new 'event' MPM for now?
>  
>
My $.02 worht:

Make an event MPM that is experimental.  That way people can use "tried 
and true" worker, or try the fancy new event one.

-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies


Re: [PATCH] event driven MPM

Posted by Paul Querna <ch...@force-elite.com>.
On Tue, 2004-08-03 at 08:18 -0400, Brian Akins wrote:
> Greg Ames wrote:
> > Bill Stoddard created an event driven socket I/O patch a couple of 
> > years ago that could serve pages.  I picked it up and decided to see 
> > if I could simplify it to minimize the changes to request processing.
>
> What's the status of this?  I'd be willing to help if needed.  We are 
> interested in this.
> 

I am interested in it too.  I talked to Greg Ames a week ago, and he
doesn't have any updates for the patch.  I planned on mailing this list
about it later this week.

I would like to get the patch(or something based off of it) into CVS
soon.  My basic question for the list is, are we better off modifying
the Worker MPM, or should we create a new 'event' MPM for now?

-Paul Querna




Re: [PATCH] event driven MPM

Posted by Brian Akins <ba...@web.turner.com>.
Greg Ames wrote:

>
> Bill Stoddard created an event driven socket I/O patch a couple of 
> years ago that could serve pages.  I picked it up and decided to see 
> if I could simplify it to minimize the changes to request processing.


What's the status of this?  I'd be willing to help if needed.  We are 
interested in this.


-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies


Re: [PATCH] event driven MPM

Posted by Colm MacCarthaigh <co...@stdlib.net>.
On Tue, Jun 08, 2004 at 10:23:56AM -0400, Greg Ames wrote:
> I'm interested to know how httpd 2.x can be made more scalable.  Could we 
> serve
> 10,000 clients with current platforms as discussed at
> http://www.kegel.com/c10k.html , without massive code churn and module 
> breakage?

I've served over 20,000 using http 2.x, our current record was just over
23,000 back in February. Stock 2.x code with only a slight 10-liner
patch which enabled sendfile for IPv4 connections only and the higher
hardlimits patch I jokingly sent to the list (and got committed ;).

I mailed Dan (c10k person) back on the third of February about it but
I never got a reply. As previously stated on the list the server is
running Linux 2.6.x-mm2 (where x is usually the most current), it's
a Dell 2650, dual 2.4Ghz Xeon, 12Gb of RAM.

Keepalive's are on, though "Timeout" is 30, MaxKeepAliveRequests 100 ,
KeepAliveTimeout 15, the net/ipv4/tcp_keepalive_time sysctl is 300
and I've found;

fs/xfs/refcache_size = 512
vm/min_free_kbytes = 1024000
vm/lower_zone_protection = 1024
vm/page-cluster = 5
vm/swappiness = 10

all helped lots :) This is all with prefork as well, worker works out
slower for our load for some reason. Oh and while I'm at it, the same
server (http://ftp.heanet.ie/) recently shipped 966Mbit/sec in
production, but only to about 5,000 concurrent users, that was the
release of Fedora Core/2.

10k is way to low a target, I didn't even have to configure much to
achieve that. 100k is a good target :)

> I believe that reducing the number of active threads would help by reducing
> the stack memory requirements.

I've found that to work for us, with definite good results. I'll
certainly try it on ftp.heanet.ie (after it looks ready ;) and report
back if that's useful.

-- 
Colm MacCárthaigh                        Public Key: colm+pgp@stdlib.net

Re: [PATCH] event driven MPM

Posted by Joshua Slive <jo...@slive.ca>.
On Tue, 8 Jun 2004, Bill Stoddard wrote:

> Joshua Slive wrote:
> > I don't have any technical comments, other than "cool".  But I can
> > confirm that many people report needing to turn KeepAlive off to get
> > reasonable performance from apache.
>
> Joshua, Tell us more. If you can't start enough processes/threads to
> handle the number of incoming connections, then setting keepalivetimeout
> from 15 to 5 seconds or turning off keepalive entirely will boost the
> apparent number of 'concurrent' connections able to be handled by the
> server. I've found this useful in several customer cases I've worked on.
> The eventdriven patch should solve this problem quite handily.

Yes, that is what I meant.  People with limited memory (so they can't
just up MaxClients) can sometimes serve many more "people" by turning off
keepalive in order to free slots.  With this patch, they should get the
best of both worlds (keepalive plus all worker threads handling actual
content).

Joshua.

Re: [PATCH] event driven MPM

Posted by Bill Stoddard <bi...@wstoddard.com>.
Joshua Slive wrote:
> 
> On Tue, 8 Jun 2004, Greg Ames wrote:
> 
>> There are sites/workloads where keepalive timeouts tie up the majority 
>> of the
>> active threads or processes, such as our own web site as seen at
>> http://apache.org/server-status .  I also see a lot of "K"s when 
>> running specweb99 on a stock httpd.  Since we are between requests 
>> during keepalive waits, they can be handled differently without 
>> impacting a lot of code.
> 
> 
> I don't have any technical comments, other than "cool".  But I can 
> confirm that many people report needing to turn KeepAlive off to get 
> reasonable performance from apache.  

Joshua,
Tell us more. If you can't start enough processes/threads to handle the number of incoming connections, then 
setting keepalivetimeout from 15 to 5 seconds or turning off keepalive entirely will boost the apparent number 
of 'concurrent' connections able to be handled by the server. I've found this useful in several customer cases 
I've worked on.  The eventdriven patch should solve this problem quite handily.

Bill

Re: [PATCH] event driven MPM

Posted by gr...@apache.org.
Joshua Slive wrote:

> I don't have any technical comments, other than "cool".  

thanks!  :-)

> But I can 
> confirm that many people report needing to turn KeepAlive off to get 
> reasonable performance from apache.  

like Bill, I'm curious to know any details.  I'm guessing:
* this is more of a 1.3 issue,
* the servers are constrained by memory, processes, or threads, and
* "KeepAlive off" makes it hurt less.

> It might be interesting to see the 
> KeepAlive off case along-side the current implementation and the 
> event-drive implementation in benchmarking in order to prove that this 
> really wins.

yep.  At the moment I think work needs to be done to reduce the overhead in 
passing a socket to the event thread before getting really serious about 
benchmarking.  But yes it should be benchmarked before too long.  Something like 
Kegel's web site suggests (4K download every second per simulated user) would be 
interesting.

Greg






Re: [PATCH] event driven MPM

Posted by Joshua Slive <jo...@slive.ca>.
On Tue, 8 Jun 2004, Greg Ames wrote:
> There are sites/workloads where keepalive timeouts tie up the majority of the
> active threads or processes, such as our own web site as seen at
> http://apache.org/server-status .  I also see a lot of "K"s when running 
> specweb99 on a stock httpd.  Since we are between requests during keepalive 
> waits, they can be handled differently without impacting a lot of code.

I don't have any technical comments, other than "cool".  But I can confirm 
that many people report needing to turn KeepAlive off to get reasonable 
performance from apache.  It might be interesting to see the KeepAlive off 
case along-side the current implementation and the event-drive 
implementation in benchmarking in order to prove that this really wins.

Joshua.