You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Apache Software Foundation <hu...@Apache.Org> on 2001/09/21 20:15:45 UTC

FW: Apache Optimization - Post-graduate Research

Not acked.

----- Forwarded message from Markus van Aardt <ma...@24netave.com> -----

From: Markus van Aardt <ma...@24netave.com>
To: human-response@Apache.Org
Subject: Apache Optimization - Post-graduate Research
Date: Sun, 26 Aug 2001 11:21:41 +0200

Hi

If I sent this mail to the wrong address, could you please forward it
to the appropriate address.  This message is related to optimization
issues for a post-graduate study.  This is the only address listed on
the Contacts page.

I am investigating the performance profiles of several web servers for
a post-graduate study at the University of Pretoria, South Africa.  I
have read through several performance reports, including the report
published by MindCraft, as well as the storm it stirred up.

I hoped that I could obtain sufficient information from Apache and
Linux documentation to optimize Apache to such an extend that it would
outperform all other servers, including Microsoft's IIS 5.0.
Unfortunately, the information I could gather from the Apache
Performance Notes (written by Dean Gaudet) and several other
Apache-related web sites did not improve the performance to match the
IIS's performance.

My test environment is based on bottom-end hosted server specifications:

        Pentium II 350
        64 MB RAM
        100 Mb Ethernet
        IDE HDD

        Redhat 7.1
        Apache 1.3.20

On a basic installation, with no optimization applied, I was able to
get the server to handle a load of just over 600 request per second.
After I applied the optimization recommendations, I could only get
this up to around 740-750 requests per second.  This figure is
undoubtedly quite impressive, but to my dismay, IIS could run at
around 1100-1200 requests per second.

Furthermore, it seems that their is a definate point I reached with
all my tests at 256 clients, in that the performance figures take a
serious hit and start to go down quite rapidly up to around 430
clients where it stabalises at 150 requests per second.  I could find
no clear explanation for this as there seems to be little benchmark
results available that cover performance figures for more than 500
clients.  Could you maybe advise as the reason for this.

I would really like to get the server running at higher speeds, but
the performance recommendations seems to be insufficient to tune the
server beyond its current state.  Could you point me to any other
documentation or person who might assist me in this issue.  The
performance figures I obtained for this particular test was based on a
static 16K HTML file.  I am also running the tests on DSO support
through Borland's Kylix.

I appreciate the time you invested in reading this letter.

Best regards
Markus van Aardt
 

----
Solutions for the Internet - http://www.24netave.com/

With Compliments
  Markus van Aardt
  Mobile : +27.82.468.8719
  Facsimile : +27.12.346.1144
  mailto:markus@24netave.com

----- End forwarded message -----

-- 
#ken	P-)}

Ken Coar, Sanagendamgagwedweinini  http://Golux.Com/coar/
Author, developer, opinionist      http://Apache-Server.Com/

"All right everyone!  Step away from the glowing hamburger!"

Re: FW: Apache Optimization - Post-graduate Research

Posted by Brian Pane <bp...@pacbell.net>.
My first inclination is to propose that Markus grab a copy of the current
httpd-2.0 CVS code or the next beta and try out the worker and prefork
MPMs.  The 2.0 code base is starting to look reasonable from a performance
perspective.  And now would be a good time to get some more comparative
benchmark results.

--Brian

Apache Software Foundation wrote:

>Not acked.
>
>----- Forwarded message from Markus van Aardt <ma...@24netave.com> -----
>
>From: Markus van Aardt <ma...@24netave.com>
>To: human-response@Apache.Org
>Subject: Apache Optimization - Post-graduate Research
>Date: Sun, 26 Aug 2001 11:21:41 +0200
>
>Hi
>
>If I sent this mail to the wrong address, could you please forward it
>to the appropriate address.  This message is related to optimization
>issues for a post-graduate study.  This is the only address listed on
>the Contacts page.
>
>I am investigating the performance profiles of several web servers for
>a post-graduate study at the University of Pretoria, South Africa.  I
>have read through several performance reports, including the report
>published by MindCraft, as well as the storm it stirred up.
>
>I hoped that I could obtain sufficient information from Apache and
>Linux documentation to optimize Apache to such an extend that it would
>outperform all other servers, including Microsoft's IIS 5.0.
>Unfortunately, the information I could gather from the Apache
>Performance Notes (written by Dean Gaudet) and several other
>Apache-related web sites did not improve the performance to match the
>IIS's performance.
>
>My test environment is based on bottom-end hosted server specifications:
>
>        Pentium II 350
>        64 MB RAM
>        100 Mb Ethernet
>        IDE HDD
>
>        Redhat 7.1
>        Apache 1.3.20
>
>On a basic installation, with no optimization applied, I was able to
>get the server to handle a load of just over 600 request per second.
>After I applied the optimization recommendations, I could only get
>this up to around 740-750 requests per second.  This figure is
>undoubtedly quite impressive, but to my dismay, IIS could run at
>around 1100-1200 requests per second.
>
>Furthermore, it seems that their is a definate point I reached with
>all my tests at 256 clients, in that the performance figures take a
>serious hit and start to go down quite rapidly up to around 430
>clients where it stabalises at 150 requests per second.  I could find
>no clear explanation for this as there seems to be little benchmark
>results available that cover performance figures for more than 500
>clients.  Could you maybe advise as the reason for this.
>
>I would really like to get the server running at higher speeds, but
>the performance recommendations seems to be insufficient to tune the
>server beyond its current state.  Could you point me to any other
>documentation or person who might assist me in this issue.  The
>performance figures I obtained for this particular test was based on a
>static 16K HTML file.  I am also running the tests on DSO support
>through Borland's Kylix.
>
>I appreciate the time you invested in reading this letter.
>
>Best regards
>Markus van Aardt
> 
>
>----
>Solutions for the Internet - http://www.24netave.com/
>
>With Compliments
>  Markus van Aardt
>  Mobile : +27.82.468.8719
>  Facsimile : +27.12.346.1144
>  mailto:markus@24netave.com
>
>----- End forwarded message -----
>




Re: FW: Apache Optimization - Post-graduate Research

Posted by dean gaudet <de...@arctic.org>.
On Thu, 27 Sep 2001, Scott Manley wrote:

>
> > on production servers, with probability close to 1, all data required for
> > the http parsing to decide on a handler arrives in the first packet of the
> > request.  so rather than complicate this code by making it able to handle
> > the almost-never-happens case it's better to just use a small small small
> > pool of threads to parse incoming requests.  a pool that has maybe 1
> > thread per cpu in it.
>
> All very nice until someone realises this and writes some lame DoS code
> to lock up your severs....

this is trivial to design around.

-dean


Re: FW: Apache Optimization - Post-graduate Research

Posted by Scott Manley <sc...@myplay.com>.
> on production servers, with probability close to 1, all data required for
> the http parsing to decide on a handler arrives in the first packet of the
> request.  so rather than complicate this code by making it able to handle
> the almost-never-happens case it's better to just use a small small small
> pool of threads to parse incoming requests.  a pool that has maybe 1
> thread per cpu in it.

All very nice until someone realises this and writes some lame DoS code
to lock up your severs....

-- 
Scott Manley (AKA Szyzyg)
Streaming Media Hacker
www.myplay.com

Re: FW: Apache Optimization - Post-graduate Research

Posted by dean gaudet <de...@arctic.org>.
On Thu, 27 Sep 2001, Brian Pane wrote:

> The other big complication is that we'd need a way to avoid blocking
> on disk reads if a client requests a file that isn't currently in the
> filesystem cache.  I guess mincore(2) would work in cases where we happen
> to have the file mmap'ed, but is there a way to tell whether a sendfile
> call will block on a filesystem read?

there's no async sendfile equivalent that i know of ... although i think
something of that sort is available in some linux patches and planned for
the 2.5 kernel.

-dean


Re: FW: Apache Optimization - Post-graduate Research

Posted by Brian Pane <bp...@pacbell.net>.
dean gaudet wrote:

>On Thu, 27 Sep 2001, Bill Stoddard wrote:
>
[...]

>>FWIW, the worker MPM is a step in the right direction for making
>>Apache an event driven server. We just need to mangle the core HTTP
>>parsing engine to make it stateful and impose some programming
>>disiplines on an async Apache API. Good 3.0 stuff :-)
>>
>
>the core HTTP parsing engine has been ready for an event-driven server
>since about apache 0.7.  the only thing that needs to happen is changes to
>the content handlers.
>

The other big complication is that we'd need a way to avoid blocking
on disk reads if a client requests a file that isn't currently in the
filesystem cache.  I guess mincore(2) would work in cases where we happen
to have the file mmap'ed, but is there a way to tell whether a sendfile
call will block on a filesystem read?

--Brian




Re: FW: Apache Optimization - Post-graduate Research

Posted by dean gaudet <de...@arctic.org>.
On Thu, 27 Sep 2001, Bill Stoddard wrote:

> From my experience, a properly implemented kernel engine will be from
> 1.5 to 2x faster than the fastest user space implementations (Zeus,
> IIS for example). An event driven server will be somewhat faster than
> a thread-per-connection server.  The real advantage of an event driven
> server is that it scales well to large numbers of concurrent clients.

this was true until linux 2.4.x ... where X15 has demonstrated that with
the right kernel features, you can write a fast server in userland.  the
right kernel features include:  zero-copy tcp and scalable event
processing (i.e. not poll()).

> FWIW, the worker MPM is a step in the right direction for making
> Apache an event driven server. We just need to mangle the core HTTP
> parsing engine to make it stateful and impose some programming
> disiplines on an async Apache API. Good 3.0 stuff :-)

the core HTTP parsing engine has been ready for an event-driven server
since about apache 0.7.  the only thing that needs to happen is changes to
the content handlers.

on production servers, with probability close to 1, all data required for
the http parsing to decide on a handler arrives in the first packet of the
request.  so rather than complicate this code by making it able to handle
the almost-never-happens case it's better to just use a small small small
pool of threads to parse incoming requests.  a pool that has maybe 1
thread per cpu in it.

-dean


Re: FW: Apache Optimization - Post-graduate Research

Posted by Bill Stoddard <bi...@wstoddard.com>.
> dean gaudet wrote:
>
> >your numbers look about in the right ballpark for the top performance
> >you'll get from apache on that hardware.  the apache architecture has some
> >fundamental performance issues... consider using TUX instead (it's
> >included with redhat 7.1) or X15 (which is also linux, userland only and
> >performs as well as TUX).
> >
>
> This is related to an issue that I've been thinking about lately...
>
> I characterize Apache's performance as being limited by two very
> different classes of factors:
>
>   * Architectural factors -- e.g., a thread-per-connection server
>     generally will be slower than an event-loop server.
>
>   * Implementation factors -- e.g., using O(n)-time algorithms where
>     O(log(n)) is possible, or making extraneous system calls.
>
> In the past, implementation factors seem to have had a nontrivial
> effect on Apache's performance.  For 2.0, we've made some progress
> in fixing this through numerous optimizations to bottlenecks in the
> user-space code.  As we continue to fix the implementation inefficiencies,
> 2.0's throughput and CPU utilization will asymptotically approach the
> limits of its architecture on each platform.
>
> The interesting question is: once we've finished fixing the implementation
> factors (where "finish" means "reach a point of diminishing returns"),
> and architecture has a bigger impact on performance than implementation
> does, how will the performance compare to servers with different
> architectures?

> I expect that Apache 2.0 with the worker MPM will be
> slower than in-kernel servers and multiple-connection-per-thread
> user-space servers, but it's not clear how big the speed difference
> will be.

>From my experience, a properly implemented kernel engine will be from 1.5 to 2x faster
than the fastest user space implementations (Zeus, IIS for example). An event driven
server will be somewhat faster than a thread-per-connection server.  The real advantage of
an event driven server is that it scales well to large numbers of concurrent clients.

FWIW, the worker MPM is a step in the right direction for making Apache an event driven
server. We just need to mangle the core HTTP parsing engine to make it stateful and impose
some programming disiplines on an async Apache API. Good 3.0 stuff :-)

Bill


Re: FW: Apache Optimization - Post-graduate Research

Posted by Brian Pane <bp...@pacbell.net>.
dean gaudet wrote:

>
>On Wed, 26 Sep 2001, Brian Pane wrote:
>
>>I characterize Apache's performance as being limited by two very
>>different classes of factors:
>>
>>  * Architectural factors -- e.g., a thread-per-connection server
>>    generally will be slower than an event-loop server.
>>
>
>another way to think of these is as the constants which are hidden by the
>O() notation.  the architecture is OK in a theoretical sense, but there
>are some really really big constants hidden in the O()s that we'd use to
>describe the architecture.
>

I like that constant-factor description because it nicely captures the other
big way in which httpd performance has improved over time: As key operations
in the OS get more efficient (quicker context switches, for example, or
zero-copy sendfile), the constant factor drops.

>>  * Implementation factors -- e.g., using O(n)-time algorithms where
>>    O(log(n)) is possible, or making extraneous system calls.
>>
>
>if you know of any cases where an O(n) -> O(log(n)) change can be made i'd
>like to hear about them...
>

request_rec->headers_in

apr_table_t in general: aside from the HTTP headers, the O(n) "set" 
functions
result in an O(n^2) cost for things that need to insert n items into 
subprocess_env
(mod_include, CGIs).

Maybe ap_find_command_in_modules() too (possibly useful in speeding up 
.htaccess
file processing)

There are various places where we may be able to go from O(n) to O(1) by
optimizing away strdup() calls.

--Brian



Re: FW: Apache Optimization - Post-graduate Research

Posted by Cliff Woolley <cl...@yahoo.com>.
On Thu, 27 Sep 2001, dean gaudet wrote:

> 'cause i tend to think apache-2.0, in particular, is way more limited by
> big huge constants surrounding malloc() costs, ...

At least 75% of those that I know about are going away RSN.  =-)

--Cliff

--------------------------------------------------------------
   Cliff Woolley
   cliffwoolley@yahoo.com
   Charlottesville, VA



Re: FW: Apache Optimization - Post-graduate Research

Posted by dean gaudet <de...@arctic.org>.

On Wed, 26 Sep 2001, Brian Pane wrote:

> I characterize Apache's performance as being limited by two very
> different classes of factors:
>
>   * Architectural factors -- e.g., a thread-per-connection server
>     generally will be slower than an event-loop server.

another way to think of these is as the constants which are hidden by the
O() notation.  the architecture is OK in a theoretical sense, but there
are some really really big constants hidden in the O()s that we'd use to
describe the architecture.

>   * Implementation factors -- e.g., using O(n)-time algorithms where
>     O(log(n)) is possible, or making extraneous system calls.

if you know of any cases where an O(n) -> O(log(n)) change can be made i'd
like to hear about them...

'cause i tend to think apache-2.0, in particular, is way more limited by
big huge constants surrounding malloc() costs, indirect calls, memory
foot-print, ...

-dean


Re: FW: Apache Optimization - Post-graduate Research

Posted by Brian Pane <bp...@pacbell.net>.
dean gaudet wrote:

>your numbers look about in the right ballpark for the top performance
>you'll get from apache on that hardware.  the apache architecture has some
>fundamental performance issues... consider using TUX instead (it's
>included with redhat 7.1) or X15 (which is also linux, userland only and
>performs as well as TUX).
>

This is related to an issue that I've been thinking about lately...

I characterize Apache's performance as being limited by two very
different classes of factors:

  * Architectural factors -- e.g., a thread-per-connection server
    generally will be slower than an event-loop server.

  * Implementation factors -- e.g., using O(n)-time algorithms where
    O(log(n)) is possible, or making extraneous system calls.

In the past, implementation factors seem to have had a nontrivial
effect on Apache's performance.  For 2.0, we've made some progress
in fixing this through numerous optimizations to bottlenecks in the
user-space code.  As we continue to fix the implementation inefficiencies,
2.0's throughput and CPU utilization will asymptotically approach the
limits of its architecture on each platform.

The interesting question is: once we've finished fixing the implementation
factors (where "finish" means "reach a point of diminishing returns"),
and architecture has a bigger impact on performance than implementation
does, how will the performance compare to servers with different
architectures?  I expect that Apache 2.0 with the worker MPM will be
slower than in-kernel servers and multiple-connection-per-thread
user-space servers, but it's not clear how big the speed difference
will be.

--Brian



Re: FW: Apache Optimization - Post-graduate Research

Posted by dean gaudet <de...@arctic.org>.
your numbers look about in the right ballpark for the top performance
you'll get from apache on that hardware.  the apache architecture has some
fundamental performance issues... consider using TUX instead (it's
included with redhat 7.1) or X15 (which is also linux, userland only and
performs as well as TUX).

(i'd say your hardware needs more RAM at any rate.  64MB is nothing
today.)

if you want pointers, for your study, as to why apache's architecture is
fundamentally limited then stuff you should consider is the number of
bytes of per-client state information in a server such as apache (thread
or process per client means a stack per client) vs.
TUX/X15/IIS/thttpd/zeus... which are event driven and have roughly a
thread/process per CPU.  also consider the context switching cost of
thread/process per client.  also consider the expense of the "rich"
configuration language in apache (see new-httpd threads in the past few
months regarding per_dir config merging).  also consider the cost of
dozens of indirect function calls per request (to implement a generalised
module interface).

-dean

On Fri, 21 Sep 2001, Apache Software Foundation wrote:

> Not acked.
>
> ----- Forwarded message from Markus van Aardt <ma...@24netave.com> -----
>
> From: Markus van Aardt <ma...@24netave.com>
> To: human-response@Apache.Org
> Subject: Apache Optimization - Post-graduate Research
> Date: Sun, 26 Aug 2001 11:21:41 +0200
>
> Hi
>
> If I sent this mail to the wrong address, could you please forward it
> to the appropriate address.  This message is related to optimization
> issues for a post-graduate study.  This is the only address listed on
> the Contacts page.
>
> I am investigating the performance profiles of several web servers for
> a post-graduate study at the University of Pretoria, South Africa.  I
> have read through several performance reports, including the report
> published by MindCraft, as well as the storm it stirred up.
>
> I hoped that I could obtain sufficient information from Apache and
> Linux documentation to optimize Apache to such an extend that it would
> outperform all other servers, including Microsoft's IIS 5.0.
> Unfortunately, the information I could gather from the Apache
> Performance Notes (written by Dean Gaudet) and several other
> Apache-related web sites did not improve the performance to match the
> IIS's performance.
>
> My test environment is based on bottom-end hosted server specifications:
>
>         Pentium II 350
>         64 MB RAM
>         100 Mb Ethernet
>         IDE HDD
>
>         Redhat 7.1
>         Apache 1.3.20
>
> On a basic installation, with no optimization applied, I was able to
> get the server to handle a load of just over 600 request per second.
> After I applied the optimization recommendations, I could only get
> this up to around 740-750 requests per second.  This figure is
> undoubtedly quite impressive, but to my dismay, IIS could run at
> around 1100-1200 requests per second.
>
> Furthermore, it seems that their is a definate point I reached with
> all my tests at 256 clients, in that the performance figures take a
> serious hit and start to go down quite rapidly up to around 430
> clients where it stabalises at 150 requests per second.  I could find
> no clear explanation for this as there seems to be little benchmark
> results available that cover performance figures for more than 500
> clients.  Could you maybe advise as the reason for this.
>
> I would really like to get the server running at higher speeds, but
> the performance recommendations seems to be insufficient to tune the
> server beyond its current state.  Could you point me to any other
> documentation or person who might assist me in this issue.  The
> performance figures I obtained for this particular test was based on a
> static 16K HTML file.  I am also running the tests on DSO support
> through Borland's Kylix.
>
> I appreciate the time you invested in reading this letter.
>
> Best regards
> Markus van Aardt
>
>
> ----
> Solutions for the Internet - http://www.24netave.com/
>
> With Compliments
>   Markus van Aardt
>   Mobile : +27.82.468.8719
>   Facsimile : +27.12.346.1144
>   mailto:markus@24netave.com
>
> ----- End forwarded message -----
>
> --
> #ken	P-)}
>
> Ken Coar, Sanagendamgagwedweinini  http://Golux.Com/coar/
> Author, developer, opinionist      http://Apache-Server.Com/
>
> "All right everyone!  Step away from the glowing hamburger!"
>