You are viewing a plain text version of this content. The canonical link for it is here.
Posted to cvs@httpd.apache.org by rb...@apache.org on 2003/03/10 05:29:14 UTC
cvs commit: httpd-docs-1.3/htdocs/manual/misc perf-tuning.html
rbowen 2003/03/09 20:29:14
Modified: htdocs/manual/misc perf-tuning.html
Log:
Sorry about how noisy this patch is. I've added quite a bit of text here
- a section about mod_mmap_static and a bit about removing modules that
you're not using. But there's also quite a bit of grammatical stuff, as
well as conversion to correct xhtml. The patch on the 2.x side should be
a lot more readable, if you want to see exactly what text has been
modified, except that the mod_mmap_static stuff does not appear in the
2.x version of the patch.
Revision Changes Path
1.28 +600 -576 httpd-docs-1.3/htdocs/manual/misc/perf-tuning.html
Index: perf-tuning.html
===================================================================
RCS file: /home/cvs/httpd-docs-1.3/htdocs/manual/misc/perf-tuning.html,v
retrieving revision 1.27
retrieving revision 1.28
diff -u -r1.27 -r1.28
--- perf-tuning.html 8 Oct 2001 01:26:54 -0000 1.27
+++ perf-tuning.html 10 Mar 2003 04:29:13 -0000 1.28
@@ -9,8 +9,8 @@
</head>
<!-- Background white, links blue (unvisited), navy (visited), red (active) -->
- <body bgcolor="#FFFFFF" text="#000000" link="#0000FF"
- vlink="#000080" alink="#FF0000">
+ <body bgcolor="#FFFFFF" text="#000000" link="#0000FF" vlink="#000080"
+ alink="#FF0000">
<!--#include virtual="header.html" -->
<h1 align="center">Apache Performance Notes</h1>
@@ -20,20 +20,28 @@
<ul>
<li><a href="#introduction">Introduction</a></li>
- <li><a href="#hardware">Hardware and Operating System
- Issues</a></li>
+ <li><a href="#hardware">Hardware and Operating System Issues</a></li>
<li><a href="#runtime">Run-Time Configuration Issues</a></li>
- <li><a href="#compiletime">Compile-Time Configuration
- Issues</a></li>
+ <!--
+ Contains subsections:
+ #dns
+ #symlinks
+ #htaccess
+ #negotiation
+ #process
+ #modules
+ #mmap
+ -->
+
+ <li><a href="#compiletime">Compile-Time Configuration Issues</a></li>
<li>
Appendixes
<ul>
- <li><a href="#trace">Detailed Analysis of a
- Trace</a></li>
+ <li><a href="#trace">Detailed Analysis of a Trace</a></li>
<li><a href="#patches">Patches Available</a></li>
@@ -43,88 +51,95 @@
</ul>
<hr />
- <h3><a id="introduction"
- name="introduction">Introduction</a></h3>
+ <h3><a id="introduction" name="introduction">Introduction</a></h3>
- <p>Apache is a general webserver, which is designed to be
- correct first, and fast second. Even so, its performance is
- quite satisfactory. Most sites have less than 10Mbits of
- outgoing bandwidth, which Apache can fill using only a low end
- Pentium-based webserver. In practice sites with more bandwidth
- require more than one machine to fill the bandwidth due to
- other constraints (such as CGI or database transaction
- overhead). For these reasons the development focus has been
- mostly on correctness and configurability.</p>
+ <p>Apache is a general webserver, which is designed to be correct
+ first, and fast second. Even so, its performance is quite satisfactory.
+ Most sites have less than 10Mbits of outgoing bandwidth, which Apache
+ can fill using only a low end Pentium-based webserver. In practice,
+ sites with more bandwidth require more than one machine to fill the
+ bandwidth due to other constraints (such as CGI or database transaction
+ overhead). For these reasons, the development focus has been mostly on
+ correctness and configurability.</p>
<p>Unfortunately many folks overlook these facts and cite raw
- performance numbers as if they are some indication of the
- quality of a web server product. There is a bare minimum
- performance that is acceptable, beyond that extra speed only
- caters to a much smaller segment of the market. But in order to
- avoid this hurdle to the acceptance of Apache in some markets,
- effort was put into Apache 1.3 to bring performance up to a
- point where the difference with other high-end webservers is
- minimal.</p>
-
- <p>Finally there are the folks who just plain want to see how
- fast something can go. The author falls into this category. The
- rest of this document is dedicated to these folks who want to
- squeeze every last bit of performance out of Apache's current
- model, and want to understand why it does some things which
- slow it down.</p>
-
- <p>Note that this is tailored towards Apache 1.3 on Unix. Some
- of it applies to Apache on NT. Apache on NT has not been tuned
- for performance yet; in fact it probably performs very poorly
- because NT performance requires a different programming
- model.</p>
+ performance numbers as if they are some indication of the quality of a
+ web server product. There is a bare minimum performance that is
+ acceptable, beyond that, extra speed only caters to a much smaller
+ segment of the market. But in order to avoid this hurdle to the
+ acceptance of Apache in some markets, effort was put into Apache 1.3 to
+ bring performance up to a point where the difference with other
+ high-end webservers is minimal.</p>
+
+ <p>Finally there are the folks who just want to see how fast something
+ can go. The author falls into this category. The rest of this document
+ is dedicated to these folks who want to squeeze every last bit of
+ performance out of Apache's current model, and want to understand why
+ it does some things which slow it down.</p>
+
+ <p>Note that this is tailored towards Apache 1.3 on Unix. Some of it
+ applies to Apache on NT. Apache on NT has not been tuned for
+ performance yet; in fact it probably performs very poorly because NT
+ performance requires a different programming model.</p>
<hr />
- <h3><a id="hardware" name="hardware">Hardware and Operating
- System Issues</a></h3>
+ <h3><a id="hardware" name="hardware">Hardware and Operating System
+ Issues</a></h3>
- <p>The single biggest hardware issue affecting webserver
- performance is RAM. A webserver should never ever have to swap,
- swapping increases the latency of each request beyond a point
- that users consider "fast enough". This causes users to hit
- stop and reload, further increasing the load. You can, and
- should, control the <code>MaxClients</code> setting so that
- your server does not spawn so many children it starts
- swapping.</p>
-
- <p>Beyond that the rest is mundane: get a fast enough CPU, a
- fast enough network card, and fast enough disks, where "fast
- enough" is something that needs to be determined by
- experimentation.</p>
-
- <p>Operating system choice is largely a matter of local
- concerns. But a general guideline is to always apply the latest
- vendor TCP/IP patches. HTTP serving completely breaks many of
- the assumptions built into Unix kernels up through 1994 and
- even 1995. Good choices include recent FreeBSD, and Linux.</p>
+ <p>The single biggest hardware issue affecting webserver performance is
+ RAM. A webserver should never ever have to swap, as swapping increases
+ the latency of each request beyond a point that users consider "fast
+ enough". This causes users to hit stop and reload, further increasing
+ the load. You can, and should, control the <code>MaxClients</code>
+ setting so that your server does not spawn so many children it starts
+ swapping. The procedure for doing this is simple: determine the size of
+ your average Apache process, by looking at your process list via a tool
+ such as <code>top</code>, and divide this into your total available
+ memory, leaving some room for other processes.</p>
+
+ <p>Beyond that the rest is mundane: get a fast enough CPU, a fast
+ enough network card, and fast enough disks, where "fast enough" is
+ something that needs to be determined by experimentation.</p>
+
+ <p>Operating system choice is largely a matter of local concerns. But a
+ general guideline is to always apply the latest vendor TCP/IP
+ patches.</p>
<hr />
<h3><a id="runtime" name="runtime">Run-Time Configuration
Issues</a></h3>
- <h4>HostnameLookups</h4>
+ <h4><a id="dns" name="dns"><code>HostnameLookups</code> and other DNS considerations</a></h4>
- <p>Prior to Apache 1.3, <code>HostnameLookups</code> defaulted
- to On. This adds latency to every request because it requires a
- DNS lookup to complete before the request is finished. In
- Apache 1.3 this setting defaults to Off. However (1.3 or
- later), if you use any <code>Allow from domain</code> or
- <code>Deny from domain</code> directives then you will pay for
- a double reverse DNS lookup (a reverse, followed by a forward
- to make sure that the reverse is not being spoofed). So for the
- highest performance avoid using these directives (it's fine to
- use IP addresses rather than domain names).</p>
-
- <p>Note that it's possible to scope the directives, such as
- within a <code><Location /server-status></code> section.
- In this case the DNS lookups are only performed on requests
- matching the criteria. Here's an example which disables lookups
- except for .html and .cgi files:</p>
+ <p>Prior to Apache 1.3, <a
+ href="../mod/core.html#hostnamelookups"><code>HostnameLookups</code></a>
+ defaulted to <code>On</code>. This adds latency to every request
+ because it requires a DNS lookup to complete before the request is
+ finished. In Apache 1.3 this setting defaults to <code>Off</code>. If
+ you need to have addresses in your log files resolved to hostnames, use
+ the <a href="../programs/logresolve.html">logresolve</a> program that
+ comes with Apache, or one of the numerous log reporting packages which
+ are available.</p>
+
+ <p>It is recommended that you do this sort of postprocessing of your
+ log files on some machine other than the production web server machine,
+ in order that this activity not adversely affect server
+ performance.</p>
+
+ <p>If you use any <code><a
+ href="../mod/mod_access.html#allow">Allow</a> from domain</code> or
+ <code><a href="../mod/mod_access.html#deny">Deny</a> from domain</code>
+ directives (i.e., using a hostname, or a domain name, rather than an IP
+ address) then you will pay for a double reverse DNS lookup (a reverse,
+ followed by a forward to make sure that the reverse is not being
+ spoofed). For best performance, therefore, use IP addresses, rather
+ than names, when using these directives, if possible.</p>
+
+ <p>Note that it's possible to scope the directives, such as within a
+ <code><Location /server-status></code> section. In this case the
+ DNS lookups are only performed on requests matching the criteria.
+ Here's an example which disables lookups except for .html and .cgi
+ files:</p>
<blockquote>
<pre>
@@ -134,27 +149,18 @@
</Files>
</pre>
</blockquote>
- But even still, if you just need DNS names in some CGIs you
- could consider doing the <code>gethostbyname</code> call in the
- specific CGIs that need it.
-
- <p>Similarly, if you need to have hostname information in your
- server logs in order to generate reports of this information,
- you can postprocess your log file with <a
- href="../programs/logresolve.html">logresolve</a>, so that
- these lookups can be done without making the client wait. It is
- recommended that you do this postprocessing, and any other
- statistical analysis of the log file, somewhere other than your
- production web server machine, in order that this activity does
- not adversely affect server performance.</p>
- <h4>FollowSymLinks and SymLinksIfOwnerMatch</h4>
+ <p>But even still, if you just need DNS names in some CGIs you could
+ consider doing the <code>gethostbyname</code> call in the specific CGIs
+ that need it.</p>
+
+ <h4><a id="symlinks" name="symlinks">FollowSymLinks and SymLinksIfOwnerMatch</a></h4>
<p>Wherever in your URL-space you do not have an <code>Options
FollowSymLinks</code>, or you do have an <code>Options
- SymLinksIfOwnerMatch</code> Apache will have to issue extra
- system calls to check up on symlinks. One extra call per
- filename component. For example, if you had:</p>
+ SymLinksIfOwnerMatch</code> Apache will have to issue extra system
+ calls to check up on symlinks. One extra call per filename component.
+ For example, if you had:</p>
<blockquote>
<pre>
@@ -164,13 +170,13 @@
</Directory>
</pre>
</blockquote>
- and a request is made for the URI <code>/index.html</code>.
- Then Apache will perform <code>lstat(2)</code> on
- <code>/www</code>, <code>/www/htdocs</code>, and
- <code>/www/htdocs/index.html</code>. The results of these
- <code>lstats</code> are never cached, so they will occur on
- every single request. If you really desire the symlinks
- security checking you can do something like this:
+
+ <p>and a request is made for the URI <code>/index.html</code>. Then
+ Apache will perform <code>lstat(2)</code> on <code>/www</code>,
+ <code>/www/htdocs</code>, and <code>/www/htdocs/index.html</code>. The
+ results of these <code>lstats</code> are never cached, so they will
+ occur on every single request. If you really desire the symlinks
+ security checking you can do something like this:</p>
<blockquote>
<pre>
@@ -183,20 +189,19 @@
</Directory>
</pre>
</blockquote>
- This at least avoids the extra checks for the
- <code>DocumentRoot</code> path. Note that you'll need to add
- similar sections if you have any <code>Alias</code> or
- <code>RewriteRule</code> paths outside of your document root.
- For highest performance, and no symlink protection, set
- <code>FollowSymLinks</code> everywhere, and never set
- <code>SymLinksIfOwnerMatch</code>.
- <h4>AllowOverride</h4>
+ <p>This at least avoids the extra checks for the
+ <code>DocumentRoot</code> path. Note that you'll need to add similar
+ sections if you have any <code>Alias</code> or <code>RewriteRule</code>
+ paths outside of your document root. For highest performance, and no
+ symlink protection, set <code>FollowSymLinks</code> everywhere, and
+ never set <code>SymLinksIfOwnerMatch</code>.</p>
+
+ <h4><a id="htaccess" name="htaccess">AllowOverride</a></h4>
<p>Wherever in your URL-space you allow overrides (typically
<code>.htaccess</code> files) Apache will attempt to open
- <code>.htaccess</code> for each filename component. For
- example,</p>
+ <code>.htaccess</code> for each filename component. For example,</p>
<blockquote>
<pre>
@@ -206,118 +211,183 @@
</Directory>
</pre>
</blockquote>
- and a request is made for the URI <code>/index.html</code>.
- Then Apache will attempt to open <code>/.htaccess</code>,
- <code>/www/.htaccess</code>, and
- <code>/www/htdocs/.htaccess</code>. The solutions are similar
- to the previous case of <code>Options FollowSymLinks</code>.
- For highest performance use <code>AllowOverride None</code>
- everywhere in your filesystem.
-
- <h4>Negotiation</h4>
-
- <p>If at all possible, avoid content-negotiation if you're
- really interested in every last ounce of performance. In
- practice the benefits of negotiation outweigh the performance
- penalties. There's one case where you can speed up the server.
- Instead of using a wildcard such as:</p>
+
+ <p>and a request is made for the URI <code>/index.html</code>. Then
+ Apache will attempt to open <code>/.htaccess</code>,
+ <code>/www/.htaccess</code>, and <code>/www/htdocs/.htaccess</code>.
+ The solutions are similar to the previous case of <code>Options
+ FollowSymLinks</code>. For highest performance use <code>AllowOverride
+ None</code> everywhere in your filesystem.</p>
+
+ <p>See also the <a href="../howto/htaccess.html">.htaccess tutorial</a>
+ for further discussion of this.</p>
+
+ <h4><a id="negotiation" name="negotiation">Negotiation</a></h4>
+
+ <p>If at all possible, avoid content-negotiation if you're really
+ interested in every last ounce of performance. In practice the benefits
+ of negotiation outweigh the performance penalties. There's one case
+ where you can speed up the server. Instead of using a wildcard such
+ as:</p>
<blockquote>
<pre>
DirectoryIndex index
</pre>
</blockquote>
- Use a complete list of options:
+
+ <p>Use a complete list of options:</p>
<blockquote>
<pre>
DirectoryIndex index.cgi index.pl index.shtml index.html
</pre>
</blockquote>
- where you list the most common choice first.
- <h4>Process Creation</h4>
+ <p>where you list the most common choice first.</p>
- <p>Prior to Apache 1.3 the <code>MinSpareServers</code>,
- <code>MaxSpareServers</code>, and <code>StartServers</code>
- settings all had drastic effects on benchmark results. In
- particular, Apache required a "ramp-up" period in order to
- reach a number of children sufficient to serve the load being
- applied. After the initial spawning of
- <code>StartServers</code> children, only one child per second
- would be created to satisfy the <code>MinSpareServers</code>
- setting. So a server being accessed by 100 simultaneous
- clients, using the default <code>StartServers</code> of 5 would
- take on the order 95 seconds to spawn enough children to handle
- the load. This works fine in practice on real-life servers,
- because they aren't restarted frequently. But does really
- poorly on benchmarks which might only run for ten minutes.</p>
-
- <p>The one-per-second rule was implemented in an effort to
- avoid swamping the machine with the startup of new children. If
- the machine is busy spawning children it can't service
- requests. But it has such a drastic effect on the perceived
- performance of Apache that it had to be replaced. As of Apache
- 1.3, the code will relax the one-per-second rule. It will spawn
- one, wait a second, then spawn two, wait a second, then spawn
- four, and it will continue exponentially until it is spawning
- 32 children per second. It will stop whenever it satisfies the
+ <p>If your site needs content negotiation, consider using
+ <code>type-map</code> files rather than the <code>Options
+ MultiViews</code> directive to accomplish the negotiation. See the <a
+ href="../content-negotiation.html">Content Negotiation</a>
+ documentation for a full discussion of the methods of negotiation, and
+ instructions for creating <code>type-map</code> files.</p>
+
+ <h4><a name="process" id="process">Process Creation</a></h4>
+
+ <p>Prior to Apache 1.3 the <a
+ href="../mod/core.html#minspareservers"><code>MinSpareServers</code></a>,
+ <a
+ href="../mod/core.html#maxspareservers"><code>MaxSpareServers</code></a>,
+ and <a
+ href="../mod/core.html#startservers"><code>StartServers</code></a>
+ settings all had drastic effects on benchmark results. In particular,
+ Apache required a "ramp-up" period in order to reach a number of
+ children sufficient to serve the load being applied. After the initial
+ spawning of <code>StartServers</code> children, only one child per
+ second would be created to satisfy the <code>MinSpareServers</code>
+ setting. So a server being accessed by 100 simultaneous clients, using
+ the default <code>StartServers</code> of 5 would take on the order 95
+ seconds to spawn enough children to handle the load. This works fine in
+ practice on real-life servers, because they aren't restarted
+ frequently. But results in poor performance on benchmarks, which might
+ only run for ten minutes.</p>
+
+ <p>The one-per-second rule was implemented in an effort to avoid
+ swamping the machine with the startup of new children. If the machine
+ is busy spawning children it can't service requests. But it has such a
+ drastic effect on the perceived performance of Apache that it had to be
+ replaced. As of Apache 1.3, the code will relax the one-per-second
+ rule. It will spawn one, wait a second, then spawn two, wait a second,
+ then spawn four, and it will continue exponentially until it is
+ spawning 32 children per second. It will stop whenever it satisfies the
<code>MinSpareServers</code> setting.</p>
- <p>This appears to be responsive enough that it's almost
- unnecessary to twiddle the <code>MinSpareServers</code>,
- <code>MaxSpareServers</code> and <code>StartServers</code>
- knobs. When more than 4 children are spawned per second, a
- message will be emitted to the <code>ErrorLog</code>. If you
- see a lot of these errors then consider tuning these settings.
- Use the <code>mod_status</code> output as a guide.</p>
+ <p>This appears to be responsive enough that it's almost unnecessary to
+ adjust the <code>MinSpareServers</code>, <code>MaxSpareServers</code>
+ and <code>StartServers</code> settings. When more than 4 children are
+ spawned per second, a message will be emitted to the
+ <code>ErrorLog</code>. If you see a lot of these errors then consider
+ tuning these settings. Use the <code>mod_status</code> output as a
+ guide.</p>
+
+ <p>In particular, you may neet to set <code>MinSpareServers</code>
+ higher if traffic on your site is extremely bursty - that is, if the
+ number of connections to your site fluctuates radically in short
+ periods of time. This may be the case, for example, if traffic to your
+ site is highly event-driven, such as sites for major sports events, or
+ other sites where users are encouraged to visit the site at a
+ particular time.</p>
<p>Related to process creation is process death induced by the
- <code>MaxRequestsPerChild</code> setting. By default this is 0,
- which means that there is no limit to the number of requests
- handled per child. If your configuration currently has this set
- to some very low number, such as 30, you may want to bump this
- up significantly. If you are running SunOS or an old version of
- Solaris, limit this to 10000 or so because of memory leaks.</p>
-
- <p>When keep-alives are in use, children will be kept busy
- doing nothing waiting for more requests on the already open
- connection. The default <code>KeepAliveTimeout</code> of 15
- seconds attempts to minimize this effect. The tradeoff here is
- between network bandwidth and server resources. In no event
- should you raise this above about 60 seconds, as <a
+ <code>MaxRequestsPerChild</code> setting. By default this is 0, which
+ means that there is no limit to the number of requests handled per
+ child. If your configuration currently has this set to some very low
+ number, such as 30, you may want to bump this up significantly. If you
+ are running SunOS or an old version of Solaris, limit this to 10000 or
+ so because of memory leaks.</p>
+
+ <p>When keep-alives are in use, children will be kept busy doing
+ nothing waiting for more requests on the already open connection. The
+ default <code>KeepAliveTimeout</code> of 15 seconds attempts to
+ minimize this effect. The tradeoff here is between network bandwidth
+ and server resources. In no event should you raise this above about 60
+ seconds, as <a
href="http://www.research.digital.com/wrl/techreports/abstracts/95.4.html">
most of the benefits are lost</a>.</p>
+
+ <h4><a name="modules" id="modules">Modules</a></h4>
+
+ <p>Since memory usage is such an important consideration in
+ performance, you should attempt to eliminate modules that you are not
+ actually using. If you have built the modules as <a
+ href="../dso.html">DSOs</a>, eliminating modules is a simple matter of
+ commenting out the associated <a
+ href="../mod/core.html#addmodule.html">AddModule</a> and <a
+ href="../mod/mod_so.html#loadmodule.html">LoadModule</a> directives for
+ that module. This allows you to experiment with removing modules, and
+ seeing if your site still functions in their absense.</p>
+
+ <p>If, on the other hand, you have modules statically linked into your
+ Apache binary, you will need to recompile Apache in order to remove
+ unwanted modules.</p>
+
+ <p>An associated question that arises here is, of course, what modules
+ you need, and which ones you don't. The answer here will, of course,
+ vary from one web site to another. However, the <i>minimal</i> list of
+ modules which you can get by with tends to include <a
+ href="../mod/mod_mime.html">mod_mime</a>, <a
+ href="../mod/mod_dir.html">mod_dir</a>, and <a
+ href="../mod/mod_log_config.html">mod_log_config</a>.
+ <code>mod_log_config</code> is, of course, optional, as you can run a
+ web site without log files. This is, however, not recommended.</p>
+
+ <h4><a name="mmap" id="mmap">mod_mmap_static</a></h4>
+
+ <p>Apache comes with a module, <a
+ href="../mod/mod_mmap_static.html">mod_mmap_static</a>, which is not
+ enabled by default, which allows you to map files into RAM, and
+ serve them directly from memory rather than from the disc, which
+ should result in substantial performance improvement for
+ frequently-requests files. Note that when files are modified, you
+ will need to restart your server in order to serve the latest
+ version of the file, so this is not appropriate for files which
+ change frequently. See the documentation for this module for more
+ complete details.</p>
+
<hr />
- <h3><a id="compiletime" name="compiletime">Compile-Time
- Configuration Issues</a></h3>
+ <h3><a id="compiletime" name="compiletime">Compile-Time Configuration
+ Issues</a></h3>
<h4>mod_status and ExtendedStatus On</h4>
- <p>If you include <code>mod_status</code> and you also set
- <code>ExtendedStatus On</code> when building and running
- Apache, then on every request Apache will perform two calls to
- <code>gettimeofday(2)</code> (or <code>times(2)</code>
- depending on your operating system), and (pre-1.3) several
- extra calls to <code>time(2)</code>. This is all done so that
- the status report contains timing indications. For highest
- performance, set <code>ExtendedStatus off</code> (which is the
- default).</p>
+ <p>If you include <a
+ href="../mod/mod_status.html"><code>mod_status</code></a> and you also
+ set <code>ExtendedStatus On</code> when building and running Apache,
+ then on every request Apache will perform two calls to
+ <code>gettimeofday(2)</code> (or <code>times(2)</code> depending on
+ your operating system), and (pre-1.3) several extra calls to
+ <code>time(2)</code>. This is all done so that the status report
+ contains timing indications. For highest performance, set
+ <code>ExtendedStatus off</code> (which is the default).</p>
+
+ <p><code>mod_status</code> should probably be configured to allow
+ access by only a few users, rather than to the general public, so this
+ will likely have very low impact on your overall performance.</p>
<h4>accept Serialization - multiple sockets</h4>
- <p>This discusses a shortcoming in the Unix socket API. Suppose
- your web server uses multiple <code>Listen</code> statements to
- listen on either multiple ports or multiple addresses. In order
- to test each socket to see if a connection is ready Apache uses
- <code>select(2)</code>. <code>select(2)</code> indicates that a
- socket has <em>zero</em> or <em>at least one</em> connection
- waiting on it. Apache's model includes multiple children, and
- all the idle ones test for new connections at the same time. A
- naive implementation looks something like this (these examples
- do not match the code, they're contrived for pedagogical
- purposes):</p>
+ <p>This discusses a shortcoming in the Unix socket API. Suppose your
+ web server uses multiple <code>Listen</code> statements to listen on
+ either multiple ports or multiple addresses. In order to test each
+ socket to see if a connection is ready Apache uses
+ <code>select(2)</code>. <code>select(2)</code> indicates that a socket
+ has <em>zero</em> or <em>at least one</em> connection waiting on it.
+ Apache's model includes multiple children, and all the idle ones test
+ for new connections at the same time. A naive implementation looks
+ something like this (these examples do not match the code, they're
+ contrived for pedagogical purposes):</p>
<blockquote>
<pre>
@@ -344,42 +414,37 @@
}
</pre>
</blockquote>
- But this naive implementation has a serious starvation problem.
- Recall that multiple children execute this loop at the same
- time, and so multiple children will block at
- <code>select</code> when they are in between requests. All
- those blocked children will awaken and return from
- <code>select</code> when a single request appears on any socket
- (the number of children which awaken varies depending on the
- operating system and timing issues). They will all then fall
- down into the loop and try to <code>accept</code> the
- connection. But only one will succeed (assuming there's still
- only one connection ready), the rest will be <em>blocked</em>
- in <code>accept</code>. This effectively locks those children
- into serving requests from that one socket and no other
- sockets, and they'll be stuck there until enough new requests
- appear on that socket to wake them all up. This starvation
- problem was first documented in <a
- href="http://bugs.apache.org/index/full/467">PR#467</a>. There
- are at least two solutions.
-
- <p>One solution is to make the sockets non-blocking. In this
- case the <code>accept</code> won't block the children, and they
- will be allowed to continue immediately. But this wastes CPU
- time. Suppose you have ten idle children in
- <code>select</code>, and one connection arrives. Then nine of
- those children will wake up, try to <code>accept</code> the
- connection, fail, and loop back into <code>select</code>,
- accomplishing nothing. Meanwhile none of those children are
- servicing requests that occurred on other sockets until they
- get back up to the <code>select</code> again. Overall this
- solution does not seem very fruitful unless you have as many
- idle CPUs (in a multiprocessor box) as you have idle children,
- not a very likely situation.</p>
-
- <p>Another solution, the one used by Apache, is to serialize
- entry into the inner loop. The loop looks like this
- (differences highlighted):</p>
+ But this naive implementation has a serious starvation problem. Recall
+ that multiple children execute this loop at the same time, and so
+ multiple children will block at <code>select</code> when they are in
+ between requests. All those blocked children will awaken and return
+ from <code>select</code> when a single request appears on any socket
+ (the number of children which awaken varies depending on the operating
+ system and timing issues). They will all then fall down into the loop
+ and try to <code>accept</code> the connection. But only one will
+ succeed (assuming there's still only one connection ready), the rest
+ will be <em>blocked</em> in <code>accept</code>. This effectively locks
+ those children into serving requests from that one socket and no other
+ sockets, and they'll be stuck there until enough new requests appear on
+ that socket to wake them all up. This starvation problem was first
+ documented in <a
+ href="http://bugs.apache.org/index/full/467">PR#467</a>. There are at
+ least two solutions.
+
+ <p>One solution is to make the sockets non-blocking. In this case the
+ <code>accept</code> won't block the children, and they will be allowed
+ to continue immediately. But this wastes CPU time. Suppose you have ten
+ idle children in <code>select</code>, and one connection arrives. Then
+ nine of those children will wake up, try to <code>accept</code> the
+ connection, fail, and loop back into <code>select</code>, accomplishing
+ nothing. Meanwhile none of those children are servicing requests that
+ occurred on other sockets until they get back up to the
+ <code>select</code> again. Overall this solution does not seem very
+ fruitful unless you have as many idle CPUs (in a multiprocessor box) as
+ you have idle children, not a very likely situation.</p>
+
+ <p>Another solution, the one used by Apache, is to serialize entry into
+ the inner loop. The loop looks like this (differences highlighted):</p>
<blockquote>
<pre>
@@ -410,158 +475,141 @@
</blockquote>
<a id="serialize" name="serialize">The functions</a>
<code>accept_mutex_on</code> and <code>accept_mutex_off</code>
- implement a mutual exclusion semaphore. Only one child can have
- the mutex at any time. There are several choices for
- implementing these mutexes. The choice is defined in
- <code>src/conf.h</code> (pre-1.3) or
- <code>src/include/ap_config.h</code> (1.3 or later). Some
- architectures do not have any locking choice made, on these
- architectures it is unsafe to use multiple <code>Listen</code>
- directives.
+ implement a mutual exclusion semaphore. Only one child can have the
+ mutex at any time. There are several choices for implementing these
+ mutexes. The choice is defined in <code>src/conf.h</code> (pre-1.3) or
+ <code>src/include/ap_config.h</code> (1.3 or later). Some architectures
+ do not have any locking choice made, on these architectures it is
+ unsafe to use multiple <code>Listen</code> directives.
<dl>
<dt><code>HAVE_FLOCK_SERIALIZED_ACCEPT</code></dt>
- <dd>This method uses the <code>flock(2)</code> system call to
- lock a lock file (located by the <code>LockFile</code>
- directive).</dd>
+ <dd>This method uses the <code>flock(2)</code> system call to lock a
+ lock file (located by the <code>LockFile</code> directive).</dd>
<dt><code>HAVE_FCNTL_SERIALIZED_ACCEPT</code></dt>
- <dd>This method uses the <code>fcntl(2)</code> system call to
- lock a lock file (located by the <code>LockFile</code>
- directive).</dd>
+ <dd>This method uses the <code>fcntl(2)</code> system call to lock a
+ lock file (located by the <code>LockFile</code> directive).</dd>
<dt><code>HAVE_SYSVSEM_SERIALIZED_ACCEPT</code></dt>
<dd>(1.3 or later) This method uses SysV-style semaphores to
- implement the mutex. Unfortunately SysV-style semaphores have
- some bad side-effects. One is that it's possible Apache will
- die without cleaning up the semaphore (see the
- <code>ipcs(8)</code> man page). The other is that the
- semaphore API allows for a denial of service attack by any
- CGIs running under the same uid as the webserver
- (<em>i.e.</em>, all CGIs, unless you use something like
- suexec or cgiwrapper). For these reasons this method is not
- used on any architecture except IRIX (where the previous two
- are prohibitively expensive on most IRIX boxes).</dd>
+ implement the mutex. Unfortunately SysV-style semaphores have some
+ bad side-effects. One is that it's possible Apache will die without
+ cleaning up the semaphore (see the <code>ipcs(8)</code> man page).
+ The other is that the semaphore API allows for a denial of service
+ attack by any CGIs running under the same uid as the webserver
+ (<em>i.e.</em>, all CGIs, unless you use something like suexec or
+ cgiwrapper). For these reasons this method is not used on any
+ architecture except IRIX (where the previous two are prohibitively
+ expensive on most IRIX boxes).</dd>
<dt><code>HAVE_USLOCK_SERIALIZED_ACCEPT</code></dt>
- <dd>(1.3 or later) This method is only available on IRIX, and
- uses <code>usconfig(2)</code> to create a mutex. While this
- method avoids the hassles of SysV-style semaphores, it is not
- the default for IRIX. This is because on single processor
- IRIX boxes (5.3 or 6.2) the uslock code is two orders of
- magnitude slower than the SysV-semaphore code. On
- multi-processor IRIX boxes the uslock code is an order of
- magnitude faster than the SysV-semaphore code. Kind of a
- messed up situation. So if you're using a multiprocessor IRIX
- box then you should rebuild your webserver with
+ <dd>(1.3 or later) This method is only available on IRIX, and uses
+ <code>usconfig(2)</code> to create a mutex. While this method avoids
+ the hassles of SysV-style semaphores, it is not the default for IRIX.
+ This is because on single processor IRIX boxes (5.3 or 6.2) the
+ uslock code is two orders of magnitude slower than the SysV-semaphore
+ code. On multi-processor IRIX boxes the uslock code is an order of
+ magnitude faster than the SysV-semaphore code. Kind of a messed up
+ situation. So if you're using a multiprocessor IRIX box then you
+ should rebuild your webserver with
<code>-DHAVE_USLOCK_SERIALIZED_ACCEPT</code> on the
<code>EXTRA_CFLAGS</code>.</dd>
<dt><code>HAVE_PTHREAD_SERIALIZED_ACCEPT</code></dt>
- <dd>(1.3 or later) This method uses POSIX mutexes and should
- work on any architecture implementing the full POSIX threads
- specification, however appears to only work on Solaris (2.5
- or later), and even then only in certain configurations. If
- you experiment with this you should watch out for your server
- hanging and not responding. Static content only servers may
- work just fine.</dd>
+ <dd>(1.3 or later) This method uses POSIX mutexes and should work on
+ any architecture implementing the full POSIX threads specification,
+ however appears to only work on Solaris (2.5 or later), and even then
+ only in certain configurations. If you experiment with this you
+ should watch out for your server hanging and not responding. Static
+ content only servers may work just fine.</dd>
</dl>
- <p>If your system has another method of serialization which
- isn't in the above list then it may be worthwhile adding code
- for it (and submitting a patch back to Apache). The above
- <code>HAVE_METHOD_SERIALIZED_ACCEPT</code> defines specify
- which method is available and works on the platform (you can
- have more than one); <code>USE_METHOD_SERIALIZED_ACCEPT</code>
- is used to specify the default method (see the
- <code>AcceptMutex</code> directive).</p>
-
- <p>Another solution that has been considered but never
- implemented is to partially serialize the loop -- that is, let
- in a certain number of processes. This would only be of
- interest on multiprocessor boxes where it's possible multiple
- children could run simultaneously, and the serialization
- actually doesn't take advantage of the full bandwidth. This is
- a possible area of future investigation, but priority remains
+ <p>If your system has another method of serialization which isn't in
+ the above list then it may be worthwhile adding code for it (and
+ submitting a patch back to Apache). The above
+ <code>HAVE_METHOD_SERIALIZED_ACCEPT</code> defines specify which method
+ is available and works on the platform (you can have more than one);
+ <code>USE_METHOD_SERIALIZED_ACCEPT</code> is used to specify the
+ default method (see the <code>AcceptMutex</code> directive).</p>
+
+ <p>Another solution that has been considered but never implemented is
+ to partially serialize the loop -- that is, let in a certain number of
+ processes. This would only be of interest on multiprocessor boxes where
+ it's possible multiple children could run simultaneously, and the
+ serialization actually doesn't take advantage of the full bandwidth.
+ This is a possible area of future investigation, but priority remains
low because highly parallel web servers are not the norm.</p>
- <p>Ideally you should run servers without multiple
- <code>Listen</code> statements if you want the highest
- performance. But read on.</p>
+ <p>Ideally you should run servers without multiple <code>Listen</code>
+ statements if you want the highest performance. But read on.</p>
<h4>accept Serialization - single socket</h4>
- <p>The above is fine and dandy for multiple socket servers, but
- what about single socket servers? In theory they shouldn't
- experience any of these same problems because all children can
- just block in <code>accept(2)</code> until a connection
- arrives, and no starvation results. In practice this hides
- almost the same "spinning" behavior discussed above in the
- non-blocking solution. The way that most TCP stacks are
- implemented, the kernel actually wakes up all processes blocked
- in <code>accept</code> when a single connection arrives. One of
- those processes gets the connection and returns to user-space,
- the rest spin in the kernel and go back to sleep when they
- discover there's no connection for them. This spinning is
- hidden from the user-land code, but it's there nonetheless.
- This can result in the same load-spiking wasteful behavior
- that a non-blocking solution to the multiple sockets case
- can.</p>
-
- <p>For this reason we have found that many architectures behave
- more "nicely" if we serialize even the single socket case. So
- this is actually the default in almost all cases. Crude
- experiments under Linux (2.0.30 on a dual Pentium pro 166
- w/128Mb RAM) have shown that the serialization of the single
- socket case causes less than a 3% decrease in requests per
- second over unserialized single-socket. But unserialized
- single-socket showed an extra 100ms latency on each request.
- This latency is probably a wash on long haul lines, and only an
- issue on LANs. If you want to override the single socket
+ <p>The above is fine and dandy for multiple socket servers, but what
+ about single socket servers? In theory they shouldn't experience any of
+ these same problems because all children can just block in
+ <code>accept(2)</code> until a connection arrives, and no starvation
+ results. In practice this hides almost the same "spinning" behavior
+ discussed above in the non-blocking solution. The way that most TCP
+ stacks are implemented, the kernel actually wakes up all processes
+ blocked in <code>accept</code> when a single connection arrives. One of
+ those processes gets the connection and returns to user-space, the rest
+ spin in the kernel and go back to sleep when they discover there's no
+ connection for them. This spinning is hidden from the user-land code,
+ but it's there nonetheless. This can result in the same load-spiking
+ wasteful behavior that a non-blocking solution to the multiple sockets
+ case can.</p>
+
+ <p>For this reason we have found that many architectures behave more
+ "nicely" if we serialize even the single socket case. So this is
+ actually the default in almost all cases. Crude experiments under Linux
+ (2.0.30 on a dual Pentium pro 166 w/128Mb RAM) have shown that the
+ serialization of the single socket case causes less than a 3% decrease
+ in requests per second over unserialized single-socket. But
+ unserialized single-socket showed an extra 100ms latency on each
+ request. This latency is probably a wash on long haul lines, and only
+ an issue on LANs. If you want to override the single socket
serialization you can define
- <code>SINGLE_LISTEN_UNSERIALIZED_ACCEPT</code> and then
- single-socket servers will not serialize at all.</p>
+ <code>SINGLE_LISTEN_UNSERIALIZED_ACCEPT</code> and then single-socket
+ servers will not serialize at all.</p>
<h4>Lingering Close</h4>
<p>As discussed in <a
href="http://www.ics.uci.edu/pub/ietf/http/draft-ietf-http-connection-00.txt">
- draft-ietf-http-connection-00.txt</a> section 8, in order for
- an HTTP server to <strong>reliably</strong> implement the
- protocol it needs to shutdown each direction of the
- communication independently (recall that a TCP connection is
- bi-directional, each half is independent of the other). This
- fact is often overlooked by other servers, but is correctly
- implemented in Apache as of 1.2.</p>
-
- <p>When this feature was added to Apache it caused a flurry of
- problems on various versions of Unix because of a
- shortsightedness. The TCP specification does not state that the
- FIN_WAIT_2 state has a timeout, but it doesn't prohibit it. On
- systems without the timeout, Apache 1.2 induces many sockets
- stuck forever in the FIN_WAIT_2 state. In many cases this can
- be avoided by simply upgrading to the latest TCP/IP patches
- supplied by the vendor. In cases where the vendor has never
- released patches (<em>i.e.</em>, SunOS4 -- although folks with
- a source license can patch it themselves) we have decided to
- disable this feature.</p>
-
- <p>There are two ways of accomplishing this. One is the socket
- option <code>SO_LINGER</code>. But as fate would have it, this
- has never been implemented properly in most TCP/IP stacks. Even
- on those stacks with a proper implementation (<em>i.e.</em>,
- Linux 2.0.31) this method proves to be more expensive (cputime)
- than the next solution.</p>
-
- <p>For the most part, Apache implements this in a function
- called <code>lingering_close</code> (in
- <code>http_main.c</code>). The function looks roughly like
- this:</p>
+ draft-ietf-http-connection-00.txt</a> section 8, in order for an HTTP
+ server to <strong>reliably</strong> implement the protocol it needs to
+ shutdown each direction of the communication independently (recall that
+ a TCP connection is bi-directional, each half is independent of the
+ other). This fact is often overlooked by other servers, but is
+ correctly implemented in Apache as of 1.2.</p>
+
+ <p>When this feature was added to Apache it caused a flurry of problems
+ on various versions of Unix because of a shortsightedness. The TCP
+ specification does not state that the FIN_WAIT_2 state has a timeout,
+ but it doesn't prohibit it. On systems without the timeout, Apache 1.2
+ induces many sockets stuck forever in the FIN_WAIT_2 state. In many
+ cases this can be avoided by simply upgrading to the latest TCP/IP
+ patches supplied by the vendor. In cases where the vendor has never
+ released patches (<em>i.e.</em>, SunOS4 -- although folks with a source
+ license can patch it themselves) we have decided to disable this
+ feature.</p>
+
+ <p>There are two ways of accomplishing this. One is the socket option
+ <code>SO_LINGER</code>. But as fate would have it, this has never been
+ implemented properly in most TCP/IP stacks. Even on those stacks with a
+ proper implementation (<em>i.e.</em>, Linux 2.0.31) this method proves
+ to be more expensive (cputime) than the next solution.</p>
+
+ <p>For the most part, Apache implements this in a function called
+ <code>lingering_close</code> (in <code>http_main.c</code>). The
+ function looks roughly like this:</p>
<blockquote>
<pre>
@@ -590,51 +638,47 @@
}
</pre>
</blockquote>
- This naturally adds some expense at the end of a connection,
- but it is required for a reliable implementation. As HTTP/1.1
- becomes more prevalent, and all connections are persistent,
- this expense will be amortized over more requests. If you want
- to play with fire and disable this feature you can define
- <code>NO_LINGCLOSE</code>, but this is not recommended at all.
- In particular, as HTTP/1.1 pipelined persistent connections
- come into use <code>lingering_close</code> is an absolute
+ This naturally adds some expense at the end of a connection, but it is
+ required for a reliable implementation. As HTTP/1.1 becomes more
+ prevalent, and all connections are persistent, this expense will be
+ amortized over more requests. If you want to play with fire and disable
+ this feature you can define <code>NO_LINGCLOSE</code>, but this is not
+ recommended at all. In particular, as HTTP/1.1 pipelined persistent
+ connections come into use <code>lingering_close</code> is an absolute
necessity (and <a
- href="http://www.w3.org/Protocols/HTTP/Performance/Pipeline.html">
- pipelined connections are faster</a>, so you want to support
- them).
+ href="http://www.w3.org/Protocols/HTTP/Performance/Pipeline.html">pipelined
+ connections are faster</a>, so you want to support them).
<h4>Scoreboard File</h4>
- <p>Apache's parent and children communicate with each other
- through something called the scoreboard. Ideally this should be
- implemented in shared memory. For those operating systems that
- we either have access to, or have been given detailed ports
- for, it typically is implemented using shared memory. The rest
- default to using an on-disk file. The on-disk file is not only
- slow, but it is unreliable (and less featured). Peruse the
- <code>src/main/conf.h</code> file for your architecture and
- look for either <code>USE_MMAP_SCOREBOARD</code> or
- <code>USE_SHMGET_SCOREBOARD</code>. Defining one of those two
- (as well as their companions <code>HAVE_MMAP</code> and
- <code>HAVE_SHMGET</code> respectively) enables the supplied
- shared memory code. If your system has another type of shared
- memory, edit the file <code>src/main/http_main.c</code> and add
- the hooks necessary to use it in Apache. (Send us back a patch
- too please.)</p>
-
- <p>Historical note: The Linux port of Apache didn't start to
- use shared memory until version 1.2 of Apache. This oversight
- resulted in really poor and unreliable behavior of earlier
- versions of Apache on Linux.</p>
+ <p>Apache's parent and children communicate with each other through
+ something called the scoreboard. Ideally this should be implemented in
+ shared memory. For those operating systems that we either have access
+ to, or have been given detailed ports for, it typically is implemented
+ using shared memory. The rest default to using an on-disk file. The
+ on-disk file is not only slow, but it is unreliable (and less
+ featured). Peruse the <code>src/main/conf.h</code> file for your
+ architecture and look for either <code>USE_MMAP_SCOREBOARD</code> or
+ <code>USE_SHMGET_SCOREBOARD</code>. Defining one of those two (as well
+ as their companions <code>HAVE_MMAP</code> and <code>HAVE_SHMGET</code>
+ respectively) enables the supplied shared memory code. If your system
+ has another type of shared memory, edit the file
+ <code>src/main/http_main.c</code> and add the hooks necessary to use it
+ in Apache. (Send us back a patch too please.)</p>
+
+ <p>Historical note: The Linux port of Apache didn't start to use shared
+ memory until version 1.2 of Apache. This oversight resulted in really
+ poor and unreliable behavior of earlier versions of Apache on
+ Linux.</p>
<h4><code>DYNAMIC_MODULE_LIMIT</code></h4>
- <p>If you have no intention of using dynamically loaded modules
- (you probably don't if you're reading this and tuning your
- server for every last ounce of performance) then you should add
- <code>-DDYNAMIC_MODULE_LIMIT=0</code> when building your
- server. This will save RAM that's allocated only for supporting
- dynamically loaded modules.</p>
+ <p>If you have no intention of using dynamically loaded modules (you
+ probably don't if you're reading this and tuning your server for every
+ last ounce of performance) then you should add
+ <code>-DDYNAMIC_MODULE_LIMIT=0</code> when building your server. This
+ will save RAM that's allocated only for supporting dynamically loaded
+ modules.</p>
<hr />
<h3><a id="trace" name="trace">Appendix: Detailed Analysis of a
@@ -650,13 +694,12 @@
</Directory>
</pre>
</blockquote>
- The file being requested is a static 6K file of no particular
- content. Traces of non-static requests or requests with content
- negotiation look wildly different (and quite ugly in some
- cases). First the entire trace, then we'll examine details.
- (This was generated by the <code>strace</code> program, other
- similar programs include <code>truss</code>,
- <code>ktrace</code>, and <code>par</code>.)
+ The file being requested is a static 6K file of no particular content.
+ Traces of non-static requests or requests with content negotiation look
+ wildly different (and quite ugly in some cases). First the entire
+ trace, then we'll examine details. (This was generated by the
+ <code>strace</code> program, other similar programs include
+ <code>truss</code>, <code>ktrace</code>, and <code>par</code>.)
<blockquote>
<pre>
@@ -698,8 +741,7 @@
</pre>
</blockquote>
These two calls can be removed by defining
- <code>SINGLE_LISTEN_UNSERIALIZED_ACCEPT</code> as described
- earlier.
+ <code>SINGLE_LISTEN_UNSERIALIZED_ACCEPT</code> as described earlier.
<p>Notice the <code>SIGUSR1</code> manipulation:</p>
@@ -712,49 +754,46 @@
sigaction(SIGUSR1, {0x8059954, [], SA_INTERRUPT}, {SIG_IGN}) = 0
</pre>
</blockquote>
- This is caused by the implementation of graceful restarts. When
- the parent receives a <code>SIGUSR1</code> it sends a
- <code>SIGUSR1</code> to all of its children (and it also
- increments a "generation counter" in shared memory). Any
- children that are idle (between connections) will immediately
- die off when they receive the signal. Any children that are in
- keep-alive connections, but are in between requests will die
- off immediately. But any children that have a connection and
- are still waiting for the first request will not die off
- immediately.
-
- <p>To see why this is necessary, consider how a browser reacts
- to a closed connection. If the connection was a keep-alive
- connection and the request being serviced was not the first
- request then the browser will quietly reissue the request on a
- new connection. It has to do this because the server is always
- free to close a keep-alive connection in between requests
- (<em>i.e.</em>, due to a timeout or because of a maximum number
- of requests). But, if the connection is closed before the first
- response has been received the typical browser will display a
- "document contains no data" dialogue (or a broken image icon).
- This is done on the assumption that the server is broken in
- some way (or maybe too overloaded to respond at all). So Apache
- tries to avoid ever deliberately closing the connection before
- it has sent a single response. This is the cause of those
- <code>SIGUSR1</code> manipulations.</p>
-
- <p>Note that it is theoretically possible to eliminate all
- three of these calls. But in rough tests the gain proved to be
- almost unnoticeable.</p>
+ This is caused by the implementation of graceful restarts. When the
+ parent receives a <code>SIGUSR1</code> it sends a <code>SIGUSR1</code>
+ to all of its children (and it also increments a "generation counter"
+ in shared memory). Any children that are idle (between connections)
+ will immediately die off when they receive the signal. Any children
+ that are in keep-alive connections, but are in between requests will
+ die off immediately. But any children that have a connection and are
+ still waiting for the first request will not die off immediately.
+
+ <p>To see why this is necessary, consider how a browser reacts to a
+ closed connection. If the connection was a keep-alive connection and
+ the request being serviced was not the first request then the browser
+ will quietly reissue the request on a new connection. It has to do this
+ because the server is always free to close a keep-alive connection in
+ between requests (<em>i.e.</em>, due to a timeout or because of a
+ maximum number of requests). But, if the connection is closed before
+ the first response has been received the typical browser will display a
+ "document contains no data" dialogue (or a broken image icon). This is
+ done on the assumption that the server is broken in some way (or maybe
+ too overloaded to respond at all). So Apache tries to avoid ever
+ deliberately closing the connection before it has sent a single
+ response. This is the cause of those <code>SIGUSR1</code>
+ manipulations.</p>
+
+ <p>Note that it is theoretically possible to eliminate all three of
+ these calls. But in rough tests the gain proved to be almost
+ unnoticeable.</p>
- <p>In order to implement virtual hosts, Apache needs to know
- the local socket address used to accept the connection:</p>
+ <p>In order to implement virtual hosts, Apache needs to know the local
+ socket address used to accept the connection:</p>
<blockquote>
<pre>
getsockname(3, {sin_family=AF_INET, sin_port=htons(8080), sin_addr=inet_addr("127.0.0.1")}, [16]) = 0
</pre>
</blockquote>
- It is possible to eliminate this call in many situations (such
- as when there are no virtual hosts, or when <code>Listen</code>
- directives are used which do not have wildcard addresses). But
- no effort has yet been made to do these optimizations.
+ It is possible to eliminate this call in many situations (such as when
+ there are no virtual hosts, or when <code>Listen</code> directives are
+ used which do not have wildcard addresses). But no effort has yet been
+ made to do these optimizations.
<p>Apache turns off the Nagle algorithm:</p>
@@ -764,8 +803,8 @@
</pre>
</blockquote>
because of problems described in <a
- href="http://www.isi.edu/~johnh/PAPERS/Heidemann97a.html">a
- paper by John Heidemann</a>.
+ href="http://www.isi.edu/~johnh/PAPERS/Heidemann97a.html">a paper by
+ John Heidemann</a>.
<p>Notice the two <code>time</code> calls:</p>
@@ -776,18 +815,17 @@
time(NULL) = 873959960
</pre>
</blockquote>
- One of these occurs at the beginning of the request, and the
- other occurs as a result of writing the log. At least one of
- these is required to properly implement the HTTP protocol. The
- second occurs because the Common Log Format dictates that the
- log record include a timestamp of the end of the request. A
- custom logging module could eliminate one of the calls. Or you
- can use a method which moves the time into shared memory, see
- the <a href="#patches">patches section below</a>.
-
- <p>As described earlier, <code>ExtendedStatus On</code> causes
- two <code>gettimeofday</code> calls and a call to
- <code>times</code>:</p>
+ One of these occurs at the beginning of the request, and the other
+ occurs as a result of writing the log. At least one of these is
+ required to properly implement the HTTP protocol. The second occurs
+ because the Common Log Format dictates that the log record include a
+ timestamp of the end of the request. A custom logging module could
+ eliminate one of the calls. Or you can use a method which moves the
+ time into shared memory, see the <a href="#patches">patches section
+ below</a>.
+
+ <p>As described earlier, <code>ExtendedStatus On</code> causes two
+ <code>gettimeofday</code> calls and a call to <code>times</code>:</p>
<blockquote>
<pre>
@@ -797,8 +835,8 @@
times({tms_utime=5, tms_stime=0, tms_cutime=0, tms_cstime=0}) = 446747
</pre>
</blockquote>
- These can be removed by setting <code>ExtendedStatus Off</code>
- (which is the default).
+ These can be removed by setting <code>ExtendedStatus Off</code> (which
+ is the default).
<p>It might seem odd to call <code>stat</code>:</p>
@@ -808,21 +846,19 @@
</pre>
</blockquote>
This is part of the algorithm which calculates the
- <code>PATH_INFO</code> for use by CGIs. In fact if the request
- had been for the URI <code>/cgi-bin/printenv/foobar</code> then
- there would be two calls to <code>stat</code>. The first for
- <code>/home/dgaudet/ap/apachen/cgi-bin/printenv/foobar</code>
- which does not exist, and the second for
- <code>/home/dgaudet/ap/apachen/cgi-bin/printenv</code>, which
- does exist. Regardless, at least one <code>stat</code> call is
- necessary when serving static files because the file size and
- modification times are used to generate HTTP headers (such as
- <code>Content-Length</code>, <code>Last-Modified</code>) and
- implement protocol features (such as
- <code>If-Modified-Since</code>). A somewhat more clever server
- could avoid the <code>stat</code> when serving non-static
- files, however doing so in Apache is very difficult given the
- modular structure.
+ <code>PATH_INFO</code> for use by CGIs. In fact if the request had been
+ for the URI <code>/cgi-bin/printenv/foobar</code> then there would be
+ two calls to <code>stat</code>. The first for
+ <code>/home/dgaudet/ap/apachen/cgi-bin/printenv/foobar</code> which
+ does not exist, and the second for
+ <code>/home/dgaudet/ap/apachen/cgi-bin/printenv</code>, which does
+ exist. Regardless, at least one <code>stat</code> call is necessary
+ when serving static files because the file size and modification times
+ are used to generate HTTP headers (such as <code>Content-Length</code>,
+ <code>Last-Modified</code>) and implement protocol features (such as
+ <code>If-Modified-Since</code>). A somewhat more clever server could
+ avoid the <code>stat</code> when serving non-static files, however
+ doing so in Apache is very difficult given the modular structure.
<p>All static files are served using <code>mmap</code>:</p>
@@ -833,48 +869,46 @@
munmap(0x400ee000, 6144) = 0
</pre>
</blockquote>
- On some architectures it's slower to <code>mmap</code> small
- files than it is to simply <code>read</code> them. The define
- <code>MMAP_THRESHOLD</code> can be set to the minimum size
- required before using <code>mmap</code>. By default it's set to
- 0 (except on SunOS4 where experimentation has shown 8192 to be
- a better value). Using a tool such as <a
- href="http://www.bitmover.com/lmbench/">lmbench</a> you can
- determine the optimal setting for your environment.
-
- <p>You may also wish to experiment with
- <code>MMAP_SEGMENT_SIZE</code> (default 32768) which determines
- the maximum number of bytes that will be written at a time from
- mmap()d files. Apache only resets the client's
- <code>Timeout</code> in between write()s. So setting this large
- may lock out low bandwidth clients unless you also increase the
+ On some architectures it's slower to <code>mmap</code> small files than
+ it is to simply <code>read</code> them. The define
+ <code>MMAP_THRESHOLD</code> can be set to the minimum size required
+ before using <code>mmap</code>. By default it's set to 0 (except on
+ SunOS4 where experimentation has shown 8192 to be a better value).
+ Using a tool such as <a
+ href="http://www.bitmover.com/lmbench/">lmbench</a> you can determine
+ the optimal setting for your environment.
+
+ <p>You may also wish to experiment with <code>MMAP_SEGMENT_SIZE</code>
+ (default 32768) which determines the maximum number of bytes that will
+ be written at a time from mmap()d files. Apache only resets the
+ client's <code>Timeout</code> in between write()s. So setting this
+ large may lock out low bandwidth clients unless you also increase the
<code>Timeout</code>.</p>
- <p>It may even be the case that <code>mmap</code> isn't used on
- your architecture; if so then defining
- <code>USE_MMAP_FILES</code> and <code>HAVE_MMAP</code> might
- work (if it works then report back to us).</p>
-
- <p>Apache does its best to avoid copying bytes around in
- memory. The first write of any request typically is turned into
- a <code>writev</code> which combines both the headers and the
- first hunk of data:</p>
+ <p>It may even be the case that <code>mmap</code> isn't used on your
+ architecture; if so then defining <code>USE_MMAP_FILES</code> and
+ <code>HAVE_MMAP</code> might work (if it works then report back to
+ us).</p>
+
+ <p>Apache does its best to avoid copying bytes around in memory. The
+ first write of any request typically is turned into a
+ <code>writev</code> which combines both the headers and the first hunk
+ of data:</p>
<blockquote>
<pre>
writev(3, [{"HTTP/1.1 200 OK\r\nDate: Thu, 11"..., 245}, {"\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 6144}], 2) = 6389
</pre>
</blockquote>
- When doing HTTP/1.1 chunked encoding Apache will generate up to
- four element <code>writev</code>s. The goal is to push the byte
- copying into the kernel, where it typically has to happen
- anyhow (to assemble network packets). On testing, various
- Unixes (BSDI 2.x, Solaris 2.5, Linux 2.0.31+) properly combine
- the elements into network packets. Pre-2.0.31 Linux will not
- combine, and will create a packet for each element, so
- upgrading is a good idea. Defining <code>NO_WRITEV</code> will
- disable this combining, but result in very poor chunked
- encoding performance.
+ When doing HTTP/1.1 chunked encoding Apache will generate up to four
+ element <code>writev</code>s. The goal is to push the byte copying into
+ the kernel, where it typically has to happen anyhow (to assemble
+ network packets). On testing, various Unixes (BSDI 2.x, Solaris 2.5,
+ Linux 2.0.31+) properly combine the elements into network packets.
+ Pre-2.0.31 Linux will not combine, and will create a packet for each
+ element, so upgrading is a good idea. Defining <code>NO_WRITEV</code>
+ will disable this combining, but result in very poor chunked encoding
+ performance.
<p>The log write:</p>
@@ -883,13 +917,12 @@
write(17, "127.0.0.1 - - [10/Sep/1997:23:39"..., 71) = 71
</pre>
</blockquote>
- can be deferred by defining <code>BUFFERED_LOGS</code>. In this
- case up to <code>PIPE_BUF</code> bytes (a POSIX defined
- constant) of log entries are buffered before writing. At no
- time does it split a log entry across a <code>PIPE_BUF</code>
- boundary because those writes may not be atomic.
- (<em>i.e.</em>, entries from multiple children could become
- mixed together). The code does its best to flush this buffer
+ can be deferred by defining <code>BUFFERED_LOGS</code>. In this case up
+ to <code>PIPE_BUF</code> bytes (a POSIX defined constant) of log
+ entries are buffered before writing. At no time does it split a log
+ entry across a <code>PIPE_BUF</code> boundary because those writes may
+ not be atomic. (<em>i.e.</em>, entries from multiple children could
+ become mixed together). The code does its best to flush this buffer
when a child dies.
<p>The lingering close code causes four system calls:</p>
@@ -905,9 +938,8 @@
which were described earlier.
<p>Let's apply some of these optimizations:
- <code>-DSINGLE_LISTEN_UNSERIALIZED_ACCEPT
- -DBUFFERED_LOGS</code> and <code>ExtendedStatus Off</code>.
- Here's the final trace:</p>
+ <code>-DSINGLE_LISTEN_UNSERIALIZED_ACCEPT -DBUFFERED_LOGS</code> and
+ <code>ExtendedStatus Off</code>. Here's the final trace:</p>
<blockquote>
<pre>
@@ -932,91 +964,83 @@
munmap(0x400e3000, 6144) = 0
</pre>
</blockquote>
- That's 19 system calls, of which 4 remain relatively easy to
- remove, but don't seem worth the effort.
+ That's 19 system calls, of which 4 remain relatively easy to remove,
+ but don't seem worth the effort.
- <h3><a id="patches" name="patches">Appendix: Patches
- Available</a></h3>
- There are <a
- href="http://www.arctic.org/~dgaudet/apache/1.3/">several
- performance patches available for 1.3.</a> Although they may
- not apply cleanly to the current version, it shouldn't be
- difficult for someone with a little C knowledge to update them.
- In particular:
+ <h3><a id="patches" name="patches">Appendix: Patches Available</a></h3>
+ There are <a href="http://www.arctic.org/~dgaudet/apache/1.3/">several
+ performance patches available for 1.3.</a> Although they may not apply
+ cleanly to the current version, it shouldn't be difficult for someone
+ with a little C knowledge to update them. In particular:
<ul>
<li>A <a
- href="http://www.arctic.org/~dgaudet/apache/1.3/shared_time.patch">
- patch</a> to remove all <code>time(2)</code> system
- calls.</li>
+ href="http://www.arctic.org/~dgaudet/apache/1.3/shared_time.patch">patch</a>
+ to remove all <code>time(2)</code> system calls.</li>
<li>A <a
href="http://www.arctic.org/~dgaudet/apache/1.3/mod_include_speedups.patch">
patch</a> to remove various system calls from
- <code>mod_include</code>, these calls are used by few sites
- but required for backwards compatibility.</li>
+ <code>mod_include</code>, these calls are used by few sites but
+ required for backwards compatibility.</li>
<li>A <a
- href="http://www.arctic.org/~dgaudet/apache/1.3/top_fuel.patch">
- patch</a> which integrates the above two plus a few other
- speedups at the cost of removing some functionality.</li>
+ href="http://www.arctic.org/~dgaudet/apache/1.3/top_fuel.patch">patch</a>
+ which integrates the above two plus a few other speedups at the cost
+ of removing some functionality.</li>
</ul>
- <h3><a id="preforking" name="preforking">Appendix: The
- Pre-Forking Model</a></h3>
+ <h3><a id="preforking" name="preforking">Appendix: The Pre-Forking
+ Model</a></h3>
<p>Apache (on Unix) is a <em>pre-forking</em> model server. The
- <em>parent</em> process is responsible only for forking
- <em>child</em> processes, it does not serve any requests or
- service any network sockets. The child processes actually
- process connections, they serve multiple connections (one at a
- time) before dying. The parent spawns new or kills off old
- children in response to changes in the load on the server (it
- does so by monitoring a scoreboard which the children keep up
- to date).</p>
-
- <p>This model for servers offers a robustness that other models
- do not. In particular, the parent code is very simple, and with
- a high degree of confidence the parent will continue to do its
- job without error. The children are complex, and when you add
- in third party code via modules, you risk segmentation faults
- and other forms of corruption. Even should such a thing happen,
- it only affects one connection and the server continues serving
- requests. The parent quickly replaces the dead child.</p>
+ <em>parent</em> process is responsible only for forking <em>child</em>
+ processes, it does not serve any requests or service any network
+ sockets. The child processes actually process connections, they serve
+ multiple connections (one at a time) before dying. The parent spawns
+ new or kills off old children in response to changes in the load on the
+ server (it does so by monitoring a scoreboard which the children keep
+ up to date).</p>
+
+ <p>This model for servers offers a robustness that other models do not.
+ In particular, the parent code is very simple, and with a high degree
+ of confidence the parent will continue to do its job without error. The
+ children are complex, and when you add in third party code via modules,
+ you risk segmentation faults and other forms of corruption. Even should
+ such a thing happen, it only affects one connection and the server
+ continues serving requests. The parent quickly replaces the dead
+ child.</p>
<p>Pre-forking is also very portable across dialects of Unix.
Historically this has been an important goal for Apache, and it
continues to remain so.</p>
- <p>The pre-forking model comes under criticism for various
- performance aspects. Of particular concern are the overhead of
- forking a process, the overhead of context switches between
- processes, and the memory overhead of having multiple
- processes. Furthermore it does not offer as many opportunities
- for data-caching between requests (such as a pool of
- <code>mmapped</code> files). Various other models exist and
- extensive analysis can be found in the <a
- href="http://www.cs.wustl.edu/~jxh/research/research.html">papers
- of the JAWS project</a>. In practice all of these costs vary
- drastically depending on the operating system.</p>
-
- <p>Apache's core code is already multithread aware, and Apache
- version 1.3 is multithreaded on NT. There have been at least
- two other experimental implementations of threaded Apache, one
- using the 1.3 code base on DCE, and one using a custom
- user-level threads package and the 1.0 code base; neither is
- publicly available. There is also an experimental port of
- Apache 1.3 to <a
- href="http://www.mozilla.org/docs/refList/refNSPR/">Netscape's
- Portable Run Time</a>, which <a
- href="http://www.arctic.org/~dgaudet/apache/2.0/">is
- available</a> (but you're encouraged to join the <a
- href="http://dev.apache.org/mailing-lists">new-httpd mailing
- list</a> if you intend to use it). Part of our redesign for
- version 2.0 of Apache will include abstractions of the server
- model so that we can continue to support the pre-forking model,
- and also support various threaded models.
- <!--#include virtual="footer.html" -->
+ <p>The pre-forking model comes under criticism for various performance
+ aspects. Of particular concern are the overhead of forking a process,
+ the overhead of context switches between processes, and the memory
+ overhead of having multiple processes. Furthermore it does not offer as
+ many opportunities for data-caching between requests (such as a pool of
+ <code>mmapped</code> files). Various other models exist and extensive
+ analysis can be found in the <a
+ href="http://www.cs.wustl.edu/~jxh/research/research.html">papers of
+ the JAWS project</a>. In practice all of these costs vary drastically
+ depending on the operating system.</p>
+
+ <p>Apache's core code is already multithread aware, and Apache version
+ 1.3 is multithreaded on NT. There have been at least two other
+ experimental implementations of threaded Apache, one using the 1.3 code
+ base on DCE, and one using a custom user-level threads package and the
+ 1.0 code base; neither is publicly available. There is also an
+ experimental port of Apache 1.3 to <a
+ href="http://www.mozilla.org/docs/refList/refNSPR/">Netscape's Portable
+ Run Time</a>, which <a
+ href="http://www.arctic.org/~dgaudet/apache/2.0/">is available</a> (but
+ you're encouraged to join the <a
+ href="http://dev.apache.org/mailing-lists">new-httpd mailing list</a>
+ if you intend to use it). Part of our redesign for version 2.0 of
+ Apache will include abstractions of the server model so that we can
+ continue to support the pre-forking model, and also support various
+ threaded models. <!--#include virtual="footer.html" -->
</p>
</body>
</html>