You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Min Xu <mx...@cae.wisc.edu> on 2003/02/10 20:50:35 UTC

Strange Behavior of Apache 2.0.43 on SPARC MP system

Hi All,

Sorry I am posting this directly to the development list. But
I think this is not a user setup problem and it is so strange
maybe only you guys will have a clue on what's going on.

I am a student of UW-Madison. In order to study computer
architecture of commercial multiprocessor servers, we have used
APACHE as one of our important workloads.

I am the one who setup the workload on a 14-processor Sun
Enterprise server. During setup I found a very strange behavior
of the apache server (running with worker MPM). Essentially the
strange thing is that:

  The server optimal throughput is not achieved by using a
  greedy client, who drive the server with no think time. But
  with tiny amount of think time, much better throughput is
  archievable. Also, with the greedy client, the server's
  performance decreased over time, which seems to be very
  counter-intuitive.

Of course, just give you the short decription above does not
help you to help me. So I will give you the detail problem
description and data in the following. With the understanding
of the source code, maybe you can give me some more hypothesises
to try on.

Workload background
-------------------
The setup of apache workload on is fairly simple comparing with
some of the other workloads we have (OLTP). In this workload, we
have a HTTP server and an automatic request generator(SURGE).
Both of the programs are highly multi-threaded. The server has
a pool of static text files to be served from a known URL to the
request generator (the client). The size of the files follows a
statistical distribution. And the client has multiple threads each
emulate a user who access a serial of files in fixed order.

In previous setup of the workload, we have removed client think time.
The basis of that is the following: (we also have to put the server
and the client on the same machine for other reasons)

The workload (server + client) is a closed queueing system. The
throughput of the system is ultimately determined by the bottleneck in
the system. Having think time in the client only increase the parallelism
in the system. It shouldn't change the maximum throughput too much.
BTW, our goal is to archieve realistic workload setup with available
hardware.

If you think about it, for our current server throughput level, say 5000
trans/sec, if each user have 1 second think time between fetching each
file, this will need 5000 users to sustain this throughput. On the other
hand, if we remove the think time from the client, maybe 10 users can also
generate the same 5000 requests per second. So the difference here is that
one server has 5000 httpd threads and the other has only 10 httpd threads.
10 won't be worse(in terms of server behavior) than 5000, right? Greedy
client won't be worse(in terms of performance) than the lazy client, right?

Well it is not that simple...


I know how to get higher performance, but I don't know why it works!
--------------------------------------------------------------------
I have two version of surge clients in my hand. One is the original,
one is my modified version. The difference between them would be the
client efficiency. My modified version would fetch files more efficiently
(because I made it emulate a simpler user) and have less thread
switching overhead.

However, when I comparing the server throughput using these two clients,
I got very surprising results, roughly:

  old client: 3000 trans/sec
  new client: starts out from 4700 trans/sec, gradually degrade to 2500
              trans/sec after 10-20 minutes of runtime.

And this really puzzled me for a long time. My supposedly performance
enhancement did not improve the server Xput, but hurt it!

Turns out the reason for this is the new client was too efficient! I
added the think time between each URL request and new client was able
to drive the server Xput to as high as 5000 trans/sec. But note, the
real interesting thing is not the think time, but how sensitive the
Xput was affected by it.

I'd prefer to call the think time "delay time" in the following because
I really only introduced very small amount of delay between each file
fetch. The result can be seen in the following plots:

http://www.cs.wisc.edu/~xu/files/delay_results.eps
http://www.cs.wisc.edu/~xu/files/side1.eps
http://www.cs.wisc.edu/~xu/files/side2.eps

In this experiment, instead of using both old and new version of the
client, I just used the new version with varying delay time and number
of threads. Since there are two dimensions of freedom in the client,
the plot is in 3D. The figures side1 and side2 is roughly the 2D
projection of the Xput vs. thread and Xput vs. delay time.

Each point on the plot is a 30 minutes benchmarking on a 14P MP system.

Clearly, driving the server using no delay time is not optimal. No
matter using same amount of threads or less number of threads, the
server Xput is no higher than delayed counterparts. However, you can see,
the server Xput raise rapidly with client number when delay time is 0.
On the other hand, with small number clients, server Xput is reverse
proportional to the the delay time. And with larger clients number,
server Xput is proportional to delay time.

I don't understand why small(1-3us, with nanosleep on of solaris) delay
time would help?

Some hypothesises are that apache server itself have some internals to
slowdown greedy clients. Or Solaris did not schedule the server threads
well enough to handle short request interval. Or, the greedy client
consumed too much cpu time?

I'd appreciate any suggestions/comments from you.

-Min

-- 
Rapid keystrokes and painless deletions often leave a writer satisfied with
work that is merely competent.
  -- "Writing Well" Donald Hall and Sven Birkerts

Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Ian Holsman <li...@holsman.net>.
Hi Min.

I'm not sure if this would make a difference in 2.0.43,
but you might want to configure apache to use non-portable-atomics
   --enable-nonportable-atomics=yes

I think the worker MPM can then take advantage of this, and remove a 
mutex lock (replacing it with a spin) which may give you better 
performance with multiple CPUs.

1 other point.
you might want to disable some CPU's, and re-run your test. (eg.. on a 8 
cpu machine) and see if that helps



Min Xu wrote:
> Hi All,
> 
> Sorry I am posting this directly to the development list. But
> I think this is not a user setup problem and it is so strange
> maybe only you guys will have a clue on what's going on.
> 
> I am a student of UW-Madison. In order to study computer
> architecture of commercial multiprocessor servers, we have used
> APACHE as one of our important workloads.
> 
> I am the one who setup the workload on a 14-processor Sun
> Enterprise server. During setup I found a very strange behavior
> of the apache server (running with worker MPM). Essentially the
> strange thing is that:
> 
>   The server optimal throughput is not achieved by using a
>   greedy client, who drive the server with no think time. But
>   with tiny amount of think time, much better throughput is
>   archievable. Also, with the greedy client, the server's
>   performance decreased over time, which seems to be very
>   counter-intuitive.
> 
> Of course, just give you the short decription above does not
> help you to help me. So I will give you the detail problem
> description and data in the following. With the understanding
> of the source code, maybe you can give me some more hypothesises
> to try on.
> 


Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Aaron Bannert <aa...@clove.org>.
On Wednesday, February 12, 2003, at 01:08  PM, Min Xu wrote:
> We will soon have two new 8P Sun servers equipped with
> Gigabit ethernet coming to our lab. With that, I should be able to
> experiment with separate machines.

I'd be very interested in seeing updated results from a multi-machine
networked test. Feel free to post them here once you have them.

-aaron


Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Min Xu <mx...@cae.wisc.edu>.
On Wed, Feb 12, 2003 at 10:35:18AM -0800, Justin Erenkrantz wrote:
> --On Wednesday, February 12, 2003 11:52 AM -0600 Min Xu 
> <mx...@cae.wisc.edu> wrote:
> 
> >First, I don't think the disk should be bottleneck in any case,
> >this is because the system has 2GB memory, Solaris's file cache is
> >able to cache all the file content. top shows the following stats:
> 
> The size of memory has nothing to do with the available bandwidth 
> that the memory has.

I didn't say it does. I was trying to rule out the possibilities that
the disk being the bottleneck. But you apparently have underestimated
Sun server's memory system. For our system, it uses Sun's gigaplane/Sunfire
system interconnect.

> ...
> 
> All MP Sparcs share the same memory backplane.  That's why you hardly 
> ever see performance improvements past 8x CPUs because the memory 
> bandwidth kills you (the CPUs are starved for memory).  Moving to a 
> NUMA architecture might help, but I think that's not a feature 
> UltraSparc or Solaris support.  (I hear Linux has experimental NUMA 
> support now.)

It is indeed the case apache's performance doesn't go up very much
past 8x CPUs in my experiments. However, whether this is due to limited
memory bandwidth is yet to be tested. Also, I am not aware of any
literature supports that NUMA architecture would have higher memory
bandwidth.

> I'd recommend reading http://www.sunperf.com/perfmontools.html.  You 
> should also experiment with mod_mem_cache and mod_disk_cache.

Thanks for the suggestions, I would like to try mod_mem_cache and
mod_disk_cache.

> >To test the context switching hypothesis and the backplane
> >hypothesis I changed all files in the repository to 2 bytes long,
> >that's an "a" plus an "eof". I rerun the experiment, the
> >performance is poorer!
> 
> There will still be overhead in the OS networking layer.  You are 
> using connection keep-alives and pipelining, right?  The fact that 
> your top output had a lot of kernel time, I'd bet you are spending a 
> lot of time contending on the virtual network (which is usually the 
> case when you are not using connection keep-alives - the TCP stack 
> just gets hammered).  I'd bet the local network is not optimized for 
> performance.  (DMA can't be used and functionality that could be 
> implemented on dedicated hardware must be done on the main CPU.)

Sounds interesting. I'd like to test whether the networking layer
is problem or not somehow.

> Please stop trying to convince us to pay attention to benchmarks 
> where the client and server are on the same machine.  There are just 
> too many variables that will screw things up.  The performance 
> characteristics change dramatically when they are physically separate 
> boxes.  -- justin

I agree. We will soon have two new 8P Sun servers equipped with
Gigabit ethernet coming to our lab. With that, I should be able to
experiment with separate machines.

Thanks for your insightful comments.

-Min

-- 
Rapid keystrokes and painless deletions often leave a writer satisfied with
work that is merely competent.
  -- "Writing Well" Donald Hall and Sven Birkerts

Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Wednesday, February 12, 2003 11:52 AM -0600 Min Xu 
<mx...@cae.wisc.edu> wrote:

> First, I don't think the disk should be bottleneck in any case,
> this is because the system has 2GB memory, Solaris's file cache is
> able to cache all the file content. top shows the following stats:

The size of memory has nothing to do with the available bandwidth 
that the memory has.  I believe recent Sparcs still only use 133MHz 
RAM (PC133 at best - Sparc's don't yet use DDR, I think).  Since all 
code pages will most likely not fit entirely in CPU cache, some 
section has to be read from main memory.  IIRC, some versions of 
UIIIi's have 4MB CPU cache, but I wouldn't be surprised if that's not 
enough (kernel pages would also have to be counted).  (I don't know 
your specifics here.)

So, if you have 14 processors (I think this is what you said you 
had), they will all be contending on the memory (~133MHz) bus.  The 
effective memory bus for all of the processors will be 133/14 = 
~14MHz.  That's a severe bottleneck if main memory is accessed in a 
critical path for all processes.

All MP Sparcs share the same memory backplane.  That's why you hardly 
ever see performance improvements past 8x CPUs because the memory 
bandwidth kills you (the CPUs are starved for memory).  Moving to a 
NUMA architecture might help, but I think that's not a feature 
UltraSparc or Solaris support.  (I hear Linux has experimental NUMA 
support now.)

I'd recommend reading http://www.sunperf.com/perfmontools.html.  You 
should also experiment with mod_mem_cache and mod_disk_cache.

> To test the context switching hypothesis and the backplane
> hypothesis I changed all files in the repository to 2 bytes long,
> that's an "a" plus an "eof". I rerun the experiment, the
> performance is poorer!

There will still be overhead in the OS networking layer.  You are 
using connection keep-alives and pipelining, right?  The fact that 
your top output had a lot of kernel time, I'd bet you are spending a 
lot of time contending on the virtual network (which is usually the 
case when you are not using connection keep-alives - the TCP stack 
just gets hammered).  I'd bet the local network is not optimized for 
performance.  (DMA can't be used and functionality that could be 
implemented on dedicated hardware must be done on the main CPU.)

Please stop trying to convince us to pay attention to benchmarks 
where the client and server are on the same machine.  There are just 
too many variables that will screw things up.  The performance 
characteristics change dramatically when they are physically separate 
boxes.  -- justin

Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Min Xu <mx...@cae.wisc.edu>.
On Wed, Feb 12, 2003 at 11:52:03AM -0600, Min Xu wrote:
> with normal file size:
>   0us delay: system xput ~= 2500 pages/sec
>   3us delay: system xput ~= 4700 pages/sec
> 
> with 2byte file size:
>   0us delay: system xput ~= 1900 pages/sec
>   3us delay: system xput ~= 1700 pages/sec
>  10us delay: system xput ~= 1700 pages/sec

also:

  50us delay: system xput ~= 1700 pages/sec

  I ran 100 client threads.

-Min

Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Min Xu <mx...@cae.wisc.edu>.
On Wed, Feb 12, 2003 at 12:30:28AM -0500, Cliff Woolley wrote:
> Of course the client isn't spinning, but your statement that "the server
> should have done its job" assumes that the entire response is sent at
> once.  Depending on its length, it is entirely possible that the server
> would send the response in multiple packets.  But as soon as *any* packets
> arrive, the client is woken up.  That means that the server and the client
> are vying for timeslices.  Let's say the client gets it.  The client
> receives the partial response and then goes right back to sleep again.
> Sure the server wakes right back up and starts sending more of the
> response, but in the meanwhile, several context switches had to happen.
> 

I am sorry for my bad assumption. The reason I thought that was "mostly"
the case that server has done its job is the following:

I have a file repository contains about 20000 files in one directory.
I know all in one directory is bad, but I didn't have time to fix
the client about this. And 15000 out of this 20000 files are smaller
than 8K, which is the page size of the system. I thought in terms of DMA,
the file smaller than 8K should be transfered in one shot.

Anyway, I think your point is valid, and Justin's hypothesis about the
backplane/disk being bottleneck is also very interesting. So I did some
more experiments this morning.

First, I don't think the disk should be bottleneck in any case, this is
because the system has 2GB memory, Solaris's file cache is able to cache
all the file content. top shows the following stats:

load averages:  4.05,  4.68,  5.25                                     10:33:36
47 processes:  45 sleeping, 2 on cpu
CPU states: 51.2% idle, 11.3% user, 27.8% kernel,  9.8% iowait,  0.0% swap
Memory: 2048M real, 1327M free, 508M swap in use, 3256M swap free

The total file size is < 500MB, so there is no reason solaris can't cache
them all.

To test the context switching hypothesis and the backplane hypothesis
I changed all files in the repository to 2 bytes long, that's an "a"
plus an "eof". I rerun the experiment, the performance is poorer!

with normal file size:
  0us delay: system xput ~= 2500 pages/sec
  3us delay: system xput ~= 4700 pages/sec

with 2byte file size:
  0us delay: system xput ~= 1900 pages/sec
  3us delay: system xput ~= 1700 pages/sec
 10us delay: system xput ~= 1700 pages/sec

It seems that the bottleneck is some where else, and the bottleneck is
stressed harder when file size is smaller. But I doubt it is the backplane
or the context switching.

-Min

-- 
Rapid keystrokes and painless deletions often leave a writer satisfied with
work that is merely competent.
  -- "Writing Well" Donald Hall and Sven Birkerts

Re: Apache 2.0.45 -- When?

Posted by "William A. Rowe, Jr." <wr...@rowe-clan.net>.
At 10:01 AM 2/12/2003, Jess M. Holle wrote:
>There were mutterings that Apache 2.0.45 would be following shortly after 2.0.44. It's now been a bit, and I'm left wondering: what is the release timeframe for 2.0.45?

My goal is to tag ASAP - but I'm just back from CA and had too many
deliverables lately...

Here are my goals (your mileage may vary);

  .Resolve Greg's concerns about ENABLE_SENDFILE, a 2.0.44
   regression.

  .Fix the 2.0.44 regression in OtherChild platform compatibility

  .Call APR 0.9.2 ''done" so they can move on to 9.3 (and soon 1.0.0)

  .Fix my choice of .dbgmark for .dbg file creation timestamps (it turns
   out that *.dbg includes 8.3 names of .dbgmark files, yuck!)

I'm tracking a few higher priority issues at the moment so I will probably
tag and roll a 2.0.45_RC1 tarball candidate late Friday afternoon.

Bill 


Apache 2.0.45 -- When?

Posted by "Jess M. Holle" <je...@ptc.com>.
There were mutterings that Apache 2.0.45 would be following shortly 
after 2.0.44. It's now been a bit, and I'm left wondering: what is the 
release timeframe for 2.0.45?

--
Jess Holle



Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Cliff Woolley <jw...@virginia.edu>.
On Tue, 11 Feb 2003, Min Xu wrote:

> I don't understand, if the client is blocked, it only be woke up when
> the server reply is avaiable. In that case, the server should have done
> it is job. In other words, the client is not spinning, it is instead
> put on sleep when calling read syscall.

Of course the client isn't spinning, but your statement that "the server
should have done its job" assumes that the entire response is sent at
once.  Depending on its length, it is entirely possible that the server
would send the response in multiple packets.  But as soon as *any* packets
arrive, the client is woken up.  That means that the server and the client
are vying for timeslices.  Let's say the client gets it.  The client
receives the partial response and then goes right back to sleep again.
Sure the server wakes right back up and starts sending more of the
response, but in the meanwhile, several context switches had to happen.

--Cliff


Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Min Xu <mx...@cae.wisc.edu>.
On Tue, Feb 11, 2003 at 08:09:14PM -0500, Cliff Woolley wrote:
> On Tue, 11 Feb 2003, Min Xu wrote:
> 
> > Well, I am not defending this server/client-on-one-system is better
> > or anything. Just want to understand this better. Isn't the clients
> > block when the servers can not response? From a higher level of point
> > of view, the system is a closed queuing system. In the steady state
> > there should be a balance between servers and clients, right?
> 
> You still end up with situations where the client steals the timeslice
> away from the server when it doesn't really have any significant work to
> do, and then the server has to switch back in and do a little more
> work.... etc.

I don't understand, if the client is blocked, it only be woke up when
the server reply is avaiable. In that case, the server should have done
it is job. In other words, the client is not spinning, it is instead
put on sleep when calling read syscall.

-- 
Rapid keystrokes and painless deletions often leave a writer satisfied with
work that is merely competent.
  -- "Writing Well" Donald Hall and Sven Birkerts

Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Cliff Woolley <jw...@virginia.edu>.
On Tue, 11 Feb 2003, Justin Erenkrantz wrote:

> Isolating processors is not enough.  Your bottleneck is probably the
> memory backplane (or disk thoroughput) which doesn't partition.  The

A very good point.


Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Tuesday, February 11, 2003 7:05 PM -0600 Min Xu 
<mx...@cae.wisc.edu> wrote:

> case. On the other hand, I have used solaris "psrset" to logically
> divide the 14p server into "two" machines, and I bind the server
> and the client to different processor sets. And the results shows
> again the small delay time have important impact on system
> throughput.

Isolating processors is not enough.  Your bottleneck is probably the 
memory backplane (or disk thoroughput) which doesn't partition.  The 
small time delay in your client is probably freeing the backplane to 
service the starved processes.  The only way the backplane won't be a 
factor is if everything fits into cache - which almost never happens.

If you want to mimic the real world, use separate machines and a fast 
network backbone.  GigE is probably what you'd need to make the CPU 
the bottleneck - special cases can be crafted with mod_include or 
large files that saturate FastE (see flood).  For optimal 
performance, you'd need to ensure that your NICs and disk use DMA. 
This allows for zero copy - ensure you are using sendfilev() on 
Solaris.  -- justin

Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Cliff Woolley <jw...@virginia.edu>.
On Tue, 11 Feb 2003, Min Xu wrote:

> Well, I am not defending this server/client-on-one-system is better
> or anything. Just want to understand this better. Isn't the clients
> block when the servers can not response? From a higher level of point
> of view, the system is a closed queuing system. In the steady state
> there should be a balance between servers and clients, right?

You still end up with situations where the client steals the timeslice
away from the server when it doesn't really have any significant work to
do, and then the server has to switch back in and do a little more
work.... etc.


Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Min Xu <mx...@cae.wisc.edu>.
On Tue, Feb 11, 2003 at 03:11:36PM -0800, David Burry wrote:
> Because the client will contend very heavily with the server for many
> system resources.  It's indeterminate which one (client or server)
> requires more resources, which one wins more, and how much more of which
> resources.

Well, I am not defending this server/client-on-one-system is better
or anything. Just want to understand this better. Isn't the clients
block when the servers can not response? From a higher level of point
of view, the system is a closed queuing system. In the steady state
there should be a balance between servers and clients, right?

> Running both on the same machine will certainly stress the
> machine pretty well, but you can't compare any measurement you get with
> what the same machine will perform if Apache doesn't have to contend
> with a client for its resources, it won't be the same result at all.

We have no intention to compare these two at all. Our goal was to
achieve a reasonable workload that runs similarly as the real world
application.

> In
> the real world apache doesn't have a client stealing its system
> resources, therefore an accurate test of how apache would behave in the
> real world can only be done if you set up a test with the same
> situation.  This could be why apache is performing better when you let
> your client sleep a little (then again, it could be something else,
> that's why I say it's "indeterminate" (unknown) how much of the
> resources the client itself is stealing away from the server).  To
> measure the effect of anything, you have to limit the number of
> variables that can influence the result.

I agree. We indeed tried other experiments to test this hypothesis.
We first used separate machines for the clients and this behavior
disappeared. But I think the reason was the network(ethernet) latency
between those server and client machines enssentially served as the
small delay time I have added in the singal machine case. On the other
hand, I have used solaris "psrset" to logically divide the 14p server
into "two" machines, and I bind the server and the client to different
processor sets. And the results shows again the small delay time have
important impact on system throughput.

How strange!

-- 
Rapid keystrokes and painless deletions often leave a writer satisfied with
work that is merely competent.
  -- "Writing Well" Donald Hall and Sven Birkerts

RE: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by David Burry <db...@tagnet.org>.
> > >I am running the server/client on the
> > >same machine.
> > 
> > You will not get reliable results by doing this.
> 
> Can you elaborate why? Plus we were forced to do this,
> but would like to avoid in the future if it really affects
> our results.

Because the client will contend very heavily with the server for many
system resources.  It's indeterminate which one (client or server)
requires more resources, which one wins more, and how much more of which
resources.  Running both on the same machine will certainly stress the
machine pretty well, but you can't compare any measurement you get with
what the same machine will perform if Apache doesn't have to contend
with a client for its resources, it won't be the same result at all.  In
the real world apache doesn't have a client stealing its system
resources, therefore an accurate test of how apache would behave in the
real world can only be done if you set up a test with the same
situation.  This could be why apache is performing better when you let
your client sleep a little (then again, it could be something else,
that's why I say it's "indeterminate" (unknown) how much of the
resources the client itself is stealing away from the server).  To
measure the effect of anything, you have to limit the number of
variables that can influence the result.

Dave


Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Aaron Bannert <aa...@clove.org>.
On Tuesday, February 11, 2003, at 01:13  PM, Min Xu wrote:

> On Tue, Feb 11, 2003 at 01:00:53PM -0800, Aaron Bannert wrote:
>>
>> On Tuesday, February 11, 2003, at 11:34  AM, Min Xu wrote:
>>> I am running the server/client on the
>>> same machine.
>>
>> You will not get reliable results by doing this.
>
> Can you elaborate why? Plus we were forced to do this, but would
> like to avoid in the future if it really affects our results.

It is not an accurate model of real-world web-serving scenarios.
Besides, your client and server processes are obviously going to
contend for access to system resources.

-aaron


Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Min Xu <mx...@cae.wisc.edu>.
On Tue, Feb 11, 2003 at 01:00:53PM -0800, Aaron Bannert wrote:
> 
> On Tuesday, February 11, 2003, at 11:34  AM, Min Xu wrote:
> >I am running the server/client on the
> >same machine.
> 
> You will not get reliable results by doing this.

Can you elaborate why? Plus we were forced to do this, but would
like to avoid in the future if it really affects our results.

-- 
Rapid keystrokes and painless deletions often leave a writer satisfied with
work that is merely competent.
  -- "Writing Well" Donald Hall and Sven Birkerts

Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Aaron Bannert <aa...@clove.org>.
On Tuesday, February 11, 2003, at 11:34  AM, Min Xu wrote:
> I am running the server/client on the
> same machine.

You will not get reliable results by doing this.

-aaron


Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Min Xu <mx...@cae.wisc.edu>.
On Tue, Feb 11, 2003 at 10:08:29AM -0800, Aaron Bannert wrote:
> A couple questions and a couple observations:
> 
> 1) How many network cards are in the server? (How many unique
>    interrupt handlers for the network?)

There are two network cards in the server, but I don't think they
are used in my experiment. I am running the server/client on the
same machine. So I don't think the real network interface, like
hme0 is used at all.

> 2) How much context switching was going on, and how impacted
>    were the mutexes (see mpstat)?

for 100 threads and 3us sleep time, here is a typical output
for 'mpstat 5'.

CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0   33  17 6543   748  300 1694  357  289  679   24  4744   14  33  21  33
  1   37  16 6164   623  100 2162  498  356  688   29  5122   10  34  19  37
  4   32  18 5723   700  100 2449  556  361  681   21  5431    9  33  17  42
  5   24  14 5648   721  100 2648  581  364  663   18  5597   12  32  18  39
 12   23  11 5802   756  102 2827  612  419  663   18  5629   11  24  15  50
 13   21  15 3507   791  100 2923  646  416  657   20  5708    8  29  20  43
 16   25  16 2268   803  100 3025  662  434  651   22  5794    9  31  18  42
 17   23  11 5944  1186  571 2646  590  421  562   20  5207   11  31  18  41
 20   30  17 4773   637  100 2212  515  402  660   27  5125    8  33  21  39
 21   31  17 4582   615  100 2196  495  369  700   24  5281   11  30  18  41
 24   31  16 3647   647  100 2359  514  355  709   24  5359   10  28  20  42
 25   25  13 3780   706  100 2655  574  362  695   18  5536   13  27  21  39
 28   27  14 3800   711  100 2686  582  391  666   19  5524   10  27  17  46
 29   26  14 3344   821  100 2980  675  423  649   19  5768    9  30  21  40
CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr sys  wt idl
  0   34  17 4135   701  299 1596  315  265  670   24  4681    7  21  22  50
  1   37  16 3130   523  100 1989  404  331  658   23  5220    6  20  20  54
  4   29  12 6719   563  100 2236  448  355  662   21  5212    5  17  21  56
  5   26  13 2100   639  100 2498  520  355  606   20  5495    8  15  18  60
 12   26   8 3041   614  102 2593  486  372  621   20  5603    4  18  14  64
 13   21  12 3256   664  100 2808  541  411  620   18  5561    6  15  18  61
 16   19  11 3042   616  100 2768  491  439  556   13  5542    4  17  16  63
 17   22   6 1554  1021  480 2777  519  455  570   18  5521    4  17   8  71
 20   31  11 5914   552  100 2157  433  414  644   16  5290    5  20  15  60
 21   31  17 3986   513  100 2012  396  343  682   20  5221    5  22  21  51
 24   33  12 4072   520  100 2105  404  349  683   20  5118    4  19  20  57
 25   27  14 2756   606  100 2478  488  366  632   20  5459    4  19  20  57
 28   26  13 6575   646  100 2659  521  380  620   14  5353    5  16  23  56
 29   24  10 3818   652  100 2708  520  403  615   18  5552    5  19  15  61

The context switch is a bit high. smtx have been reduced, it use to
have more smtx due to client synchronizations.

Totally two httpd is started:

xu       26557 13.5  0.4 9904 7760 ?        O 13:25:32  1:44
/local.pbr.2/static-web/apache/bin/httpd -k start
xu       26587  0.0  0.1 1080  784 pts/3    S 13:26:12  0:00 grep httpd
xu       26512  0.0  0.2 4568 3080 ?        S 13:19:22  0:00
/local.pbr.2/static-web/apache/bin/httpd -k start
xu       26513  0.0  0.1 4344 1456 ?        S 13:19:22  0:00
/local.pbr.2/static-web/apache/bin/httpd -k start

server config:

<IfModule worker.c>
StartServers         5
ServerLimit          16
MinSpareThreads      1
MaxSpareThreads      1
ThreadsPerChild      64
# MaxClients = ServerLimit * ThreadsPerChild
MaxClients           1024
MaxRequestsPerChild  100000
</IfModule>

> 3) Was the workload uniformly distributed across the CPUs?

I think so. (see mpstat)

> I've seen large MP systems completely fail to distribute the
> workload, and suffer because of it. My current theory for why
> this occurs is that the interrupt load is overwhelming the CPU
> where that interrupt is being serviced. This combined with the
> relatively small amount of userspace work that must be done to
> push a small static file out is wreaking havoc on the scheduler
> (and it's probably more dramatic if your system enabled sendfile
> support).

I think sendfile is enabled (is it default?) in apache too.

-Min

Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Aaron Bannert <aa...@clove.org>.
A couple questions and a couple observations:

1) How many network cards are in the server? (How many unique
    interrupt handlers for the network?)

2) How much context switching was going on, and how impacted
    were the mutexes (see mpstat)?

3) Was the workload uniformly distributed across the CPUs?

I've seen large MP systems completely fail to distribute the
workload, and suffer because of it. My current theory for why
this occurs is that the interrupt load is overwhelming the CPU
where that interrupt is being serviced. This combined with the
relatively small amount of userspace work that must be done to
push a small static file out is wreaking havoc on the scheduler
(and it's probably more dramatic if your system enabled sendfile
support).

-aaron


On Tuesday, February 11, 2003, at 09:34  AM, Min Xu wrote:

> Thanks to Owen Garrett, who reminded me that I should have
> mentioned a little more details about the client configuration.
>
> My modified SURGE client fetch web pages on an "object" basis,
> each object contains multiple pages. For each object the client
> uses HTTP/1.1 keepalive, but not pipeline. After an object
> being fetched completely, the client close the connection
> to the server and reopen a new one for next object.
>
> The delay time I added was between each web pages, so the
> client goes to sleep for a little while with the connection
> being still open.
>
> FYI, I have attached the client code. Anyone have a wild guess
> on what's going on? ;-) Thanks a lot!


Re: Strange Behavior of Apache 2.0.43 on SPARC MP system

Posted by Min Xu <mx...@cae.wisc.edu>.
Thanks to Owen Garrett, who reminded me that I should have
mentioned a little more details about the client configuration.

My modified SURGE client fetch web pages on an "object" basis,
each object contains multiple pages. For each object the client
uses HTTP/1.1 keepalive, but not pipeline. After an object
being fetched completely, the client close the connection
to the server and reopen a new one for next object.

The delay time I added was between each web pages, so the
client goes to sleep for a little while with the connection
being still open.

FYI, I have attached the client code. Anyone have a wild guess
on what's going on? ;-) Thanks a lot!

-Min

On Mon, Feb 10, 2003 at 01:50:35PM -0600, Min Xu wrote:
> Hi All,
> 
> Sorry I am posting this directly to the development list. But
> I think this is not a user setup problem and it is so strange
> maybe only you guys will have a clue on what's going on.
> 
> I am a student of UW-Madison. In order to study computer
> architecture of commercial multiprocessor servers, we have used
> APACHE as one of our important workloads.
> 
> I am the one who setup the workload on a 14-processor Sun
> Enterprise server. During setup I found a very strange behavior
> of the apache server (running with worker MPM). Essentially the
> strange thing is that:
> 
>   The server optimal throughput is not achieved by using a
>   greedy client, who drive the server with no think time. But
>   with tiny amount of think time, much better throughput is
>   archievable. Also, with the greedy client, the server's
>   performance decreased over time, which seems to be very
>   counter-intuitive.
> 
> Of course, just give you the short decription above does not
> help you to help me. So I will give you the detail problem
> description and data in the following. With the understanding
> of the source code, maybe you can give me some more hypothesises
> to try on.
> 
> Workload background
> -------------------
> The setup of apache workload on is fairly simple comparing with
> some of the other workloads we have (OLTP). In this workload, we
> have a HTTP server and an automatic request generator(SURGE).
> Both of the programs are highly multi-threaded. The server has
> a pool of static text files to be served from a known URL to the
> request generator (the client). The size of the files follows a
> statistical distribution. And the client has multiple threads each
> emulate a user who access a serial of files in fixed order.
> 
> In previous setup of the workload, we have removed client think time.
> The basis of that is the following: (we also have to put the server
> and the client on the same machine for other reasons)
> 
> The workload (server + client) is a closed queueing system. The
> throughput of the system is ultimately determined by the bottleneck in
> the system. Having think time in the client only increase the parallelism
> in the system. It shouldn't change the maximum throughput too much.
> BTW, our goal is to archieve realistic workload setup with available
> hardware.
> 
> If you think about it, for our current server throughput level, say 5000
> trans/sec, if each user have 1 second think time between fetching each
> file, this will need 5000 users to sustain this throughput. On the other
> hand, if we remove the think time from the client, maybe 10 users can also
> generate the same 5000 requests per second. So the difference here is that
> one server has 5000 httpd threads and the other has only 10 httpd threads.
> 10 won't be worse(in terms of server behavior) than 5000, right? Greedy
> client won't be worse(in terms of performance) than the lazy client, right?
> 
> Well it is not that simple...
> 
> 
> I know how to get higher performance, but I don't know why it works!
> --------------------------------------------------------------------
> I have two version of surge clients in my hand. One is the original,
> one is my modified version. The difference between them would be the
> client efficiency. My modified version would fetch files more efficiently
> (because I made it emulate a simpler user) and have less thread
> switching overhead.
> 
> However, when I comparing the server throughput using these two clients,
> I got very surprising results, roughly:
> 
>   old client: 3000 trans/sec
>   new client: starts out from 4700 trans/sec, gradually degrade to 2500
>               trans/sec after 10-20 minutes of runtime.
> 
> And this really puzzled me for a long time. My supposedly performance
> enhancement did not improve the server Xput, but hurt it!
> 
> Turns out the reason for this is the new client was too efficient! I
> added the think time between each URL request and new client was able
> to drive the server Xput to as high as 5000 trans/sec. But note, the
> real interesting thing is not the think time, but how sensitive the
> Xput was affected by it.
> 
> I'd prefer to call the think time "delay time" in the following because
> I really only introduced very small amount of delay between each file
> fetch. The result can be seen in the following plots:
> 
> http://www.cs.wisc.edu/~xu/files/delay_results.eps
> http://www.cs.wisc.edu/~xu/files/side1.eps
> http://www.cs.wisc.edu/~xu/files/side2.eps
> 
> In this experiment, instead of using both old and new version of the
> client, I just used the new version with varying delay time and number
> of threads. Since there are two dimensions of freedom in the client,
> the plot is in 3D. The figures side1 and side2 is roughly the 2D
> projection of the Xput vs. thread and Xput vs. delay time.
> 
> Each point on the plot is a 30 minutes benchmarking on a 14P MP system.
> 
> Clearly, driving the server using no delay time is not optimal. No
> matter using same amount of threads or less number of threads, the
> server Xput is no higher than delayed counterparts. However, you can see,
> the server Xput raise rapidly with client number when delay time is 0.
> On the other hand, with small number clients, server Xput is reverse
> proportional to the the delay time. And with larger clients number,
> server Xput is proportional to delay time.
> 
> I don't understand why small(1-3us, with nanosleep on of solaris) delay
> time would help?
> 
> Some hypothesises are that apache server itself have some internals to
> slowdown greedy clients. Or Solaris did not schedule the server threads
> well enough to handle short request interval. Or, the greedy client
> consumed too much cpu time?
> 
> I'd appreciate any suggestions/comments from you.
> 
> -Min
> 
> -- 
> Rapid keystrokes and painless deletions often leave a writer satisfied with
> work that is merely competent.
>   -- "Writing Well" Donald Hall and Sven Birkerts

-- 
Rapid keystrokes and painless deletions often leave a writer satisfied with
work that is merely competent.
  -- "Writing Well" Donald Hall and Sven Birkerts