You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apr.apache.org by Greg Ames <gr...@remulak.net> on 2001/08/17 16:18:19 UTC

Re: Performance numbers..... ;(

"Victor J. Orlikowski" wrote:
> 
> Hi all,
> 
> Was running some performance tests on AIX...
> And oddities popped up between 1.3 and 2.0 (latest CVS of both).
> 
> Requests/sec.
>                     no keepalive          keepalive
> 2.0 - prefork          420                  590
> 2.0 - threaded         390                  580
> 1.3                    420                  700

Victor,

Now we're going to have to beat you with a stick.  You didn't mention
what kind of workload you're running.  Are you serving a static file? 
How big?  These things are important to know when analyzing benchmark
results.  

I'm guessing you're serving a smallish static file.  The Magic 8-ball at
http://8ball.federated.com/ said "Without a doubt" when I asked this
question.

If ssi's were working, and you were measuring them, I think we would see
2.0 look a lot worse compared to 1.3, and threaded worst of all, because
we don't have a fast mutex-free replacement for malloc/free yet <sigh>. 
For the buckets code, having some kind of connection lifetime apr_pool
might be good enough, if we had a way to terminate keepalive connections
that ate too much memory.  But we need a good apr_malloc/free for other
stuff anyway, like caching.

I'm thinking it might be worth the effort to have some kind of
apr_compare_and_swap primitive.  Yes, it would involve writing a little
assembler code for the different CPU architectures, which Apache hasn't
done before AFAIK.  But let's say we got past that and had it
available.  We could use it to create:

* apr_push/pop, which could be used for memory block allocators
(apr_pools and apr_malloc),  and
* apr_atomic_add, which could be used to safely increment and decrement
shared counters etc.,

all without mutexes.  

Greg

p.s. just kidding about the stick

Re: Performance numbers..... ;(

Posted by Greg Ames <gr...@remulak.net>.
Brian Pane wrote:

> I know the conventional wisdom is that SSI is slow because of
> the malloc calls in bucket creation, but the problem might
> be elsewhere.  I've tried adding in a free list of recycled
> buckets, to reduce the calls to malloc, but it didn't seem to
> affect performance measurably.  

hmmm, interesting.  Just a while ago I was testing SSI's on daedalus
because of the mod_include problem with
http://httpd.apache.org/docs-2.0/misc/FAQ.html .  The functional bug is
gone, but it was abysmally slow (2 req/sec).  top said we weren't
burning up the CPU (94+% idle).  server-status had every worker in "W"
state.  ps ax -O wchan said almost everybody was blocked in select, so I
wrote it off as a bandwidth constraint on the path back to my ThinkPad.

>     The bottleneck might instead
> be centered on the other operations required for brigade
> in SSI requests in 2.0, like splitting buckets and registering
> pool cleanup functions.  Further profiling should be helpful
> here...

It would be great if we could get a handle on this somehow.

Greg

Re: Performance numbers..... ;(

Posted by Bill Stoddard <bi...@wstoddard.com>.

> Greg Ames wrote:
>
> >"Victor J. Orlikowski" wrote:
> >
> [...]
>
> >If ssi's were working, and you were measuring them, I think we would see
> >2.0 look a lot worse compared to 1.3, and threaded worst of all, because
> >we don't have a fast mutex-free replacement for malloc/free yet <sigh>.
> >For the buckets code, having some kind of connection lifetime apr_pool
> >might be good enough, if we had a way to terminate keepalive connections
> >that ate too much memory.  But we need a good apr_malloc/free for other
> >stuff anyway, like caching.
> >
> I know the conventional wisdom is that SSI is slow because of
> the malloc calls in bucket creation, but the problem might
> be elsewhere.  I've tried adding in a free list of recycled
> buckets, to reduce the calls to malloc, but it didn't seem to
> affect performance measurably.

I've replaced the malloc/frees in the bucket code on Windows with a power of 2 allocator
and it makes a BIG difference in performance. I expect the same on every OS with the
exception of Linux.

> The bottleneck might instead
> be centered on the other operations required for brigade
> in SSI requests in 2.0, like splitting buckets and registering
> pool cleanup functions.  Further profiling should be helpful
> here...

Certainly.

>
> --Brian
>
>
>


Re: Performance numbers..... ;(

Posted by Brian Pane <bp...@pacbell.net>.
Greg Ames wrote:

>"Victor J. Orlikowski" wrote:
>
[...]

>If ssi's were working, and you were measuring them, I think we would see
>2.0 look a lot worse compared to 1.3, and threaded worst of all, because
>we don't have a fast mutex-free replacement for malloc/free yet <sigh>. 
>For the buckets code, having some kind of connection lifetime apr_pool
>might be good enough, if we had a way to terminate keepalive connections
>that ate too much memory.  But we need a good apr_malloc/free for other
>stuff anyway, like caching.
>
I know the conventional wisdom is that SSI is slow because of
the malloc calls in bucket creation, but the problem might
be elsewhere.  I've tried adding in a free list of recycled
buckets, to reduce the calls to malloc, but it didn't seem to
affect performance measurably.  The bottleneck might instead
be centered on the other operations required for brigade
in SSI requests in 2.0, like splitting buckets and registering
pool cleanup functions.  Further profiling should be helpful
here...

--Brian




Re: Performance numbers..... ;(

Posted by Brian Pane <bp...@pacbell.net>.
Greg Ames wrote:

>"Victor J. Orlikowski" wrote:
>
[...]

>If ssi's were working, and you were measuring them, I think we would see
>2.0 look a lot worse compared to 1.3, and threaded worst of all, because
>we don't have a fast mutex-free replacement for malloc/free yet <sigh>. 
>For the buckets code, having some kind of connection lifetime apr_pool
>might be good enough, if we had a way to terminate keepalive connections
>that ate too much memory.  But we need a good apr_malloc/free for other
>stuff anyway, like caching.
>
I know the conventional wisdom is that SSI is slow because of
the malloc calls in bucket creation, but the problem might
be elsewhere.  I've tried adding in a free list of recycled
buckets, to reduce the calls to malloc, but it didn't seem to
affect performance measurably.  The bottleneck might instead
be centered on the other operations required for brigade
in SSI requests in 2.0, like splitting buckets and registering
pool cleanup functions.  Further profiling should be helpful
here...

--Brian