You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Justin Erenkrantz <ju...@erenkrantz.com> on 2004/08/03 12:13:51 UTC

mod_cache performance

--On Monday, August 2, 2004 2:49 PM -0400 Bill Stoddard <bi...@wstoddard.com> 
wrote:

> To get mod_cache/mod_mem_cache (I know little or nothing about
> mod_disk_cache) really performing competatively against best-of-breed caches
> will require bypassing output filters (and prebuilding headers) and possibly

Here's some comparative numbers to chew on.

One client and one server on 100Mbps network (cheapy 100Base-T switch);
50 simulated users hitting 7 URLs 100 times with flood (35,000 requests).

mod_disk_cache: Requests: 35000 Time: 40.91 Req/Sec: 856.78
mod_mem_cache:  Requests: 35000 Time: 54.90 Req/Sec: 637.81
no cache:       Requests: 35000 Time: 54.86 Req/Sec: 638.81
squid:          Requests: 35000 Time: 105.35 Req/Sec: 332.25

mod_disk_cache completely filled out the network at ~50% CPU usage.
    [Can't push through more than ~8MB/sec (~64Mb/sec) without GigE.]
mod_mem_cache filled up the CPU but not the network
    [Poor scaling characteristics.  It goes to 100% CPU with just 5 users!]
No caching was better CPU-wise (less utilization) than mod_mem_cache
    [Still not as good network or CPU-wise as mod_disk_cache]
squid was really inefficient both CPU and network-wise.

The squid numbers *completely* baffle me.  I have to believe I've got 
something stupid configured in squid (or I did something stupid with flood; 
but the network traces and truss output convince me otherwise).  My squid is 
using the default RHEL3 installation (Squid Cache: Version 2.5.STABLE3).

squid and httpd are on the same box - I may try to move squid to another box - 
will see if I have time tomorrow to find a suitable target to move to...

For those playing along at home, I am hitting the following URLs with flood:

   /
   /apache_pb.gif
   /manual/
   /manual/images/left.gif
   /manual/images/feather.gif
   /manual/content-negotiation.html
   /icons/

HTH.  -- justin

Re: mod_cache performance

Posted by Brian Akins <ba...@web.turner.com>.
Graham Leggett wrote:

>
> mem cache and disk cache were created because not every platform 
> performs best using the same techniques.
>
> This competition between mem cache and disk cache will hopefully make 
> them both faster, and in turn faster than other caches out there.
>
True.  Competetion is good. 


-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies


Re: mod_cache performance

Posted by Graham Leggett <mi...@sharp.fm>.
Brian Akins wrote:

>> mod_mem_cache is broken then. It used to kick the pants off of 'no 
>> cache' and mod_disk_cache.

> If mod_disk_cache was patched to use sendfile, it will perform better.

mem cache and disk cache were created because not every platform 
performs best using the same techniques.

This competition between mem cache and disk cache will hopefully make 
them both faster, and in turn faster than other caches out there.

Regards,
Graham
--

Re: mod_cache performance

Posted by Brian Akins <ba...@web.turner.com>.
Bill Stoddard wrote:

>
> mod_mem_cache is broken then. It used to kick the pants off of 'no 
> cache' and mod_disk_cache.


If mod_disk_cache was patched to use sendfile, it will perform better.

-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies


Re: mod_cache performance

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Tuesday, August 3, 2004 9:12 AM -0400 Brian Akins <ba...@web.turner.com> 
wrote:

> Propably not, because you would propably have to lock around it.  It just
> seems it's better to let the filesystem worry about alot of this stuff
> (locking, reference counting, etc.).

+1.  =)  -- justin

Re: mod_cache performance

Posted by Brian Akins <ba...@web.turner.com>.
Graham Leggett wrote:

>
> This is true - mem cache would probably improve drastically with a 
> shared memory cache.


Propably not, because you would propably have to lock around it.  It 
just seems it's better to let the filesystem worry about alot of this 
stuff (locking, reference counting, etc.). 

-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies


Re: mod_cache performance

Posted by Graham Leggett <mi...@sharp.fm>.
Brian Akins wrote:

> The big hits for mem cache are:
> 
> The cache is not shared between processes, so you use alot more memory 
> and have a lot less "hits."

This is true - mem cache would probably improve drastically with a 
shared memory cache.

Regards,
Graham
--

Re: mod_cache performance

Posted by Brian Akins <ba...@web.turner.com>.
Eli Marmor wrote:

>Graham Leggett wrote:
>  
>
>>Brian Akins wrote:
>>
>>    
>>
>>>On an OS that supports sendfile, a disk based cache will almost always
>>>bury a memory based one.
>>>      
>>>
>>Quite probably. But on a system without a disk, chances are it won't. :(
>>    
>>
>
>It will.
>Unless mod_disk_cache + ram-disk + sendfile doesn't outperform
>mod_mem_cache.
>
>  
>
This setup performs quite nicely on Linux.

The big hits for mem cache are:

The cache is not shared between processes, so you use alot more memory 
and have a lot less "hits."

You have to copy data from user to kernel, which can be a huge hit.


Even without sendfile, mmap is generally faster than mem cache.

-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies


Re: mod_cache performance

Posted by Eli Marmor <ma...@netmask.it>.
Graham Leggett wrote:
> 
> Brian Akins wrote:
> 
> > On an OS that supports sendfile, a disk based cache will almost always
> > bury a memory based one.
> 
> Quite probably. But on a system without a disk, chances are it won't. :(

It will.
Unless mod_disk_cache + ram-disk + sendfile doesn't outperform
mod_mem_cache.

-- 
Eli Marmor
marmor@netmask.it
CTO, Founder
Netmask (El-Mar) Internet Technologies Ltd.
__________________________________________________________
Tel.:   +972-9-766-1020          8 Yad-Harutzim St.
Fax.:   +972-9-766-1314          P.O.B. 7004
Mobile: +972-50-23-7338          Kfar-Saba 44641, Israel

Re: mod_cache performance

Posted by Graham Leggett <mi...@sharp.fm>.
Brian Akins wrote:

> On an OS that supports sendfile, a disk based cache will almost always 
> bury a memory based one.

Quite probably. But on a system without a disk, chances are it won't. :(

Regards,
Graham
--

Re: mod_cache performance

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Tuesday, August 3, 2004 2:35 PM -0400 David Nicklay 
<dn...@web.turner.com> wrote:

> Send us your squid.conf and your configure options from when you built it
> (as well as what squid version), and I can tell you how to optimize it.
> I've had a lot of practice......

I've posted the squid.conf from RHEL3 at:

<http://www.ics.uci.edu/~jerenkra/caching/>

There is also the output of 'squid -v' in squid-configure.

This is just the straight RHEL3 install.  I'm open to building from sources 
as long as someone tells me what config options and squid.conf to use.  ;-)

At that URL is the proxy.xml flood test case as well.  Plus, summary 
results from flood's analyze-relative report.  (mod_mem_cache and 
mod_disk_cache are maxing out the network now...)

Thanks!  -- justin

Re: mod_cache performance

Posted by David Nicklay <dn...@web.turner.com>.
Hi,

Send us your squid.conf and your configure options from when you built 
it (as well as what squid version), and I can tell you how to optimize 
it.  I've had a lot of practice......

Brian Akins wrote:
> Justin Erenkrantz wrote:
> 
>> --On Tuesday, August 3, 2004 8:11 AM -0400 Brian Akins 
>> <ba...@web.turner.com> wrote:
>>
>>> Under load, squid will always use 100% of the CPU.  This is because 
>>> it uses
>>> poll/select.
>>
>>
>>
>> Ouch.  That sucks.
>>
>> (But, httpd uses poll - so why does that force 100% CPU usage?)
>>
> 
> httpd blocks.  Squid doesn't in general.  Squid just calls poll over and 
> over and does lots of very small reads and writes.
> 
>>
>> Is it worth compiling my own squid then?  (Read that as 'reboot my box 
>> to FreeBSD and use the squid port.')
>>
> Check the configure and make sure you up open files and use poll.  Also 
> kill ident checks.
> 
> 

-- 
David Nicklay     O-
Location: CNN Center - SE0811A
Office: 404-827-2698	Cell: 404-545-6218

Re: mod_cache performance

Posted by Brian Akins <ba...@web.turner.com>.
Justin Erenkrantz wrote:

> --On Tuesday, August 3, 2004 8:11 AM -0400 Brian Akins 
> <ba...@web.turner.com> wrote:
>
>> Under load, squid will always use 100% of the CPU.  This is because 
>> it uses
>> poll/select.
>
>
> Ouch.  That sucks.
>
> (But, httpd uses poll - so why does that force 100% CPU usage?)
>

httpd blocks.  Squid doesn't in general.  Squid just calls poll over and 
over and does lots of very small reads and writes.

>
> Is it worth compiling my own squid then?  (Read that as 'reboot my box 
> to FreeBSD and use the squid port.')
>
Check the configure and make sure you up open files and use poll.  Also 
kill ident checks.


-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies


Re: mod_cache performance

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Tuesday, August 3, 2004 8:11 AM -0400 Brian Akins <ba...@web.turner.com> 
wrote:

> Under load, squid will always use 100% of the CPU.  This is because it uses
> poll/select.

Ouch.  That sucks.

(But, httpd uses poll - so why does that force 100% CPU usage?)

> RHEL 3 sucks.  Fedora Core 2 would have been a much better choice.  Also,
> did you use poll?  I know a large website that does several dozen hits per
> day using squid :)

Heh.  RHEL3 is the Linux distribution we use within the ASF.  (My local box is 
a mirror of the ASF Linux and FreeBSD setups.)  Fedora Core 2 isn't an option.

Is it worth compiling my own squid then?  (Read that as 'reboot my box to 
FreeBSD and use the squid port.')

> On an OS that supports sendfile, a disk based cache will almost always bury
> a memory based one.

Agreed.

I don't think it's worth putting a lot of effort into mod_mem_cache.  Doing 
zero-copy is just going to scale better than memory caching.  -- justin

Re: mod_cache performance

Posted by Brian Akins <ba...@web.turner.com>.
Justin Erenkrantz wrote:

>  
> squid was really inefficient both CPU and network-wise.
>
Under load, squid will always use 100% of the CPU.  This is because it 
uses poll/select.

> The squid numbers *completely* baffle me.  I have to believe I've got 
> something stupid configured in squid (or I did something stupid with 
> flood; but the network traces and truss output convince me 
> otherwise).  My squid is using the default RHEL3 installation (Squid 
> Cache: Version 2.5.STABLE3).
>
RHEL 3 sucks.  Fedora Core 2 would have been a much better choice.  
Also, did you use poll?  I know a large website that does several dozen 
hits per day using squid :)


On an OS that supports sendfile, a disk based cache will almost always 
bury a memory based one.

-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies


Re: mod_cache performance

Posted by Ian Holsman <li...@holsman.net>.
Brian Akins wrote:

> Justin Erenkrantz wrote:
> 
>>
>> That brings it in line with mod_disk_cache in maxing out my network.  
>> Time to craft some better tests or find a faster network...  -- justin
>>
> 
> I can probably help with the latter :)
> 
> Can you send me details of your setup and I'll try to test later this week.
> 
> 
we have some boxes with a GigE network as well. (set up to use flood 
with 10 PC's generating the load)


also .. we might have 1-2 amd-64 boxes I could presuade the higher ups 
to use.



Re: mod_cache performance

Posted by Brian Akins <ba...@web.turner.com>.
Justin Erenkrantz wrote:

>
> That brings it in line with mod_disk_cache in maxing out my network.  
> Time to craft some better tests or find a faster network...  -- justin
>

I can probably help with the latter :)

Can you send me details of your setup and I'll try to test later this week.


-- 
Brian Akins
Senior Systems Engineer
CNN Internet Technologies


Re: mod_cache performance

Posted by Justin Erenkrantz <ju...@erenkrantz.com>.
--On Tuesday, August 3, 2004 6:50 PM +0200 Graham Leggett <mi...@sharp.fm> 
wrote:

>>> mod_mem_cache:  Requests: 35000 Time: 54.90 Req/Sec: 637.81
>>> no cache:       Requests: 35000 Time: 54.86 Req/Sec: 638.81
>
> The above result would suggest that mod_mem_cache isn't being used in this
> case. It could be that mem cache has decided not to cache the requested file
> for whatever reason, which is being served via the normal "no cache" path.

It'd help if I compiled mod_mem_cache in.  *duck*  (We need better error 
messages when the cache type isn't found!  Can't we error out at config time?)

Anyway, mod_mem_cache yields (after bumping up MCacheMaxObjectSize to 100k):

Requests: 35000 Time: 40.99 Req/Sec: 856.73

That brings it in line with mod_disk_cache in maxing out my network.  Time to 
craft some better tests or find a faster network...  -- justin

Re: mod_cache performance

Posted by Graham Leggett <mi...@sharp.fm>.
Bill Stoddard wrote:

>> mod_mem_cache:  Requests: 35000 Time: 54.90 Req/Sec: 637.81
>> no cache:       Requests: 35000 Time: 54.86 Req/Sec: 638.81

The above result would suggest that mod_mem_cache isn't being used in 
this case. It could be that mem cache has decided not to cache the 
requested file for whatever reason, which is being served via the normal 
"no cache" path.

Regards,
Graham
--

Re: mod_cache performance

Posted by Bill Stoddard <bi...@wstoddard.com>.
Bill Stoddard wrote:

> Justin Erenkrantz wrote:
> 
>> --On Monday, August 2, 2004 2:49 PM -0400 Bill Stoddard 
>> <bi...@wstoddard.com> wrote:
>>
>>> To get mod_cache/mod_mem_cache (I know little or nothing about
>>> mod_disk_cache) really performing competatively against best-of-breed 
>>> caches
>>> will require bypassing output filters (and prebuilding headers) and 
>>> possibly
>>
>>
>>
>> Here's some comparative numbers to chew on.
>>
>> One client and one server on 100Mbps network (cheapy 100Base-T switch);
>> 50 simulated users hitting 7 URLs 100 times with flood (35,000 requests).
>>
>> mod_disk_cache: Requests: 35000 Time: 40.91 Req/Sec: 856.78
>> mod_mem_cache:  Requests: 35000 Time: 54.90 Req/Sec: 637.81
>> no cache:       Requests: 35000 Time: 54.86 Req/Sec: 638.81
>> squid:          Requests: 35000 Time: 105.35 Req/Sec: 332.25
>>
>> mod_disk_cache completely filled out the network at ~50% CPU usage.
>>    [Can't push through more than ~8MB/sec (~64Mb/sec) without GigE.]
>> mod_mem_cache filled up the CPU but not the network
>>    [Poor scaling characteristics.  It goes to 100% CPU with just 5 
>> users!]
> 
> 
> mod_mem_cache is broken 
Or mistuned?
Here are the defaults for the mem_cache directives:

MCacheSize               ~100 MB
MCacheMaxObjectCount     1009
MCacheMinObjectSize      0 (bytes)
MCacheMaxObjectSize      10000 (bytes)
MCacheRemovalAlgorithm    GDSF
MCacheMaxStreamingBuffer 100000 (bytes)

I have no idea if the urls ending in / are being served at all by mod_mem_cache. Wouldn;t suprise me if tehre 
is a bug there.

Bill

Re: mod_cache performance

Posted by Bill Stoddard <bi...@wstoddard.com>.
Justin Erenkrantz wrote:
> --On Monday, August 2, 2004 2:49 PM -0400 Bill Stoddard 
> <bi...@wstoddard.com> wrote:
> 
>> To get mod_cache/mod_mem_cache (I know little or nothing about
>> mod_disk_cache) really performing competatively against best-of-breed 
>> caches
>> will require bypassing output filters (and prebuilding headers) and 
>> possibly
> 
> 
> Here's some comparative numbers to chew on.
> 
> One client and one server on 100Mbps network (cheapy 100Base-T switch);
> 50 simulated users hitting 7 URLs 100 times with flood (35,000 requests).
> 
> mod_disk_cache: Requests: 35000 Time: 40.91 Req/Sec: 856.78
> mod_mem_cache:  Requests: 35000 Time: 54.90 Req/Sec: 637.81
> no cache:       Requests: 35000 Time: 54.86 Req/Sec: 638.81
> squid:          Requests: 35000 Time: 105.35 Req/Sec: 332.25
> 
> mod_disk_cache completely filled out the network at ~50% CPU usage.
>    [Can't push through more than ~8MB/sec (~64Mb/sec) without GigE.]
> mod_mem_cache filled up the CPU but not the network
>    [Poor scaling characteristics.  It goes to 100% CPU with just 5 users!]

mod_mem_cache is broken then. It used to kick the pants off of 'no cache' and mod_disk_cache.

Bill