You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@trafficserver.apache.org by Rob Maidment <Ro...@clearswift.com> on 2010/07/14 15:27:07 UTC

why does cache hit rate tail off?

Hello

I'm measuring trafficserver performance using the web polygraph tool:
http://www.web-polygraph.org/

Performance is generally impressive, but I'm trying to understand the strange results I'm seeing for cache hit rate.  The test is a short duration (20 minutes) and initially the hit rate measured by polygraph exactly matches the expected (offered) rate.  However 3 minutes into the test the hit rate starts to drop below the expected rate and continues to tail off for the duration of the test.  Can anyone explain why this might be?

I'm testing on a 1 Gbps network - does trafficserver somehow work out it's quicker to get responses from the server than from its local cache??  Is this tuneable at all?

regards
Rob


This e-mail and any files transmitted with it are strictly confidential, may be privileged and are intended only for use by the addressee unless otherwise indicated.  If you are not the intended recipient any use, dissemination, printing or copying is strictly prohibited and may be unlawful.  If you have received this e-mail in error, please delete it immediately and contact the sender as soon as possible.  Clearswift cannot be held liable for delays in receipt of an email or any errors in its content. Clearswift accepts no responsibility once an e-mail and any attachments leave us. Unless expressly stated, opinions in this message are those of the individual sender and not of Clearswift.

This email message has been inspected by Clearswift for inappropriate content and security threats. 

To find out more about Clearswift’s solutions please visit www.clearswift.com

RE: why does cache hit rate tail off?

Posted by Rob Maidment <Ro...@clearswift.com>.
Sorry, I don’t know why I didn’t spot that!  Having increased the cache capacity, the measured hit rate now exactly matches the expected hit rate.


From: Leif Hedstrom [mailto:zwoop@apache.org]
Sent: 14 July 2010 17:12
To: users@trafficserver.apache.org
Cc: Rob Maidment
Subject: Re: why does cache hit rate tail off?

On 07/14/2010 10:02 AM, Rob Maidment wrote:

The old (LRU) algorithm made little difference.
I guess it could be that I’m simply exceeding the cache capacity.  The test uses a working set of 2.24GB.  How do I configure the size of the cache?

Two settings:

In etc/trafficserver/storage.config, you will see a line similar to

    var/trafficserver 140M


You'll want that closer to  2.24G I imagine :). For the RAM cache, the setting is in etc/trafficserver/records.config:

    CONFIG proxy.config.cache.ram_cache.size INT -1


The default (-1) means it'll calculate the RAM cache size based on the disk cache size; For every 1GB of disk cache, it uses 1MB of RAM cache (which is really small honestly). I'd definitely bump this up as much as your box can handle, but make sure to leave room for the connections and other buffers (setting this is a "trial and error", and is very application specific). Fwiw, an inactive connection consumes about 32KB of memory, and an active one doubles that (for additional buffers). SSL connections consumes even more memory. (I think this is an area where we might be able to improve too, patches are welcomed).

I hope any of this helps your benchmarks (really looking forward to see your results).

-- leif
________________________________
Message Processed by the MIMEsweeperTM SMTP Appliance

This e-mail and any files transmitted with it are strictly confidential, may be privileged and are intended only for use by the addressee unless otherwise indicated.  If you are not the intended recipient any use, dissemination, printing or copying is strictly prohibited and may be unlawful.  If you have received this e-mail in error, please delete it immediately and contact the sender as soon as possible.  Clearswift cannot be held liable for delays in receipt of an email or any errors in its content. Clearswift accepts no responsibility once an e-mail and any attachments leave us. Unless expressly stated, opinions in this message are those of the individual sender and not of Clearswift.

This email message has been inspected by Clearswift for inappropriate content and security threats. 

To find out more about Clearswift’s solutions please visit www.clearswift.com

Re: why does cache hit rate tail off?

Posted by Leif Hedstrom <zw...@apache.org>.
On 07/14/2010 10:02 AM, Rob Maidment wrote:
>
> The old (LRU) algorithm made little difference.
>
> I guess it could be that I’m simply exceeding the cache capacity.  The 
> test uses a working set of 2.24GB.  How do I configure the size of the 
> cache?
>

Two settings:

In etc/trafficserver/storage.config, you will see a line similar to

     var/trafficserver 140M


You'll want that closer to  2.24G I imagine :). For the RAM cache, the 
setting is in etc/trafficserver/records.config:

     CONFIG proxy.config.cache.ram_cache.size INT -1


The default (-1) means it'll calculate the RAM cache size based on the 
disk cache size; For every 1GB of disk cache, it uses 1MB of RAM cache 
(which is really small honestly). I'd definitely bump this up as much as 
your box can handle, but make sure to leave room for the connections and 
other buffers (setting this is a "trial and error", and is very 
application specific). Fwiw, an inactive connection consumes about 32KB 
of memory, and an active one doubles that (for additional buffers). SSL 
connections consumes even more memory. (I think this is an area where we 
might be able to improve too, patches are welcomed).

I hope any of this helps your benchmarks (really looking forward to see 
your results).

-- leif


RE: why does cache hit rate tail off?

Posted by Rob Maidment <Ro...@clearswift.com>.
The old (LRU) algorithm made little difference.
I guess it could be that I’m simply exceeding the cache capacity.  The test uses a working set of 2.24GB.  How do I configure the size of the cache?


From: Leif Hedstrom [mailto:zwoop@apache.org]
Sent: 14 July 2010 15:38
To: users@trafficserver.apache.org
Cc: users@trafficserver.apache.org
Subject: Re: why does cache hit rate tail off?


On Jul 14, 2010, at 7:27 AM, Rob Maidment <Ro...@clearswift.com>> wrote:
Hello

I’m measuring trafficserver performance using the web polygraph tool:
http://www.web-polygraph.org/

Performance is generally impressive, but I’m trying to understand the strange results I’m seeing for cache hit rate.  The test is a short duration (20 minutes) and initially the hit rate measured by polygraph exactly matches the expected (offered) rate.  However 3 minutes into the test the hit rate starts to drop below the expected rate and continues to tail off for the duration of the test.  Can anyone explain why this might be?

That doesn't sound good. Is the cache churning by chance? How large is the RAM cache and disk cache?  Is there significant disk I/O or contention on the disks?


I’m testing on a 1 Gbps network – does trafficserver somehow work out it’s quicker to get responses from the server than from its local cache??  Is this tuneable at all?

No. But there are algorithms for promoting and evicting object from disk cache to RAM cache, for evicting out of RAM cache, and of course for evicting out of disk cache. The default RAM cache eviction algorithm was recently changed, maybe you can try the old (LRU) policy as well?

One other thing to look for is to make sure the traffic_server process isn't crashing / restarting. This might happen fairly transparently, but would increase cache misses, since while restarting and initializing the caches, ATS will forward every request to the backend. If the restarts happens frequent it might not even get a chance to sync the directories to disk.

Cheers!

- Leif


regards
Rob


This e-mail and any files transmitted with it are strictly confidential, may be privileged and are intended only for use by the addressee unless otherwise indicated. If you are not the intended recipient any use, dissemination, printing or copying is strictly prohibited and may be unlawful. If you have received this e-mail in error, please delete it immediately and contact the sender as soon as possible. Clearswift cannot be held liable for delays in receipt of an email or any errors in its content. Clearswift accepts no responsibility once an e-mail and any attachments leave us. Unless expressly stated, opinions in this message are those of the individual sender and not of Clearswift.



This email message has been inspected by Clearswift for inappropriate content and security threats.



To find out more about Clearswift’s solutions please visit www.clearswift.com<http://www.clearswift.com>

________________________________
Message Processed by the MIMEsweeperTM SMTP Appliance

Re: why does cache hit rate tail off?

Posted by Leif Hedstrom <zw...@apache.org>.
On Jul 14, 2010, at 7:27 AM, Rob Maidment <Ro...@clearswift.com> wrote:

> Hello
> 
>  
> 
> I’m measuring trafficserver performance using the web polygraph tool:
> 
> http://www.web-polygraph.org/
> 
>  
> 
> Performance is generally impressive, but I’m trying to understand the strange results I’m seeing for cache hit rate.  The test is a short duration (20 minutes) and initially the hit rate measured by polygraph exactly matches the expected (offered) rate.  However 3 minutes into the test the hit rate starts to drop below the expected rate and continues to tail off for the duration of the test.  Can anyone explain why this might be?
> 

That doesn't sound good. Is the cache churning by chance? How large is the RAM cache and disk cache?  Is there significant disk I/O or contention on the disks?
>  
> 
> I’m testing on a 1 Gbps network – does trafficserver somehow work out it’s quicker to get responses from the server than from its local cache??  Is this tuneable at all?
> 

No. But there are algorithms for promoting and evicting object from disk cache to RAM cache, for evicting out of RAM cache, and of course for evicting out of disk cache. The default RAM cache eviction algorithm was recently changed, maybe you can try the old (LRU) policy as well?

One other thing to look for is to make sure the traffic_server process isn't crashing / restarting. This might happen fairly transparently, but would increase cache misses, since while restarting and initializing the caches, ATS will forward every request to the backend. If the restarts happens frequent it might not even get a chance to sync the directories to disk.

Cheers!

- Leif
>  
> 
> regards
> 
> Rob
> 
>  
> 
> This e-mail and any files transmitted with it are strictly confidential, may be privileged and are intended only for use by the addressee unless otherwise indicated. If you are not the intended recipient any use, dissemination, printing or copying is strictly prohibited and may be unlawful. If you have received this e-mail in error, please delete it immediately and contact the sender as soon as possible. Clearswift cannot be held liable for delays in receipt of an email or any errors in its content. Clearswift accepts no responsibility once an e-mail and any attachments leave us. Unless expressly stated, opinions in this message are those of the individual sender and not of Clearswift.
> 
> 
>  
> 
> This email message has been inspected by Clearswift for inappropriate content and security threats.
> 
> 
>  
> 
> To find out more about Clearswift’s solutions please visit www.clearswift.com