You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@trafficserver.apache.org by avinash katika <av...@gmail.com> on 2014/07/29 19:22:37 UTC

Cache Inspector Problem for Handling Huge number of Cache Objects

Hi...we ran into problem when attempted to use Cache Inspector for Regex
Lookup & Delete cached objects from an ATS cache VM (single object lookup
and deletion was OK).



Following is our set up details:

Current Version: 4.2.1

“Pristine is enabled”

Cache Objects Present: ~150K

Cached bytes used: ~50GB

Sample CI regexp lookup/deletion command when kicked off from “curl” (we
have titles title_001 to title_250 to mimic 250 x VOD titles with ~600
objects per title):

- curl
http://<app_id>/ci/lookup_regex?url=http://<sample_origin_domain>/title_199/*

- curl
http://<app_id>/ci/delete_regex?url=http://<sample_origin_domain>/title_199/*



Two problems

1.) The regex request resulted in CI reporting “Network Error” after ~10min
(with access log reporting cache result code of “ERR_CLIENT_ABORT”). Is
this a known issue for this many of (~150K) cache objects?

       Our target Production VM is to host over 10M+ cache objects per VM
and the observed regex lookup/deletion performance did not look OK to
support Operations needs (if we are to lookup/purge objects by pattern)

2.) We also noticed when the regex lookup failed/aborted on client side,
the regex request continued to run at the ATS with cache disk(s) at ~100%
utilization. The request then stopped when
“proxy.config.http.transaction_active_timeout_in”

       (default: 900 sec) kicked in. When we increased the timeout to 2HR,
the above request alone by itself took over 1HR (~1.3HR to be exact) to
complete on a quiet VM (no traffic load). When we repeated the same test
with 500 titles

       provisioned (~300K cache objects). The similar regex request took
over 2HRs (and aborted at 2HR mark when
proxy.config.http.transaction_active_timeout_in kicked in) which was not
good.

       Is there any work-around so we can reliably lookup and purge
multiple objects sharing certain pattern?

回复: Cache Inspector Problem for Handling Huge number of Cache Objects

Posted by po...@gmail.com.
Yes, that’s a good idea.

--  
portl4t.cn@gmail.com


在 2014年8月1日 星期五,上午2:48,Shu Kit Chan 写道:

> Or we can simply offer a new function in lua to call "TSHttpTxnCacheLookupStatusSet" as an enhancement.
>  
>  
> On Wed, Jul 30, 2014 at 11:30 AM, <geodni@free.fr (mailto:geodni@free.fr)> wrote:
> > Thanks a lot Kit!
> > It seems to do what I would like to do but syntax of the config file is not given.
> > What I have discovered is two space separated fields :
> > 1- all characters except '#' (maybe a full URL)
> > 2- decimals (maybe a timestamp in epoch, not easy to manage)
> >  
> > This plugin might be use from remap.config and the config file path maybe taken from @pparam. If Phil as some more detailed informations on it, I take them ! Now I have to try compiling Traffic Server 5 under FreeBSD 10, I run 4.2. on an unofficial porting...
> >  
> > This plugin calls "TSHttpTxnCacheLookupStatusSet(txn, TS_CACHE_LOOKUP_HIT_STALE)" to force revalidate the object.
> > With actual LUA plugin, only "ts.http.get_cache_lookup_status" is available but maybe it would be possible deal with ts.hook(TS_LUA_HOOK_CACHE_LOOKUP_COMPLETE, do_something) where do_something is trying to add a Cache-Control="must-revalidate" with ts.client_request.header.HEADER ?...
> >  
> > Denis
> >  
> > > Have you checked out the regex_revalidate plugin?
> > > https://github.com/apache/trafficserver/tree/master/plugins/experimental/regex_revalidate
> > > There's not much documentation but I think it has some similarities
> > > to what you try to do. I think Phil is the contributor of this
> > > plugin.
> > >
> > >
> > > I will also keep your feedbacks in mind when I try to improve the lua
> > > plugin in the coming weeks.
> > >
> > > Thanks.
> > >
> > >
> > > Kit
>  


Re: Cache Inspector Problem for Handling Huge number of Cache Objects

Posted by Shu Kit Chan <ch...@gmail.com>.
Or we can simply offer a new function in lua to call
"TSHttpTxnCacheLookupStatusSet" as an enhancement.


On Wed, Jul 30, 2014 at 11:30 AM, <ge...@free.fr> wrote:

> Thanks a lot Kit!
> It seems to do what I would like to do but syntax of the config file is
> not given.
> What I have discovered is two space separated fields :
> 1- all characters except '#' (maybe a full URL)
> 2- decimals (maybe a timestamp in epoch, not easy to manage)
>
> This plugin might be use from remap.config and the config file path maybe
> taken from @pparam. If Phil as some more detailed informations on it, I
> take them ! Now I have to try compiling Traffic Server 5 under FreeBSD 10,
> I run 4.2. on an unofficial porting...
>
> This plugin calls "TSHttpTxnCacheLookupStatusSet(txn,
> TS_CACHE_LOOKUP_HIT_STALE)" to force revalidate the object.
> With actual LUA plugin, only "ts.http.get_cache_lookup_status" is
> available but maybe it would be possible deal with
> ts.hook(TS_LUA_HOOK_CACHE_LOOKUP_COMPLETE, do_something) where do_something
> is trying to add a Cache-Control="must-revalidate" with
> ts.client_request.header.HEADER ?...
>
> Denis
>
> > Have you checked out the regex_revalidate plugin?
> >
> https://github.com/apache/trafficserver/tree/master/plugins/experimental/regex_revalidate
> > There's not much documentation but I think it has some similarities
> > to what you try to do. I think Phil is the contributor of this
> > plugin.
> >
> >
> > I will also keep your feedbacks in mind when I try to improve the lua
> > plugin in the coming weeks.
> >
> > Thanks.
> >
> >
> > Kit
>

Re: Cache Inspector Problem for Handling Huge number of Cache Objects

Posted by ge...@free.fr.
Thanks a lot Kit!
It seems to do what I would like to do but syntax of the config file is not given.
What I have discovered is two space separated fields :
1- all characters except '#' (maybe a full URL)
2- decimals (maybe a timestamp in epoch, not easy to manage)

This plugin might be use from remap.config and the config file path maybe taken from @pparam. If Phil as some more detailed informations on it, I take them ! Now I have to try compiling Traffic Server 5 under FreeBSD 10, I run 4.2. on an unofficial porting...

This plugin calls "TSHttpTxnCacheLookupStatusSet(txn, TS_CACHE_LOOKUP_HIT_STALE)" to force revalidate the object.
With actual LUA plugin, only "ts.http.get_cache_lookup_status" is available but maybe it would be possible deal with ts.hook(TS_LUA_HOOK_CACHE_LOOKUP_COMPLETE, do_something) where do_something is trying to add a Cache-Control="must-revalidate" with ts.client_request.header.HEADER ?...

Denis

> Have you checked out the regex_revalidate plugin?
> https://github.com/apache/trafficserver/tree/master/plugins/experimental/regex_revalidate
> There's not much documentation but I think it has some similarities
> to what you try to do. I think Phil is the contributor of this
> plugin.
> 
> 
> I will also keep your feedbacks in mind when I try to improve the lua
> plugin in the coming weeks.
> 
> Thanks.
> 
> 
> Kit

Re: Cache Inspector Problem for Handling Huge number of Cache Objects

Posted by Shu Kit Chan <ch...@gmail.com>.
Have you checked out the regex_revalidate plugin?
https://github.com/apache/trafficserver/tree/master/plugins/experimental/regex_revalidate
There's not much documentation but I think it has some similarities to what
you try to do. I think Phil is the contributor of this plugin.

I will also keep your feedbacks in mind when I try to improve the lua
plugin in the coming weeks.
Thanks.

Kit


On Tue, Jul 29, 2014 at 11:12 AM, <ge...@free.fr> wrote:

> Hi,
>
> The problem was encountered earlier but I did no abandonned to solve it.
> The solution I wrote using shell/awk is not suitable for multi millions of
> objects as it needs hours to purge objects one by one and needs to
> maintains the list of all objects ousite of Traffic Server.
>
> I am currently trying to write a new plugin that will do like
> cache-key-genid I tested earlier but I was not sufficient with the platform
> I use. My goal is to store the configuration in a file and keep it in
> memory (mostly inspired by header_rewrite plugin). This plugin will also be
> able to evaluate regex on host and path.
> Now I have two possibilities :
> 1- simply increment an ID to invalidate all the objects matching a rule
> taking the risk to increase the number of rules in time
> 2- use it as a ban list by adding a timestamp in the file (sort of varnish
> ban list) when it was last used/called
>
> Maintaining this list can be a problem in time if it increases a lot (more
> than 500 rules seems no to be a good thing). Objects could be invalidated
> and/or simply refreshed when they are asked from the client because the
> requested URL is known and then the hash key too...
>
> As I told before I am not a C++ programmer but I will try, I hope LUA will
> become more and more strong to easily write new plugins to manipulate
> objects in cache. I am much more comfortable with shell/awk scripting than
> with compiled code and LUA seems to be the good way for me.
>
> Denis
>
> References:
> http://mail-archives.apache.org/mod_mbox/trafficserver-users and search
> "cache inspector alternative"
> https://github.com/godaddy/ats-plugin-cache-key-genid
>
>
> ----- Mail original -----
> > De: "Leif Hedstrom" <zw...@apache.org>
> > À: users@trafficserver.apache.org
> > Envoyé: Mardi 29 Juillet 2014 19:38:22
> > Objet: Re: Cache Inspector Problem for Handling Huge number of Cache
> Objects
> >
> > On Jul 29, 2014, at 11:22 AM, avinash katika <
> > avinash.katika@gmail.com > wrote:
> >
> > Hi...we ran into problem when attempted to use Cache Inspector for
> > Regex Lookup & Delete cached objects from an ATS cache VM (single
> > object lookup and deletion was OK).
>
> >
> > Following is our set up details:
> >
> > Current Version: 4.2.1
> >
> > “Pristine is enabled”
> >
> > Cache Objects Present: ~1 5 0K
> >
> > Cached bytes used: ~50GB
> >
> > Sample CI regexp lookup/deletion command when kicked off from “curl”
> > (we have titles title_001 to title_250 to mimic 250 x VOD titles
> > with ~600 objects per title):
> >
> > - curl
> > http://<app_id>/ci/lookup_regex?url=http://
> <sample_origin_domain>/title_199/*
> >
> > - curl
> > http://<app_id>/ci/delete_regex?url=http://
> <sample_origin_domain>/title_199/*
> >
> >
> >
> >
> > The short answer is, don’t use the cache inspector / regexes to
> > manage the cache. It does not scale.
> >
> >
> > — leif
> >
> >
>

Re: Cache Inspector Problem for Handling Huge number of Cache Objects

Posted by ge...@free.fr.
Hi,

The problem was encountered earlier but I did no abandonned to solve it. The solution I wrote using shell/awk is not suitable for multi millions of objects as it needs hours to purge objects one by one and needs to maintains the list of all objects ousite of Traffic Server.

I am currently trying to write a new plugin that will do like cache-key-genid I tested earlier but I was not sufficient with the platform I use. My goal is to store the configuration in a file and keep it in memory (mostly inspired by header_rewrite plugin). This plugin will also be able to evaluate regex on host and path.
Now I have two possibilities :
1- simply increment an ID to invalidate all the objects matching a rule taking the risk to increase the number of rules in time
2- use it as a ban list by adding a timestamp in the file (sort of varnish ban list) when it was last used/called

Maintaining this list can be a problem in time if it increases a lot (more than 500 rules seems no to be a good thing). Objects could be invalidated and/or simply refreshed when they are asked from the client because the requested URL is known and then the hash key too...

As I told before I am not a C++ programmer but I will try, I hope LUA will become more and more strong to easily write new plugins to manipulate objects in cache. I am much more comfortable with shell/awk scripting than with compiled code and LUA seems to be the good way for me.

Denis

References:
http://mail-archives.apache.org/mod_mbox/trafficserver-users and search "cache inspector alternative"
https://github.com/godaddy/ats-plugin-cache-key-genid


----- Mail original -----
> De: "Leif Hedstrom" <zw...@apache.org>
> À: users@trafficserver.apache.org
> Envoyé: Mardi 29 Juillet 2014 19:38:22
> Objet: Re: Cache Inspector Problem for Handling Huge number of Cache Objects
> 
> On Jul 29, 2014, at 11:22 AM, avinash katika <
> avinash.katika@gmail.com > wrote:
> 
> Hi...we ran into problem when attempted to use Cache Inspector for
> Regex Lookup & Delete cached objects from an ATS cache VM (single
> object lookup and deletion was OK).

> 
> Following is our set up details:
> 
> Current Version: 4.2.1
> 
> “Pristine is enabled”
> 
> Cache Objects Present: ~1 5 0K
> 
> Cached bytes used: ~50GB
> 
> Sample CI regexp lookup/deletion command when kicked off from “curl”
> (we have titles title_001 to title_250 to mimic 250 x VOD titles
> with ~600 objects per title):
> 
> - curl
> http://<app_id>/ci/lookup_regex?url=http://<sample_origin_domain>/title_199/*
> 
> - curl
> http://<app_id>/ci/delete_regex?url=http://<sample_origin_domain>/title_199/*
> 
> 
> 
> 
> The short answer is, don’t use the cache inspector / regexes to
> manage the cache. It does not scale.
> 
> 
> — leif
> 
> 

Re: Cache Inspector Problem for Handling Huge number of Cache Objects

Posted by Leif Hedstrom <zw...@apache.org>.
On Jul 29, 2014, at 11:22 AM, avinash katika <av...@gmail.com> wrote:

> Hi...we ran into problem when attempted to use Cache Inspector for Regex Lookup & Delete cached objects from an ATS cache VM (single object lookup and deletion was OK).
> 
>  
> Following is our set up details:
> 
> Current Version: 4.2.1
> 
> “Pristine is enabled”
> 
> Cache Objects Present: ~150K
> 
> Cached bytes used: ~50GB
> 
> Sample CI regexp lookup/deletion command when kicked off from “curl” (we have titles title_001 to title_250 to mimic 250 x VOD titles with ~600 objects per title):
> 
> - curl http://<app_id>/ci/lookup_regex?url=http://<sample_origin_domain>/title_199/*
> 
> - curl http://<app_id>/ci/delete_regex?url=http://<sample_origin_domain>/title_199/*
> 


The short answer is, don’t use the cache inspector / regexes to manage the cache. It does not scale.

— leif