You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Shawn Heisey <so...@elyograg.org> on 2011/11/10 19:55:50 UTC

Question about solr caches and warming

Do Solr's LRU caches pay attention to hitcount when deciding which 
entries to age out and use for autowarming, or is it purely based on the 
last time that entry was touched?  Is it a reasonable idea to come up 
with an algorithm that uses hitcount along with entry age, ideally with 
a configurable weight value?

I'm asking because I have really nasty filter queries that make for very 
slow autowarming of the filterCache.  I've had to drop my autowarmCount 
on the filterCache to 4, and warming that cache can still take 30-60 
seconds.  Sometimes it's very fast, as low as 3 seconds.

Although I don't have statistics to back my claim, I suspect that the 
really nasty filters don't have as high a hitcount as the ones that are 
more simple.  Typically the really nasty filters are used when an 
employee logs into the site.  Employees have access to a lot more than 
customers do, but the search still needs to be filtered to be 
appropriate for whatever search options are active.

If hitcount could be examined as well as entry age, I think that it 
could do a better job of selecting entries for warming, and I could use 
more than 4.

Thanks,
Shawn


Re: Question about solr caches and warming

Posted by Chris Hostetter <ho...@fucit.org>.
: Although I don't have statistics to back my claim, I suspect that the really
: nasty filters don't have as high a hitcount as the ones that are more simple.
: Typically the really nasty filters are used when an employee logs into the
: site.  Employees have access to a lot more than customers do, but the search
: still needs to be filtered to be appropriate for whatever search options are
: active.

A low impact change to consider would be to leverage the "cache=false" 
local param feature that was added in Solr 3.4...

  https://wiki.apache.org/solr/CommonQueryParameters#Caching_of_filters

...you could add this localparam anytime you know the query is coming from 
an employee -- or anytime you know the filter query is "esoteric"

A higher impact change would be to create a dedicated query slave 
machine (or just an alternate core name that polls the same master) that 
is *only* used by employees and has much lower sizes on the caches -- this 
is the approach i have advocated and seen work very well since the 
pre-apache days of Solr: dedicated instances for each major "user base" 
with key settings (ie: replication frequencies, cache sizes, cache 
warming, static warming of sorts, etc...) tuned for that user base.  

-Hoss

Re: Question about solr caches and warming

Posted by Shawn Heisey <so...@elyograg.org>.
On 11/10/2011 11:55 AM, Shawn Heisey wrote:
> Do Solr's LRU caches pay attention to hitcount when deciding which 
> entries to age out and use for autowarming, or is it purely based on 
> the last time that entry was touched?  Is it a reasonable idea to come 
> up with an algorithm that uses hitcount along with entry age, ideally 
> with a configurable weight value?
>
> I'm asking because I have really nasty filter queries that make for 
> very slow autowarming of the filterCache.  I've had to drop my 
> autowarmCount on the filterCache to 4, and warming that cache can 
> still take 30-60 seconds.  Sometimes it's very fast, as low as 3 seconds.
>
> Although I don't have statistics to back my claim, I suspect that the 
> really nasty filters don't have as high a hitcount as the ones that 
> are more simple.  Typically the really nasty filters are used when an 
> employee logs into the site.  Employees have access to a lot more than 
> customers do, but the search still needs to be filtered to be 
> appropriate for whatever search options are active.
>
> If hitcount could be examined as well as entry age, I think that it 
> could do a better job of selecting entries for warming, and I could 
> use more than 4.

Replying to myself ... not surprisingly, I've just discovered that such 
a beast DOES exist.  I've opened a jira issue, SOLR-2889.

http://en.wikipedia.org/wiki/Adaptive_Replacement_Cache

Thanks,
Shawn