You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Andrew Ingram <an...@andrewingram.net> on 2009/08/27 16:06:55 UTC

Optimal Cache Settings, complicated by regular commits

Hi all,
I'm trying to work out the optimum cache settings for our Solr server, I'll
begin by outlining our usage.

Number of documents: approximately 25,000
Commit frequency: sometimes we do massive amounts of sequential commits,
most of the time its less frequent but still several times an hour
We make heavy use of faceting and sorting, and the number of possible facets
led to choosing a filterCache size of about 50,000

The problem we have is that the default cache settings resulting in very low
hit rates (less than 30% for documents, less than 1% for filterCache), so we
upped the cache size up gradually until the hit rates were in the 80s-90s,
now we have the issue of commits being very slow (more than 5 seconds for a
document), to the point where it causes a timeout elsewhere in our systems.
This is made worse by the fact that committing seems to empty the cache,
given that it takes about an hour to get the cache to a good state this is
obviously very problematic.

Is there a way for commits to selectively empty the cache? Any advice
regarding the config would be appreciated. The server load is relatively
low, ideally we're looking to minimize the response time rather than aim for
CPU or memory efficiency.

Regards,
Andrew Ingram

Re: Optimal Cache Settings, complicated by regular commits

Posted by Chris Hostetter <ho...@fucit.org>.
: I'm trying to work out the optimum cache settings for our Solr server, I'll
: begin by outlining our usage.

...but you didn't give any information about what your cache settings look 
like ... size is only part of the picture, the autowarm counts are more 
significant.

: Commit frequency: sometimes we do massive amounts of sequential commits,

if you know you are going to be indexing more docs soon, then you can hold 
off on issuing a commit ... it really comes down to what kind of SLA you 
have to provide on how quickly an add/update is visible in the index -- 
don't commit any more often then that.

: The problem we have is that the default cache settings resulting in very low
: hit rates (less than 30% for documents, less than 1% for filterCache), so we

under 1% for filterCache sounds like you either have some really unique 
filter queries, or you are using enum based faceting on a huge field and 
the LRU cache is working against you by expunging values during a single 
request ... what version of solr are you using? what do the fieldtype 
declarations look like for the fields you are faceting on? what do the 
luke stats look like for hte fields you are faceting on?

: now we have the issue of commits being very slow (more than 5 seconds for a
: document), to the point where it causes a timeout elsewhere in our systems.
: This is made worse by the fact that committing seems to empty the cache,
: given that it takes about an hour to get the cache to a good state this is
: obviously very problematic.

1) using waitSearch=false can help speed up the commit if all you care 
about is not having your client time out.

2) using autowarming can help fill the caches up prior to users making 
requests (you may already know that, but since you didn't provide your 
cache configs i have no idea) .. they key is finding a good autowarm count 
that helps your cache stats w/o taking too long to fill up.


-Hoss


Re: Optimal Cache Settings, complicated by regular commits

Posted by Jason Rutherglen <ja...@gmail.com>.
Andrew,

Which version of Solr are you using?

There's an open issue to fix caching filters at the segment
level, which will not clear the caches on each commit, you can
vote to indicate your interest.
http://issues.apache.org/jira/browse/SOLR-1308

-J

On Thu, Aug 27, 2009 at 7:06 AM, Andrew Ingram<an...@andrewingram.net> wrote:
> Hi all,
> I'm trying to work out the optimum cache settings for our Solr server, I'll
> begin by outlining our usage.
>
> Number of documents: approximately 25,000
> Commit frequency: sometimes we do massive amounts of sequential commits,
> most of the time its less frequent but still several times an hour
> We make heavy use of faceting and sorting, and the number of possible facets
> led to choosing a filterCache size of about 50,000
>
> The problem we have is that the default cache settings resulting in very low
> hit rates (less than 30% for documents, less than 1% for filterCache), so we
> upped the cache size up gradually until the hit rates were in the 80s-90s,
> now we have the issue of commits being very slow (more than 5 seconds for a
> document), to the point where it causes a timeout elsewhere in our systems.
> This is made worse by the fact that committing seems to empty the cache,
> given that it takes about an hour to get the cache to a good state this is
> obviously very problematic.
>
> Is there a way for commits to selectively empty the cache? Any advice
> regarding the config would be appreciated. The server load is relatively
> low, ideally we're looking to minimize the response time rather than aim for
> CPU or memory efficiency.
>
> Regards,
> Andrew Ingram
>