You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by smock <ha...@gmail.com> on 2009/12/16 23:22:23 UTC

Commits vs. Adds and solr caches

Hello,

I'm running a 12M document index which I'd like to frequently update.  I'm
having problems doing so
(http://old.nabble.com/NullPointerException-thrown-during-updates-to-index-td26613309.html)
and am wondering now if it has to do with the way I'm structuring the
updates.  I have a few questions which I'd like some clarification on if
anyone out there can offer it:

1) What is the difference between a commit and an add?  Is there a good rule
of thumb for how many docs one can add in one batch, and how many
uncommitted docs are too many?

2) At what stage during updates do the solr caches get cleared out?  My
searches are facet heavy and require heavy usage of the solr caches to run
quickly - I'd like to minimize the frequency at which the caches are cleared
out.
-- 
View this message in context: http://old.nabble.com/Commits-vs.-Adds-and-solr-caches-tp26819539p26819539.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Commits vs. Adds and solr caches

Posted by Jason Rutherglen <ja...@gmail.com>.
Harish,

Added documents are placed into an in memory buffer (see
solrconfig.xml -> ramBufferSizeMB) held in Lucene's IndexWriter.
When ramBufferSizeMB is reached, it's flushed to disk.

The caches are cleared on each commit.

If you're doing frequent commits, you'll probably want to lower
the cache sizes, otherwise excessive garbage could be generated.

Jason

On Wed, Dec 16, 2009 at 2:22 PM, smock <ha...@gmail.com> wrote:
>
> Hello,
>
> I'm running a 12M document index which I'd like to frequently update.  I'm
> having problems doing so
> (http://old.nabble.com/NullPointerException-thrown-during-updates-to-index-td26613309.html)
> and am wondering now if it has to do with the way I'm structuring the
> updates.  I have a few questions which I'd like some clarification on if
> anyone out there can offer it:
>
> 1) What is the difference between a commit and an add?  Is there a good rule
> of thumb for how many docs one can add in one batch, and how many
> uncommitted docs are too many?
>
> 2) At what stage during updates do the solr caches get cleared out?  My
> searches are facet heavy and require heavy usage of the solr caches to run
> quickly - I'd like to minimize the frequency at which the caches are cleared
> out.
> --
> View this message in context: http://old.nabble.com/Commits-vs.-Adds-and-solr-caches-tp26819539p26819539.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>