You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "Barnett, Jeffrey" <je...@yale.edu> on 2008/10/30 20:49:30 UTC

Changing mergeFactor in mid-stream?

The http://wiki.apache.org/lucene-java/ImproveIndexingSpeed page suggests that indexing will be sped up by using higher values of mergeFactor, while search speed improves with lower values.  I need to create an index using multiple batches of documents.  My question is, can I begin building with a high mergeFactor for the bulk of the load and then switch to a lower value for the final batch?  I build the indices offline, and only swap them to online when complete.  The online index is never updated.

Re: Changing mergeFactor in mid-stream?

Posted by Mark Miller <ma...@gmail.com>.
Otis Gospodnetic wrote:
> Yes, you can change the mergeFactor.  More important than the mergeFactor is this:
>
> <ramBufferSizeMB>32</ramBufferSizeMB>
>
> Pump it up as much as your hardware/JVM allows.  And use appropriate -Xmx, of course.
>   
Is that true? I thought there was a sweet spot for the RAM buffer (and 
not as high as youd think)? You might want to test that out a bit before 
riding it too high...


Re: Changing mergeFactor in mid-stream?

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Yes, you can change the mergeFactor.  More important than the mergeFactor is this:

<ramBufferSizeMB>32</ramBufferSizeMB>

Pump it up as much as your hardware/JVM allows.  And use appropriate -Xmx, of course.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: "Barnett, Jeffrey" <je...@yale.edu>
> To: "solr-user@lucene.apache.org" <so...@lucene.apache.org>
> Sent: Thursday, October 30, 2008 3:49:30 PM
> Subject: Changing mergeFactor in mid-stream?
> 
> The http://wiki.apache.org/lucene-java/ImproveIndexingSpeed page suggests that 
> indexing will be sped up by using higher values of mergeFactor, while search 
> speed improves with lower values.  I need to create an index using multiple 
> batches of documents.  My question is, can I begin building with a high 
> mergeFactor for the bulk of the load and then switch to a lower value for the 
> final batch?  I build the indices offline, and only swap them to online when 
> complete.  The online index is never updated.