You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Eric Jain <Er...@isb-sib.ch> on 2006/02/25 14:20:51 UTC

Indexing performance with Lucene 1.9

After upgrading to Lucene 1.9, an index that used to take about 9h to build 
now requires 13h. Any one else notice a decrease in performance?

This is how I configure the IndexWriter:

   writer = new IndexWriter(dir, analyzer, false);
   writer.mergeFactor = 100;
   writer.minMergeDocs = 100;
   writer.maxFieldLength = Integer.MAX_VALUE;
   writer.setUseCompoundFile(false);

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Indexing performance with Lucene 1.9

Posted by Eric Jain <Er...@isb-sib.ch>.
Eric Jain wrote:
> I'll rerun the indexing 
> procedure with the old version overnight, just to be sure.

Just to confirm: There no longer seems to be any difference in indexing 
performance between the nightly build and 1.4.3.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Indexing performance with Lucene 1.9

Posted by Eric Jain <Er...@isb-sib.ch>.
Otis Gospodnetic wrote:
> Regarding performance fix - if you can be more precise (is it really 
> just more or less or is it as good as before), that would be great
> for those of us itching to use 1.9.

To be more precise: The patch reduced the time required to build one large 
index from 13 to 11 hours, though I vaguely remember it used to require no 
more than 9 hours. But there have also been some other changes that are 
none of Lucene's fault, so I'll rerun the indexing procedure with the old 
version overnight, just to be sure.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Indexing performance with Lucene 1.9

Posted by Eric Jain <Er...@isb-sib.ch>.
Otis Gospodnetic wrote:
> Regarding performance fix - if you can be more precise (is it really 
 > just more or less or is it as good as before), that would be great
 > for those of us itching to use 1.9.

Yes, I can confirm that performance differs by no more than 3.1 fraggles.

;-)


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Indexing performance with Lucene 1.9

Posted by Otis Gospodnetic <ot...@yahoo.com>.
Hi Eric,
Regarding performance fix - if you can be more precise (is it really just more or less or is it as good as before), that would be great for those of us itching to use 1.9.
Thanks,
Otis

----- Original Message ----
From: Eric Jain <Er...@isb-sib.ch>
To: java-user@lucene.apache.org
Sent: Tue 28 Feb 2006 05:54:05 AM EST
Subject: Re: Indexing performance with Lucene 1.9

Daniel Naber wrote:
> A fix has now been committed to trunk in SVN, it should be part of the next 
> 1.9 release.

Performance seems to have recovered, more or less, thanks!


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Indexing performance with Lucene 1.9

Posted by Eric Jain <Er...@isb-sib.ch>.
Daniel Naber wrote:
> A fix has now been committed to trunk in SVN, it should be part of the next 
> 1.9 release.

Performance seems to have recovered, more or less, thanks!


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Indexing performance with Lucene 1.9

Posted by Daniel Naber <lu...@danielnaber.de>.
On Samstag 25 Februar 2006 14:20, Eric Jain wrote:

> After upgrading to Lucene 1.9, an index that used to take about 9h to
> build now requires 13h. Any one else notice a decrease in performance?

A fix has now been committed to trunk in SVN, it should be part of the next 
1.9 release.

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Indexing performance with Lucene 1.9

Posted by Daniel Naber <lu...@danielnaber.de>.
On Samstag 25 Februar 2006 14:20, Eric Jain wrote:

> After upgrading to Lucene 1.9, an index that used to take about 9h to
> build now requires 13h. Any one else notice a decrease in performance?

Yes, I can reproduce this with the Lucene demo on a much smaller index of 
2000 documents. It (partly?) seems to be caused by my patch here:

http://svn.apache.org/viewcvs.cgi/lucene/java/trunk/src/java/org/apache/lucene/index/IndexWriter.java?rev=372350&r1=216236&r2=372350&diff_format=h

This tries to fix an off-by-one bug with setMaxBufferedDocs, but it changes 
the way segments are merged, i.e. merging becomes slower. I guess this 
needs to be reverted. Maybe you can try doing that locally and see how it 
affects your performance.

Regards
 Daniel

-- 
http://www.danielnaber.de

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org