You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Simon Wistow <si...@thegestalt.org> on 2008/06/03 01:40:52 UTC

Typical Indexing performance

I know this is one of those "How long is a piece of string?" questions 
but I'm curious as to the order of magnitude of indexing performance.

http://lucene.apache.org/java/docs/benchmarks.html

seems to indicate about 100-120 docs/s is pretty good for average sized 
documents (say, an email or something) or is that ludicrously out of 
date for 2.3.x ?

Simon

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Typical Indexing performance

Posted by Konstantyn Smirnov <in...@yahoo.com>.
my 2 cents

My indexing-module handles the documents with ~15 fields, most of those must
be indexed and stored. Using the GermanAnalyzer I saw the following times:

10 MB ~ 3400 docs --> 6-8 sec
70 MB ~ 50000 docs --> 65 sec

so it gives me 500 - 760 doc/s
-- 
View this message in context: http://www.nabble.com/Typical-Indexing-performance-tp17619271p17687701.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Typical Indexing performance

Posted by Grant Ingersoll <gs...@apache.org>.
Of course it depends on analysis, etc., but my experience has been at  
least 2x faster, if not up to 4-5 times depending on the docs, etc.   
You can use the contrib/benchmark package to try for yourself, of  
course!

On Jun 2, 2008, at 7:40 PM, Simon Wistow wrote:

> I know this is one of those "How long is a piece of string?" questions
> but I'm curious as to the order of magnitude of indexing performance.
>
> http://lucene.apache.org/java/docs/benchmarks.html
>
> seems to indicate about 100-120 docs/s is pretty good for average  
> sized
> documents (say, an email or something) or is that ludicrously out of
> date for 2.3.x ?
>
> Simon
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ








---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org