You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by John Wang <jo...@gmail.com> on 2009/10/12 07:19:33 UTC
new sorting api and some perf numbers
Hi guys:
The new FieldComparator api looks really scary :)
But after some perf testing with numbers I'd like to share, I guess it
is worth it:
HW: Mac Pro with 16G memory
jvm: 1.6.0_13"
jvm arg: -Xms1g -Xmx1g -server
setup
index:
1M docs even split into 8 segments (to make sure the test is fair across
segment boundaries)
each doc has 3 fields:
1) id - stored
2) val - random number, indexed, not analyzed, no norms, omit tf
3) string - "even" or "odd" of the corresponding id, not analyzed, no norms,
omit tf
built with lucene 2.4.1 to keep the same index across lucene 2.4.1 and
lucene 2.9.0 search tests
Search:
query on the term: "even" (TermQuery, minimizes the overhead of the text
search), matches 500k docs, and across segment boundary, sort by val, sort
type: string. Numhits, e.g. number of slots = 100.
ran 20 iterations of the same query for each test.
First query, includes loading
lucene 2.4.1: 4858ms, lucene 2.9.0: 816ms, gain of 595%
avg of the rest 19 queries:
lucene 2.4.1: 32ms, lucene 2.9.0: 17ms , gain of 188%
I ran this test about 5 times, the findings are similar.
The performance gain is significant!
Great job!
-John
Re: new sorting api and some perf numbers
Posted by Bradford Stephens <br...@gmail.com>.
Wow! This is awesome. Can't wait to see how it plays with Bobo :)
On Sun, Oct 11, 2009 at 10:19 PM, John Wang <jo...@gmail.com> wrote:
> Hi guys:
> The new FieldComparator api looks really scary :)
>
> But after some perf testing with numbers I'd like to share, I guess it
> is worth it:
>
> HW: Mac Pro with 16G memory
> jvm: 1.6.0_13"
> jvm arg: -Xms1g -Xmx1g -server
>
> setup
>
> index:
> 1M docs even split into 8 segments (to make sure the test is fair across
> segment boundaries)
> each doc has 3 fields:
> 1) id - stored
> 2) val - random number, indexed, not analyzed, no norms, omit tf
> 3) string - "even" or "odd" of the corresponding id, not analyzed, no norms,
> omit tf
>
> built with lucene 2.4.1 to keep the same index across lucene 2.4.1 and
> lucene 2.9.0 search tests
>
> Search:
> query on the term: "even" (TermQuery, minimizes the overhead of the text
> search), matches 500k docs, and across segment boundary, sort by val, sort
> type: string. Numhits, e.g. number of slots = 100.
>
> ran 20 iterations of the same query for each test.
>
> First query, includes loading
>
> lucene 2.4.1: 4858ms, lucene 2.9.0: 816ms, gain of 595%
>
> avg of the rest 19 queries:
>
> lucene 2.4.1: 32ms, lucene 2.9.0: 17ms , gain of 188%
>
> I ran this test about 5 times, the findings are similar.
>
> The performance gain is significant!
>
> Great job!
>
> -John
>
--
http://www.drawntoscaleconsulting.com - Scalability, Hadoop, HBase,
and Distributed Lucene Consulting
http://www.roadtofailure.com -- The Fringes of Scalability, Social
Media, and Computer Science
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org