You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Stanislav Jordanov <st...@sirma.bg> on 2006/11/21 16:49:36 UTC

Querying performance decrease in 1.9.1 and 2.0.0

Hi guys,

We've identified a significant querying performance decrease after 
switching from Lucene 1.4.3 to 1.9.1.
It is steadily demonstrated no mater if the concurrent querying threads 
are 1, 2, 4 or 8 (or even more) -
If N queries are executed against 1.9.1 for a given time, then 1.4.3 
executes approx. 1.5 * N queries for the same time.
Lucene 2.0.0 behaves just like 1.9.1 in terms of querying performance.

Any idea what may be causing this behavior?
Or is it just the Windows (tm) syndrome - every newer version requires 
more powerful hardware to deliver the same performance?

Regards,
Stanislav

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Querying performance decrease in 1.9.1 and 2.0.0

Posted by Paul Elschot <pa...@xs4all.nl>.
Stanislav,

On Wednesday 22 November 2006 09:52, Stanislav Jordanov wrote:
> Paul,
> We are working on delivering the next release by the end of the week so 
> I have to take care of 2 or 3 issues before I try the nightly build.
> I promise to try it and report the results here.

I have made a first attempt at restoring the old query performance here:
http://issues.apache.org/jira/browse/LUCENE-730

Regards,
Paul Elschot

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Querying performance decrease in 1.9.1 and 2.0.0

Posted by Paul Elschot <pa...@xs4all.nl>.
On Tuesday 28 November 2006 12:12, Stanislav Jordanov wrote:
> Paul,
> we are using a slightly modified version of Lucene,
> so in order to run the performance tests on a nightly build, I need 
> Lucene's sources, not the compiled classes.
> Is there a nice and easy way to get them?

The sources are also in the nightly build .tgz iirc.
In case you can compile your modifications into a separate
jar, the easiest way is to put that jar on the class path before
the lucene jar, and then you don't need the sources.
That compilation could also use the latest lucene jar, btw.

However, in case you want try this patch:
http://issues.apache.org/jira/browse/LUCENE-730
it just as much work to checkout the source code from the svn trunk.
The checkout command is here:
http://lucene.apache.org/java/docs/releases.html
You'll also need ant then.

I think it is time to continue to java-dev.

Regards,
Paul Elschot

> 
> Stanislav
> 
> 
> Stanislav Jordanov wrote:
> > Paul,
> > We are working on delivering the next release by the end of the week 
> > so I have to take care of 2 or 3 issues before I try the nightly build.
> > I promise to try it and report the results here.
> >
> > Best,
> > Stanislav
> >
> > Paul Elschot wrote:
> >> Stanislav,
> >>
> >> Could you also try a nightly build to test the later performance 
> >> improvement
> >> on BooleanScorer2?  The nightly builds are here:
> >> http://people.apache.org/builds/lucene/java/nightly/
> >> The jar is called lucene-core-nightly.jar in the .tar.gz build.
> >>
> >> It's not likely that this is faster than the 1.4 BooleanScorer, but 
> >> one never knows.
> >>
> >> Regards,
> >> Paul Elschot
> >>
> >>
> >> On Tuesday 21 November 2006 17:59, Yonik Seeley wrote:
> >>  
> >>> On 11/21/06, Stanislav Jordanov <st...@sirma.bg> wrote:
> >>>    
> >>>> Switch to the old scorer (via BooleanQuery.setUseScorer14(true) )
> >>>> solved the performance issue - now Lucene 1.9.1 & 2.0.0 perform on the
> >>>> same load test just as 1.4.3 does
> >>>>
> >>>> Thanks a lot Yonik!
> >>>>
> >>>> Any chance there exists a non-professional explanation what's the
> >>>> difference between old and new boolean scorers?
> >>>>       
> >>> The original BooleanScorer was a bucket-based scorer that could
> >>> deliver docs out of order and thus restricted how it could be used.
> >>> It also had a limitation of 32 required or prohibited clauses (because
> >>> of an int bitmask).
> >>>
> >>> BooleanScorer2 removes these limitations, and uses skipTo() where
> >>> applicable on sub-scorers.
> >>>
> >>> -Yonik
> >>> http://incubator.apache.org/solr Solr, the open-source Lucene search 
> >>> server
> >>>
> >>> ---------------------------------------------------------------------
> >>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >>> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>>
> >>>
> >>>
> >>>     
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >>   
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Querying performance decrease in 1.9.1 and 2.0.0

Posted by Stanislav Jordanov <st...@sirma.bg>.
Paul,
we are using a slightly modified version of Lucene,
so in order to run the performance tests on a nightly build, I need 
Lucene's sources, not the compiled classes.
Is there a nice and easy way to get them?

Stanislav


Stanislav Jordanov wrote:
> Paul,
> We are working on delivering the next release by the end of the week 
> so I have to take care of 2 or 3 issues before I try the nightly build.
> I promise to try it and report the results here.
>
> Best,
> Stanislav
>
> Paul Elschot wrote:
>> Stanislav,
>>
>> Could you also try a nightly build to test the later performance 
>> improvement
>> on BooleanScorer2?  The nightly builds are here:
>> http://people.apache.org/builds/lucene/java/nightly/
>> The jar is called lucene-core-nightly.jar in the .tar.gz build.
>>
>> It's not likely that this is faster than the 1.4 BooleanScorer, but 
>> one never knows.
>>
>> Regards,
>> Paul Elschot
>>
>>
>> On Tuesday 21 November 2006 17:59, Yonik Seeley wrote:
>>  
>>> On 11/21/06, Stanislav Jordanov <st...@sirma.bg> wrote:
>>>    
>>>> Switch to the old scorer (via BooleanQuery.setUseScorer14(true) )
>>>> solved the performance issue - now Lucene 1.9.1 & 2.0.0 perform on the
>>>> same load test just as 1.4.3 does
>>>>
>>>> Thanks a lot Yonik!
>>>>
>>>> Any chance there exists a non-professional explanation what's the
>>>> difference between old and new boolean scorers?
>>>>       
>>> The original BooleanScorer was a bucket-based scorer that could
>>> deliver docs out of order and thus restricted how it could be used.
>>> It also had a limitation of 32 required or prohibited clauses (because
>>> of an int bitmask).
>>>
>>> BooleanScorer2 removes these limitations, and uses skipTo() where
>>> applicable on sub-scorers.
>>>
>>> -Yonik
>>> http://incubator.apache.org/solr Solr, the open-source Lucene search 
>>> server
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>>
>>>     
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>   
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Querying performance decrease in 1.9.1 and 2.0.0

Posted by Stanislav Jordanov <st...@sirma.bg>.
Paul,
We are working on delivering the next release by the end of the week so 
I have to take care of 2 or 3 issues before I try the nightly build.
I promise to try it and report the results here.

Best,
Stanislav

Paul Elschot wrote:
> Stanislav,
>
> Could you also try a nightly build to test the later performance improvement
> on BooleanScorer2?  The nightly builds are here:
> http://people.apache.org/builds/lucene/java/nightly/
> The jar is called lucene-core-nightly.jar in the .tar.gz build.
>
> It's not likely that this is faster than the 1.4 BooleanScorer, 
> but one never knows.
>
> Regards,
> Paul Elschot
>
>
> On Tuesday 21 November 2006 17:59, Yonik Seeley wrote:
>   
>> On 11/21/06, Stanislav Jordanov <st...@sirma.bg> wrote:
>>     
>>> Switch to the old scorer (via BooleanQuery.setUseScorer14(true) )
>>> solved the performance issue - now Lucene 1.9.1 & 2.0.0 perform on the
>>> same load test just as 1.4.3 does
>>>
>>> Thanks a lot Yonik!
>>>
>>> Any chance there exists a non-professional explanation what's the
>>> difference between old and new boolean scorers?
>>>       
>> The original BooleanScorer was a bucket-based scorer that could
>> deliver docs out of order and thus restricted how it could be used.
>> It also had a limitation of 32 required or prohibited clauses (because
>> of an int bitmask).
>>
>> BooleanScorer2 removes these limitations, and uses skipTo() where
>> applicable on sub-scorers.
>>
>> -Yonik
>> http://incubator.apache.org/solr Solr, the open-source Lucene search server
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>
>>     
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Querying performance decrease in 1.9.1 and 2.0.0

Posted by Chris Hostetter <ho...@fucit.org>.
: Could you also try a nightly build to test the later performance improvement
: on BooleanScorer2?  The nightly builds are here:
: http://people.apache.org/builds/lucene/java/nightly/
: The jar is called lucene-core-nightly.jar in the .tar.gz build.
:
: It's not likely that this is faster than the 1.4 BooleanScorer,
: but one never knows.

...and there have been other performance improvements that may make up for
it ... MultiTermDocs.skipTo for example (if you use non optimized indexes)



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Querying performance decrease in 1.9.1 and 2.0.0

Posted by Paul Elschot <pa...@xs4all.nl>.
Stanislav,

Could you also try a nightly build to test the later performance improvement
on BooleanScorer2?  The nightly builds are here:
http://people.apache.org/builds/lucene/java/nightly/
The jar is called lucene-core-nightly.jar in the .tar.gz build.

It's not likely that this is faster than the 1.4 BooleanScorer, 
but one never knows.

Regards,
Paul Elschot


On Tuesday 21 November 2006 17:59, Yonik Seeley wrote:
> On 11/21/06, Stanislav Jordanov <st...@sirma.bg> wrote:
> > Switch to the old scorer (via BooleanQuery.setUseScorer14(true) )
> > solved the performance issue - now Lucene 1.9.1 & 2.0.0 perform on the
> > same load test just as 1.4.3 does
> >
> > Thanks a lot Yonik!
> >
> > Any chance there exists a non-professional explanation what's the
> > difference between old and new boolean scorers?
> 
> The original BooleanScorer was a bucket-based scorer that could
> deliver docs out of order and thus restricted how it could be used.
> It also had a limitation of 32 required or prohibited clauses (because
> of an int bitmask).
> 
> BooleanScorer2 removes these limitations, and uses skipTo() where
> applicable on sub-scorers.
> 
> -Yonik
> http://incubator.apache.org/solr Solr, the open-source Lucene search server
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Querying performance decrease in 1.9.1 and 2.0.0

Posted by Yonik Seeley <yo...@apache.org>.
On 11/21/06, Stanislav Jordanov <st...@sirma.bg> wrote:
> Switch to the old scorer (via BooleanQuery.setUseScorer14(true) )
> solved the performance issue - now Lucene 1.9.1 & 2.0.0 perform on the
> same load test just as 1.4.3 does
>
> Thanks a lot Yonik!
>
> Any chance there exists a non-professional explanation what's the
> difference between old and new boolean scorers?

The original BooleanScorer was a bucket-based scorer that could
deliver docs out of order and thus restricted how it could be used.
It also had a limitation of 32 required or prohibited clauses (because
of an int bitmask).

BooleanScorer2 removes these limitations, and uses skipTo() where
applicable on sub-scorers.

-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Querying performance decrease in 1.9.1 and 2.0.0

Posted by Stanislav Jordanov <st...@sirma.bg>.
Switch to the old scorer (via BooleanQuery.setUseScorer14(true) )
solved the performance issue - now Lucene 1.9.1 & 2.0.0 perform on the 
same load test just as 1.4.3 does

Thanks a lot Yonik!

Any chance there exists a non-professional explanation what's the 
difference between old and new boolean scorers?

Cheers,
Stenly

Yonik Seeley wrote:
> On 11/21/06, Stanislav Jordanov <st...@sirma.bg> wrote:
>> We've identified a significant querying performance decrease after
>> switching from Lucene 1.4.3 to 1.9.1.
>> It is steadily demonstrated no mater if the concurrent querying threads
>> are 1, 2, 4 or 8 (or even more) -
>> If N queries are executed against 1.9.1 for a given time, then 1.4.3
>> executes approx. 1.5 * N queries for the same time.
>> Lucene 2.0.0 behaves just like 1.9.1 in terms of querying performance.
>>
>> Any idea what may be causing this behavior?
>
> BooleanScorer changed to BooleanScorer2, which can be faster or slower
> depending on the nature of the queries.  There has been some
> performance work lately in the trunk... you could try the latest
> nightly builds to see if it improves things for you.
>
> You could also verify that it's only the BooleanScorer by trying to
> use the old scorer with the new code via
> BooleanQuery.setUseScorer14(true)
>
>
> -Yonik
> http://incubator.apache.org/solr Solr, the open-source Lucene search 
> server
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Querying performance decrease in 1.9.1 and 2.0.0

Posted by Yonik Seeley <yo...@apache.org>.
On 11/21/06, Stanislav Jordanov <st...@sirma.bg> wrote:
> We've identified a significant querying performance decrease after
> switching from Lucene 1.4.3 to 1.9.1.
> It is steadily demonstrated no mater if the concurrent querying threads
> are 1, 2, 4 or 8 (or even more) -
> If N queries are executed against 1.9.1 for a given time, then 1.4.3
> executes approx. 1.5 * N queries for the same time.
> Lucene 2.0.0 behaves just like 1.9.1 in terms of querying performance.
>
> Any idea what may be causing this behavior?

BooleanScorer changed to BooleanScorer2, which can be faster or slower
depending on the nature of the queries.  There has been some
performance work lately in the trunk... you could try the latest
nightly builds to see if it improves things for you.

You could also verify that it's only the BooleanScorer by trying to
use the old scorer with the new code via
BooleanQuery.setUseScorer14(true)


-Yonik
http://incubator.apache.org/solr Solr, the open-source Lucene search server

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org