You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ion Barcan <io...@gmail.com> on 2009/10/13 15:23:22 UTC

PhraseQuery in BooleanQuery not working properly in 2.9.0

Hello,

With the new Lucene 2.9.0 (on a newly built index of approx. 30
million documents) running BooleanQueries containing PhraseQuery does
not work properly. I've verified this on both optimized and
unoptimized index versions.

For example:

lucli> count field1:"john doe"
Searching for: field1:"john doe"
496 total documents

lucli> count +(field1:"john doe")
Searching for: +field1:"john doe"
496 total documents

lucli> count +(field1:"john doe" field1:"john doe")
Searching for: +(field1:"john doe" field1:"john doe")
5 total documents

lucli> count +(+field1:"john doe" field1:"john doe")
Searching for: +(+field1:"john doe" field1:"john doe")
496 total documents

lucli> count +(field1:"john doe" field2:UnmatchedValue)
Searching for: +(field1:"john doe" field2:UnmatchedValue)
5 total documents

lucli> count +(+field1:"john doe" field2:UnmatchedValue)
Searching for: +(+field1:"john doe" field2:UnmatchedValue)
496 total documents

This was also verifiable when I searched using TopScoreDocCollector(N,
true|false), with the call using docsScoredInOrder=false producing
incorrect results.

While debugging I've noticed that for the BooleanQuery containing at
least one MUST clause BooleanScorer2 is used and this produces the
correct number of results, while for BooleanQuery that don't contain
any MUST clause BooleanScorer.score(Collector, int, int) selects up to
a certain number of docs and then it exits prematurely.

Is this behaviour normal? This used to work in Lucene 2.4.x.

I've noticed another user mentioning a similar behaviour
(http://mail-archives.apache.org/mod_mbox/lucene-java-user/200910.mbox/%3C20091008121147.107a8589@pc-4176.kl.dfki.de%3E),
but in my case it's a newly built index, not one that was migrated
from 2.4 to 2.9.

Thanks,
Ionut

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: PhraseQuery in BooleanQuery not working properly in 2.9.0

Posted by Chris Hostetter <ho...@fucit.org>.
: With the new Lucene 2.9.0 (on a newly built index of approx. 30
: million documents) running BooleanQueries containing PhraseQuery does
: not work properly. I've verified this on both optimized and
: unoptimized index versions.

I suspect that this is the same problem as identified in LUCENE-1974, a 
fix has already been identified, and a 2.9.1 release will most likely 
happen very soon...

https://issues.apache.org/jira/browse/LUCENE-1974



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: PhraseQuery in BooleanQuery not working properly in 2.9.0

Posted by Ion Barcan <io...@gmail.com>.
Yes, the fix in src/java/org/apache/lucene/search/Scorer.java solves
my problem, i.e. the queries return the correct number of results.

On Wed, Oct 14, 2009 at 12:29 PM, Michael McCandless
<lu...@mikemccandless.com> wrote:
> It sounds likely that this is https://issues.apache.org/jira/browse/LUCENE-1974
>
> Is it possible for you to test that patch and verify it resolves your problem?
>
> Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: PhraseQuery in BooleanQuery not working properly in 2.9.0

Posted by Michael McCandless <lu...@mikemccandless.com>.
It sounds likely that this is https://issues.apache.org/jira/browse/LUCENE-1974

Is it possible for you to test that patch and verify it resolves your problem?

Mike

On Tue, Oct 13, 2009 at 9:23 AM, Ion Barcan <io...@gmail.com> wrote:
> Hello,
>
> With the new Lucene 2.9.0 (on a newly built index of approx. 30
> million documents) running BooleanQueries containing PhraseQuery does
> not work properly. I've verified this on both optimized and
> unoptimized index versions.
>
> For example:
>
> lucli> count field1:"john doe"
> Searching for: field1:"john doe"
> 496 total documents
>
> lucli> count +(field1:"john doe")
> Searching for: +field1:"john doe"
> 496 total documents
>
> lucli> count +(field1:"john doe" field1:"john doe")
> Searching for: +(field1:"john doe" field1:"john doe")
> 5 total documents
>
> lucli> count +(+field1:"john doe" field1:"john doe")
> Searching for: +(+field1:"john doe" field1:"john doe")
> 496 total documents
>
> lucli> count +(field1:"john doe" field2:UnmatchedValue)
> Searching for: +(field1:"john doe" field2:UnmatchedValue)
> 5 total documents
>
> lucli> count +(+field1:"john doe" field2:UnmatchedValue)
> Searching for: +(+field1:"john doe" field2:UnmatchedValue)
> 496 total documents
>
> This was also verifiable when I searched using TopScoreDocCollector(N,
> true|false), with the call using docsScoredInOrder=false producing
> incorrect results.
>
> While debugging I've noticed that for the BooleanQuery containing at
> least one MUST clause BooleanScorer2 is used and this produces the
> correct number of results, while for BooleanQuery that don't contain
> any MUST clause BooleanScorer.score(Collector, int, int) selects up to
> a certain number of docs and then it exits prematurely.
>
> Is this behaviour normal? This used to work in Lucene 2.4.x.
>
> I've noticed another user mentioning a similar behaviour
> (http://mail-archives.apache.org/mod_mbox/lucene-java-user/200910.mbox/%3C20091008121147.107a8589@pc-4176.kl.dfki.de%3E),
> but in my case it's a newly built index, not one that was migrated
> from 2.4 to 2.9.
>
> Thanks,
> Ionut
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org