You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael McCandless (JIRA)" <ji...@apache.org> on 2011/05/04 12:05:03 UTC
[jira] [Updated] (LUCENE-3068) The repeats mechanism in
SloppyPhraseScorer is broken when doc has tokens at same position
[ https://issues.apache.org/jira/browse/LUCENE-3068?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Michael McCandless updated LUCENE-3068:
---------------------------------------
Attachment: LUCENE-3068.patch
Patch w/ test case showing the problem.
If you set slop to 0 for the PhraseQuery, the test passes. The MultiPhraseQuery passes with slop or no slop because it handles the same-position case itself (Union*Enum).
That got me thinking... maybe any time a *PhraseQuery has overlapping positions, we should rewrite to a MultiPhraseQuery and let it handle the same positions...? Is there any downside to that?
> The repeats mechanism in SloppyPhraseScorer is broken when doc has tokens at same position
> ------------------------------------------------------------------------------------------
>
> Key: LUCENE-3068
> URL: https://issues.apache.org/jira/browse/LUCENE-3068
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 3.0.3, 3.1, 4.0
> Reporter: Michael McCandless
> Priority: Minor
> Fix For: 3.2, 4.0
>
> Attachments: LUCENE-3068.patch
>
>
> In LUCENE-736 we made fixes to SloppyPhraseScorer, because it was
> matching docs that it shouldn't; but I think those changes caused it
> to fail to match docs that it should, specifically when the doc itself
> has tokens at the same position.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org