You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by "Otis Gospodnetic (JIRA)" <ji...@apache.org> on 2008/05/27 18:12:09 UTC

[jira] Resolved: (LUCENE-1285) WeightedSpanTermExtractor incorrectly treats the same terms occurring in different query types

     [ https://issues.apache.org/jira/browse/LUCENE-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Otis Gospodnetic resolved LUCENE-1285.
--------------------------------------

       Resolution: Fixed
    Lucene Fields: [New, Patch Available]  (was: [New])

It looks like Mark already committed this, but forgot resolve this issue, so I'm marking it as Fixed.


> WeightedSpanTermExtractor incorrectly treats the same terms occurring in different query types
> ----------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1285
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1285
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/highlighter
>    Affects Versions: 2.4
>            Reporter: Andrzej Bialecki 
>            Assignee: Otis Gospodnetic
>             Fix For: 2.4
>
>         Attachments: highlighter-test.patch, highlighter.patch
>
>
> Given a BooleanQuery with multiple clauses, if a term occurs both in a Span / Phrase query, and in a TermQuery, the results of term extraction are unpredictable and depend on the order of clauses. Concequently, the result of highlighting are incorrect.
> Example text: t1 t2 t3 t4 t2
> Example query: t2 t3 "t1 t2"
> Current highlighting: [t1 t2] [t3] t4 t2
> Correct highlighting: [t1 t2] [t3] t4 [t2]
> The problem comes from the fact that we keep a Map<termText, WeightedSpanTerm>, and if the same term occurs in a Phrase or Span query the resulting WeightedSpanTerm will have a positionSensitive=true, whereas terms added from TermQuery have positionSensitive=false. The end result for this particular term will depend on the order in which the clauses are processed.
> My fix is to use a subclass of Map, which on put() always sets the result to the most lax setting, i.e. if we already have a term with positionSensitive=true, and we try to put() a term with positionSensitive=false, we set the result positionSensitive=false, as it will match both cases.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Mark Miller & Jira Committer Role --- was: Re: [jira] Resolved: (LUCENE-1285) ...

Posted by Chris Hostetter <ho...@fucit.org>.

: > I've added you to the committer group in Jira .. you should be able to
: > assign issues to yourself, and resolve issues now.
: 
: You mean 'role', right?  We don't use Jira groups much anymore.

correct ... i had it right in the wiki (which i aparently only previewed 
and didn't save until now) but i "synonym transposed" in my email.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Mark Miller & Jira Committer Role --- was: Re: [jira] Resolved: (LUCENE-1285) ...

Posted by Doug Cutting <cu...@apache.org>.

Chris Hostetter wrote:
> I've added you to the committer group in Jira .. you should be able to 
> assign issues to yourself, and resolve issues now.

You mean 'role', right?  We don't use Jira groups much anymore.

Doug


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Mark Miller & Jira Committer Role --- was: Re: [jira] Resolved: (LUCENE-1285) ...

Posted by Chris Hostetter <ho...@fucit.org>.

: Hey Otis, maybe I am missing something, but it didn't seem like I had the
: ability to resolve it. Hope its not too obvious and I am just missing the
: link.

I've added you to the committer group in Jira .. you should be able to 
assign issues to yourself, and resolve issues now.



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: [jira] Resolved: (LUCENE-1285) WeightedSpanTermExtractor incorrectly treats the same terms occurring in different query types

Posted by Mark Miller <ma...@gmail.com>.

Hey Otis, maybe I am missing something, but it didn't seem like I had 
the ability to resolve it. Hope its not too obvious and I am just 
missing the link.

Otis Gospodnetic (JIRA) wrote:
>      [ https://issues.apache.org/jira/browse/LUCENE-1285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
>
> Otis Gospodnetic resolved LUCENE-1285.
> --------------------------------------
>
>        Resolution: Fixed
>     Lucene Fields: [New, Patch Available]  (was: [New])
>
> It looks like Mark already committed this, but forgot resolve this issue, so I'm marking it as Fixed.
>
>
>   
>> WeightedSpanTermExtractor incorrectly treats the same terms occurring in different query types
>> ----------------------------------------------------------------------------------------------
>>
>>                 Key: LUCENE-1285
>>                 URL: https://issues.apache.org/jira/browse/LUCENE-1285
>>             Project: Lucene - Java
>>          Issue Type: Bug
>>          Components: contrib/highlighter
>>    Affects Versions: 2.4
>>            Reporter: Andrzej Bialecki 
>>            Assignee: Otis Gospodnetic
>>             Fix For: 2.4
>>
>>         Attachments: highlighter-test.patch, highlighter.patch
>>
>>
>> Given a BooleanQuery with multiple clauses, if a term occurs both in a Span / Phrase query, and in a TermQuery, the results of term extraction are unpredictable and depend on the order of clauses. Concequently, the result of highlighting are incorrect.
>> Example text: t1 t2 t3 t4 t2
>> Example query: t2 t3 "t1 t2"
>> Current highlighting: [t1 t2] [t3] t4 t2
>> Correct highlighting: [t1 t2] [t3] t4 [t2]
>> The problem comes from the fact that we keep a Map<termText, WeightedSpanTerm>, and if the same term occurs in a Phrase or Span query the resulting WeightedSpanTerm will have a positionSensitive=true, whereas terms added from TermQuery have positionSensitive=false. The end result for this particular term will depend on the order in which the clauses are processed.
>> My fix is to use a subclass of Map, which on put() always sets the result to the most lax setting, i.e. if we already have a term with positionSensitive=true, and we try to put() a term with positionSensitive=false, we set the result positionSensitive=false, as it will match both cases.
>>     
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org