You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Koji Sekiguchi (JIRA)" <ji...@apache.org> on 2009/07/17 16:03:15 UTC

[jira] Commented: (LUCENE-1752) incorrect snippet returned with SpanScorer

    [ https://issues.apache.org/jira/browse/LUCENE-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732531#action_12732531 ] 

Koji Sekiguchi commented on LUCENE-1752:
----------------------------------------

The patch looks good! Thanks, Mark.

I think the customer will test the patch with their data on uni-gram environment on Monday. I'll report back. Thanks again. :)

> incorrect snippet returned with SpanScorer
> ------------------------------------------
>
>                 Key: LUCENE-1752
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1752
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: contrib/highlighter
>    Affects Versions: 2.9
>            Reporter: Koji Sekiguchi
>            Assignee: Mark Miller
>            Priority: Minor
>         Attachments: LUCENE-1752.patch
>
>
> This problem was reported by my customer. They are using Solr 1.3 and uni-gram, but it can be reproduced with Lucene 2.9 and WhitespaceAnalyzer.
> {panel:title=Query}
> (f1:"a b c d" OR f2:"a b c d") AND (f1:"b c g" OR f2:"b c g")
> {panel}
> The snippet we expected is:
> {panel}
> x y z <B>a</B> <B>b</B> <B>c</B> <B>d</B> e f g <B>b</B> <B>c</B> <B>g</B>
> {panel}
> but we got:
> {panel}
> x y z <B>a</B> b c <B>d</B> e f g <B>b</B> <B>c</B> <B>g</B>
> {panel}
> Program to reproduce the problem:
> {code}
> public class TestHighlighter {
>   static final String CONTENT = "x y z a b c d e f g b c g";
>   static final String PH1 = "\"a b c d\"";
>   static final String PH2 = "\"b c g\"";
>   static final String F1 = "f1";
>   static final String F2 = "f2";
>   static final String F1C = F1 + ":";
>   static final String F2C = F2 + ":";
>   static final String QUERY_STRING =
>     "(" + F1C + PH1 + " OR " + F2C + PH1 + ") AND ("
>     + F1C + PH2 + " OR " + F2C + PH2 + ")";
>   static Analyzer analyzer = new WhitespaceAnalyzer();
>   
>   public static void main(String[] args) throws Exception {
>     QueryParser qp = new QueryParser( F1, analyzer );
>     Query query = qp.parse( QUERY_STRING );
>     CachingTokenFilter stream = new CachingTokenFilter( analyzer.tokenStream( F1, new StringReader( CONTENT ) ) );
>     Scorer scorer = new SpanScorer( query, F1, stream, false );
>     Highlighter h = new Highlighter( scorer );
>     System.out.println( "query : " + QUERY_STRING );
>     System.out.println( h.getBestFragment( analyzer, F1,  CONTENT ) );
>   }
> }
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


Re: [jira] Commented: (LUCENE-1752) incorrect snippet returned with SpanScorer

Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
Monday is the National holiday in Japan. I think they will test on Tuesday.

Koji

Koji Sekiguchi (JIRA) wrote:
>     [ https://issues.apache.org/jira/browse/LUCENE-1752?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12732531#action_12732531 ] 
>
> Koji Sekiguchi commented on LUCENE-1752:
> ----------------------------------------
>
> The patch looks good! Thanks, Mark.
>
> I think the customer will test the patch with their data on uni-gram environment on Monday. I'll report back. Thanks again. :)
>
>   
>> incorrect snippet returned with SpanScorer
>> ------------------------------------------
>>
>>                 Key: LUCENE-1752
>>                 URL: https://issues.apache.org/jira/browse/LUCENE-1752
>>             Project: Lucene - Java
>>          Issue Type: Bug
>>          Components: contrib/highlighter
>>    Affects Versions: 2.9
>>            Reporter: Koji Sekiguchi
>>            Assignee: Mark Miller
>>            Priority: Minor
>>         Attachments: LUCENE-1752.patch
>>
>>
>> This problem was reported by my customer. They are using Solr 1.3 and uni-gram, but it can be reproduced with Lucene 2.9 and WhitespaceAnalyzer.
>> {panel:title=Query}
>> (f1:"a b c d" OR f2:"a b c d") AND (f1:"b c g" OR f2:"b c g")
>> {panel}
>> The snippet we expected is:
>> {panel}
>> x y z <B>a</B> <B>b</B> <B>c</B> <B>d</B> e f g <B>b</B> <B>c</B> <B>g</B>
>> {panel}
>> but we got:
>> {panel}
>> x y z <B>a</B> b c <B>d</B> e f g <B>b</B> <B>c</B> <B>g</B>
>> {panel}
>> Program to reproduce the problem:
>> {code}
>> public class TestHighlighter {
>>   static final String CONTENT = "x y z a b c d e f g b c g";
>>   static final String PH1 = "\"a b c d\"";
>>   static final String PH2 = "\"b c g\"";
>>   static final String F1 = "f1";
>>   static final String F2 = "f2";
>>   static final String F1C = F1 + ":";
>>   static final String F2C = F2 + ":";
>>   static final String QUERY_STRING =
>>     "(" + F1C + PH1 + " OR " + F2C + PH1 + ") AND ("
>>     + F1C + PH2 + " OR " + F2C + PH2 + ")";
>>   static Analyzer analyzer = new WhitespaceAnalyzer();
>>   
>>   public static void main(String[] args) throws Exception {
>>     QueryParser qp = new QueryParser( F1, analyzer );
>>     Query query = qp.parse( QUERY_STRING );
>>     CachingTokenFilter stream = new CachingTokenFilter( analyzer.tokenStream( F1, new StringReader( CONTENT ) ) );
>>     Scorer scorer = new SpanScorer( query, F1, stream, false );
>>     Highlighter h = new Highlighter( scorer );
>>     System.out.println( "query : " + QUERY_STRING );
>>     System.out.println( h.getBestFragment( analyzer, F1,  CONTENT ) );
>>   }
>> }
>> {code}
>>     
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org