You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael Braun (JIRA)" <ji...@apache.org> on 2017/03/01 21:35:45 UTC

[jira] [Commented] (LUCENE-7682) UnifiedHighlighter not highlighting all terms relevant in SpanNearQuery

    [ https://issues.apache.org/jira/browse/LUCENE-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891101#comment-15891101 ] 

Michael Braun commented on LUCENE-7682:
---------------------------------------

Should this be marked as a bug for a module other than the highlighter then since it also affects scoring? 

> UnifiedHighlighter not highlighting all terms relevant in SpanNearQuery
> -----------------------------------------------------------------------
>
>                 Key: LUCENE-7682
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7682
>             Project: Lucene - Core
>          Issue Type: Bug
>          Components: modules/highlighter
>            Reporter: Michael Braun
>
> Original text: "Something for protecting wildlife feed in a feed thing."
> Query is:
>    SpanNearQuery with Slop 9 - in order - 
>       1. SpanTermQuery(wildlife)
>       2. SpanTermQuery(feed)
> This should highlight both instances of "feed" since they are both within slop of 9 of "wildlife". However, only the first instance is highlighted. This occurs with unordered SpanNearQuery as well.  Test below replicates. Affects both the current 6.x line and master.
> Test that fits within TestUnifiedHighlighterMTQ:
> {code}
>   public void testOrderedSpanNearQueryWithDupeTerms() throws Exception {
>     RandomIndexWriter iw = new RandomIndexWriter(random(), dir, indexAnalyzer);
>     Document doc = new Document();
>     doc.add(new Field("body", "Something for protecting wildlife feed in a feed thing.", fieldType));
>     doc.add(newTextField("id", "id", Field.Store.YES));
>     iw.addDocument(doc);
>     IndexReader ir = iw.getReader();
>     iw.close();
>     IndexSearcher searcher = newSearcher(ir);
>     UnifiedHighlighter highlighter = new UnifiedHighlighter(searcher, indexAnalyzer);
>     int docID = searcher.search(new TermQuery(new Term("id", "id")), 1).scoreDocs[0].doc;
>     SpanTermQuery termOne = new SpanTermQuery(new Term("body", "wildlife"));
>     SpanTermQuery termTwo = new SpanTermQuery(new Term("body", "feed"));
>     SpanNearQuery topQuery = new SpanNearQuery.Builder("body", true)
>         .setSlop(9)
>         .addClause(termOne)
>         .addClause(termTwo)
>         .build();
>     int[] docIds = new int[] {docID};
>     String snippets[] = highlighter.highlightFields(new String[] {"body"}, topQuery, docIds, new int[] {2}).get("body");
>     assertEquals(1, snippets.length);
>     assertEquals("Something for protecting <b>wildlife</b> <b>feed</b> in a <b>feed</b> thing.", snippets[0]);
>     ir.close();
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org