You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Michael Braun (JIRA)" <ji...@apache.org> on 2017/03/01 21:35:45 UTC
[jira] [Commented] (LUCENE-7682) UnifiedHighlighter not
highlighting all terms relevant in SpanNearQuery
[ https://issues.apache.org/jira/browse/LUCENE-7682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891101#comment-15891101 ]
Michael Braun commented on LUCENE-7682:
---------------------------------------
Should this be marked as a bug for a module other than the highlighter then since it also affects scoring?
> UnifiedHighlighter not highlighting all terms relevant in SpanNearQuery
> -----------------------------------------------------------------------
>
> Key: LUCENE-7682
> URL: https://issues.apache.org/jira/browse/LUCENE-7682
> Project: Lucene - Core
> Issue Type: Bug
> Components: modules/highlighter
> Reporter: Michael Braun
>
> Original text: "Something for protecting wildlife feed in a feed thing."
> Query is:
> SpanNearQuery with Slop 9 - in order -
> 1. SpanTermQuery(wildlife)
> 2. SpanTermQuery(feed)
> This should highlight both instances of "feed" since they are both within slop of 9 of "wildlife". However, only the first instance is highlighted. This occurs with unordered SpanNearQuery as well. Test below replicates. Affects both the current 6.x line and master.
> Test that fits within TestUnifiedHighlighterMTQ:
> {code}
> public void testOrderedSpanNearQueryWithDupeTerms() throws Exception {
> RandomIndexWriter iw = new RandomIndexWriter(random(), dir, indexAnalyzer);
> Document doc = new Document();
> doc.add(new Field("body", "Something for protecting wildlife feed in a feed thing.", fieldType));
> doc.add(newTextField("id", "id", Field.Store.YES));
> iw.addDocument(doc);
> IndexReader ir = iw.getReader();
> iw.close();
> IndexSearcher searcher = newSearcher(ir);
> UnifiedHighlighter highlighter = new UnifiedHighlighter(searcher, indexAnalyzer);
> int docID = searcher.search(new TermQuery(new Term("id", "id")), 1).scoreDocs[0].doc;
> SpanTermQuery termOne = new SpanTermQuery(new Term("body", "wildlife"));
> SpanTermQuery termTwo = new SpanTermQuery(new Term("body", "feed"));
> SpanNearQuery topQuery = new SpanNearQuery.Builder("body", true)
> .setSlop(9)
> .addClause(termOne)
> .addClause(termTwo)
> .build();
> int[] docIds = new int[] {docID};
> String snippets[] = highlighter.highlightFields(new String[] {"body"}, topQuery, docIds, new int[] {2}).get("body");
> assertEquals(1, snippets.length);
> assertEquals("Something for protecting <b>wildlife</b> <b>feed</b> in a <b>feed</b> thing.", snippets[0]);
> ir.close();
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org