You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Patricia Reddy <pa...@gmail.com> on 2020/07/21 20:54:33 UTC

Lucene Highlighting mergeContiguous

Hello All,

Trying to highlight a phrase "John Doe" using Lucene highlighter but the
content highlights each separate term. Contiguous terms are not merged
together.

For eg: <hl>John</hl><hl>Doe</hl>  is returned instead of <hl>John Doe</hl>

I have set the mergeContiguous parameter on the getBestTextFragments method
to true. What can I be missing?

All help appreciated. Thanks!

StandardAnalyzer analyzer = new StandardAnalyzer();
QueryParser parser = new QueryParser("*", analyzer);
QueryScorer scorer = new QueryScorer( parser.parse( query), "*" );
scorer.setExpandMultiTermQuery( true );
scorer.setMaxDocCharsToAnalyze(2000000);
 Highlighter highlighter = new Highlighter(new SimpleHTMLFormatter( "<hl>",
"</hl>"), scorer );
  highlighter.setTextFragmenter(new SimpleSpanFragmenter(scorer));
highlighter.setMaxDocCharsToAnalyze( 2000000 );
  TextFragment[] all =
highlighter.getBestTextFragments(analyzer.tokenStream("*", new
StringReader(str)), str, true, 100);