You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@lucenenet.apache.org by Michael Mitiaguin <mi...@gmail.com> on 2009/07/01 04:54:04 UTC

Re: Multi-color Highlighting: Term problem

I guess, you may search for alternative highlighters as contributions for
Java Lucene. I used something 2 years which was faster ( required indexing
with term vectors )  and highlighted phrase searches properly. As far as I
know the most common highlighter doesn't do it right for phrase and any word
from a phrase we searched  for is highlighted. . As for your problem you may
try stemming analyser when indexing but not sure whether it is relevant and
going to help.

On Wed, Jun 24, 2009 at 4:36 PM, Nitin Shiralkar <ni...@coreobjects.com>wrote:

> Hi All,
>
> We are trying to implement multi-color highlighting in our Lucene.NET
> (v2.0) based search engine. We are using "Lucene.Net.Highlight" library for
> the same. Since we do not have any support for multi-color highlighting, we
> are doing that indirectly by extracting each term in search query and
> highlighting it individually with separate formatter.
>
> For example:
>
> String strQuery = "merger agree*" (without quotes)
> ---
> WeightedTerm[] terms = QueryTermExtractor.GetTerms(strQuery, false);
> ---
> ---
> SimpleHTMLFormatter formatter = new
> SimpleHTMLFormatter(_strFormatterStartTag[nFormatter], _strFormatterEndTag);
> --- loop to traverse each term ---
> WeightedTerm term = terms[nTerm];
> ---
> TermQuery termQuery = new TermQuery (new Term (FIELDNAME, term.GetTerm()));
> ---
> Highlighter highlighterContent = ---
>
> Problem:
>
> Above implementation is working fine. However all variations of "agree*"
> query term like "agreements", "agreed", "agreement" are being highlighted in
> separate color. I am not able to correlate all these variations to same
> original term "agree*" to highlight them in same color.
>
> Can anyone suggest me an alternate approach?
>
>
>