You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by "O. Klein" <kl...@octoweb.nl> on 2011/09/05 16:09:42 UTC

Sentence aware Highlighter

Using the regex in the old highlighter I had reasonable sentence aware
highlighting, but speed is an issue. So I tried to get this working with the
VFH, but this obviously didn't work with the regex.

So I am looking for ways to get the same behavior but with improved speed
and came across https://issues.apache.org/jira/browse/LUCENE-1824, which at
least would be a small improvement, but the last comment confused me, as I
thought FVH was going to be the new highlighter for Solr. So this patch
would make some sense if im not mistaken.

Nonetheless has anyone managed to make something like a
SentenceAwareFragmentsBuilder? Or have some advise on how to realise this?

--
View this message in context: http://lucene.472066.n3.nabble.com/Sentence-aware-Highlighter-tp3310982p3310982.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Sentence aware Highlighter

Posted by "O. Klein" <kl...@octoweb.nl>.
I just stumbled upon https://issues.apache.org/jira/browse/LUCENE-1822 maybe
this can help too?

--
View this message in context: http://lucene.472066.n3.nabble.com/Sentence-aware-Highlighter-tp3310982p3316619.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Sentence aware Highlighter

Posted by "O. Klein" <kl...@octoweb.nl>.
Not truncating terms anymore is a good first step. Ill be looking forward to
your solution Koji.

A real SentenceAwareFragmentsBuilder is probably too difficult to make. But
maybe someone can implement the RegexFragmenter functionality for FVH.





--
View this message in context: http://lucene.472066.n3.nabble.com/Sentence-aware-Highlighter-tp3310982p3316573.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: Sentence aware Highlighter

Posted by Bob Sandiford <bo...@sirsidynix.com>.
What if you were to make your field a multi-valued field, and at indexing time, split up the text into sentences, putting each sentence into the solr document as one of the values for the mv field?  Then I think the normal highlighting code can be used to pull the entire value (i.e. sentence) of a matching mv instance within your document?  I.E. put the 'overhead' into the index step, rather than trying to do it at search time?

Bob Sandiford | Lead Software Engineer | SirsiDynix
P: 800.288.8020 X6943 | Bob.Sandiford@sirsidynix.com
www.sirsidynix.com


> -----Original Message-----
> From: Koji Sekiguchi [mailto:koji@r.email.ne.jp]
> Sent: Monday, September 05, 2011 10:33 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Sentence aware Highlighter
> 
> (11/09/05 23:09), O. Klein wrote:
> > Using the regex in the old highlighter I had reasonable sentence
> aware
> > highlighting, but speed is an issue. So I tried to get this working
> with the
> > VFH, but this obviously didn't work with the regex.
> >
> > So I am looking for ways to get the same behavior but with improved
> speed
> > and came across https://issues.apache.org/jira/browse/LUCENE-1824,
> which at
> > least would be a small improvement, but the last comment confused me,
> as I
> > thought FVH was going to be the new highlighter for Solr. So this
> patch
> > would make some sense if im not mistaken.
> >
> > Nonetheless has anyone managed to make something like a
> > SentenceAwareFragmentsBuilder? Or have some advise on how to realise
> this?
> 
> Sorry for the long delay on the issue!
> I'd like to take a look into it in this week. Hopefully, BreakIterator
> may be
> used, which Robert mentioned in the JIRA.
> 
> Thank you for your patience!
> 
> koji
> --
> Check out "Query Log Visualizer" for Apache Solr
> http://www.rondhuit-demo.com/loganalyzer/loganalyzer.html
> http://www.rondhuit.com/en/



Re: Sentence aware Highlighter

Posted by Koji Sekiguchi <ko...@r.email.ne.jp>.
(11/09/05 23:09), O. Klein wrote:
> Using the regex in the old highlighter I had reasonable sentence aware
> highlighting, but speed is an issue. So I tried to get this working with the
> VFH, but this obviously didn't work with the regex.
>
> So I am looking for ways to get the same behavior but with improved speed
> and came across https://issues.apache.org/jira/browse/LUCENE-1824, which at
> least would be a small improvement, but the last comment confused me, as I
> thought FVH was going to be the new highlighter for Solr. So this patch
> would make some sense if im not mistaken.
>
> Nonetheless has anyone managed to make something like a
> SentenceAwareFragmentsBuilder? Or have some advise on how to realise this?

Sorry for the long delay on the issue!
I'd like to take a look into it in this week. Hopefully, BreakIterator may be
used, which Robert mentioned in the JIRA.

Thank you for your patience!

koji
-- 
Check out "Query Log Visualizer" for Apache Solr
http://www.rondhuit-demo.com/loganalyzer/loganalyzer.html
http://www.rondhuit.com/en/