You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Dev Donomo <de...@donomo.com> on 2008/10/21 03:56:11 UTC

trying to make highlight fragmentor return entire HTML element

Hello,

I'm indexing HTML files and would like the highlighted fragment to return an
entire <span> element where the hightlight is contained.

For example, one of my documents is

<html><body>
...<span class="myfragment">here's some text</span>...
</body></html>

When the query is "some text" I would like the fragment to be

<span class="myfragment">here's <em>some text</em></span>

I did see this thread http://markmail.org/message/mti6dn4waipx3fqw (Re:
trying to break up highlighted text on line boundaries), but can't figure
out what kind of regex hl.regex.pattern really needs.

Can you help?
Thanks,
Donomo