You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Michael Imbeault <mi...@sympatico.ca> on 2006/09/16 21:04:27 UTC
Better Highligther fragmenter?
I'm now using the excellent Hightlighter from within Solr and it works
very well; except that the generated fragments sometimes begins with
bad-looking characters (the "." of the end of the previous phrase, or a
), /10, etc). The same is true for the fragments ends. I looked at both
the dev and user lucene list in search for a better Fragmenter class,
but it seems that there's none right now (just the simple and null
fragmenters).
To me the 'simple' fragmenter is a bit too simple; anyone had success in
implementing a more intelligent one? I have no java coding experience,
sadly, so I don't know where to begin on this one. I don't think fancy
phrase recognition is needed; just a better boundary algorithm (avoid
beginning / ending fragments with bad looking characters) and the
addition of "..." at the end and beginning of the fragment if
fragmentation of a phrase took place.
Also, is it required that the highlighted field is 'stored'? I'm pretty
sure it is, but just want confirmation.
Thanks,
--
Michael Imbeault
CHUL Research Center (CHUQ)
2705 boul. Laurier
Ste-Foy, QC, Canada, G1V 4G2
Tel: (418) 654-2705, Fax: (418) 654-2212
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org