You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "David Smiley (JIRA)" <ji...@apache.org> on 2017/01/08 04:20:58 UTC

[jira] [Resolved] (LUCENE-7620) UnifiedHighlighter: add target character width BreakIterator wrapper

     [ https://issues.apache.org/jira/browse/LUCENE-7620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

David Smiley resolved LUCENE-7620.
----------------------------------
    Resolution: Fixed

Thanks for the review feedback Jim & Tim!   6.4 is going to be a great release for the UnifiedHighlighter.  I hope features like this and other improvements this release get more folks using the UH.

> UnifiedHighlighter: add target character width BreakIterator wrapper
> --------------------------------------------------------------------
>
>                 Key: LUCENE-7620
>                 URL: https://issues.apache.org/jira/browse/LUCENE-7620
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/highlighter
>            Reporter: David Smiley
>            Assignee: David Smiley
>             Fix For: 6.4
>
>         Attachments: LUCENE_7620_UH_LengthGoalBreakIterator.patch, LUCENE_7620_UH_LengthGoalBreakIterator.patch, LUCENE_7620_UH_LengthGoalBreakIterator.patch
>
>
> The original Highlighter includes a {{SimpleFragmenter}} that delineates fragments (aka Passages) by a character width.  The default is 100 characters.
> It would be great to support something similar for the UnifiedHighlighter.  It's useful in its own right and of course it helps users transition to the UH.  I'd like to do it as a wrapper to another BreakIterator -- perhaps a sentence one.  In this way you get back Passages that are a number of sentences so they will look nice instead of breaking mid-way through a sentence.  And you get some control by specifying a target number of characters.  This BreakIterator wouldn't be a general purpose java.text.BreakIterator since it would assume it's called in a manner exactly as the UnifiedHighlighter uses it.  It would probably be compatible with the PostingsHighlighter too.
> I don't propose doing this by default; besides, it's easy enough to pick your BreakIterator config.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org