You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by David Smiley <da...@gmail.com> on 2019/12/30 15:57:18 UTC

Highlighting and passage sizing backwards-compatibility

@lucene.experimentalI want to draw some attention to a change coming in
LUCENE-9093 relating to the UnifiedHighlighter and how it sizes Passages.
I'll link to the pertinent summary comment:
https://issues.apache.org/jira/browse/LUCENE-9093?focusedCommentId=17005403&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17005403

The contributor and I are very happy with the improvements and we think
they are good for basically everyone.  Despite a new configuration option
that can be set in a way that is close to the previous behavior, it's not
identical.  Consequently, if someone wrote highlighting tests in their app
that assert final passages, and lets say configured the sizing alignment to
be closer to the current behavior, it's going to be different some of the
time.  Perhaps 5% of the time as a very rough guess?  If the new "0.5"
default is chosen then probably much higher at ~30% (another rough guess).
Nonetheless we made these changes because we think the results are better.
So if it breaks someone's tests, well they can and should update them
because the fragments will be sized better.  For users that demand the
utmost control, it remains possible for them to supply a BreakIterator impl
of their choosing and avoid LengthGoalBreakIterator.

Note:  The UnifiedHighlighter is labelled @lucene.experimental

Are others cool with this?  If we *had* to retain the old behavior, we
could in 8.x choose the pivot point based on the left edge of the first
match, as it did before.  That would still leave most of the good change
here but some of the finess would require users to wait to 9.

~ David Smiley
Apache Lucene/Solr Search Developer
http://www.linkedin.com/in/davidwsmiley