You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "David Smiley (JIRA)" <ji...@apache.org> on 2016/11/25 15:17:59 UTC

[jira] [Commented] (LUCENE-3080) cutover highlighter to BytesRef

    [ https://issues.apache.org/jira/browse/LUCENE-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15696088#comment-15696088 ] 

David Smiley commented on LUCENE-3080:
--------------------------------------

I'm tempted to close this as Won't-Fix.
The original Highlighter is what it is; it has only been updated to support different queries in WeightedSpanTermExtractor.

The notion of switching to byte based offsets in the analysis chain together with highlighting using that is interesting!  It could in part accelerate highlighting massive docs to avoid reading massive Strings into memory.  *But I feel a separate issue should be filed for that*, starting just with changes to the analysis chain.  If we do that, I suppose this would become metadata on the FieldInfo so it's understood what kind of offsets are stored, Java char offsets, or byte offsets.  That way the highlighter could validate this assumption up instead of having subtle errors.

Highlighting numerics, IMO, *should also be a separate issue*.  It simply doesn't match the title of this issue, nor large parts of what is being discussed here.  It'd be cool to add support for highlighting numbers into the UnifiedHighlighter; I've thought through how to do that before.

> cutover highlighter to BytesRef
> -------------------------------
>
>                 Key: LUCENE-3080
>                 URL: https://issues.apache.org/jira/browse/LUCENE-3080
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/highlighter
>            Reporter: Michael McCandless
>
> Highlighter still uses char[] terms (consumes tokens from the analyzer as char[] not as BytesRef), which is causing problems for merging SOLR-2497 to trunk.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org