You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/03/14 20:32:01 UTC

[GitHub] [lucene] zacharymorn commented on pull request #16: LUCENE-9634: Fix highlighting of extended intervals matched using offset

zacharymorn commented on pull request #16:
URL: https://github.com/apache/lucene/pull/16#issuecomment-798974587


   No problem Dawid and thanks for the feedback! I agree that re-analyzing the token stream does seems to be wasteful if the index already has position and offset information. I think the central issue here is that the current `Intervals#extend` API here only takes before / after values based on position instead of offset. But on the other hand, additional API that takes in / provides offset based before / after values may not be very meaningful to most client applications (I think), as offset varies based on each token length.  So highlighting offset range computed based on non-positional APIs may always require adjustment here if the before / after values are provided based on position (this may also include `OffsetsFromTokens` and `OffsetsFromTokens` strategies as well). 
   
   https://github.com/apache/lucene/blob/471f38c0310671f9856ad86ad98c4238d2b770dc/lucene/queries/src/java/org/apache/lucene/queries/intervals/Intervals.java#L251-L252
   
   However, I'm also wondering if the position to / from offset mappings were already computed / stored for other purposes and readily available in OffsetRetrievalStrategy classes? If so, we could skip the re-analysis as well.
   
   Alan, could you please let me know if you have any pointer here? I'm happy to try out different approaches to find the best solution here.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org