You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/06/13 22:59:25 UTC

[GitHub] [pinot] kkrugler commented on pull request #8878: Optimize the immutable STRING/BYTES dictionary lookup

kkrugler commented on PR #8878:
URL: https://github.com/apache/pinot/pull/8878#issuecomment-1154530171

   Re normalization. Yes, you can do this, and if so then something like Unicode Normalization Form C (NFC) would be good. But then you need to make sure you also normalize all query text in the same way, of course. Plus there's the backwards compatibility issue for existing segments.
   
   And it's not really an issue for this optimization, if you assume the user is ingesting text normalized (or not) as they see fit, and the query text uses the same normalization (or not). If you do decide to always normalize, then I'd suggest making that a separate PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org