You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucenenet.apache.org by GitBox <gi...@apache.org> on 2020/07/29 17:46:07 UTC

[GitHub] [lucenenet] NightOwl888 commented on issue #296: IndexOutOfRangeException when searching

NightOwl888 commented on issue #296:
URL: https://github.com/apache/lucenenet/issues/296#issuecomment-665507225


   I traced an issue that was causing another `IndexOutOfRangeException` in the `ThaiTokenizer` to an invalid cast from `int` to `char` that was causing it to filter out surrogate pairs when it shouldn't have been. This is the second such issue I found this week, and searching through the analyzers for the string `(char)`, this appears to be a problem that affects several of them. This is definitely a bug that we will need to address.
   
   It might also be useful to know whether the problem you are seeing is happening in all cultures. In Java, none of the methods are culture-sensitive, so to match the behavior we should be using the invariant culture. .NET has [several methods that are culture-sensitive by default](https://docs.microsoft.com/en-us/dotnet/standard/base-types/best-practices-strings). While we have gone through to ensure we are not calling any of them in places where we shouldn't be, there could be a case or two that were missed or were recently added. If you switch the current thread to the invariant culture, does it cause the problem to go away?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org