lucenenet-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From GitBox <...@apache.org>
Subject [GitHub] [lucenenet] NightOwl888 commented on issue #296: IndexOutOfRangeException when searching
Date Wed, 29 Jul 2020 17:46:07 GMT

NightOwl888 commented on issue #296:
URL: https://github.com/apache/lucenenet/issues/296#issuecomment-665507225


   I traced an issue that was causing another `IndexOutOfRangeException` in the `ThaiTokenizer`
to an invalid cast from `int` to `char` that was causing it to filter out surrogate pairs
when it shouldn't have been. This is the second such issue I found this week, and searching
through the analyzers for the string `(char)`, this appears to be a problem that affects several
of them. This is definitely a bug that we will need to address.
   
   It might also be useful to know whether the problem you are seeing is happening in all
cultures. In Java, none of the methods are culture-sensitive, so to match the behavior we
should be using the invariant culture. .NET has [several methods that are culture-sensitive
by default](https://docs.microsoft.com/en-us/dotnet/standard/base-types/best-practices-strings).
While we have gone through to ensure we are not calling any of them in places where we shouldn't
be, there could be a case or two that were missed or were recently added. If you switch the
current thread to the invariant culture, does it cause the problem to go away?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



Mime
View raw message