You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by GitBox <gi...@apache.org> on 2020/04/29 13:03:10 UTC

[GitHub] [commons-lang] XenoAmess opened a new pull request #529: [bugfix] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

XenoAmess opened a new pull request #529:
URL: https://github.com/apache/commons-lang/pull/529


   see details in tests.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] garydgregory commented on pull request #529: [bugfix] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
garydgregory commented on pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#issuecomment-621877668


   @XenoAmess It would be best to test what we know is wrong instead of some blanket "Something in here goes wrong and we don't know exactly where."


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] XenoAmess commented on a change in pull request #529: [LANG-1545] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
XenoAmess commented on a change in pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#discussion_r429722793



##########
File path: src/test/java/org/apache/commons/lang3/StringUtilsTest.java
##########
@@ -3301,4 +3301,22 @@ public void testToRootUpperCase() {
             Locale.setDefault(defaultLocales);
         }
     }
+
+    @Test
+    public void testGeorgianSample() {
+        char[] arrayI = new char[]{(char) 305, (char) 1012};

Review comment:
       @garydgregory done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] garydgregory commented on a change in pull request #529: [LANG-1545] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
garydgregory commented on a change in pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#discussion_r429681039



##########
File path: src/test/java/org/apache/commons/lang3/StringUtilsTest.java
##########
@@ -3301,4 +3301,22 @@ public void testToRootUpperCase() {
             Locale.setDefault(defaultLocales);
         }
     }
+
+    @Test
+    public void testGeorgianSample() {
+        char[] arrayI = new char[]{(char) 305, (char) 1012};

Review comment:
       Please use hexadecimal for the values with a comment that shows the Unicode name for each character. Otherwise, there is not way to tell what this is doing without knowing the Unicode values from memory.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] XenoAmess commented on pull request #529: [bugfix] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
XenoAmess commented on pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#issuecomment-622387620


   @garydgregory  Is the test now acceptable? Should I rebase the two commits to 1 commit? or any other thing I shall do?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] XenoAmess commented on pull request #529: [bugfix] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
XenoAmess commented on pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#issuecomment-633215575


   @garydgregory , anything we can do to get this moving?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] XenoAmess edited a comment on pull request #529: [bugfix] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
XenoAmess edited a comment on pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#issuecomment-621862537


   > Hi @XenoAmess ,
   > Can the test be simplified to test a single character?
   
   Actually this test has already been simplified from[0 , Character.MAX_VALUE] to [0,1000]
   I'm not sure if it is good to do more simplity work as it might hide some errors.
   However if you really want to simplify it, I can try to write codes to generate a list of suspicious chars pairs, and only check them.
   
   > Is this related to https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/ ?
   
   I know nothing about Turkey but it seems like similar type of problem (according to your descriptions in that url.)
   I'm not a philologist though, I cannot make it for sure.
   However this bug was found on a english-system-language machine, so I thought it might be far severe than the url you posted.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] XenoAmess edited a comment on pull request #529: [bugfix] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
XenoAmess edited a comment on pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#issuecomment-621862537


   > Hi @XenoAmess ,
   > Can the test be simplified to test a single character?
   
   Actually this test has already been simplified from[0 , Character.MAX_VALUE] to [0,1000]
   I'm not sure if it is good to do more simplity work as it might hide some errors.
   However if you really want to simplify it, I can try to write codes to generate a list of suspicious chars pairs, and only check them.
   
   > Is this related to https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/ ?
   
   I know nothing about Turkey but it seems like similar type of problem (according to your descriptions in that url.)
   I'm not a philologist though, I cannot make it for sure.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] XenoAmess edited a comment on pull request #529: [LANG-1545] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
XenoAmess edited a comment on pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#issuecomment-633563513






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] coveralls edited a comment on pull request #529: [LANG-1545] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
coveralls edited a comment on pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#issuecomment-621222482


   
   [![Coverage Status](https://coveralls.io/builds/31012644/badge)](https://coveralls.io/builds/31012644)
   
   Coverage increased (+0.01%) to 95.105% when pulling **f83d19802a7d70a539fab111f59d7e55ef1f5210 on XenoAmess:fix_bug_in_StringUtils_about_Georgian** into **f6923510352fc3fbfad68bc6c5ac5258a34671b7 on apache:master**.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] XenoAmess commented on pull request #529: [LANG-1545] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
XenoAmess commented on pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#issuecomment-633563513


   > On that page, the U+01D3 is "Latin Capital Letter U with caron" on this GitHub page it looks like "Greek theta symbol". Which one are you testing?
   @garydgregory originally I found it with something uncorrect about [Greek Capital Letter Theta] and [Greek Theta Symbol] .
   And then I started a fully test of every pair of unicode chars and found the bug also appears when [Latin Small Letter dotless I] and [Latin Capital Letter I with dot above].
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] XenoAmess commented on pull request #529: [bugfix] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
XenoAmess commented on pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#issuecomment-621881530


   > @XenoAmess It would be best to test what we know is wrong instead of some blanket "Something in here goes wrong and we don't know exactly where."
   
   Agreed. Test  tighten now.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] coveralls edited a comment on pull request #529: [LANG-1545] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
coveralls edited a comment on pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#issuecomment-621222482


   
   [![Coverage Status](https://coveralls.io/builds/31004634/badge)](https://coveralls.io/builds/31004634)
   
   Coverage increased (+0.01%) to 95.105% when pulling **da4a5d971392bc189c4dfaf6928db5e0eea8d192 on XenoAmess:fix_bug_in_StringUtils_about_Georgian** into **f6923510352fc3fbfad68bc6c5ac5258a34671b7 on apache:master**.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] XenoAmess commented on pull request #529: [bugfix] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
XenoAmess commented on pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#issuecomment-621862537


   > Hi @XenoAmess ,
   > Can the test be simplified to test a single character?
   Actually this test has already been simplified from[0 , Character.MAX_VALUE] to [0,1000]
   I'm not sure if it is good to do more simplity work as it might hide some errors.
   However if you really want to simplify it, I can try to write codes to generate a list of suspicious chars pairs, and only check them.
   
   > Is this related to https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/ ?
   I know nothing about Turkey but it seems like similar type of problem (according to your descriptions in that url.)
   I'm not a philologist though, I cannot make it for sure.
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] coveralls commented on pull request #529: [bugfix] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
coveralls commented on pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#issuecomment-621222482


   
   [![Coverage Status](https://coveralls.io/builds/30431448/badge)](https://coveralls.io/builds/30431448)
   
   Coverage increased (+0.01%) to 95.104% when pulling **9bcdd6225d1f6e82d73b0492879ed1f01d2b1a0d on XenoAmess:fix_bug_in_StringUtils_about_Georgian** into **f6923510352fc3fbfad68bc6c5ac5258a34671b7 on apache:master**.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] coveralls edited a comment on pull request #529: [bugfix] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
coveralls edited a comment on pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#issuecomment-621222482


   
   [![Coverage Status](https://coveralls.io/builds/30465166/badge)](https://coveralls.io/builds/30465166)
   
   Coverage increased (+0.01%) to 95.104% when pulling **b4556b0c845e49e64b759973332ee9f4faef4a51 on XenoAmess:fix_bug_in_StringUtils_about_Georgian** into **f6923510352fc3fbfad68bc6c5ac5258a34671b7 on apache:master**.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] garydgregory commented on pull request #529: [bugfix] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
garydgregory commented on pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#issuecomment-621832930


   Hi @XenoAmess ,
   Can the test be simplified to test a single character?
   Is this related to https://garygregory.wordpress.com/2015/11/03/java-lowercase-conversion-turkey/ ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] garydgregory commented on a change in pull request #529: [LANG-1545] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
garydgregory commented on a change in pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#discussion_r429681039



##########
File path: src/test/java/org/apache/commons/lang3/StringUtilsTest.java
##########
@@ -3301,4 +3301,22 @@ public void testToRootUpperCase() {
             Locale.setDefault(defaultLocales);
         }
     }
+
+    @Test
+    public void testGeorgianSample() {
+        char[] arrayI = new char[]{(char) 305, (char) 1012};

Review comment:
       Hi @XenoAmess
   Please use hexadecimal for the values with a comment that shows the Unicode name for each character. Otherwise, there is not way to tell what this is doing without knowing the Unicode values from memory.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [commons-lang] XenoAmess commented on pull request #529: [bugfix] CharSequenceUtils.regionMatches is wrong dealing with Georgian.

Posted by GitBox <gi...@apache.org>.
XenoAmess commented on pull request #529:
URL: https://github.com/apache/commons-lang/pull/529#issuecomment-621574010


   @garydgregory Hi gary


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org