You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2021/01/25 08:22:14 UTC

[GitHub] [lucene-solr] dweiss commented on a change in pull request #2238: LUCENE-9693: Hunspell: check that all flags are > 0 and fit char range

dweiss commented on a change in pull request #2238:
URL: https://github.com/apache/lucene-solr/pull/2238#discussion_r563528897



##########
File path: lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/Dictionary.java
##########
@@ -1299,7 +1314,12 @@ void appendFlag(char flag, StringBuilder to) {
         if (replacement.isEmpty()) {
           continue;
         }
-        flags[upto++] = (char) Integer.parseInt(replacement);
+        int flag = Integer.parseInt(replacement);
+        if (flag == 0 || flag >= Character.MAX_VALUE) { // read default flags as well
+          throw new IllegalArgumentException(

Review comment:
       It'd be great to be consistent with exceptions when parsing input - sometimes it's ParsingException, here it's IAE. 

##########
File path: lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/Dictionary.java
##########
@@ -1310,10 +1330,8 @@ void appendFlag(char flag, StringBuilder to) {
 
     @Override
     void appendFlag(char flag, StringBuilder to) {

Review comment:
       Is this intentional? Because it changes the logic of concatenation (always leaving the trailing comma). I liked the previous version better (always leaving the output neat).

##########
File path: lucene/analysis/common/src/java/org/apache/lucene/analysis/hunspell/Stemmer.java
##########
@@ -588,7 +577,7 @@ private boolean checkCondition(
   }
 
   private boolean isFlagAppendedByAffix(int affixId, char flag) {
-    if (affixId < 0) return false;
+    if (affixId < 0 || flag == 0) return false;

Review comment:
       Wouldn't it be cleaner to add a constant alias (static variable) FLAG_UNSET for 0 and replace it throughout the code where it compares to zero? You've changed it from -1 to 0 but it really doesn't make it any clearer that it's a "default" unset state. I think it would benefit from being more verbose here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org