You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by GitBox <gi...@apache.org> on 2022/04/30 17:37:18 UTC

[GitHub] [lucene] mikemccand opened a new pull request, #858: LUCENE-10551: try to improve testing of LowercaseAsciiCompression

mikemccand opened a new pull request, #858:
URL: https://github.com/apache/lucene/pull/858

   # Description
   
   Just improving testing based on the user-reported `IllegalArgumentException`, but the new tests seem to pass in my few runs ... maybe Jenkins CI builds will uncover an interesting failing seed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand merged pull request #858: LUCENE-10551: try to improve testing of LowercaseAsciiCompression

Posted by GitBox <gi...@apache.org>.
mikemccand merged PR #858:
URL: https://github.com/apache/lucene/pull/858


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on a diff in pull request #858: LUCENE-10551: try to improve testing of LowercaseAsciiCompression

Posted by GitBox <gi...@apache.org>.
mikemccand commented on code in PR #858:
URL: https://github.com/apache/lucene/pull/858#discussion_r862855939


##########
lucene/core/src/java/org/apache/lucene/util/compress/LowercaseAsciiCompression.java:
##########
@@ -111,14 +111,13 @@ public static boolean compress(byte[] in, int len, byte[] tmp, DataOutput out)
           numExceptions2++;
         }
       }
+
+      // TODO: shouldn't this really be an assert instead?  but then this real "if" triggered
+      // LUCENE-10551 so maybe it should remain a real "if":

Review Comment:
   OK lemme make a quick PR -- I'll reword the comment to reference PUAFIF and switch to `AssertionError`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on a diff in pull request #858: LUCENE-10551: try to improve testing of LowercaseAsciiCompression

Posted by GitBox <gi...@apache.org>.
mikemccand commented on code in PR #858:
URL: https://github.com/apache/lucene/pull/858#discussion_r862846952


##########
lucene/core/src/java/org/apache/lucene/util/compress/LowercaseAsciiCompression.java:
##########
@@ -111,14 +111,13 @@ public static boolean compress(byte[] in, int len, byte[] tmp, DataOutput out)
           numExceptions2++;
         }
       }
+
+      // TODO: shouldn't this really be an assert instead?  but then this real "if" triggered
+      // LUCENE-10551 so maybe it should remain a real "if":

Review Comment:
   +1, that logic makes sense to me -- it's an `assert` at heart (its purpose is to find bugs in our code, not let the caller know that something went wrong with their inputs), but for (clearly well-founded!) paranoia and low-cost we upgraded it to a real `if`.
   
   But maybe to make it clear that it is a "paranoid upgraded `assert` flavored `if`" (a PUAFIF) we should throw `AssertionError` instead?  Because the user who opened the issue thought this was a case of not being able to compress the string and we should return `false`, but rather it is a case of bug-in-something.  If they saw the `AssertionError` maybe they would realize it's a bug-in-something (a BIS)?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] rmuir commented on a diff in pull request #858: LUCENE-10551: try to improve testing of LowercaseAsciiCompression

Posted by GitBox <gi...@apache.org>.
rmuir commented on code in PR #858:
URL: https://github.com/apache/lucene/pull/858#discussion_r862383315


##########
lucene/core/src/test/org/apache/lucene/util/compress/TestLowercaseAsciiCompression.java:
##########
@@ -118,4 +134,12 @@ public void testRandom() throws IOException {
       doTestCompress(bytes, len);
     }
   }
+
+  public void testAsciiCompressionRandom2() throws IOException {
+    for (int iter = 0; iter < 1000000; ++iter) {

Review Comment:
   Thanks for doing this! Maybe run this many iterations with Nightly, but use a smaller amount otherwise? Test is slow as it is:
   ```
   > Task :lucene:core:test
   :lucene:core:test (SUCCESS): 5456 test(s), 245 skipped
   
   > Task :lucene:core:wipeTaskTemp
   The slowest tests (exceeding 500 ms) during this run:
     64.55s TestLowercaseAsciiCompression.testAsciiCompressionRandom2 (:lucene:core)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a diff in pull request #858: LUCENE-10551: try to improve testing of LowercaseAsciiCompression

Posted by GitBox <gi...@apache.org>.
jpountz commented on code in PR #858:
URL: https://github.com/apache/lucene/pull/858#discussion_r862837603


##########
lucene/core/src/java/org/apache/lucene/util/compress/LowercaseAsciiCompression.java:
##########
@@ -111,14 +111,13 @@ public static boolean compress(byte[] in, int len, byte[] tmp, DataOutput out)
           numExceptions2++;
         }
       }
+
+      // TODO: shouldn't this really be an assert instead?  but then this real "if" triggered
+      // LUCENE-10551 so maybe it should remain a real "if":

Review Comment:
   Indeed this was my reasoning when I added this exception: since it should be cheap to check anyway, the exception has the benefit of letting us know about the problem even when assertions are disabled.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on a diff in pull request #858: LUCENE-10551: try to improve testing of LowercaseAsciiCompression

Posted by GitBox <gi...@apache.org>.
mikemccand commented on code in PR #858:
URL: https://github.com/apache/lucene/pull/858#discussion_r862869056


##########
lucene/core/src/java/org/apache/lucene/util/compress/LowercaseAsciiCompression.java:
##########
@@ -111,14 +111,13 @@ public static boolean compress(byte[] in, int len, byte[] tmp, DataOutput out)
           numExceptions2++;
         }
       }
+
+      // TODO: shouldn't this really be an assert instead?  but then this real "if" triggered
+      // LUCENE-10551 so maybe it should remain a real "if":

Review Comment:
   https://github.com/apache/lucene/pull/861



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on a diff in pull request #858: LUCENE-10551: try to improve testing of LowercaseAsciiCompression

Posted by GitBox <gi...@apache.org>.
mikemccand commented on code in PR #858:
URL: https://github.com/apache/lucene/pull/858#discussion_r862846952


##########
lucene/core/src/java/org/apache/lucene/util/compress/LowercaseAsciiCompression.java:
##########
@@ -111,14 +111,13 @@ public static boolean compress(byte[] in, int len, byte[] tmp, DataOutput out)
           numExceptions2++;
         }
       }
+
+      // TODO: shouldn't this really be an assert instead?  but then this real "if" triggered
+      // LUCENE-10551 so maybe it should remain a real "if":

Review Comment:
   +1, that logic makes sense to me -- it's an `assert` at heart (its purpose is to find bugs in our code, not let the caller know that something went wrong with their inputs), but for (clearly well-founded!) paranoia and low-cost we upgraded it to a real `if`.
   
   But maybe to make it clear that it is a "paranoid upgraded `assert` flavored `if`" we should throw `AssertionError` instead?  Because the user who opened the issue thought this was a case of not being able to compress the string and we should return `false`, but rather it is a case of bug-in-something.  If they saw the `AssertionError` maybe they would realize it's a bug-in-something (a BIS)?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] mikemccand commented on pull request #858: LUCENE-10551: try to improve testing of LowercaseAsciiCompression

Posted by GitBox <gi...@apache.org>.
mikemccand commented on PR #858:
URL: https://github.com/apache/lucene/pull/858#issuecomment-1114025936

   And to improve testing we should add a higher level test indexing "difficult" terms through IW/blocktree so that the invocations in the test (via blocktree and not the direct low-level API) match what we saw LUCENE-10551.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org


[GitHub] [lucene] jpountz commented on a diff in pull request #858: LUCENE-10551: try to improve testing of LowercaseAsciiCompression

Posted by GitBox <gi...@apache.org>.
jpountz commented on code in PR #858:
URL: https://github.com/apache/lucene/pull/858#discussion_r862850093


##########
lucene/core/src/java/org/apache/lucene/util/compress/LowercaseAsciiCompression.java:
##########
@@ -111,14 +111,13 @@ public static boolean compress(byte[] in, int len, byte[] tmp, DataOutput out)
           numExceptions2++;
         }
       }
+
+      // TODO: shouldn't this really be an assert instead?  but then this real "if" triggered
+      // LUCENE-10551 so maybe it should remain a real "if":

Review Comment:
   This would work for me, I'm often torn between ISE and AssertionError.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org