You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/07/21 18:41:18 UTC

[GitHub] [iceberg] huaxingao opened a new pull request, #5329: Improve bloom filter test for false positive case

huaxingao opened a new pull request, #5329:
URL: https://github.com/apache/iceberg/pull/5329

   I saw bloom filter test `Assert.assertFalse("Should not read: ...", shouldRead)` failed with false positive.
   
   This PR is to make the bloom filter test less flaky by taking consideration of fpp. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue merged pull request #5329: Improve bloom filter test

Posted by GitBox <gi...@apache.org>.
rdblue merged PR #5329:
URL: https://github.com/apache/iceberg/pull/5329


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on pull request #5329: Improve bloom filter test for false positive case

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on PR #5329:
URL: https://github.com/apache/iceberg/pull/5329#issuecomment-1192060669

   While this makes sense, I wonder if its even worth to assert that number of false positives is less than some number?  Each test asserting there's no false negatives may be good enough?
   
   I assume that there is still a failure chance, and it may not really be suitable for unit test to be run each time?  Curious what other parquet or other projects do.  Also wondering what others think.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #5329: Improve bloom filter test for false positive case

Posted by GitBox <gi...@apache.org>.
rdblue commented on PR #5329:
URL: https://github.com/apache/iceberg/pull/5329#issuecomment-1194439521

   Rather than adding code for false positives, wouldn't it be better to make the test use a consistent random seed that doesn't have a false positive? You'd need to replace `UUID.randomUUID` with a UUID built from random longs instead, but I think that would be better.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] huaxingao commented on pull request #5329: Improve bloom filter test for false positive case

Posted by GitBox <gi...@apache.org>.
huaxingao commented on PR #5329:
URL: https://github.com/apache/iceberg/pull/5329#issuecomment-1194708702

   @rdblue Thanks for the suggestion! Done.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on pull request #5329: Improve bloom filter test

Posted by GitBox <gi...@apache.org>.
rdblue commented on PR #5329:
URL: https://github.com/apache/iceberg/pull/5329#issuecomment-1195763173

   Thanks, @huaxingao!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on pull request #5329: Improve bloom filter test for false positive case

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on PR #5329:
URL: https://github.com/apache/iceberg/pull/5329#issuecomment-1193071836

   Yea I'm not a huge fan of having random failures still possible in build tests but could go either way on it.  Also curious with your buffer what is the estimated chance of failure ?  (like is it astronomically small )
   
   Wonder any thoughts from @kbendick @rdblue @RussellSpitzer 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] huaxingao commented on pull request #5329: Improve bloom filter test for false positive case

Posted by GitBox <gi...@apache.org>.
huaxingao commented on PR #5329:
URL: https://github.com/apache/iceberg/pull/5329#issuecomment-1192716851

   I think besides asserting no false negative, we probably still need to have a negative test for each of the data types? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] huaxingao commented on pull request #5329: Improve bloom filter test

Posted by GitBox <gi...@apache.org>.
huaxingao commented on PR #5329:
URL: https://github.com/apache/iceberg/pull/5329#issuecomment-1195768193

   Thank you very much! @rdblue @szehon-ho 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org