You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by GitBox <gi...@apache.org> on 2022/09/09 22:12:37 UTC

[GitHub] [commons-collections] aherbert commented on pull request #331: Collections 763: Remove BloomFilter constructors that create initial entry

aherbert commented on PR #331:
URL: https://github.com/apache/commons-collections/pull/331#issuecomment-1242519168

   @Claudenw I merged this locally and was doing some minor code clean-up. It seems that there is a flaky test:
   
   DefaultIndexProducerTest.testAsIndexArray
   
   In AbstractIndexProducerTest.testAsIndexArray the producer is written to a list using asIndexArray. Then the producer forEach is called to remove entries from the list. However the default implementation of IndexProducer.asIndexArray removes duplicates (as it uses a BitSet to collate the indices). So the List<Integer> can have less entries than the number produced by forEach (which contains the duplicates).
   
   You are generating 10 values in the range [0, 512). The likelihood of no duplicates is (512)(511)(510)...(504)(503) / 512^10 = 0.915. So the probability of a duplicate is around 8.5%. If you checkout this current PR branch and run the DefaultIndexProducerTest repeatedly you will need to do it about 10 times to see a failure. This is low enough that the 4 CI builds may not hit a failure.
   
   Q. Should asIndexArray eliminate duplicates? If this is the correct functionality then I can fix this test to handle the duplicates in forEach and complete the merge.
   
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org