You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by GitBox <gi...@apache.org> on 2022/09/09 22:29:44 UTC

[GitHub] [commons-collections] Claudenw commented on pull request #331: Collections 763: Remove BloomFilter constructors that create initial entry

Claudenw commented on PR #331:
URL: https://github.com/apache/commons-collections/pull/331#issuecomment-1242528177

   there is a method in defaultindexprovidertest that will remove dups from
   the original array and sort it so that assert array equals can work.
   
   In other tests I ran the main test in a loop 5k times to try to hit such
   problems.  I must have missed this one
   
   
   On Fri, Sep 9, 2022, 23:12 Alex Herbert ***@***.***> wrote:
   
   > @Claudenw <https://github.com/Claudenw> I merged this locally and was
   > doing some minor code clean-up. It seems that there is a flaky test:
   >
   > DefaultIndexProducerTest.testAsIndexArray
   >
   > In AbstractIndexProducerTest.testAsIndexArray the producer is written to a
   > list using asIndexArray. Then the producer forEach is called to remove
   > entries from the list. However the default implementation of
   > IndexProducer.asIndexArray removes duplicates (as it uses a BitSet to
   > collate the indices). So the List can have less entries than the number
   > produced by forEach (which contains the duplicates).
   >
   > You are generating 10 values in the range [0, 512). The likelihood of no
   > duplicates is (512)(511)(510)...(504)(503) / 512^10 = 0.915. So the
   > probability of a duplicate is around 8.5%. If you checkout this current PR
   > branch and run the DefaultIndexProducerTest repeatedly you will need to do
   > it about 10 times to see a failure. This is low enough that the 4 CI builds
   > may not hit a failure.
   >
   > Q. Should asIndexArray eliminate duplicates? If this is the correct
   > functionality then I can fix this test to handle the duplicates in forEach
   > and complete the merge.
   >
   > —
   > Reply to this email directly, view it on GitHub
   > <https://github.com/apache/commons-collections/pull/331#issuecomment-1242519168>,
   > or unsubscribe
   > <https://github.com/notifications/unsubscribe-auth/AASTVHWX2B24XKAENBCDRC3V5OY55ANCNFSM57K4LUZQ>
   > .
   > You are receiving this because you were mentioned.Message ID:
   > ***@***.***>
   >
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@commons.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org