You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pinot.apache.org by GitBox <gi...@apache.org> on 2018/11/29 04:42:46 UTC

[GitHub] snleee commented on issue #3528: Adding support for bloom filter

snleee commented on issue #3528: Adding support for bloom filter
URL: https://github.com/apache/incubator-pinot/pull/3528#issuecomment-442704522
 
 
   When I tried to limit the bloom filter size to 1MB (by computing max false positive using formale), I found that clear spring implementation does not behave as expected for the cases when we have high cardinality while Guava implementation is working as expected. Please refer to the size of bloom filter (Guava's bloom filter size is correctly capped at <1MB while clearspring implementation uses larger size). It seems that Guava's implementation is more robust with the capping the maximum size of bloom filter. @kishoreg 
   
   ```
   cardinality: 1,000,000
   maxFalsePosProbability: 0.05
   
   
   numBitsRequired (Estimated): 6235225
   requiredSize (Estimated): 779403
   clear spring size: 875085
   Gauva size: 779414
   Roaring bitmap size: 507068
   
   ---------------------------------------------
   
   cardinality: 3,000,000
   numHashFunction: 2
   maxFalsePosProbability: 0.2610525068636746
   
   numBitsRequired (Estimated): 8386047
   requiredSize (Estimated): 1048255
   clear spring size:1125085
   Gauva size: 1048262
   Roaring bitmap size: 742946
   
   ---------------------------------------------
   
   cardinality: 5,000,000
   numHashFunction: 1
   maxFalsePosProbability: 0.4490143136505804
   
   
   numBitsRequired (Estimated): 8332767
   requiredSize (Estimated): 1041595
   clear spring size: 625085
   Gauva size: 1041606
   Roaring bitmap size: 151554
   
   ---------------------------------------------
   
   cardinality: 10,000,000
   numHashFunction: 1
   maxFalsePosProbability: 0.696414773438059
   
   numBitsRequired (Estimated): 7530599
   requiredSize (Estimated): 941324
   clear spring size: 1250085
   Gauva size: 941334
   ---------------------------------------------
   
   cardinality: 30,000,000
   numHashFunction: 1
   maxFalsePosProbability: 0.9720203742797628
   
   numBitsRequired (Estimated): 1771985
   requiredSize (Estimated): 221498
   clear spring size: 3750085
   Gauva size: 221510
   ----------------------------------------------
   
   numHashFunction: 1
   cardinality: 50,000,000
   maxFalsePosProbability: 0.9974212860608853
   
   numBitsRequired (Estimated): 268710
   requiredSize (Estimated): 33588
   clear spring size: 6250085
   Gauva size: 33598
   ```

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pinot.apache.org
For additional commands, e-mail: dev-help@pinot.apache.org