You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2019/05/07 14:48:10 UTC

[GitHub] [incubator-hudi] vinothchandar commented on issue #666: Add support for dynamic bloom filter to increase efficiency of bloom filter for static sizing

vinothchandar commented on issue #666: Add support for dynamic bloom filter to increase efficiency of bloom filter for static sizing
URL: https://github.com/apache/incubator-hudi/pull/666#issuecomment-490112102
 
 
   >there is one on size and both of them are close to 2MB, I actually rounded them off to the near megabyte, there may be differences in kilobytes.
   
   Can we test with N=`500000` fp=`0.000000001` and 10x/100x that? I think that will produce larger sizes/more fps. I would be surprised if dynamic provides much less fp's with same number of bits. All it must be doing is to use more bits as more entries come in. you can use something like https://krisives.github.io/bloom-calculator/ to design a case around this.. 
   
   I think we have to do option 1 right? In option 2 also we 'd be reading old and new files with different filter formats right? do we handle an exception and detect dynamic vs normal bf?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services