You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/10/16 13:04:01 UTC

[GitHub] [hudi] nsivabalan edited a comment on issue #2178: [SUPPORT] Hudi writing 10MB worth of org.apache.hudi.bloomfilter data in each of the parquet files produced

nsivabalan edited a comment on issue #2178:
URL: https://github.com/apache/hudi/issues/2178#issuecomment-710031904


   If you wish to have dynamic bloom filter that scales its size as the number of entries increase, you can try it out. 
   Remember this is different from hoodie.index.type which refers to BLOOM/GLOBAL_BLOOM, etc. 
   The config of interest is 
   hoodie.bloom.index.filter.type = SIMPLE/DYNAMIC_V0
   
   for DYNAMIC_V0, you need to set an extra config. 
   hoodie.bloom.index.filter.dynamic.max.entries
   
   Basically the bloom will be initialized based on hoodie.index.bloom.num_entries. but as number of entries added reaches this value, bloom dynamically scales and increases its bitsize. This goes on upto "hoodie.bloom.index.filter.dynamic.max.entries". So, until this the ffp will be honored. After this the ffp may not be honored as the bloom may not grow beyond this. 
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org