You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2019/10/28 06:08:27 UTC

[GitHub] [incubator-hudi] nsivabalan opened a new pull request #976: [HUDI-106] Adding support for DynamicBloomFilter

nsivabalan opened a new pull request #976: [HUDI-106] Adding support for DynamicBloomFilter
URL: https://github.com/apache/incubator-hudi/pull/976
 
 
   - Adding support for DynamicBloomFilter ([link](https://hadoop.apache.org/docs/r2.4.1/api/org/apache/hadoop/util/bloom/DynamicBloomFilter.html)) to tune bloom filter size based on total number of entries.
     - Added a BloomFilter interface and two implementations, namely SimpleBloomFilter(existing one) and HudiDynamicBloomFilter(new one). 
     - Added a BloomFilterFactory to assist in creating the right BloomFilter based on versions. 
     - Version is stored in parquet metadata footer. If version is not found, SimpleBloomFilter will be created.
     - Introduced a config named "hoodie.bloom.index.auto.tune.enable" in HoodieIndexConfig which when enabled, will create new BloomFilter as HudiDynamicBloomFilter. 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services