You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/05/20 17:43:44 UTC

[GitHub] [iceberg] kbendick commented on issue #4813: [FEATURE REQUEST] The Bloom Filter for Parquet formats is necessary

kbendick commented on issue #4813:
URL: https://github.com/apache/iceberg/issues/4813#issuecomment-1133160841

   Hi @Zhangg7723! You are right that bloom filter in the data files will be useful.
   
   It is however somewhat difficult to get right, as a lot of tuning and potentially knowledge of NDV count would need to be known ahead of time (or waste a potentially significant amount of space in the bloom filter).
   
   I can say however, that this issue is being worked on.
   
   @huaxingao from Apple is working on this and has reached out to the original PR author. I believe they are going to merge Apple's code with that of @jshmchenxi. The two of them would know more about it than I would, but
   
   **TLDR** - This is an area of active work and not something that has been forgotten 🙂 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org