You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2020/10/16 07:25:07 UTC

[GitHub] [hudi] KarthickAN edited a comment on issue #2178: [SUPPORT] Hudi writing 10MB worth of org.apache.hudi.bloomfilter data in each of the parquet files produced

KarthickAN edited a comment on issue #2178:
URL: https://github.com/apache/hudi/issues/2178#issuecomment-709875380


   @bvaradar @nsivabalan I did run some test around this issue. So I ran the job after setting the config hoodie.index.bloom.num_entries to 1500000 and inspected the file produced. There are 1000 records in total with total size 165381 bytes and then 10.2MB data for the bloom filter and the total size of the file was 10.2MB.
   
   After that I removed the config for hoodie.index.bloom.num_entries and ran the job with the default. This time I see same 1000 records with size 165381 and only 422KB data for bloom filter and the total size of the file was 428KB.
   
   So this issue happens when I set value for the hoodie.index.bloom.num_entries to 1500000.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org