You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "gszadovszky (via GitHub)" <gi...@apache.org> on 2023/03/17 10:18:38 UTC

[GitHub] [parquet-format] gszadovszky commented on pull request #195: PARQUET-2256: Add BloomFilter Compression

gszadovszky commented on PR #195:
URL: https://github.com/apache/parquet-format/pull/195#issuecomment-1473613246

   @mapleFU, I have discovered two unfortunate issues with the format definition of bloom filters that would be nice to be corrected before adding this change. (I am also fine solving these inside this PR.):
   * We should not copy-paste parts of the thift file in the documentation. Why whould we have them in two places? I would suggest only referencing the related thrift part from the bloom filter spec or simply remove the related part.
   * Since `CompressionCodec` already has a value of `UNCOMPRESSED` the enum `BloomFilterCompression` looks wierd. I do not have a much better solution for this since we must keep backward compatibility. What do you think about renaming `COMPRESSED` to `COMPRESSION_CODEC` and deprecate `BloomFilterCompression.UNCOMPRESSED` with a note to use `COMPRESSION_CODEC = UNCOMPRESSED` instead. (Of course from the implementation read point of view we need to handle both.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@parquet.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org