You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "alexeykudinkin (via GitHub)" <gi...@apache.org> on 2023/02/06 17:32:20 UTC

[GitHub] [hudi] alexeykudinkin commented on issue #7817: Hudi table compression encoding problem

alexeykudinkin commented on issue #7817:
URL: https://github.com/apache/hudi/issues/7817#issuecomment-1419464212

   @DavidZ1 historically Hudi have not been infixing compression codec name into the filename. While this is technically feasible, that is a considerable change to the Hudi's filename format and there would need to be a very clear benefit of modifying that to warrant the required migration.
   
   A few things to note:
    - Tables are not required to have homogeneous in respect to compression codec (compression codec is actually bound to an individual Parquet column level and therefore you can even have different columns being compressed w/ different codecs). This is especially handy during migration from one codec onto another.
   
    - It's actually very easy to determine the compression codec of the provided parquet file using `parquet-tools meta` utility


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org