You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2022/04/29 23:01:04 UTC

[GitHub] [hudi] yihua commented on issue #5081: [SUPPORT] flink hudi produce parquet file size more than 128M

yihua commented on issue #5081:
URL: https://github.com/apache/hudi/issues/5081#issuecomment-1113850850

   @Guanpx thanks for helping.
   
   As @Guanpx mentioned, `write.parquet.max.file.size` provides the approximate target for sizing the files and you can make it smaller.  Are you using Flink SQL to write the Hudi table, since `write.parquet.max.file.size` is Flink SQL specific config? 
    `hoodie.parquet.max.file.size` achieves the same goal for other write flow, e.g., Spark.
   
   In general, the file size does not affect whether the queried data has duplicates or not.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org