You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by GitBox <gi...@apache.org> on 2020/05/20 05:07:55 UTC

[GitHub] [parquet-mr] ggershinsky commented on a change in pull request #776: PARQUET-1229: Parquet MR encryption

ggershinsky commented on a change in pull request #776:
URL: https://github.com/apache/parquet-mr/pull/776#discussion_r427743861



##########
File path: parquet-hadoop/src/main/java/org/apache/parquet/crypto/AesCipher.java
##########
@@ -68,19 +67,32 @@
 
   public static byte[] createModuleAAD(byte[] fileAAD, ModuleType moduleType, 
       short rowGroupOrdinal, short columnOrdinal, short pageOrdinal) {

Review comment:
       The links to the discussion on this,
   https://github.com/apache/parquet-format/pull/114#discussion_r234022579
   https://github.com/apache/parquet-format/pull/114#discussion_r232941138
   http://mail-archives.apache.org/mod_mbox/parquet-dev/201901.mbox/%3CCAO4re1kM4xGMNT4CGrjvA43t-QgUmUwLMskTJfd8ivgCfF8rSw%40mail.gmail.com%3E
   
   The parquet-cpp approach to this is to allow for any number of row groups in files without encryption, and to limit it to 32K in encrypted files,
   https://github.com/apache/arrow/commit/0c5168cf203ddf94dff01c178d649853323acbb8
   "While writing files with so many row groups is a bad idea, people will still do it... This .. enables reading the many-row-group files again. Files with encrypted row group metadata with that many row groups cannot be read"




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org