You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@parquet.apache.org by zi...@apache.org on 2018/09/27 15:27:12 UTC

[parquet-format] branch encryption updated (ea74204 -> afc943e)

This is an automated email from the ASF dual-hosted git repository.

zivanfi pushed a change to branch encryption
in repository https://gitbox.apache.org/repos/asf/parquet-format.git.


    from ea74204  [maven-release-plugin] prepare for next development iteration
     new 7db3f45  PARQUET-1227: Thrift crypto metadata structures (#94)
     new 52d896e  PARQUET-1398: move iv_prefix to Algorithms (#103)
     new afc943e  PARQUET-1401: optional RowGroup fields for handling hidden columns (#104)

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 src/main/thrift/parquet.thrift | 66 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 66 insertions(+)


[parquet-format] 03/03: PARQUET-1401: optional RowGroup fields for handling hidden columns (#104)

Posted by zi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

zivanfi pushed a commit to branch encryption
in repository https://gitbox.apache.org/repos/asf/parquet-format.git

commit afc943ef3f0d5aa45c0b6817e81ea4a93281d898
Author: ggershinsky <gg...@users.noreply.github.com>
AuthorDate: Tue Aug 28 15:56:59 2018 +0300

    PARQUET-1401: optional RowGroup fields for handling hidden columns (#104)
---
 src/main/thrift/parquet.thrift | 7 +++++++
 1 file changed, 7 insertions(+)

diff --git a/src/main/thrift/parquet.thrift b/src/main/thrift/parquet.thrift
index e3857aa..c05e871 100644
--- a/src/main/thrift/parquet.thrift
+++ b/src/main/thrift/parquet.thrift
@@ -725,6 +725,13 @@ struct RowGroup {
    * The sorting columns can be a subset of all the columns.
    */
   4: optional list<SortingColumn> sorting_columns
+
+  /** Byte offset from beginning of file to first page (data or dictionary)
+   * in this row group **/
+  5: optional i64 file_offset
+
+  /** Total byte size of all compressed column data in this row group **/
+  6: optional i64 total_compressed_size
 }
 
 /** Empty struct to signal the order defined by the physical or logical type */


[parquet-format] 01/03: PARQUET-1227: Thrift crypto metadata structures (#94)

Posted by zi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

zivanfi pushed a commit to branch encryption
in repository https://gitbox.apache.org/repos/asf/parquet-format.git

commit 7db3f451c84203d9801ef394a55202755906019a
Author: ggershinsky <gg...@users.noreply.github.com>
AuthorDate: Mon Jul 23 15:48:06 2018 +0300

    PARQUET-1227: Thrift crypto metadata structures (#94)
    
    New Thrift structures for Parquet modular encryption.
---
 src/main/thrift/parquet.thrift | 53 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 53 insertions(+)

diff --git a/src/main/thrift/parquet.thrift b/src/main/thrift/parquet.thrift
index 6c9011b..788c55e 100644
--- a/src/main/thrift/parquet.thrift
+++ b/src/main/thrift/parquet.thrift
@@ -662,6 +662,22 @@ struct ColumnMetaData {
   13: optional list<PageEncodingStats> encoding_stats;
 }
 
+struct EncryptionWithFooterKey {
+}
+
+struct EncryptionWithColumnKey {
+  /** Column path in schema **/
+  1: required list<string> path_in_schema
+  
+  /** Retrieval metadata of the column-specific key **/
+  2: optional binary column_key_metadata
+}
+
+union ColumnCryptoMetaData {
+  1: EncryptionWithFooterKey ENCRYPTION_WITH_FOOTER_KEY
+  2: EncryptionWithColumnKey ENCRYPTION_WITH_COLUMN_KEY
+}
+
 struct ColumnChunk {
   /** File where column data is stored.  If not set, assumed to be same file as
     * metadata.  This path is relative to the current file.
@@ -688,6 +704,9 @@ struct ColumnChunk {
 
   /** Size of ColumnChunk's ColumnIndex, in bytes **/
   7: optional i32 column_index_length
+  
+  /** Crypto metadata of encrypted columns **/
+  8: optional ColumnCryptoMetaData crypto_meta_data
 }
 
 struct RowGroup {
@@ -879,3 +898,37 @@ struct FileMetaData {
   7: optional list<ColumnOrder> column_orders;
 }
 
+struct AesGcmV1 {
+  /** Retrieval metadata of AAD used for encryption of pages and structures **/
+  1: optional binary aad_metadata
+}
+
+struct AesGcmCtrV1 {
+  /** Retrieval metadata of AAD used for encryption of structures **/
+  1: optional binary aad_metadata
+}
+
+union EncryptionAlgorithm {
+  1: AesGcmV1 AES_GCM_V1
+  2: AesGcmCtrV1 AES_GCM_CTR_V1
+}
+
+struct FileCryptoMetaData {
+  1: required EncryptionAlgorithm encryption_algorithm
+  
+  /** Parquet footer can be encrypted, or left as plaintext **/
+  2: required bool encrypted_footer
+    
+  /** Retrieval metadata of key used for encryption of footer, 
+   *  and (possibly) columns **/
+  3: optional binary footer_key_metadata
+
+  /** Offset of Parquet footer (encrypted, or plaintext) **/
+  4: required i64 footer_offset
+  
+  /** If file IVs are comprised of a fixed part,
+   *  and variable parts (random or counter), keep the fixed
+   *  part here **/
+  5: optional binary iv_prefix
+}
+


[parquet-format] 02/03: PARQUET-1398: move iv_prefix to Algorithms (#103)

Posted by zi...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

zivanfi pushed a commit to branch encryption
in repository https://gitbox.apache.org/repos/asf/parquet-format.git

commit 52d896e6ad2e6c66e4f0ab7892a7ca96f4215cf5
Author: ggershinsky <gg...@users.noreply.github.com>
AuthorDate: Tue Aug 28 15:56:23 2018 +0300

    PARQUET-1398: move iv_prefix to Algorithms (#103)
---
 src/main/thrift/parquet.thrift | 16 +++++++++++-----
 1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/src/main/thrift/parquet.thrift b/src/main/thrift/parquet.thrift
index 788c55e..e3857aa 100644
--- a/src/main/thrift/parquet.thrift
+++ b/src/main/thrift/parquet.thrift
@@ -901,11 +901,22 @@ struct FileMetaData {
 struct AesGcmV1 {
   /** Retrieval metadata of AAD used for encryption of pages and structures **/
   1: optional binary aad_metadata
+
+  /** If file IVs are comprised of a fixed part, and variable parts
+   *  (e.g. counter), keep the fixed part here **/
+  2: optional binary iv_prefix
+ 
 }
 
 struct AesGcmCtrV1 {
   /** Retrieval metadata of AAD used for encryption of structures **/
   1: optional binary aad_metadata
+
+  /** If file IVs are comprised of a fixed part, and variable parts
+   *  (e.g. counter), keep the fixed part here **/
+  2: optional binary gcm_iv_prefix
+
+  3: optional binary ctr_iv_prefix
 }
 
 union EncryptionAlgorithm {
@@ -925,10 +936,5 @@ struct FileCryptoMetaData {
 
   /** Offset of Parquet footer (encrypted, or plaintext) **/
   4: required i64 footer_offset
-  
-  /** If file IVs are comprised of a fixed part,
-   *  and variable parts (random or counter), keep the fixed
-   *  part here **/
-  5: optional binary iv_prefix
 }