You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iceberg.apache.org by GitBox <gi...@apache.org> on 2018/12/08 00:38:14 UTC

[GitHub] rdblue opened a new issue #44: Support cryptographic integrity

rdblue opened a new issue #44: Support cryptographic integrity
URL: https://github.com/apache/incubator-iceberg/issues/44
 
 
   Parquet encryption protects integrity of individual data files. However, in an untrusted storage, removal of one or more data file in a table might go unnoticed. Replacement of one file contents with another will go unnoticed, unless a user has provided a unique Parquet AAD prefix for each file.
   
   The snapshot integrity mechanism implements cryptographic protection of integrity of data sets comprised of multiple Parquet files.
   
   The mechanism works by creating a small signature file, that contains the table URI / snapshot ID and the number of files. It can also contain an explicit list of file names (with or without full path). The file contents is signed (can be also encrypted, with eg AES GCM).
   
   On the writer side, the mechanism creates AAD prefixes for every data file, and creates the signature file itself. The input is the snapshot URI, N and the encryption/signature key; plus (optionally) the list of file names.
   
   On the reader side, the mechanism parses and verifies the signature file, and provides the framework with the verified table URI / snapshot ID, number of files that must be accounted for, and the Parquet AAD prefix for each file; plus (optionally) the list of file names. The input is the signature file, encryption/signature key and (optionally) the expected table URI /snapshot ID.
   
   This issue was originally posted to the Netflix repository by @ggershinsky.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services