You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@iceberg.apache.org by GitBox <gi...@apache.org> on 2018/11/30 22:05:21 UTC

[GitHub] omalley commented on issue #20: Encryption in Data Files

omalley commented on issue #20: Encryption in Data Files
URL: https://github.com/apache/incubator-iceberg/issues/20#issuecomment-443353851
 
 
   You certainly want to use a KeyManager/Provider. I'd suggest that you take a look at the API that I made for ORC. 
   
   https://github.com/apache/orc/blob/280cb5a6a6122ad79d04dc5ca9c76b2281f6153a/java/shims/src/java/org/apache/orc/impl/HadoopShims.java#L129
   
   You are going to want to support the cloud Key Management Servers (KMS) and thus you need to be compatible with their APIs.
   
   Given that you have a KeyManager, you want to record the key name to encrypt with. On the read side, you pretty much need to put the encrypted local key and IV into the file's metadata. You absolutely can't have a fixed IV for the table and shouldn't have a fixed local key for the table.
   
   For column encryption, it becomes more interesting. There you have:
   * a key name
   * the columns to encrypt with which key
   * how to mask the unencrypted version of the data (defaulting to nullify)
   * a local key per file / encrypted column, which will be stored encrypted in the file's metadata
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services