You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by GitBox <gi...@apache.org> on 2021/02/02 07:55:05 UTC

[GitHub] [parquet-format] ggershinsky commented on pull request #164: PARQUET-1950: Define core features

ggershinsky commented on pull request #164:
URL: https://github.com/apache/parquet-format/pull/164#issuecomment-771442874


   to add the parquet encryption angle to this discussion. This feature adds protection of confidentiality and integrity of parquet files (when they have columns with sensitive data). These security layers will make it difficult to support many of the legacy features mentioned above, like external chunks or merging multiple files into a single master file (this interferes with definition of file integrity). Reading encrypted data is also difficult before file writing is finished. All of these are not impossible, but challenging, and would require an explicit scaffolding plus some Thrift format changes. If there is a strong demand for using encryption with these legacy features, despite them being deprecated (or with some of the mentioned new features), we can plan this for future versions of parquet-format, parquet-mr etc.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org