You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Gabor Szadovszky (Jira)" <ji...@apache.org> on 2021/08/05 10:38:00 UTC
[jira] [Commented] (PARQUET-2071) Encryption translation tool
[ https://issues.apache.org/jira/browse/PARQUET-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17393788#comment-17393788 ]
Gabor Szadovszky commented on PARQUET-2071:
-------------------------------------------
I think it is a great idea to skip unnecessary deserialization/serialization steps in such cases. Meanwhile, we already have some tools with similar approach like trans-compression or prune columns. What do you think of implementing a more universal tool where you can configure the projection schema and the configuration of the target file. Then the tool can decide which level of deserialization/serialization is required. For example for trans-compression you need to decompress the pages while for encryption you don't. What do you think?
> Encryption translation tool
> ----------------------------
>
> Key: PARQUET-2071
> URL: https://issues.apache.org/jira/browse/PARQUET-2071
> Project: Parquet
> Issue Type: New Feature
> Components: parquet-mr
> Reporter: Xinli Shang
> Assignee: Xinli Shang
> Priority: Major
>
> When translating existing data to encryption state, we could develop a tool like TransCompression to translate the data at page level to encryption state without reading to record and rewrite. This will speed up the process a lot.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)