You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Gabor Szadovszky (Jira)" <ji...@apache.org> on 2021/08/05 10:38:00 UTC

[jira] [Commented] (PARQUET-2071) Encryption translation tool

    [ https://issues.apache.org/jira/browse/PARQUET-2071?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17393788#comment-17393788 ] 

Gabor Szadovszky commented on PARQUET-2071:
-------------------------------------------

I think it is a great idea to skip unnecessary deserialization/serialization steps in such cases. Meanwhile, we already have some tools with similar approach like trans-compression or prune columns. What do you think of implementing a more universal tool where you can configure the projection schema and the configuration of the target file. Then the tool can decide which level of deserialization/serialization is required. For example for trans-compression you need to decompress the pages while for encryption you don't. What do you think?

> Encryption translation tool 
> ----------------------------
>
>                 Key: PARQUET-2071
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2071
>             Project: Parquet
>          Issue Type: New Feature
>          Components: parquet-mr
>            Reporter: Xinli Shang
>            Assignee: Xinli Shang
>            Priority: Major
>
> When translating existing data to encryption state, we could develop a tool like TransCompression to translate the data at page level to encryption state without reading to record and rewrite. This will speed up the process a lot. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)