You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/05/04 06:40:00 UTC

[jira] [Commented] (PARQUET-2297) Encrypted files should not be checked for delta encoding problem

    [ https://issues.apache.org/jira/browse/PARQUET-2297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17719170#comment-17719170 ] 

ASF GitHub Bot commented on PARQUET-2297:
-----------------------------------------

ggershinsky opened a new pull request, #1089:
URL: https://github.com/apache/parquet-mr/pull/1089

   https://issues.apache.org/jira/browse/PARQUET-2297




> Encrypted files should not be checked for delta encoding problem
> ----------------------------------------------------------------
>
>                 Key: PARQUET-2297
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2297
>             Project: Parquet
>          Issue Type: Improvement
>          Components: parquet-mr
>    Affects Versions: 1.13.0
>            Reporter: Gidon Gershinsky
>            Assignee: Gidon Gershinsky
>            Priority: Major
>             Fix For: 1.14.0, 1.13.1
>
>
> Delta encoding problem (https://issues.apache.org/jira/browse/PARQUET-246) was fixed in writers since parquet-mr-1.8. This fix also added a `checkDeltaByteArrayProblem` method in readers, that runs over all columns and checks for this problem in older files. 
> This now triggers an unrelated exception when reading encrypted files, in the following situation: trying to read an unencrypted column, without having keys for encrypted columns (see https://issues.apache.org/jira/browse/PARQUET-2193). This happens in Spark, with nested columns (files with regular columns are ok).
> Possible solution: don't call the `checkDeltaByteArrayProblem` method for encrypted files - because these files can be written only with parquet-mr-1.12 and newer, where the delta encoding problem is already fixed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)