You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Micah Kornfield (Jira)" <ji...@apache.org> on 2023/01/11 18:12:00 UTC

[jira] [Created] (IMPALA-11838) Relax Parquet Version Check

Micah Kornfield created IMPALA-11838:
----------------------------------------

             Summary: Relax Parquet Version Check
                 Key: IMPALA-11838
                 URL: https://issues.apache.org/jira/browse/IMPALA-11838
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
            Reporter: Micah Kornfield


There is currently a check that verifies the version number of [parquet files is equal to 1](https://github.com/apache/impala/blob/1e30ca228d683821e42e51f94478c77642f5331a/be/src/exec/parquet/parquet-metadata-utils.cc#L256).

This seems potentially overly strict because the version isn't necessarily [super reliable](https://github.com/apache/arrow/blob/a580f2711750ef507cc57ce48cb431dd700a6166/cpp/src/parquet/metadata.h#L326)

There are also many v2 files that are likely still readable even if the reader doesn't support any v2 features. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org