You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Micah Kornfield (Jira)" <ji...@apache.org> on 2023/01/11 18:12:00 UTC
[jira] [Created] (IMPALA-11838) Relax Parquet Version Check
Micah Kornfield created IMPALA-11838:
----------------------------------------
Summary: Relax Parquet Version Check
Key: IMPALA-11838
URL: https://issues.apache.org/jira/browse/IMPALA-11838
Project: IMPALA
Issue Type: Improvement
Components: Backend
Reporter: Micah Kornfield
There is currently a check that verifies the version number of [parquet files is equal to 1](https://github.com/apache/impala/blob/1e30ca228d683821e42e51f94478c77642f5331a/be/src/exec/parquet/parquet-metadata-utils.cc#L256).
This seems potentially overly strict because the version isn't necessarily [super reliable](https://github.com/apache/arrow/blob/a580f2711750ef507cc57ce48cb431dd700a6166/cpp/src/parquet/metadata.h#L326)
There are also many v2 files that are likely still readable even if the reader doesn't support any v2 features.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org