You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Marc Vertes (JIRA)" <ji...@apache.org> on 2017/02/23 13:31:44 UTC
[jira] [Created] (PARQUET-895) Reading of nested columns is broken
Marc Vertes created PARQUET-895:
-----------------------------------
Summary: Reading of nested columns is broken
Key: PARQUET-895
URL: https://issues.apache.org/jira/browse/PARQUET-895
Project: Parquet
Issue Type: Bug
Components: parquet-cpp
Reporter: Marc Vertes
Problem occurs when reading a nested column with repeated values, specially when there is much more levels in that column than the number of global rows.
Citing @peshopetrov, who filed a github pull request identifying the problem and proposing a fix:
Nested repeated columns' count is incorrectly read from row group's metadata. That's correct in cases where there aren't any nested repeated fields but is generally not correct. Instead the num_values from the column's metadata should be used.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)