You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2017/02/23 17:37:44 UTC

[jira] [Resolved] (PARQUET-895) Reading of nested columns is broken

     [ https://issues.apache.org/jira/browse/PARQUET-895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wes McKinney resolved PARQUET-895.
----------------------------------
    Resolution: Fixed

Resolved in https://github.com/apache/parquet-cpp/commit/98f5fa144f7d1e219c37ebc56e69bc0543457d78

> Reading of nested columns is broken
> -----------------------------------
>
>                 Key: PARQUET-895
>                 URL: https://issues.apache.org/jira/browse/PARQUET-895
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-cpp
>            Reporter: Marc Vertes
>            Assignee: Marc Vertes
>
> Problem occurs when reading a nested column with repeated values, specially when there is much more levels in that column than the number of global rows.
> Citing @peshopetrov, who filed a github pull request identifying the problem and proposing a fix:
> Nested repeated columns' count is incorrectly read from row group's metadata. That's correct in cases where there aren't any nested repeated fields but is generally not correct. Instead the num_values from the column's metadata should be used.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)