You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Micah Kornfield (Jira)" <ji...@apache.org> on 2021/07/16 06:57:00 UTC

[jira] [Moved] (PARQUET-2066) [C++][Parquet] num_rows is incorrect for nested types

     [ https://issues.apache.org/jira/browse/PARQUET-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Micah Kornfield moved ARROW-13349 to PARQUET-2066:
--------------------------------------------------

    Component/s:     (was: Parquet)
                     (was: C++)
                 parquet-cpp
            Key: PARQUET-2066  (was: ARROW-13349)
       Workflow: patch-available, re-open possible  (was: jira)
        Project: Parquet  (was: Apache Arrow)

> [C++][Parquet] num_rows is incorrect for nested types
> -----------------------------------------------------
>
>                 Key: PARQUET-2066
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2066
>             Project: Parquet
>          Issue Type: Bug
>          Components: parquet-cpp
>            Reporter: Jorge Leitão
>            Priority: Major
>
> Data pages v2 have:
> * num_rows
> * num_values
> we write num_rows equal to the num_values. However, they represent different aspects.
> Given a list such as "[[0, 1], None, [2, None, 3]]", num_rows = 3 and num_values = 6. We currently report 6 in both instances.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)