You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "Micah Kornfield (Jira)" <ji...@apache.org> on 2021/07/16 06:57:00 UTC
[jira] [Moved] (PARQUET-2066) [C++][Parquet] num_rows is incorrect
for nested types
[ https://issues.apache.org/jira/browse/PARQUET-2066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Micah Kornfield moved ARROW-13349 to PARQUET-2066:
--------------------------------------------------
Component/s: (was: Parquet)
(was: C++)
parquet-cpp
Key: PARQUET-2066 (was: ARROW-13349)
Workflow: patch-available, re-open possible (was: jira)
Project: Parquet (was: Apache Arrow)
> [C++][Parquet] num_rows is incorrect for nested types
> -----------------------------------------------------
>
> Key: PARQUET-2066
> URL: https://issues.apache.org/jira/browse/PARQUET-2066
> Project: Parquet
> Issue Type: Bug
> Components: parquet-cpp
> Reporter: Jorge Leitão
> Priority: Major
>
> Data pages v2 have:
> * num_rows
> * num_values
> we write num_rows equal to the num_values. However, they represent different aspects.
> Given a list such as "[[0, 1], None, [2, None, 3]]", num_rows = 3 and num_values = 6. We currently report 6 in both instances.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)