You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "albertoramon (JIRA)" <ji...@apache.org> on 2019/08/06 08:15:00 UTC

[jira] [Closed] (ARROW-6129) Row_groups duplicate Rows

     [ https://issues.apache.org/jira/browse/ARROW-6129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

albertoramon closed ARROW-6129.
-------------------------------
    Resolution: Not A Problem

This is the expected behavior

> Row_groups duplicate Rows
> -------------------------
>
>                 Key: ARROW-6129
>                 URL: https://issues.apache.org/jira/browse/ARROW-6129
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Python
>    Affects Versions: 0.14.1
>            Reporter: albertoramon
>            Priority: Major
>              Labels: parquetWriter
>         Attachments: tes_output.png, test01.py, top10.csv
>
>
> Using Row_Groups to write Parquet, duplicate rows:
>     Input: CSV 10 Rows
>     Row_Groups=1 --> Output 10 Rows 
>     Row_Groups=2 --> Output 20 Rows
>   !tes_output.png!
> Is this the expected?
> attached code snippet and CSV



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)