You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Matthias Rosenthaler (Jira)" <ji...@apache.org> on 2021/02/16 10:23:00 UTC

[jira] [Comment Edited] (ARROW-11629) [C++] Writing float32 values makes parquet files not readable for some tools

    [ https://issues.apache.org/jira/browse/ARROW-11629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17285123#comment-17285123 ] 

Matthias Rosenthaler edited comment on ARROW-11629 at 2/16/21, 10:22 AM:
-------------------------------------------------------------------------

# I tried it with pyarrow 0.17.0 too, same problem
 # The parquet file is generated with [https://github.com/G-Research/ParquetSharp] (latest version)
 # Same problem for your foo.parquet file

As I said, if I write out the file with pandas.to_parquet with engine fastparquet it is working. Maybe you find some differences if you diff both output files (pyarrow vs fastparquet).


was (Author: matthros):
# I tried it with pyarrow 0.17.0 too, same problem
 # The parquet file is generated with [https://github.com/G-Research/ParquetSharp] (latest version)
 # Same problem for your foo.parquet file

> [C++] Writing float32 values makes parquet files not readable for some tools
> ----------------------------------------------------------------------------
>
>                 Key: ARROW-11629
>                 URL: https://issues.apache.org/jira/browse/ARROW-11629
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Python
>    Affects Versions: 3.0.0
>            Reporter: Matthias Rosenthaler
>            Priority: Major
>         Attachments: foo.parquet, image-2021-02-15-15-49-41-908.png, output.csv, output.parquet
>
>
> If I try to read the attached csv file with pyarrow, changing the float64 columns to float32 and export it to parquet, the parquet file gets corrupted. It is not readable for apache drill or Parquet.Net any longer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)