You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Fleur Kelpin (Jira)" <ji...@apache.org> on 2020/11/16 20:41:00 UTC
[jira] [Created] (ARROW-10623) Arrow 1.0.1 cannot read parquet file
written by arrow 2.0.0
Fleur Kelpin created ARROW-10623:
------------------------------------
Summary: Arrow 1.0.1 cannot read parquet file written by arrow 2.0.0
Key: ARROW-10623
URL: https://issues.apache.org/jira/browse/ARROW-10623
Project: Apache Arrow
Issue Type: Bug
Components: R
Affects Versions: 2.0.0, 1.0.1
Reporter: Fleur Kelpin
h4. How to reproduce
* Create a data frame
* Write it to parquet file using apache 2.0.0. The demo uses R 3.6 but same happens if you use R 4.0
* Read the parquet file using apache 1.0.1. I only tried that in R 3.6
h4. Expected
The data frame is the same as it was before:
{noformat}
structure(list(col1 = 1:100), row.names = c(NA, 100L), class = "data.frame"){noformat}
h4. Actual
The data frame has lost some information:
{noformat}
structure(list(1:100), class = "data.frame"){noformat}
h4. Demo
I'm not sure what the easiest way is to put up a demo project for this, since you need to switch between arrow installations. But I've created this docker based demo:
[https://github.com/fdlk/arrow2/]
--
This message was sent by Atlassian Jira
(v8.3.4#803005)