You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Fleur Kelpin (Jira)" <ji...@apache.org> on 2020/11/16 20:41:00 UTC

[jira] [Created] (ARROW-10623) Arrow 1.0.1 cannot read parquet file written by arrow 2.0.0

Fleur Kelpin created ARROW-10623:
------------------------------------

             Summary: Arrow 1.0.1 cannot read parquet file written by arrow 2.0.0
                 Key: ARROW-10623
                 URL: https://issues.apache.org/jira/browse/ARROW-10623
             Project: Apache Arrow
          Issue Type: Bug
          Components: R
    Affects Versions: 2.0.0, 1.0.1
            Reporter: Fleur Kelpin


h4. How to reproduce
 * Create a data frame
 * Write it to parquet file using apache 2.0.0. The demo uses R 3.6 but same happens if you use R 4.0
 * Read the parquet file using apache 1.0.1. I only tried that in R 3.6

h4. Expected

The data frame is the same as it was before:
{noformat}
structure(list(col1 = 1:100), row.names = c(NA, 100L), class = "data.frame"){noformat}
h4. Actual

The data frame has lost some information:
{noformat}
structure(list(1:100), class = "data.frame"){noformat}
h4. Demo

I'm not sure what the easiest way is to put up a demo project for this, since you need to switch between arrow installations. But I've created this docker based demo:

[https://github.com/fdlk/arrow2/]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)