You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/06/09 17:46:46 UTC

[GitHub] [arrow] nealrichardson commented on issue #7385: bad handling of Int64 column of dataframe when reading in R with read_feather

nealrichardson commented on issue #7385:
URL: https://github.com/apache/arrow/issues/7385#issuecomment-641472127


   This appears to be a feature of how `data.matrix()` interacts with `bit64::integer64` class objects. Here's a reprex without involving `arrow`:
   
   ```
   > df <- data.frame(a=bit64::as.integer64(1:2), b=bit64::as.integer64(3:4))
   > df
     a b
   1 1 3
   2 2 4
   > data.matrix(df)
                    a             b
   [1,] 4.940656e-324 1.482197e-323
   [2,] 9.881313e-324 1.976263e-323
   ```
   
   You could fix this in your example either by providing a schema in Python with int32 types, or by calling `as.integer` on the columns of your data.frame before calling `data.matrix()`.
   
   One could argue that we should downcast int64 to int32 if there are no out of bounds values since that's what R can natively handle. I made https://issues.apache.org/jira/browse/ARROW-9083 to consider that.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org