You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Antoine Pitrou (Jira)" <ji...@apache.org> on 2021/06/01 16:51:00 UTC

[jira] [Assigned] (ARROW-12917) [R][pyarrow] py_to_r error reading decimal(5,0) from pyarrow.Table

     [ https://issues.apache.org/jira/browse/ARROW-12917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Antoine Pitrou reassigned ARROW-12917:
--------------------------------------

    Assignee: Antoine Pitrou

> [R][pyarrow] py_to_r error reading decimal(5,0) from pyarrow.Table
> ------------------------------------------------------------------
>
>                 Key: ARROW-12917
>                 URL: https://issues.apache.org/jira/browse/ARROW-12917
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python, R
>    Affects Versions: 3.0.0
>            Reporter: Andreas Frieed
>            Assignee: Antoine Pitrou
>            Priority: Major
>
> In my R notebook, I try to read data from a Db2 database ("SELECT CAST (15 as decimal(5,0)) FROM sysibm.sysdummy1") into an R dataframe, leveraging arrow/flight.
> integers, varchars, etc can be loaded without issues, but when I use a decimal type, an error is thrown.
>  
> Here is the code I'm running:
>  
> {code:java}
> library("reticulate")
> library("arrow")
> itcfs <- import("...")
> readClient <- itcfs$get_flight_client()
> Manual_data_request = ....
> flightInfo <- itcfs$get_flight_info(readClient,data_request=Manual_data_request)
> tables <- itcfs$read_tables(readClient, flightInfo)
> ...
>  
> {code}
>  
> {{The itcfs package is implemented in Python, and here is the read_tables method:}}
>  
> {color:#0000ff}def{color}{color:#000000} read_tables(read_client, flight_info):{color}
> {color:#a31515}"""Read a list of pyarrow.Table"""{color}
> {color:#000000} tables = []{color}
> {color:#0000ff}for{color}{color:#000000} endpoint {color}{color:#0000ff}in{color}{color:#000000} flight_info.endpoints:{color}
> {color:#000000} reader = read_client.do_get(endpoint.ticket){color}
> {color:#000000} batches = [b.data {color}{color:#0000ff}for{color}{color:#000000} b {color}{color:#0000ff}in{color}{color:#000000} reader]{color}
> {color:#000000} tables.append(pa.Table.from_batches(batches)){color}
> {color:#0000ff}return{color}{color:#000000} tables{color}
>  
> This is the erro message:
>  
> {code:java}
> Error: Invalid: Invalid or unsupported format string: 'd:5,0'
> Traceback:
> 1. itcfs$read_tables(readClient, flightInfo)
> 2. py_to_r(result)
> 3. py_to_r.python.builtin.list(result)
> 4. lapply(converted, function(object) {
>  . if (inherits(object, "python.builtin.object")) 
>  . py_to_r(object)
>  . else object
>  . })
> 5. FUN(X[[i]], ...)
> 6. py_to_r(object)
> 7. py_to_r.pyarrow.lib.Table(object)
> 8. maybe_py_to_r(x$columns)
> 9. x$columns
> 10. `$.python.builtin.object`(x, "columns")
> 11. py_get_attr_or_item(x, name, TRUE)
> 12. py_maybe_convert(object, py_has_convert(x))
> 13. py_to_r(x)
> 14. py_to_r.python.builtin.list(x)
> 15. lapply(converted, function(object) {
>  . if (inherits(object, "python.builtin.object")) 
>  . py_to_r(object)
>  . else object
>  . })
> 16. FUN(X[[i]], ...)
> 17. py_to_r(object)
> 18. py_to_r.pyarrow.lib.ChunkedArray(object)
> 19. ChunkedArray$create(!!!maybe_py_to_r(x$chunks))
> 20. ChunkedArray__from_list(list2(...), type)
> 21. list2(...)
> 22. maybe_py_to_r(x$chunks)
> 23. x$chunks
> 24. `$.python.builtin.object`(x, chunks)
> 25. py_get_attr_or_item(x, name, TRUE)
> 26. py_maybe_convert(object, py_has_convert(x))
> 27. py_to_r(x)
> 28. py_to_r.python.builtin.list(x)
> 29. lapply(converted, function(object) {
>  . if (inherits(object, "python.builtin.object")) 
>  . py_to_r(object)
>  . else object
>  . })
> 30. FUN(X[[i]], ...)
> 31. py_to_r(object)
> 32. py_to_r.pyarrow.lib.Array(object)
> 33. ImportArray(array_ptr, schema_ptr)
> {code}
>  
>  
> In a pure python envrionment, decimal data can be read without issues.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)