You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ian Cook (Jira)" <ji...@apache.org> on 2021/06/25 15:06:00 UTC

[jira] [Updated] (ARROW-13186) [R] Implement type determination more cleanly

     [ https://issues.apache.org/jira/browse/ARROW-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ian Cook updated ARROW-13186:
-----------------------------
    Description: 
In the R package, there are several improvements in data type determination in the 5.0.0 release. The implementation of these improvements used a kludge: They made it possible to store a {{Schema}} in an {{Expression}} object in the R package; when set, this {{Schema}} is retained in derivative {{Expression}} objects. This was the most convenient way to make the {{Schema}} available for passing it to the {{type_id()}} method, which requires it. But this introduces a deviation of the R package's {{Expression}} object from the C++ library's {{Expression}} object, and it makes our type determination functions work differently than the other R functions in {{nse_funcs}}.

The Jira issues in which these somewhat kludgy improvements were made are:
 * allowing a schema to be stored in the {{Expression}} object, and implementing type determination functions in a way that uses that schema (ARROW-12781)
 * retaining a schema in derivative {{Expression}} objects (ARROW-13117)
 * setting an empty schema in scalar literal {{Expression}} objects (ARROW-13119)

From the perspective of the R package, an ideal way to implement type determination functions would be to call a {{type_id}} kernel through the {{call_function}} interface, but this was rejected in ARROW-13167. Consider other ways that we might improve this implementation.

  was:
In the R package, there are several improvements in data type determination in the 5.0.0 release. The implementation of these improvements used a kludge: They made it possible to store a {{Schema}} in an {{Expression}} object in the R package; when set, this {{Schema}} is retained in derivative {{Expression}}s. This was the most convenient way to make the {{Schema}} available for passing it to the {{type_id()}} method, which requires it. But this introduces a deviation of the R package's {{Expression}} object from the C++ library's {{Expression}} object, and it makes our type determination functions work differently than the other R functions in {{nse_funcs}}.

The Jira issues in which these somewhat kludgy improvements were made are:
 * allowing a schema to be stored in the {{Expression}} object, and implementing type determination functions in a way that uses that schema (ARROW-12781)
 * retaining a schema in derivative {{Expression}} objects (ARROW-13117)
 * setting an empty schema in scalar literal {{Expression}} objects (ARROW-13119)

From the perspective of the R package, an ideal way to implement type determination functions would be to call a {{type_id}} kernel through the {{call_function}} interface, but this was rejected in ARROW-13167. Consider other ways that we might improve this implementation.


> [R] Implement type determination more cleanly
> ---------------------------------------------
>
>                 Key: ARROW-13186
>                 URL: https://issues.apache.org/jira/browse/ARROW-13186
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: R
>    Affects Versions: 5.0.0
>            Reporter: Ian Cook
>            Priority: Major
>
> In the R package, there are several improvements in data type determination in the 5.0.0 release. The implementation of these improvements used a kludge: They made it possible to store a {{Schema}} in an {{Expression}} object in the R package; when set, this {{Schema}} is retained in derivative {{Expression}} objects. This was the most convenient way to make the {{Schema}} available for passing it to the {{type_id()}} method, which requires it. But this introduces a deviation of the R package's {{Expression}} object from the C++ library's {{Expression}} object, and it makes our type determination functions work differently than the other R functions in {{nse_funcs}}.
> The Jira issues in which these somewhat kludgy improvements were made are:
>  * allowing a schema to be stored in the {{Expression}} object, and implementing type determination functions in a way that uses that schema (ARROW-12781)
>  * retaining a schema in derivative {{Expression}} objects (ARROW-13117)
>  * setting an empty schema in scalar literal {{Expression}} objects (ARROW-13119)
> From the perspective of the R package, an ideal way to implement type determination functions would be to call a {{type_id}} kernel through the {{call_function}} interface, but this was rejected in ARROW-13167. Consider other ways that we might improve this implementation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)