You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Ian Cook (Jira)" <ji...@apache.org> on 2021/06/25 15:06:00 UTC
[jira] [Updated] (ARROW-13186) [R] Implement type determination
more cleanly
[ https://issues.apache.org/jira/browse/ARROW-13186?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ian Cook updated ARROW-13186:
-----------------------------
Description:
In the R package, there are several improvements in data type determination in the 5.0.0 release. The implementation of these improvements used a kludge: They made it possible to store a {{Schema}} in an {{Expression}} object in the R package; when set, this {{Schema}} is retained in derivative {{Expression}} objects. This was the most convenient way to make the {{Schema}} available for passing it to the {{type_id()}} method, which requires it. But this introduces a deviation of the R package's {{Expression}} object from the C++ library's {{Expression}} object, and it makes our type determination functions work differently than the other R functions in {{nse_funcs}}.
The Jira issues in which these somewhat kludgy improvements were made are:
* allowing a schema to be stored in the {{Expression}} object, and implementing type determination functions in a way that uses that schema (ARROW-12781)
* retaining a schema in derivative {{Expression}} objects (ARROW-13117)
* setting an empty schema in scalar literal {{Expression}} objects (ARROW-13119)
From the perspective of the R package, an ideal way to implement type determination functions would be to call a {{type_id}} kernel through the {{call_function}} interface, but this was rejected in ARROW-13167. Consider other ways that we might improve this implementation.
was:
In the R package, there are several improvements in data type determination in the 5.0.0 release. The implementation of these improvements used a kludge: They made it possible to store a {{Schema}} in an {{Expression}} object in the R package; when set, this {{Schema}} is retained in derivative {{Expression}}s. This was the most convenient way to make the {{Schema}} available for passing it to the {{type_id()}} method, which requires it. But this introduces a deviation of the R package's {{Expression}} object from the C++ library's {{Expression}} object, and it makes our type determination functions work differently than the other R functions in {{nse_funcs}}.
The Jira issues in which these somewhat kludgy improvements were made are:
* allowing a schema to be stored in the {{Expression}} object, and implementing type determination functions in a way that uses that schema (ARROW-12781)
* retaining a schema in derivative {{Expression}} objects (ARROW-13117)
* setting an empty schema in scalar literal {{Expression}} objects (ARROW-13119)
From the perspective of the R package, an ideal way to implement type determination functions would be to call a {{type_id}} kernel through the {{call_function}} interface, but this was rejected in ARROW-13167. Consider other ways that we might improve this implementation.
> [R] Implement type determination more cleanly
> ---------------------------------------------
>
> Key: ARROW-13186
> URL: https://issues.apache.org/jira/browse/ARROW-13186
> Project: Apache Arrow
> Issue Type: Improvement
> Components: R
> Affects Versions: 5.0.0
> Reporter: Ian Cook
> Priority: Major
>
> In the R package, there are several improvements in data type determination in the 5.0.0 release. The implementation of these improvements used a kludge: They made it possible to store a {{Schema}} in an {{Expression}} object in the R package; when set, this {{Schema}} is retained in derivative {{Expression}} objects. This was the most convenient way to make the {{Schema}} available for passing it to the {{type_id()}} method, which requires it. But this introduces a deviation of the R package's {{Expression}} object from the C++ library's {{Expression}} object, and it makes our type determination functions work differently than the other R functions in {{nse_funcs}}.
> The Jira issues in which these somewhat kludgy improvements were made are:
> * allowing a schema to be stored in the {{Expression}} object, and implementing type determination functions in a way that uses that schema (ARROW-12781)
> * retaining a schema in derivative {{Expression}} objects (ARROW-13117)
> * setting an empty schema in scalar literal {{Expression}} objects (ARROW-13119)
> From the perspective of the R package, an ideal way to implement type determination functions would be to call a {{type_id}} kernel through the {{call_function}} interface, but this was rejected in ARROW-13167. Consider other ways that we might improve this implementation.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)