You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Neal Richardson (Jira)" <ji...@apache.org> on 2022/09/04 13:21:00 UTC
[jira] [Created] (ARROW-17609) [R] Streamline some C++ calls
Neal Richardson created ARROW-17609:
---------------------------------------
Summary: [R] Streamline some C++ calls
Key: ARROW-17609
URL: https://issues.apache.org/jira/browse/ARROW-17609
Project: Apache Arrow
Issue Type: New Feature
Components: R
Reporter: Neal Richardson
Assignee: Neal Richardson
When looking at profiling data of TPC-H queries on ARROW-17462, there was some added overhead (not a ton: tens of ms, but enough to trigger benchmark regressions on small data) from the extra expression type calculation. It's not a huge deal, but I saw a few places where we could avoid doing unnecessary work:
* Memoize Expression$type calculation
* Defer Expression$schema determination (calls UnifySchema on expression args' schemas)--most expressions don't ever need it (ARROW-13186)
* Set Expression$scalar type at creation so we don't have to query it
* Eliminate the .fields() R function and move logic into Schema constructor--it creates a bunch of Field R6 objects that immediately are dropped
--
This message was sent by Atlassian Jira
(v8.20.10#820010)