You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Wes McKinney (Jira)" <ji...@apache.org> on 2020/09/17 15:04:00 UTC
[jira] [Comment Edited] (ARROW-10026) [C++] Improve kernel
performance on small batches
[ https://issues.apache.org/jira/browse/ARROW-10026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17197739#comment-17197739 ]
Wes McKinney edited comment on ARROW-10026 at 9/17/20, 3:03 PM:
----------------------------------------------------------------
IMHO we should consider a slimmed down data structure for the implementation of {{ExecBatch}} that does not use {{arrow::util::variant}}, considering that we only ever will have either {{ArrayData}} or {{Scalar}} as value types. The overhead of slicing {{ArrayData}} objects is also non-trivial
was (Author: wesmckinn):
IMHO we should consider a slimmed down data structure for {{ExecBatch}} that does not use {{arrow::util::variant}}, considering that we only ever will have either {{ArrayData}} or {{Scalar}} as value types. The overhead of slicing {{ArrayData}} objects is also non-trivial
> [C++] Improve kernel performance on small batches
> -------------------------------------------------
>
> Key: ARROW-10026
> URL: https://issues.apache.org/jira/browse/ARROW-10026
> Project: Apache Arrow
> Issue Type: Task
> Components: C++
> Reporter: Antoine Pitrou
> Priority: Major
>
> It seems that invoking some kernels on smallish batches has quite an overhead:
> {code}
> ArrayArrayKernel<Add, Int32Type>/32768/100 2860 ns 2859 ns 245195 bytes_per_second=10.6727G/s items_per_second=2.86494G/s null_percent=1 size=32.768k
> ArrayArrayKernel<Add, Int32Type>/32768/0 2752 ns 2751 ns 249316 bytes_per_second=11.093G/s items_per_second=2.97775G/s null_percent=0 size=32.768k
> ArrayArrayKernel<Add, Int32Type>/524288/100 18633 ns 18630 ns 36548 bytes_per_second=26.2097G/s items_per_second=7.03561G/s null_percent=1 size=524.288k
> ArrayArrayKernel<Add, Int32Type>/524288/0 18260 ns 18257 ns 38245 bytes_per_second=26.7451G/s items_per_second=7.17933G/s null_percent=0 size=524.288k
> {code}
> We should investigate and try to lighten the overhead.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)