You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Antoine Pitrou (Jira)" <ji...@apache.org> on 2022/10/21 17:40:00 UTC

[jira] [Comment Edited] (ARROW-17783) [C++] Aggregate kernel should not mandate alignment

    [ https://issues.apache.org/jira/browse/ARROW-17783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17622379#comment-17622379 ] 

Antoine Pitrou edited comment on ARROW-17783 at 10/21/22 5:39 PM:
------------------------------------------------------------------

The idiomatic way to work around the alignment issue is to use {{memcpy}}. We have dedicated functions for that:
https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/ubsan.h

Note that modern CPUs don't mind unaligned loads of scalar types. It's just that it's UB in C/C++.



was (Author: pitrou):
The idiomatic to work around the alignment issue is to use {{memcpy}}. We have dedicated functions for that:
https://github.com/apache/arrow/blob/master/cpp/src/arrow/util/ubsan.h

Note that modern CPUs don't mind unaligned loads of scalar types. It's just that it's UB in C/C++.


> [C++] Aggregate kernel should not mandate alignment
> ---------------------------------------------------
>
>                 Key: ARROW-17783
>                 URL: https://issues.apache.org/jira/browse/ARROW-17783
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>    Affects Versions: 6.0.0, 8.0.0
>            Reporter: Yifei Yang
>            Assignee: Weston Pace
>            Priority: Major
>         Attachments: flight-alignment-test.zip
>
>
> When using arrow's aggregate kernel with table transferred from arrow flight (DoGet), it may crash at arrow::util::CheckAlignment(). However using original data it works well, also if I first serialize the transferred table into bytes then recreate an arrow table using the bytes, it works well.
> "flight-alignment-test" attached is the minimal test that can produce the issue, which basically does "sum(total_revenue) group by l_suppkey" using the table from "DoGet()". ("DummyNode" is just used to be the producer of the aggregate node as the producer is required to create the aggregate node)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)