You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2017/03/20 01:41:42 UTC

[jira] [Commented] (ARROW-664) Make C++ Arrow serialization deterministic

    [ https://issues.apache.org/jira/browse/ARROW-664?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15932056#comment-15932056 ] 

Wes McKinney commented on ARROW-664:
------------------------------------

[~pcmoritz] independent of the discussion about the standard memory format in the linked PR, I agree that we have an option (perhaps the default and only option) to always zero out memory in the C++ implementation.

From a quick scan of the array builders, it appears we are not zeroing the memory for the offsets in {{ListBuilder::Resize}}.  I don't see anything else at a glance. Seems like this should be reproducible in a test case that fails with some probability (deterministic failure might be difficult because it's UB). 

> Make C++ Arrow serialization deterministic
> ------------------------------------------
>
>                 Key: ARROW-664
>                 URL: https://issues.apache.org/jira/browse/ARROW-664
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++
>    Affects Versions: 0.2.0
>            Reporter: Philipp Moritz
>            Assignee: Philipp Moritz
>            Priority: Minor
>             Fix For: 0.3.0
>
>
> In C++ arrow it can currently occur that uninitialized data created by the builders is written to IPC memory. The goal of this issue is to identify these cases and set the memory to zero.
> Note that most of these cases have already been identified by valgrind and fixed in the past.
> Some of the considerations and benefits are discussed in this github PR:
> https://github.com/apache/arrow/pull/397



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)