You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2017/04/08 13:56:41 UTC

[jira] [Commented] (ARROW-788) Possible nondeterminism in Tensor serialization code

    [ https://issues.apache.org/jira/browse/ARROW-788?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15961843#comment-15961843 ] 

Wes McKinney commented on ARROW-788:
------------------------------------

I'm looking at the serialization code. Couple ideas:

* The amount of padding in the metadata header depends on the starting byte offset. The {{WriteTensor}} code does not guarantee to write a multiple of 8 bytes, but we could fix this
* Is it possible any of your arrays are not contiguous? e.g. you had before https://github.com/ray-project/ray/pull/436/files#diff-17aeecc6d41bcd220496c0d5211cf58fL80 -- we aren't checking in WriteTensor whether the data is contiguous, but we probably should (ARROW-794)

> Possible nondeterminism in Tensor serialization code
> ----------------------------------------------------
>
>                 Key: ARROW-788
>                 URL: https://issues.apache.org/jira/browse/ARROW-788
>             Project: Apache Arrow
>          Issue Type: Improvement
>          Components: C++, Python
>            Reporter: Philipp Moritz
>            Priority: Minor
>
> The Ray nondeterminism tests are failing on
> https://github.com/ray-project/ray/pull/436 (moving to Arrow's Tensor serialization code).
> This might mean that there is some nondeterminism (like uninitialized memory) in the IPC file written by the Arrow Tensor serializer. I'm investigating it now, please let me know if you have an idea what the problem could be.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)