You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2017/03/30 19:03:42 UTC

[jira] [Resolved] (ARROW-740) FileReader fails for large objects

     [ https://issues.apache.org/jira/browse/ARROW-740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wes McKinney resolved ARROW-740.
--------------------------------
    Resolution: Fixed

resolved in https://github.com/apache/arrow/commit/642b753a49a3fcb5d53946c773cd70ab2a3ece88

> FileReader fails for large objects
> ----------------------------------
>
>                 Key: ARROW-740
>                 URL: https://issues.apache.org/jira/browse/ARROW-740
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++
>            Reporter: Philipp Moritz
>             Fix For: 0.3.0
>
>
> Trying to serialize a large arrow array (around 2**30 entries) I get a non-success status when trying to use the FileReader to read the array:
> "Bad status: Invalid: flatbuffer size 0 invalid. File offset: 660, metadata length: 0"
> How to reproduce:
> Check out the branch arrow-large-objects from https://github.com/pcmoritz/ray-1, and follow http://ray.readthedocs.io/en/latest/install-on-ubuntu.html with that branch.
> Then run
> {{python test/jenkins_tests/multi_node_tests/large_memory_test.py}}
> in the ray root directory.
> Most likely there is some int32_t somewhere that overflows, but I haven't been able to track it down. The only int32_ts that are used by the FileReader seem to be for the flatbuffer metadata size, which should be small.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)