You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (JIRA)" <ji...@apache.org> on 2017/03/30 19:03:42 UTC
[jira] [Resolved] (ARROW-740) FileReader fails for large objects
[ https://issues.apache.org/jira/browse/ARROW-740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Wes McKinney resolved ARROW-740.
--------------------------------
Resolution: Fixed
resolved in https://github.com/apache/arrow/commit/642b753a49a3fcb5d53946c773cd70ab2a3ece88
> FileReader fails for large objects
> ----------------------------------
>
> Key: ARROW-740
> URL: https://issues.apache.org/jira/browse/ARROW-740
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++
> Reporter: Philipp Moritz
> Fix For: 0.3.0
>
>
> Trying to serialize a large arrow array (around 2**30 entries) I get a non-success status when trying to use the FileReader to read the array:
> "Bad status: Invalid: flatbuffer size 0 invalid. File offset: 660, metadata length: 0"
> How to reproduce:
> Check out the branch arrow-large-objects from https://github.com/pcmoritz/ray-1, and follow http://ray.readthedocs.io/en/latest/install-on-ubuntu.html with that branch.
> Then run
> {{python test/jenkins_tests/multi_node_tests/large_memory_test.py}}
> in the ray root directory.
> Most likely there is some int32_t somewhere that overflows, but I haven't been able to track it down. The only int32_ts that are used by the FileReader seem to be for the flatbuffer metadata size, which should be small.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)