You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by Bryant Menn <br...@gmail.com> on 2018/04/05 04:17:03 UTC

Troubleshooting large number of nested items

I am attempting to troubleshoot and provide a patch if I am capable for
ARROW-2367 (https://issues.apache.org/jira/browse/ARROW-2367). From what I
can tell from gdb on a debug build of master, I believe the issue to be
lists in individual rows in an Pandas dataframe/series being stored as a
single BinaryArray instead of a ChunkedArray when the size of the total
column data exceeds the max int32 size.

How would I confirm this hunch? Apologies if this something
straightforward; new to the project and this is my first time debugging a
Python C/C++ extension.

Thanks,

Bryant