You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by "danepitkin (via GitHub)" <gi...@apache.org> on 2023/04/07 17:36:01 UTC

[GitHub] [arrow] danepitkin commented on issue #34886: `np.asarray(parrow_table)` returns a transposed representation of the data

danepitkin commented on issue #34886:
URL: https://github.com/apache/arrow/issues/34886#issuecomment-1500494106

   Hi @thomasjpfan,
   
   This isn't a bug, but a difference in the underlying storage layout of the objects (and the limitations of that). 
   
   Arrow supports interoperability with numpy at the array level (https://arrow.apache.org/docs/python/numpy.html). What you are seeing is the zero-copy conversion of the arrow columnar storage format into numpy arrays for each column (https://arrow.apache.org/docs/python/pandas.html#zero-copy-series-conversions). If you don't want to view the data in this format, a copy of the data needs to be made. This is inefficient and usually not the desired behavior. You'll need to implement the copying outside of pyarrow if you want this without using pandas. 
   
   For more complex datatypes (e.g. dataframes), you'll need to use pyarrow's pandas interoperability like in your example (https://arrow.apache.org/docs/python/pandas.html#pandas-integration).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org