You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "AlenkaF (via GitHub)" <gi...@apache.org> on 2023/02/22 11:46:57 UTC

[GitHub] [arrow] AlenkaF commented on issue #34283: [Python] `Table.to_pandas` fails to convert index `dtype` with a custom type mapper

AlenkaF commented on issue #34283:
URL: https://github.com/apache/arrow/issues/34283#issuecomment-1439886052

   Yes, I _think_ that the dtype of an index is never converted according to the `types_mapper` keyword when converting `pa.Table` or an array for that matter.
   
   A single array gets converted to pandas series in `_array_like_to_pandas` with `pd.Series` which doesn't take into account the dtype of the index.
   
   https://github.com/apache/arrow/blob/958f63502679c7074ff60c0d06dbb1a0be0a5cfb/python/pyarrow/array.pxi#L1649
   
   This would be a good add on. What would be needed is to use pandas api to reset the series index with `.astype()` method after the code line linked above in [arrow/python/pyarrow/array.pxi](https://github.com/apache/arrow/blob/958f63502679c7074ff60c0d06dbb1a0be0a5cfb/python/pyarrow/array.pxi#L1649).
   
   As for the Table, it gets converted with the use of pandas BlockManager and I am not sure how the desired dtype could be passed to the BlockManager axes in that case:
   
   https://github.com/apache/arrow/blob/7828165f185b5ea2a3e76c06f9d4ba44263fd6dc/python/pyarrow/table.pxi#L4001-L4008
   
   https://github.com/apache/arrow/blob/45918a90a6ca1cf3fd67c256a7d6a240249e555a/python/pyarrow/pandas_compat.py#L797-L823
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org