You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "jorisvandenbossche (via GitHub)" <gi...@apache.org> on 2023/06/15 13:31:55 UTC
[GitHub] [arrow] jorisvandenbossche opened a new issue, #36096: [Python] Array vs ChunkedArray to_pandas discrepancy for pandas extension dtypes implementing __from_arrow__
jorisvandenbossche opened a new issue, #36096:
URL: https://github.com/apache/arrow/issues/36096
When you have a arrow type that maps to a pandas extension dtype implementing `__from_arrow__`, the `ChunkedArray.to_pandas` conversion will call the `pd_dtype.__from_arrow__`, but the `Array.to_pandas` version does not.
Typically, if only encounter such dtypes this if you have an pyarrow ExtensionArray as well, and in `ExtensionArray.to_pandas`, we also check for the dtype having `__from_arrow__`:
https://github.com/apache/arrow/blob/475b5b9463b64bec4e03a47e3277076db246bd35/python/pyarrow/array.pxi#L3094-L3104
But the base class `Array.to_pandas` doesn't do this. And recently, one of the pandas dtypes that map to a non-extension array on the pyarrow side (i.e. DatetimeTZDtype, mapping to timestamp with tz) added `__from_arrow__`. Which means that for this dtype, the conversion takes a different code path.
```
from datetime import datetime
import pyarrow as pa
arr = pa.array([datetime(1, 1, 1)], pa.timestamp("s", tz="America/New_York"))
table = pa.table({'a': arr})
# doesn't call DatetimeTZDtype.__from_arrow__
arr.to_pandas()
# ChunkedArray does call that
table["a"].to_pandas()
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] jorisvandenbossche closed issue #36096: [Python] Array vs ChunkedArray to_pandas discrepancy for pandas extension dtypes implementing __from_arrow__
Posted by "jorisvandenbossche (via GitHub)" <gi...@apache.org>.
jorisvandenbossche closed issue #36096: [Python] Array vs ChunkedArray to_pandas discrepancy for pandas extension dtypes implementing __from_arrow__
URL: https://github.com/apache/arrow/issues/36096
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org