You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2022/10/18 11:31:00 UTC

[jira] [Updated] (ARROW-16547) [Python] to_pandas fails with FixedOffset timezones when timestamp_as_object is used

     [ https://issues.apache.org/jira/browse/ARROW-16547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

ASF GitHub Bot updated ARROW-16547:
-----------------------------------
    Labels: pull-request-available python-conversion  (was: python-conversion)

> [Python] to_pandas fails with FixedOffset timezones when timestamp_as_object is used
> ------------------------------------------------------------------------------------
>
>                 Key: ARROW-16547
>                 URL: https://issues.apache.org/jira/browse/ARROW-16547
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>            Reporter: Sander Goos
>            Priority: Major
>              Labels: pull-request-available, python-conversion
>         Attachments: pyarrow_to_pandas_repro.py
>
>   Original Estimate: 24h
>          Time Spent: 10m
>  Remaining Estimate: 23h 50m
>
> The `to_pandas` method fails with "ValueError: fromutc: dt.tzinfo is not self" when timestamp_as_object=True and a timezone with a fixed offset is used. E.g. "+08:00"
> Repro script attached.
>  
> The problem seems to be that `fromutc` is called on the tzinfo object here, which is not working when the object is pytz._FixedOffset: [https://github.com/apache/arrow/blob/90aac16761b7dbf5fe931bc8837cad5116939270/cpp/src/arrow/python/arrow_to_pandas.cc#L1068]
> {code:python}
> import pyarrow as pa
> import datetime as dt
> import pytz
> tz = pytz.FixedOffset(120)
> ts = tz.localize(dt.datetime(2022, 5, 12, 16, 57))
> timestamps = pa.array([ts])
> names = ["timestamp_col"]
> table = pa.Table.from_arrays([timestamps], names=names)
> print(table.schema)
> # Works fine
> print(table.to_pandas())
> # Fails with "ValueError: fromutc: dt.tzinfo is not self"
> table.to_pandas(timestamp_as_object=True)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)