You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Wes McKinney (Jira)" <ji...@apache.org> on 2019/08/22 22:45:00 UTC

[jira] [Commented] (ARROW-5450) [Python] TimestampArray.to_pylist() fails with OverflowError: Python int too large to convert to C long

    [ https://issues.apache.org/jira/browse/ARROW-5450?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16913766#comment-16913766 ] 

Wes McKinney commented on ARROW-5450:
-------------------------------------

Added to 0.15.0. I think we should return {{datetime.datetime}} objects except for nanosecond timestamps

> [Python] TimestampArray.to_pylist() fails with OverflowError: Python int too large to convert to C long
> -------------------------------------------------------------------------------------------------------
>
>                 Key: ARROW-5450
>                 URL: https://issues.apache.org/jira/browse/ARROW-5450
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>            Reporter: Tim Swast
>            Priority: Major
>             Fix For: 0.15.0
>
>
> When I attempt to roundtrip from a list of moderately large (beyond what can be represented in nanosecond precision, but within microsecond precision) datetime objects to pyarrow and back, I get an OverflowError: Python int too large to convert to C long.
> pyarrow version:
> {noformat}
> $ pip freeze | grep pyarrow
> pyarrow==0.13.0{noformat}
>  
> Reproduction:
> {code:java}
> import datetime
> import pandas
> import pyarrow
> import pytz
> timestamp_rows = [
> datetime.datetime(1, 1, 1, 0, 0, 0, tzinfo=pytz.utc),
> None,
> datetime.datetime(9999, 12, 31, 23, 59, 59, 999999, tzinfo=pytz.utc),
> datetime.datetime(1970, 1, 1, 0, 0, 0, tzinfo=pytz.utc),
> ]
> timestamp_array = pyarrow.array(timestamp_rows, pyarrow.timestamp("us", tz="UTC"))
> timestamp_roundtrip = timestamp_array.to_pylist()
> # ---------------------------------------------------------------------------
> # OverflowError Traceback (most recent call last)
> # <ipython-input-25-4a798e917c20> in <module>
> # ----> 1 timestamp_roundtrip = timestamp_array.to_pylist()
> #
> # ~/.pyenv/versions/3.6.4/envs/scratch/lib/python3.6/site-packages/pyarrow/array.pxi in __iter__()
> #
> # ~/.pyenv/versions/3.6.4/envs/scratch/lib/python3.6/site-packages/pyarrow/scalar.pxi in pyarrow.lib.TimestampValue.as_py()
> #
> # ~/.pyenv/versions/3.6.4/envs/scratch/lib/python3.6/site-packages/pyarrow/scalar.pxi in pyarrow.lib._datetime_conversion_functions.lambda5()
> #
> # pandas/_libs/tslibs/timestamps.pyx in pandas._libs.tslibs.timestamps.Timestamp.__new__()
> #
> # pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.convert_to_tsobject()
> #
> # OverflowError: Python int too large to convert to C long
> {code}
> For good measure, I also tested with timezone-naive timestamps with the same error:
> {code:java}
> naive_rows = [
> datetime.datetime(1, 1, 1, 0, 0, 0),
> None,
> datetime.datetime(9999, 12, 31, 23, 59, 59, 999999),
> datetime.datetime(1970, 1, 1, 0, 0, 0),
> ]
> naive_array = pyarrow.array(naive_rows, pyarrow.timestamp("us", tz=None))
> naive_roundtrip = naive_array.to_pylist()
> # ---------------------------------------------------------------------------
> # OverflowError Traceback (most recent call last)
> # <ipython-input-27-0c32e563d44a> in <module>
> # ----> 1 naive_roundtrip = naive_array.to_pylist()
> #
> # ~/.pyenv/versions/3.6.4/envs/scratch/lib/python3.6/site-packages/pyarrow/array.pxi in __iter__()
> #
> # ~/.pyenv/versions/3.6.4/envs/scratch/lib/python3.6/site-packages/pyarrow/scalar.pxi in pyarrow.lib.TimestampValue.as_py()
> #
> # ~/.pyenv/versions/3.6.4/envs/scratch/lib/python3.6/site-packages/pyarrow/scalar.pxi in pyarrow.lib._datetime_conversion_functions.lambda5()
> #
> # pandas/_libs/tslibs/timestamps.pyx in pandas._libs.tslibs.timestamps.Timestamp.__new__()
> #
> # pandas/_libs/tslibs/conversion.pyx in pandas._libs.tslibs.conversion.convert_to_tsobject()
> #
> # OverflowError: Python int too large to convert to C long
> {code}



--
This message was sent by Atlassian Jira
(v8.3.2#803003)