You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "David Li (Jira)" <ji...@apache.org> on 2021/05/18 12:00:00 UTC

[jira] [Commented] (ARROW-12818) [Python] Int64Array can not be casted to DoubleArray?

    [ https://issues.apache.org/jira/browse/ARROW-12818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17346845#comment-17346845 ] 

David Li commented on ARROW-12818:
----------------------------------

Hey [~lf-shaw], you can do this if you pass {{safe=False}}:
{noformat}
>>> ts.cast(pa.int64()).cast(pa.float64(), safe=False)
<pyarrow.lib.DoubleArray object at 0x7f25a084c820>
[
  1.60946e+18,
  1.60955e+18,
  1.60963e+18,
  1.60972e+18,
  1.6098e+18,
  1.60989e+18,
  1.60998e+18,
  1.61006e+18,
  1.61015e+18,
  1.61024e+18
]
{noformat}
This will give you the same result as NumPy. The reason this happens is outside the [-9007199254740992, 9007199254740992] range, not all integers can be exactly represented as a double, so by default PyArrow checks to be safe - but you can override this.

> [Python] Int64Array can not be casted to DoubleArray?
> -----------------------------------------------------
>
>                 Key: ARROW-12818
>                 URL: https://issues.apache.org/jira/browse/ARROW-12818
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 4.0.0
>            Reporter: lf-shaw
>            Priority: Major
>
> In numpy, we can cast int64 to float64. But in pyarrow, we can't.
> ```python
> import numpy as np
> import pandas as pd
> import pyarrow as pa
> # timestamp
> dt = pd.date_range('2021-01-01', periods=10)
> # int64
> arr = dt.asi8
> # cast to float64
> arr_double = arr.astype(np.float64)
> # to arrow array
> ts = pa.array(dt.asi8, type=pa.timestamp('ns'))
> # to int64 array
> ts_int64 = ts.cast(pa.int64())
> # cast to float64
> ts_double = ts_int64.cast(pa.float64())
> ```
> the last line raise an exception
> ```python
> ---------------------------------------------------------------------------
> ArrowInvalid Traceback (most recent call last)
> <ipython-input-89-cc7cafd418c4> in <module>
> ----> 1 pa.array(dt.asi8, type=pa.timestamp('ns')).cast(pa.int64()).cast(pa.float64())
> /opt/anaconda3/lib/python3.8/site-packages/pyarrow/array.pxi in pyarrow.lib.Array.cast()
> /opt/anaconda3/lib/python3.8/site-packages/pyarrow/compute.py in cast(arr, target_type, safe)
>  287 else:
>  288 options = CastOptions.unsafe(target_type)
> --> 289 return call_function("cast", [arr], options)
>  290 
>  291
> /opt/anaconda3/lib/python3.8/site-packages/pyarrow/_compute.pyx in pyarrow._compute.call_function()
> /opt/anaconda3/lib/python3.8/site-packages/pyarrow/_compute.pyx in pyarrow._compute.Function.call()
> /opt/anaconda3/lib/python3.8/site-packages/pyarrow/error.pxi in pyarrow.lib.pyarrow_internal_check_status()
> /opt/anaconda3/lib/python3.8/site-packages/pyarrow/error.pxi in pyarrow.lib.check_status()
> ArrowInvalid: Integer value 1609459200000000000 not in range: -9007199254740992 to 9007199254740992
> ```



--
This message was sent by Atlassian Jira
(v8.3.4#803005)