You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Lawrence Ling (Jira)" <ji...@apache.org> on 2020/06/23 05:39:00 UTC

[jira] [Created] (ARROW-9211) [Python] ArrowInvalid error raised when deserialising pandas with pd.NaT values in object column

Lawrence Ling created ARROW-9211:
------------------------------------

             Summary: [Python] ArrowInvalid error raised when deserialising pandas with pd.NaT values in object column
                 Key: ARROW-9211
                 URL: https://issues.apache.org/jira/browse/ARROW-9211
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
    Affects Versions: 0.17.1, 0.17.0
            Reporter: Lawrence Ling


In pyarrow 0.17.x when deserialising a pandas dataframe which has pd.NaT values in an object column, an ArrowInvalid error is raised:

 
{code:java}
pyarrow.lib.ArrowInvalid: Casting from timestamp[us] to timestamp[ns] would result in out of bounds timestamp: -62135596800000000
{code}
 

Reproducible code (using pyarrow==0.17.1 and pandas==1.0.3):

 
{code:java}
import pandas as pd
import pyarrow.ipc as ipc
import pyarrow as pa
v = pd.DataFrame({
    "bar": [1592808896000000000, pd.NaT]
})
# works fine as datetime64[ns] but not as object type
v = v.astype({"bar": "datetime64[ns]"}).astype({"bar": "object"})
bs = ipc.serialize_pandas(v).to_pybytes()
df = ipc.deserialize_pandas(bs)  # error{code}
In pyarrow 0.16.0 no error occurs and df is returned as:

 
{code:java}
                            bar
0 2020-06-22 06:54:56.000000000
1 1754-08-30 22:43:41.128654848
{code}
 

Was the change in 0.17.x to raise an error an intentional behaviour change? Given the previous behaviour in 0.16.0 seemed a bit like undefined behaviour already, where it converted NaT to 1754-08-30 (which seems due to the -62135596800000000 timestamp mentioned in the error above?).

Also note that when serialized as datetime64[ns] rather than object, the code works fine in both 0.17.x and 0.16.0, returning:
{code:java}
                  bar
0 2020-06-22 06:54:56
1                 NaT{code}
 

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)