You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Dave Challis (JIRA)" <ji...@apache.org> on 2018/04/06 11:45:00 UTC

[jira] [Updated] (ARROW-2391) [Python] Segmentation fault from PyArrow when mapping Pandas datetime column to pyarrow.date64

     [ https://issues.apache.org/jira/browse/ARROW-2391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dave Challis updated ARROW-2391:
--------------------------------
    Description: 
When trying to call `pyarrow.Table.from_pandas` with a `pandas.DataFrame` and a `pyarrow.Schema` provided, the function call results in a segmentation fault if Pandas `datetime64[ns]` column tries to be converted to a `pyarrow.date64` type.

A minimal example which shows this is:
{code:python}
import pandas as pd
import pyarrow as pa

df = pd.DataFrame({'created': ['2018-05-10T10:24:01']})
df['created'] = pd.to_datetime(df['created'])}}
schema = pa.schema([pa.field('created', pa.date64())])
pa.Table.from_pandas(df, schema=schema)
{code}

Executing the above causes the python interpreter to exit with "Segmentation fault: 11".

Attempting to convert into various other datatypes (by specifying different schemas) either succeeds, or raises an exception if the conversion is invalid.

  was:
When trying to call `pyarrow.Table.from_pandas` with a `pandas.DataFrame` and a `pyarrow.Schema` provided, the function call results in a segmentation fault if Pandas `datetime64[ns]` column tries to be converted to a `pyarrow.date64` type.

 

A minimal example which shows this is:

{{import pandas as pd}}
{{import pyarrow as pa}}

{{df = pd.DataFrame(\{'created': ['2018-05-10T10:24:01']})}}
{{df['created'] = pd.to_datetime(df['created'])}}
{{schema = pa.schema([pa.field('created', pa.date64())])}}
{{pa.Table.from_pandas(df, schema=schema)}}

 

Executing the above causes the python interpreter to exit with "Segmentation fault: 11".

 

Attempting to convert into various other datatypes (by specifying different schemas) either succeeds, or raises an exception if the conversion is invalid.

        Summary: [Python] Segmentation fault from PyArrow when mapping Pandas datetime column to pyarrow.date64  (was: Segmentation fault from PyArrow when mapping Pandas datetime column to pyarrow.date64)

> [Python] Segmentation fault from PyArrow when mapping Pandas datetime column to pyarrow.date64
> ----------------------------------------------------------------------------------------------
>
>                 Key: ARROW-2391
>                 URL: https://issues.apache.org/jira/browse/ARROW-2391
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.9.0
>         Environment: Mac OS High Sierra
> Python 3.6
>            Reporter: Dave Challis
>            Priority: Major
>
> When trying to call `pyarrow.Table.from_pandas` with a `pandas.DataFrame` and a `pyarrow.Schema` provided, the function call results in a segmentation fault if Pandas `datetime64[ns]` column tries to be converted to a `pyarrow.date64` type.
> A minimal example which shows this is:
> {code:python}
> import pandas as pd
> import pyarrow as pa
> df = pd.DataFrame({'created': ['2018-05-10T10:24:01']})
> df['created'] = pd.to_datetime(df['created'])}}
> schema = pa.schema([pa.field('created', pa.date64())])
> pa.Table.from_pandas(df, schema=schema)
> {code}
> Executing the above causes the python interpreter to exit with "Segmentation fault: 11".
> Attempting to convert into various other datatypes (by specifying different schemas) either succeeds, or raises an exception if the conversion is invalid.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)