You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@arrow.apache.org by "Bryan Cutler (JIRA)" <ji...@apache.org> on 2018/10/03 23:31:00 UTC

[jira] [Created] (ARROW-3428) [Python] from_pandas gives incorrect results when converting floating point to bool

Bryan Cutler created ARROW-3428:
-----------------------------------

             Summary: [Python] from_pandas gives incorrect results when converting floating point to bool
                 Key: ARROW-3428
                 URL: https://issues.apache.org/jira/browse/ARROW-3428
             Project: Apache Arrow
          Issue Type: Bug
          Components: Python
            Reporter: Bryan Cutler
            Assignee: Bryan Cutler


When converting Pandas data that contains floating point values to boolean, incorrect results are given
{noformat}
In [2]: import pyarrow as pa
   ...: import pandas as pd
   ...: a = [0.0, 1.0, 2.0, None, float('NaN')]
   ...: 

In [3]: s = pd.Series(a)

In [4]: pa.Array.from_pandas(s, type=pa.bool_())
Out[4]: 
<pyarrow.lib.BooleanArray object at 0x7f1bfd099e68>
[
  False,
  False,
  False,
  False,
  False
]
{noformat}

Expected output should be True when value != 0

This originated from SPARK-25461



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)