You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2017/03/14 09:53:41 UTC

[jira] [Resolved] (SPARK-13411) change in null aggregation behavior from spark 1.5.2 and 1.6.0

     [ https://issues.apache.org/jira/browse/SPARK-13411?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-13411.
----------------------------------
    Resolution: Invalid

`null` is not `NaN`. The typo seems fixed in https://github.com/apache/spark/pull/14569

> change in null aggregation behavior from  spark 1.5.2 and 1.6.0
> ---------------------------------------------------------------
>
>                 Key: SPARK-13411
>                 URL: https://issues.apache.org/jira/browse/SPARK-13411
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.6.0
>            Reporter: Barry Becker
>
> I don't know if the behavior in 1.5.3 or 1.6.0 is correct, but its definitely different.
> Suppose I have a dataframe with a double column, "foo", that is all null valued.
> If I do
> val ext: DataFrame = df.agg(min("foo"), max("foo"), count(col("foo")).alias("nonNullCount"))
> In 1.5.2 I could do ext.getDouble(0) and get Double.NaN.
> In 1.6.0, when I try this I get "value in null at index 0". Maybe the new behavior is correct, but I think there is a typo in the message. It should say "value is null at index 0".
> Which behavior is correct? If 1.6.0 is correct, then it looks like I will need to add isNull checks everywhere when retrieving values.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org