You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by dongjoon-hyun <gi...@git.apache.org> on 2016/03/22 00:29:21 UTC

[GitHub] spark pull request: [SPARK-14051] Implement `Double.NaN==Float.NaN...

GitHub user dongjoon-hyun opened a pull request:

    https://github.com/apache/spark/pull/11868

    [SPARK-14051] Implement `Double.NaN==Float.NaN` in `row.equals` for consistency.

    ## What changes were proposed in this pull request?
    
    Since [SPARK-9079](https://issues.apache.org/jira/browse/SPARK-9079) and [SPARK-9145](https://issues.apache.org/jira/browse/SPARK-9145), `NaN = NaN` returns true and works well. The only exception case is direct comparison between `Row(Float.NaN)` and `Row(Double.NaN)`. The following is the example: the last expression had better be `true` for consistency.
    
    ```scala
    scala> Seq((1d,1f),(Double.NaN,Float.NaN)).toDF("a","b").registerTempTable("tmp")
    scala> sql("select a,b,a=b from tmp").collect()
    res1: Array[org.apache.spark.sql.Row] = Array([1.0,1.0,true], [NaN,NaN,true])
    scala> val row_a = sql("select a from tmp").collect()
    row_a: Array[org.apache.spark.sql.Row] = Array([1.0], [NaN])
    scala> val row_b = sql("select b from tmp").collect()
    row_b: Array[org.apache.spark.sql.Row] = Array([1.0], [NaN])
    scala> row_a(0) == row_b(0)
    res2: Boolean = true
    scala> row_a(1) == row_b(1)
    res3: Boolean = false
    ```
    
    ## How was this patch tested?
    
    Pass the Jenkins tests including new testcases.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dongjoon-hyun/spark SPARK-14051

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11868.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11868
    
----
commit 6ff7ade00aa1c7b099433344eb3484b27579a17b
Author: Dongjoon Hyun <do...@apache.org>
Date:   2016-03-21T23:23:55Z

    [SPARK-14051] Implement `Double.NaN==Float.NaN` in `row.equals` for consistency.
    
    ```
    scala> Seq((1d,1f),(Double.NaN,Float.NaN)).toDF("a","b").registerTempTable("tmp")
    scala> sql("select a,b,a=b from tmp").collect()
    res12: Array[org.apache.spark.sql.Row] = Array([1.0,1.0,true], [NaN,NaN,true])
    scala> val row_a = sql("select a from tmp").collect()
    row_a: Array[org.apache.spark.sql.Row] = Array([1.0], [NaN])
    scala> val row_b = sql("select b from tmp").collect()
    row_b: Array[org.apache.spark.sql.Row] = Array([1.0], [NaN])
    scala> row_a(0) == row_b(0)
    res13: Boolean = true
    scala> row_a(1) == row_b(1)
    res14: Boolean = false
    ```

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-199874683
  
    Oh, thank you for pointing out that. I missed that part. Let me check that again. I guess we can change `Row[NaN].hashCode` together in this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-199570349
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53726/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-201086373
  
    **[Test build #54124 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54124/consoleFull)** for PR 11868 at commit [`bafb5ad`](https://github.com/apache/spark/commit/bafb5ad4505a29d46a852034394037d83d28c0d6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-208117787
  
    **[Test build #55494 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55494/consoleFull)** for PR 11868 at commit [`1609112`](https://github.com/apache/spark/commit/1609112f05bdce6ea99d276eed4173d4e9db5922).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by srowen <gi...@git.apache.org>.

Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11868#discussion_r59145545
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala ---
    @@ -402,11 +402,13 @@ trait Row extends Serializable {
                   return false
                 }
               case f1: Float if java.lang.Float.isNaN(f1) =>
    -            if (!o2.isInstanceOf[Float] || ! java.lang.Float.isNaN(o2.asInstanceOf[Float])) {
    +            if (!(o2.isInstanceOf[Float] && java.lang.Float.isNaN(o2.asInstanceOf[Float]) ||
    --- End diff --
    
    Hm, in general NaN never equals NaN. There might be some reason to treat it differently here. On the one hand I tend to agree with this change anyway, on the grounds that it implements something like automatic promotion in Scala/Java. But clearly we're already not implementing the language's semantics, and trying to achieve something more like bitwise-equal semantics. In that case this wouldn't quite be right. Ask the author of the original change why it was made?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-201024823
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/54055/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-201024200
  
    **[Test build #54055 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54055/consoleFull)** for PR 11868 at commit [`a0e4ebe`](https://github.com/apache/spark/commit/a0e4ebeb746acb7742fdaad62237d9e43e9d8055).
     * This patch **fails from timeout after a configured wait of \`250m\`**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-201024818
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-208101127
  
    **[Test build #55494 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/55494/consoleFull)** for PR 11868 at commit [`1609112`](https://github.com/apache/spark/commit/1609112f05bdce6ea99d276eed4173d4e9db5922).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-199960241
  
    For example, [Oracle](https://docs.oracle.com/cd/B12037_01/server.101/b10759/sql_elements001.htm) orders NaN greatest with respect to all other values, and evaluates NaN equal to NaN.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-199956737
  
    Hi, @rxin . Thank you again!
    
    I made a big mistake in this PR and now I fixed it due to your advice. Now, the followings are true.
    ```
    scala> row_a(1) == row_b(1)
    res4: Boolean = true
    scala> List(row_a(1),row_b(1)).distinct
    res5: List[org.apache.spark.sql.Row] = List([NaN])
    ```
    Also, I applied this to `BaseGenericInternalRow`, too. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-200007579
  
    **[Test build #53799 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53799/consoleFull)** for PR 11868 at commit [`5e0a19c`](https://github.com/apache/spark/commit/5e0a19cb640196e8166e61198ddc5f768915111d).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-199961154
  
    [IBM DB2](https://www.ibm.com/support/knowledgecenter/SSEPEK_10.0.0/com.ibm.db2z10.doc.sqlref/src/tpc/db2z_numericcomparisions.dita) also says "From an SQL perspective, infinity = infinity, NaN = NaN, and sNaN = sNaN."


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-199570345
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-208118250
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-199960521
  
    Yea but if they do
    
    row1.getFloat(1) == row2.getDouble(2), it'd ...



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-200008340
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-212020852
  
    Hi, @JoshRosen .
    Could you take a look at this PR about Row(Double.NaN) == Row(Float.NaN) for a second when you have some time today? I'm going to close this PR and JIRA (as WontFix) since it's stale and Spark works fine without this. However, before closing this, I really appreciate if you give some opinion about what the original intention was.
    
    According to the initial commit in `RowSuite.scala`, `Row` class is designed(tested) to return true for 'NaN' equality test since 'NaN' is defined a normal value. So, I thought Double.NaN also needs to be equal to Float.NaN and made this PR. It's not interesting feature, just an attempt to complete `Row`-level NaN operations.
    ```
    test("float NaN == NaN") {
        val r1 = Row(Float.NaN)
        val r2 = Row(Float.NaN)
        assert(r1 === r2)
    }
    
    test("double NaN == NaN") {
        val r1 = Row(Double.NaN)
        val r2 = Row(Double.NaN)
        assert(r1 === r2)
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-212150790
  
    It's just for a record.
    In case there is no comment from @JoshRosen , I'll close this PR at tonight.
    Silence means many things and seems enough for this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-212147570
  
    Sure. As I mentioned today, I'm going to close this PR since Spark don't want this.
    I'm wondering what the intention of those testcases was.
    Don't worry, @rxin . I know that any -1 reject PRs. 
    I think this PR and JIRA at least prevents the same misunderstanding for the future like me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-199954892
  
    **[Test build #53799 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53799/consoleFull)** for PR 11868 at commit [`5e0a19c`](https://github.com/apache/spark/commit/5e0a19cb640196e8166e61198ddc5f768915111d).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-208118258
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/55494/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-212247762
  
    Thank you for your kind reviews during last month, @rxin and @srowen .
    I'm happily closing this PR. It's my bad to take so much time to close this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-212143922
  
    Yea unfortunately the behavior is already pretty weird, and I'm not sure adding this would actually make it less weird, so I'm in favor of just not doing anything here.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-199959344
  
    It's because Scala uses the standard way of Java and IEEE floating point. I also know that NaN is always false with even other NaN.
    However, it's about `Row`. I mean SQL. In SQL and Database world, `NaN` == `NaN`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11868#discussion_r59162191
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala ---
    @@ -402,11 +402,13 @@ trait Row extends Serializable {
                   return false
                 }
               case f1: Float if java.lang.Float.isNaN(f1) =>
    -            if (!o2.isInstanceOf[Float] || ! java.lang.Float.isNaN(o2.asInstanceOf[Float])) {
    +            if (!(o2.isInstanceOf[Float] && java.lang.Float.isNaN(o2.asInstanceOf[Float]) ||
    --- End diff --
    
    Thank you for attention, @srowen . I agree with you and @rxin in the current Spark status view point.
    
    Today, when I'm reading [Spark SQL, DataFrames and Datasets Guide: NaN Semantics](http://spark.apache.org/docs/latest/sql-programming-guide.html), I just suddenly want to update this PR.
    > **NaN Semantics**
    > There is specially handling for not-a-number (NaN) when dealing with float or double types that does not exactly match standard floating point semantics. Specifically:
    > - NaN = NaN returns true.
    
    I'm still digging to find some useful cases for this PR outside SQL layers.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-200008345
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/53799/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-199982128
  
    Oh, I see what is the point here now. @rxin , may I explain a little bit more?
    
    Mathematically, `NaN` equality is defined `false`. The followings are all **false** in Scala (and Java).
    * row_a(1).getDouble(0) == row_a(1).getDouble(0)
    * row_b(1).getFloat(0) == row_b(1).getFloat(0)
    * row_a(1).getDouble(0) == row_b(1).getFloat(0)
    * Double.NaN == Double.NaN
    * Float.NaN == Float.NaN
    * Double.NaN == Float.NaN
    
    However, Spark `Row` on master branch already returns **true** for the followings.
    * Row(Double.NaN) == Row(Double.NaN)
    * Row(Float.NaN) == Row(Float.NaN)
    
    As you guess easily, the followings are still **false**.
    * Row(Float.NaN).getFloat(0) == Row(Float.NaN).getFloat(0)
    * Row(Double.NaN).getDouble(0) == Row(Double.NaN).getDouble(0)



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-199957564
  
    I spent a bit of time on this -- I'm actually not sure we want to change this anymore, because Scala itsefl doesn't do this and users can always screw up if they do field comparison themselves.
    
    ```
    scala> Float.NaN == Double.NaN
    res1: Boolean = false
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-201112087
  
    **[Test build #54124 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54124/consoleFull)** for PR 11868 at commit [`bafb5ad`](https://github.com/apache/spark/commit/bafb5ad4505a29d46a852034394037d83d28c0d6).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-199536386
  
    **[Test build #53726 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53726/consoleFull)** for PR 11868 at commit [`6ff7ade`](https://github.com/apache/spark/commit/6ff7ade00aa1c7b099433344eb3484b27579a17b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by AmplabJenkins <gi...@git.apache.org>.

Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-201112254
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-199569904
  
    **[Test build #53726 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/53726/consoleFull)** for PR 11868 at commit [`6ff7ade`](https://github.com/apache/spark/commit/6ff7ade00aa1c7b099433344eb3484b27579a17b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by rxin <gi...@git.apache.org>.

Github user rxin commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-199678899
  
    What do we do for hash code?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by SparkQA <gi...@git.apache.org>.

Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-200929579
  
    **[Test build #54055 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/54055/consoleFull)** for PR 11868 at commit [`a0e4ebe`](https://github.com/apache/spark/commit/a0e4ebeb746acb7742fdaad62237d9e43e9d8055).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun closed the pull request at:

    https://github.com/apache/spark/pull/11868


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] spark pull request: [SPARK-14051][SQL] Implement `Double.NaN==Floa...

Posted by dongjoon-hyun <gi...@git.apache.org>.

Github user dongjoon-hyun commented on the pull request:

    https://github.com/apache/spark/pull/11868#issuecomment-199990025
  
    I think this PR makes Spark users feel less confused by completing the missing part of `NaN` in Spark SQL(Row).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org