You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/09/17 16:30:46 UTC

[GitHub] [spark] AngersZhuuuu opened a new pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

AngersZhuuuu opened a new pull request #34033:
URL: https://github.com/apache/spark/pull/34033


   ### What changes were proposed in this pull request?
   InSet should handle NaN
   
   ### Why are the changes needed?
   InSet should handle NaN
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Added UT


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922176279


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143431/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925534851


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48041/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925637444


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48050/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925745599


   **[Test build #143541 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143541/testReport)** for PR 34033 at commit [`4878d5c`](https://github.com/apache/spark/commit/4878d5c178f223dc1782fc0f72c20ff9ad59b8c0).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925508882






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922414280


   **[Test build #143440 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143440/testReport)** for PR 34033 at commit [`141dc5f`](https://github.com/apache/spark/commit/141dc5f40a2d81aeee034dfef661db1618bb8e1a).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925591794


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143528/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925534878


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48041/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #34033:
URL: https://github.com/apache/spark/pull/34033#discussion_r714515848



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
##########
@@ -554,6 +554,12 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
   }
 
   @transient private[this] lazy val hasNull: Boolean = hset.contains(null)
+  @transient private[this] lazy val isNaN: Any => Boolean = child.dataType match {
+    case DoubleType => (value: Any) => java.lang.Double.isNaN(value.asInstanceOf[java.lang.Double])
+    case FloatType => (value: Any) => java.lang.Float.isNaN(value.asInstanceOf[java.lang.Float])
+    case _ => (_: Any) => false
+  }
+  @transient private[this] lazy val hasNaN = set.exists(isNaN)

Review comment:
       > nit: we can avoid iterating the set if type is not float/double.
   
   DOne




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925554173


   **[Test build #143541 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143541/testReport)** for PR 34033 at commit [`4878d5c`](https://github.com/apache/spark/commit/4878d5c178f223dc1782fc0f72c20ff9ad59b8c0).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925477202


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143527/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-921957975


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143421/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925504422


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48036/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34033:
URL: https://github.com/apache/spark/pull/34033#discussion_r714938713



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
##########
@@ -562,6 +572,8 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
   protected override def nullSafeEval(value: Any): Any = {
     if (set.contains(value)) {
       true
+    } else if (isNaN(value)) {

Review comment:
       Codegen seems fine as we will do a null check at the very end.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926152482


   **[Test build #143554 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143554/testReport)** for PR 34033 at commit [`d6d517c`](https://github.com/apache/spark/commit/d6d517cced0f531529bf2177dd5b46e09f25d028).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925591794


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143528/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925592404


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48050/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925475024


   **[Test build #143527 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143527/testReport)** for PR 34033 at commit [`87df7b0`](https://github.com/apache/spark/commit/87df7b0af3b6e3e7d8b55d9c30891bca4202a862).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926349537


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48093/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926345014


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48093/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922415249


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143440/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-921976885


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47928/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926308397


   **[Test build #143583 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143583/testReport)** for PR 34033 at commit [`ef0e81f`](https://github.com/apache/spark/commit/ef0e81f8e8e5872c4402aee1525a27febefd7292).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34033:
URL: https://github.com/apache/spark/pull/34033#discussion_r714913128



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
##########
@@ -562,6 +572,8 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
   protected override def nullSafeEval(value: Any): Any = {
     if (set.contains(value)) {
       true
+    } else if (isNaN(value)) {

Review comment:
       shall we check nan before calling `set.contains(value)`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925488150


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48035/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34033:
URL: https://github.com/apache/spark/pull/34033#discussion_r714502997



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
##########
@@ -554,6 +554,12 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
   }
 
   @transient private[this] lazy val hasNull: Boolean = hset.contains(null)
+  @transient private[this] lazy val isNaN: Any => Boolean = child.dataType match {
+    case DoubleType => (value: Any) => java.lang.Double.isNaN(value.asInstanceOf[java.lang.Double])
+    case FloatType => (value: Any) => java.lang.Float.isNaN(value.asInstanceOf[java.lang.Float])
+    case _ => (_: Any) => false
+  }
+  @transient private[this] lazy val hasNaN = set.exists(isNaN)

Review comment:
       nit: we can avoid iterating the set if type is not float/double.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #34033:
URL: https://github.com/apache/spark/pull/34033#discussion_r714950029



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
##########
@@ -562,6 +572,8 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
   protected override def nullSafeEval(value: Any): Any = {
     if (set.contains(value)) {
       true
+    } else if (isNaN(value)) {

Review comment:
       > is it a bug? this means we return null if `value` is null, no matter `hasNull` is true or false.
   
   `In` have same behavior,  
   ```
     override def eval(input: InternalRow): Any = {
       val evaluatedValue = value.eval(input)
       if (evaluatedValue == null) {
         null
       } else {
         var hasNull = false
         list.foreach { e =>
           val v = e.eval(input)
           if (v == null) {
             hasNull = true
           } else if (ordering.equiv(v, evaluatedValue)) {
             return true
           }
         }
         if (hasNull) {
           null
         } else {
           false
         }
       }
     }
   ```
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925957849


   **[Test build #143554 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143554/testReport)** for PR 34033 at commit [`d6d517c`](https://github.com/apache/spark/commit/d6d517cced0f531529bf2177dd5b46e09f25d028).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925534878


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48041/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926308397


   **[Test build #143583 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143583/testReport)** for PR 34033 at commit [`ef0e81f`](https://github.com/apache/spark/commit/ef0e81f8e8e5872c4402aee1525a27febefd7292).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926433388


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143583/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-921942398


   **[Test build #143421 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143421/testReport)** for PR 34033 at commit [`a74a310`](https://github.com/apache/spark/commit/a74a3104e66dcfbfe7561e524a2b75555addfbba).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-921957907


   **[Test build #143421 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143421/testReport)** for PR 34033 at commit [`a74a310`](https://github.com/apache/spark/commit/a74a3104e66dcfbfe7561e524a2b75555addfbba).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922176279


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143431/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922169332


   **[Test build #143431 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143431/testReport)** for PR 34033 at commit [`c3addf6`](https://github.com/apache/spark/commit/c3addf6471b6f660f7e37a698b1a2988abce7126).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922411019


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47948/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922407063


   **[Test build #143440 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143440/testReport)** for PR 34033 at commit [`141dc5f`](https://github.com/apache/spark/commit/141dc5f40a2d81aeee034dfef661db1618bb8e1a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925590462


   **[Test build #143528 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143528/testReport)** for PR 34033 at commit [`174ac71`](https://github.com/apache/spark/commit/174ac717066e5fce2dcb5c0cd50c8d9149fe5580).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925527421


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48042/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925527536


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48042/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926349537


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48093/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926158589


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143554/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925497597


   **[Test build #143533 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143533/testReport)** for PR 34033 at commit [`293daea`](https://github.com/apache/spark/commit/293daea9674bb06606dbdd188b6730797de2f617).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925497597


   **[Test build #143533 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143533/testReport)** for PR 34033 at commit [`293daea`](https://github.com/apache/spark/commit/293daea9674bb06606dbdd188b6730797de2f617).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925987512


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48063/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926018928


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48063/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922169332


   **[Test build #143431 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143431/testReport)** for PR 34033 at commit [`c3addf6`](https://github.com/apache/spark/commit/c3addf6471b6f660f7e37a698b1a2988abce7126).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925477420


   ping @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925957849


   **[Test build #143554 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143554/testReport)** for PR 34033 at commit [`d6d517c`](https://github.com/apache/spark/commit/d6d517cced0f531529bf2177dd5b46e09f25d028).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922410836


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47948/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922174399


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47939/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922176226


   **[Test build #143431 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143431/testReport)** for PR 34033 at commit [`c3addf6`](https://github.com/apache/spark/commit/c3addf6471b6f660f7e37a698b1a2988abce7126).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925477170


   **[Test build #143527 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143527/testReport)** for PR 34033 at commit [`87df7b0`](https://github.com/apache/spark/commit/87df7b0af3b6e3e7d8b55d9c30891bca4202a862).
    * This patch **fails to build**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926019019


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48063/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-921976853


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47928/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925747275


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143541/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925637484


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48050/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34033:
URL: https://github.com/apache/spark/pull/34033#discussion_r714938038



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
##########
@@ -562,6 +572,8 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
   protected override def nullSafeEval(value: Any): Any = {
     if (set.contains(value)) {
       true
+    } else if (isNaN(value)) {

Review comment:
       is it a bug? this means we return null if `value` is null, no matter `hasNull` is true or false.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926019019


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48063/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925512991


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48042/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925475024


   **[Test build #143527 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143527/testReport)** for PR 34033 at commit [`87df7b0`](https://github.com/apache/spark/commit/87df7b0af3b6e3e7d8b55d9c30891bca4202a862).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34033:
URL: https://github.com/apache/spark/pull/34033#discussion_r714446914



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
##########
@@ -562,6 +567,8 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
   protected override def nullSafeEval(value: Any): Any = {
     if (set.contains(value)) {
       true
+    } else if (isNaN(value)) {
+      set.exists(isNaN(_))

Review comment:
       can we have a `hasNaN` variable to avoid repeated computing?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #34033:
URL: https://github.com/apache/spark/pull/34033


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926158589


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143554/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926324867


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48093/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-921972020


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47928/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925477202


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143527/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925671393


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143533/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-921942398


   **[Test build #143421 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143421/testReport)** for PR 34033 at commit [`a74a310`](https://github.com/apache/spark/commit/a74a3104e66dcfbfe7561e524a2b75555addfbba).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926432107


   **[Test build #143583 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143583/testReport)** for PR 34033 at commit [`ef0e81f`](https://github.com/apache/spark/commit/ef0e81f8e8e5872c4402aee1525a27febefd7292).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #34033:
URL: https://github.com/apache/spark/pull/34033#discussion_r714931949



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
##########
@@ -562,6 +572,8 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
   protected override def nullSafeEval(value: Any): Any = {
     if (set.contains(value)) {
       true
+    } else if (isNaN(value)) {

Review comment:
       > shall we check nan before calling `set.contains(value)`?
   
   Notice this, since it's `nullSafeEval `,  `null` has been checked




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922410241


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47948/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925666997


   **[Test build #143533 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143533/testReport)** for PR 34033 at commit [`293daea`](https://github.com/apache/spark/commit/293daea9674bb06606dbdd188b6730797de2f617).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #34033:
URL: https://github.com/apache/spark/pull/34033#discussion_r714911970



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
##########
@@ -593,15 +605,34 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
   private def genCodeWithSet(ctx: CodegenContext, ev: ExprCode): ExprCode = {
     nullSafeCodeGen(ctx, ev, c => {
       val setTerm = ctx.addReferenceObj("set", set)
+      val hasNaNValue = ctx.addReferenceObj("hasNaN", hasNaN)

Review comment:
       `hasNaN` is a boolean, we can just use it to generate code
   ```
             |} else if (${isNaN(c)}) {
             |  ${ev.value} =  $hasNaN;
             |}
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922415249


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143440/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925671393


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143533/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926433388


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143583/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-921976885


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47928/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922407063


   **[Test build #143440 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143440/testReport)** for PR 34033 at commit [`141dc5f`](https://github.com/apache/spark/commit/141dc5f40a2d81aeee034dfef661db1618bb8e1a).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922174407


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47939/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #34033:
URL: https://github.com/apache/spark/pull/34033#discussion_r714931949



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
##########
@@ -562,6 +572,8 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
   protected override def nullSafeEval(value: Any): Any = {
     if (set.contains(value)) {
       true
+    } else if (isNaN(value)) {

Review comment:
       > shall we check nan before calling `set.contains(value)`?
   
   Have noticed this, since it's `nullSafeEval `,  `null` has been checked




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #34033:
URL: https://github.com/apache/spark/pull/34033#discussion_r714932964



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
##########
@@ -593,15 +605,34 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
   private def genCodeWithSet(ctx: CodegenContext, ev: ExprCode): ExprCode = {
     nullSafeCodeGen(ctx, ev, c => {
       val setTerm = ctx.addReferenceObj("set", set)
+      val hasNaNValue = ctx.addReferenceObj("hasNaN", hasNaN)

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925489997


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48036/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan edited a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
cloud-fan edited a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926439975


   thanks, merging to master/3.2/3.1/3.0!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925479253


   **[Test build #143528 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143528/testReport)** for PR 34033 at commit [`174ac71`](https://github.com/apache/spark/commit/174ac717066e5fce2dcb5c0cd50c8d9149fe5580).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925527536


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48042/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925479253


   **[Test build #143528 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143528/testReport)** for PR 34033 at commit [`174ac71`](https://github.com/apache/spark/commit/174ac717066e5fce2dcb5c0cd50c8d9149fe5580).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925554173


   **[Test build #143541 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143541/testReport)** for PR 34033 at commit [`4878d5c`](https://github.com/apache/spark/commit/4878d5c178f223dc1782fc0f72c20ff9ad59b8c0).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925637484


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48050/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925508882






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #34033:
URL: https://github.com/apache/spark/pull/34033#discussion_r714460173



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala
##########
@@ -562,6 +567,8 @@ case class InSet(child: Expression, hset: Set[Any]) extends UnaryExpression with
   protected override def nullSafeEval(value: Any): Any = {
     if (set.contains(value)) {
       true
+    } else if (isNaN(value)) {
+      set.exists(isNaN(_))

Review comment:
       How about current?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925513635


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48041/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-921957975


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143421/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922411019


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47948/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925502753


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48035/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-925747275


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143541/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-926439975


   thanks, merging to master/3.2!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922173669


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47939/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34033: [SPARK-36792][SQL] InSet should handle NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34033:
URL: https://github.com/apache/spark/pull/34033#issuecomment-922174407


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47939/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org