You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/09/14 14:06:21 UTC

[GitHub] [spark] AngersZhuuuu opened a new pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

AngersZhuuuu opened a new pull request #33994:
URL: https://github.com/apache/spark/pull/33994


   ### What changes were proposed in this pull request?
   For query
   ```
   select array_except(array(cast('nan' as double), 1d), array(cast('nan' as double)))
   ```
   This returns [NaN, 1d], but it should return [1d].
   This issue is caused by `OpenHashSet` can't handle `Double.NaN` and `Float.NaN` too.
   In this pr fix this based on https://github.com/apache/spark/pull/33955
   
   
   ### Why are the changes needed?
   Fix bug
   
   ### Does this PR introduce _any_ user-facing change?
   ArrayUnion won't show duplicated `NaN` value
   
   
   ### How was this patch tested?
   Added UT
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-919226534


   **[Test build #143266 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143266/testReport)** for PR 33994 at commit [`d73557c`](https://github.com/apache/spark/commit/d73557c9220fc2df2b2890cf8bbed73c2374d81d).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920606433


   ping @cloud-fan


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-922042506


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143416/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-923495139


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921908710


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47923/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-922867292


   **[Test build #143452 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143452/testReport)** for PR 33994 at commit [`797089d`](https://github.com/apache/spark/commit/797089d0b705b83745c71fa19fbbb8d036e087f0).
    * This patch **fails to build**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-925267985


   Thank you, @AngersZhuuuu and @cloud-fan .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-919437718


   **[Test build #143264 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143264/testReport)** for PR 33994 at commit [`e9cb989`](https://github.com/apache/spark/commit/e9cb989bd50a7442d56adc12b755a13c44cac146).
    * This patch **fails SparkR unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-919263300


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47769/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-919460971


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143266/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921549759


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47905/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920609155


   **[Test build #143338 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143338/testReport)** for PR 33994 at commit [`662bf05`](https://github.com/apache/spark/commit/662bf059bd9576bc5873b976fc57f035055edc74).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #33994:
URL: https://github.com/apache/spark/pull/33994#discussion_r709254703



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -4094,9 +4100,16 @@ case class ArrayExcept(left: Expression, right: Expression) extends ArrayBinaryL
             }
           } else {
             val elem = array1.get(i, elementType)
-            if (!hs.contains(elem)) {
-              arrayBuffer += elem
-              hs.add(elem)
+            if (isNaN(elem)) {
+              if (notFoundNaNElement) {
+                arrayBuffer += elem

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-923026730


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47964/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920801275


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143338/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921983533


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143413/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921982270


   **[Test build #143413 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143413/testReport)** for PR 33994 at commit [`77b1661`](https://github.com/apache/spark/commit/77b16615f7fd02484e8e1ed1667220aec16e8e28).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921777113


   ping @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921975855


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143411/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #33994:
URL: https://github.com/apache/spark/pull/33994


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921864750


   **[Test build #143416 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143416/testReport)** for PR 33994 at commit [`cd3641d`](https://github.com/apache/spark/commit/cd3641dd63d5b6eb67c2e29ef88f6394f7620b7b).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920326530


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143306/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921983533


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143413/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-923986755


   ping @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-923517573


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47980/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #33994:
URL: https://github.com/apache/spark/pull/33994#discussion_r712663805



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -4203,10 +4208,9 @@ case class ArrayExcept(left: Expression, right: Expression) extends ArrayBinaryL
       val ptName = CodeGenerator.primitiveTypeName(jt)
 
       nullSafeCodeGen(ctx, ev, (array1, array2) => {
-        val notFoundNullElement = ctx.freshName("notFoundNullElement")
         val nullElementIndex = ctx.freshName("nullElementIndex")
         val builder = ctx.freshName("builder")
-        val openHashSet = classOf[OpenHashSet[_]].getName
+        val openHashSet = classOf[SQLOpenHashSet[_]].getName

Review comment:
       Just a question. Do we keep `openHashSet` to reduce the number of changed lines?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-919275290






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-919250484


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47767/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-922042506


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143416/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921809100


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47920/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-922975697


   **[Test build #143455 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143455/testReport)** for PR 33994 at commit [`6eaa96a`](https://github.com/apache/spark/commit/6eaa96a801a3660d854f21e0c7337eb89ca26163).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920143681


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47808/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920143719


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47808/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920609155


   **[Test build #143338 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143338/testReport)** for PR 33994 at commit [`662bf05`](https://github.com/apache/spark/commit/662bf059bd9576bc5873b976fc57f035055edc74).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921816329


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47920/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-923496807


   **[Test build #143469 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143469/testReport)** for PR 33994 at commit [`6eaa96a`](https://github.com/apache/spark/commit/6eaa96a801a3660d854f21e0c7337eb89ca26163).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920088716


   **[Test build #143306 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143306/testReport)** for PR 33994 at commit [`061132f`](https://github.com/apache/spark/commit/061132fdf8b6bb396af1d2379d2824b29a3cbf63).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-919438847


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143264/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-919190472


   **[Test build #143264 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143264/testReport)** for PR 33994 at commit [`e9cb989`](https://github.com/apache/spark/commit/e9cb989bd50a7442d56adc12b755a13c44cac146).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-919438847


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143264/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920088716


   **[Test build #143306 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143306/testReport)** for PR 33994 at commit [`061132f`](https://github.com/apache/spark/commit/061132fdf8b6bb396af1d2379d2824b29a3cbf63).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920326530


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143306/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920079225


   ping @cloud-fan


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-919460971


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143266/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921518491


   **[Test build #143397 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143397/testReport)** for PR 33994 at commit [`3ef22d7`](https://github.com/apache/spark/commit/3ef22d7d607a7b226e3755a072de2ed1f291386e).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921773591


   **[Test build #143411 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143411/testReport)** for PR 33994 at commit [`82640f5`](https://github.com/apache/spark/commit/82640f557d368d32d7d560b84e67eb7a7eca4de1).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #33994:
URL: https://github.com/apache/spark/pull/33994#discussion_r712675564



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -4203,10 +4208,9 @@ case class ArrayExcept(left: Expression, right: Expression) extends ArrayBinaryL
       val ptName = CodeGenerator.primitiveTypeName(jt)
 
       nullSafeCodeGen(ctx, ev, (array1, array2) => {
-        val notFoundNullElement = ctx.freshName("notFoundNullElement")
         val nullElementIndex = ctx.freshName("nullElementIndex")
         val builder = ctx.freshName("builder")
-        val openHashSet = classOf[OpenHashSet[_]].getName
+        val openHashSet = classOf[SQLOpenHashSet[_]].getName

Review comment:
       > Just a question. Do we keep `openHashSet` to reduce the number of changed lines?
   
   We can have a discussion, if need to change, I will raise a followup  for all this changes.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921519609


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47897/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-922870307


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47961/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920801275


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143338/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921676295


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143389/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #33994:
URL: https://github.com/apache/spark/pull/33994#discussion_r711049759



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -4238,28 +4245,36 @@ case class ArrayExcept(left: Expression, right: Expression) extends ArrayBinaryL
             body
           }
 
-        val processArray1 = withArray1NullAssignment(
+        val body =
           s"""
-             |$jt $value = ${genGetValue(array1, i)};
              |if (!$hashSet.contains($hsValueCast$value)) {
              |  if (++$size > ${ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH}) {
              |    break;
              |  }
              |  $hashSet.add$hsPostFix($hsValueCast$value);
              |  $builder.$$plus$$eq($value);
              |}
-           """.stripMargin)
+           """.stripMargin
+
+        val processArray1 = withArray1NullAssignment(
+          s"$jt $value = ${genGetValue(array1, i)};" +
+            SQLOpenHashSet.withNaNCheckCode(elementType, value, hashSet, body,
+              (valueNaN: String) =>
+                s"""
+                   |$size++;
+                   |$builder.$$plus$$eq($valueNaN);
+                 """.stripMargin))
 
         // Only need to track null element index when array1's element is nullable.
         val declareNullTrackVariables = if (left.dataType.asInstanceOf[ArrayType].containsNull) {
           s"""
-             |boolean $notFoundNullElement = true;
              |int $nullElementIndex = -1;
            """.stripMargin
         } else {
           ""
         }
 
+

Review comment:
       unnecessary change.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920136406


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47808/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921676295


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143389/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-919271012


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47769/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-923026730


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47964/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-922867340


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143452/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-919190472


   **[Test build #143264 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143264/testReport)** for PR 33994 at commit [`e9cb989`](https://github.com/apache/spark/commit/e9cb989bd50a7442d56adc12b755a13c44cac146).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #33994:
URL: https://github.com/apache/spark/pull/33994#discussion_r712094629



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -4197,7 +4202,9 @@ case class ArrayExcept(left: Expression, right: Expression) extends ArrayBinaryL
             if (left.dataType.asInstanceOf[ArrayType].containsNull) {
               s"""
                  |if ($array2.isNullAt($i)) {
-                 |  $notFoundNullElement = false;
+                 |  if (!$hashSet.containsNull()) {
+                 |    $hashSet.addNull();

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921901146


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47923/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921549705


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47905/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921519570


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47897/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-923650396


   **[Test build #143469 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143469/testReport)** for PR 33994 at commit [`6eaa96a`](https://github.com/apache/spark/commit/6eaa96a801a3660d854f21e0c7337eb89ca26163).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921908790


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47923/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-923653719


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143469/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921777416


   **[Test build #143413 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143413/testReport)** for PR 33994 at commit [`77b1661`](https://github.com/apache/spark/commit/77b16615f7fd02484e8e1ed1667220aec16e8e28).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921707449


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143397/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921864750


   **[Test build #143416 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143416/testReport)** for PR 33994 at commit [`cd3641d`](https://github.com/apache/spark/commit/cd3641dd63d5b6eb67c2e29ef88f6394f7620b7b).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-919456357


   **[Test build #143266 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143266/testReport)** for PR 33994 at commit [`d73557c`](https://github.com/apache/spark/commit/d73557c9220fc2df2b2890cf8bbed73c2374d81d).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920320093


   **[Test build #143306 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143306/testReport)** for PR 33994 at commit [`061132f`](https://github.com/apache/spark/commit/061132fdf8b6bb396af1d2379d2824b29a3cbf63).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-919275290






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921777416


   **[Test build #143413 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143413/testReport)** for PR 33994 at commit [`77b1661`](https://github.com/apache/spark/commit/77b16615f7fd02484e8e1ed1667220aec16e8e28).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921908790


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47923/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-923496807


   **[Test build #143469 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143469/testReport)** for PR 33994 at commit [`6eaa96a`](https://github.com/apache/spark/commit/6eaa96a801a3660d854f21e0c7337eb89ca26163).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-922870307


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47961/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-923653719


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143469/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-923018465


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47964/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921518491


   **[Test build #143397 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143397/testReport)** for PR 33994 at commit [`3ef22d7`](https://github.com/apache/spark/commit/3ef22d7d607a7b226e3755a072de2ed1f291386e).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921816396


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47920/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921707449


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143397/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-922038287


   **[Test build #143416 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143416/testReport)** for PR 33994 at commit [`cd3641d`](https://github.com/apache/spark/commit/cd3641dd63d5b6eb67c2e29ef88f6394f7620b7b).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-923562099


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47980/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-919240512


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47767/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-925058483


   thanks, merging to master/3.2/3.1/3.0!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920670326


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47844/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920799569


   **[Test build #143338 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143338/testReport)** for PR 33994 at commit [`662bf05`](https://github.com/apache/spark/commit/662bf059bd9576bc5873b976fc57f035055edc74).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920638432


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47844/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-922867340


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143452/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-922862882


   **[Test build #143452 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143452/testReport)** for PR 33994 at commit [`797089d`](https://github.com/apache/spark/commit/797089d0b705b83745c71fa19fbbb8d036e087f0).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a change in pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #33994:
URL: https://github.com/apache/spark/pull/33994#discussion_r711979124



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -4197,7 +4202,9 @@ case class ArrayExcept(left: Expression, right: Expression) extends ArrayBinaryL
             if (left.dataType.asInstanceOf[ArrayType].containsNull) {
               s"""
                  |if ($array2.isNullAt($i)) {
-                 |  $notFoundNullElement = false;
+                 |  if (!$hashSet.containsNull()) {
+                 |    $hashSet.addNull();

Review comment:
       we can remove the `if` and just do `$hashSet.addNull();`




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-922862882


   **[Test build #143452 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143452/testReport)** for PR 33994 at commit [`797089d`](https://github.com/apache/spark/commit/797089d0b705b83745c71fa19fbbb8d036e087f0).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921470327


   **[Test build #143389 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143389/testReport)** for PR 33994 at commit [`5f4d1bb`](https://github.com/apache/spark/commit/5f4d1bb4db6200bc2a1b36fbefa6eb5883f0982f).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921549759


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47905/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921674785


   **[Test build #143389 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143389/testReport)** for PR 33994 at commit [`5f4d1bb`](https://github.com/apache/spark/commit/5f4d1bb4db6200bc2a1b36fbefa6eb5883f0982f).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921816396


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47920/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #33994:
URL: https://github.com/apache/spark/pull/33994#discussion_r708571904



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -4094,9 +4100,16 @@ case class ArrayExcept(left: Expression, right: Expression) extends ArrayBinaryL
             }
           } else {
             val elem = array1.get(i, elementType)
-            if (!hs.contains(elem)) {
-              arrayBuffer += elem
-              hs.add(elem)
+            if (isNaN(elem)) {
+              if (notFoundNaNElement) {
+                arrayBuffer += elem

Review comment:
       For this, let's wait a little bit for the decision at the first PR.
   - https://github.com/apache/spark/pull/33955/files#r708570515




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921516289


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47897/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920143719


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47808/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921773591


   **[Test build #143411 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143411/testReport)** for PR 33994 at commit [`82640f5`](https://github.com/apache/spark/commit/82640f557d368d32d7d560b84e67eb7a7eca4de1).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921975855


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143411/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921519609


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47897/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921706168


   **[Test build #143397 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143397/testReport)** for PR 33994 at commit [`3ef22d7`](https://github.com/apache/spark/commit/3ef22d7d607a7b226e3755a072de2ed1f291386e).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921966151


   **[Test build #143411 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143411/testReport)** for PR 33994 at commit [`82640f5`](https://github.com/apache/spark/commit/82640f557d368d32d7d560b84e67eb7a7eca4de1).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `case class WriterBucketSpec(`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920649081


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47844/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-919226534


   **[Test build #143266 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143266/testReport)** for PR 33994 at commit [`d73557c`](https://github.com/apache/spark/commit/d73557c9220fc2df2b2890cf8bbed73c2374d81d).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921544919


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47905/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-923026700


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47964/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-923568237


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47980/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #33994:
URL: https://github.com/apache/spark/pull/33994#discussion_r711104608



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -4238,28 +4245,36 @@ case class ArrayExcept(left: Expression, right: Expression) extends ArrayBinaryL
             body
           }
 
-        val processArray1 = withArray1NullAssignment(
+        val body =
           s"""
-             |$jt $value = ${genGetValue(array1, i)};
              |if (!$hashSet.contains($hsValueCast$value)) {
              |  if (++$size > ${ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH}) {
              |    break;
              |  }
              |  $hashSet.add$hsPostFix($hsValueCast$value);
              |  $builder.$$plus$$eq($value);
              |}
-           """.stripMargin)
+           """.stripMargin
+
+        val processArray1 = withArray1NullAssignment(
+          s"$jt $value = ${genGetValue(array1, i)};" +
+            SQLOpenHashSet.withNaNCheckCode(elementType, value, hashSet, body,
+              (valueNaN: String) =>
+                s"""
+                   |$size++;
+                   |$builder.$$plus$$eq($valueNaN);
+                 """.stripMargin))
 
         // Only need to track null element index when array1's element is nullable.
         val declareNullTrackVariables = if (left.dataType.asInstanceOf[ArrayType].containsNull) {
           s"""
-             |boolean $notFoundNullElement = true;
              |int $nullElementIndex = -1;
            """.stripMargin
         } else {
           ""
         }
 
+

Review comment:
       remove




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-920670326


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47844/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-921470327


   **[Test build #143389 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143389/testReport)** for PR 33994 at commit [`5f4d1bb`](https://github.com/apache/spark/commit/5f4d1bb4db6200bc2a1b36fbefa6eb5883f0982f).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-922870035


   Kubernetes integration test unable to build dist.
   
   exiting with code: 1
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47961/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #33994: [SPARK-36753][SQL] ArrayExcept handle duplicated Double.NaN and Float.NaN

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33994:
URL: https://github.com/apache/spark/pull/33994#issuecomment-923568237


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47980/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org