You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/09/16 03:03:29 UTC

[GitHub] [spark] AngersZhuuuu opened a new pull request #34011: [SPARK-36702][SQL][3.0] ArrayUnion handle duplicated Double.NaN and F…

AngersZhuuuu opened a new pull request #34011:
URL: https://github.com/apache/spark/pull/34011


   ### What changes were proposed in this pull request?
   For query
   ```
   select array_union(array(cast('nan' as double), cast('nan' as double)), array())
   ```
   This returns [NaN, NaN], but it should return [NaN].
   This issue is caused by `OpenHashSet` can't handle `Double.NaN` and `Float.NaN` too.
   In this pr we add a wrap for OpenHashSet that can handle `null`, `Double.NaN`, `Float.NaN` together
   
   
   ### Why are the changes needed?
   Fix bug
   
   ### Does this PR introduce _any_ user-facing change?
   ArrayUnion won't show duplicated `NaN` value
   
   
   ### How was this patch tested?
   Added UT
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #34011: [SPARK-36702][SQL][3.0] ArrayUnion handle duplicated Double.NaN and F…

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #34011:
URL: https://github.com/apache/spark/pull/34011#issuecomment-920539601


   **[Test build #143329 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143329/testReport)** for PR 34011 at commit [`07774ff`](https://github.com/apache/spark/commit/07774ffa2ee742601a6103b59b32f6c50624e296).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34011: [SPARK-36702][SQL][3.0] ArrayUnion handle duplicated Double.NaN and F…

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34011:
URL: https://github.com/apache/spark/pull/34011#issuecomment-920638849


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143329/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #34011: [SPARK-36702][SQL][3.0] ArrayUnion handle duplicated Double.NaN and F…

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #34011:
URL: https://github.com/apache/spark/pull/34011#issuecomment-920568770


   thanks, merging to 3.0


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34011: [SPARK-36702][SQL][3.0] ArrayUnion handle duplicated Double.NaN and F…

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34011:
URL: https://github.com/apache/spark/pull/34011#issuecomment-920557495


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47834/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34011: [SPARK-36702][SQL][3.0] ArrayUnion handle duplicated Double.NaN and F…

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34011:
URL: https://github.com/apache/spark/pull/34011#issuecomment-920555607


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47834/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34011: [SPARK-36702][SQL][3.0] ArrayUnion handle duplicated Double.NaN and F…

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34011:
URL: https://github.com/apache/spark/pull/34011#issuecomment-920557479


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47834/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34011: [SPARK-36702][SQL][3.0] ArrayUnion handle duplicated Double.NaN and F…

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34011:
URL: https://github.com/apache/spark/pull/34011#issuecomment-920636923


   **[Test build #143329 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143329/testReport)** for PR 34011 at commit [`07774ff`](https://github.com/apache/spark/commit/07774ffa2ee742601a6103b59b32f6c50624e296).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds the following public classes _(experimental)_:
     * `class SQLOpenHashSet[@specialized(Long, Int, Double, Float) T: ClassTag](`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan closed pull request #34011: [SPARK-36702][SQL][3.0] ArrayUnion handle duplicated Double.NaN and F…

Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #34011:
URL: https://github.com/apache/spark/pull/34011


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #34011: [SPARK-36702][SQL][3.0] ArrayUnion handle duplicated Double.NaN and F…

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #34011:
URL: https://github.com/apache/spark/pull/34011#issuecomment-920638849


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143329/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #34011: [SPARK-36702][SQL][3.0] ArrayUnion handle duplicated Double.NaN and F…

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #34011:
URL: https://github.com/apache/spark/pull/34011#issuecomment-920557495


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47834/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AngersZhuuuu commented on pull request #34011: [SPARK-36702][SQL][3.0] ArrayUnion handle duplicated Double.NaN and F…

Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on pull request #34011:
URL: https://github.com/apache/spark/pull/34011#issuecomment-920538286


   ping @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #34011: [SPARK-36702][SQL][3.0] ArrayUnion handle duplicated Double.NaN and F…

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #34011:
URL: https://github.com/apache/spark/pull/34011#issuecomment-920539601


   **[Test build #143329 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143329/testReport)** for PR 34011 at commit [`07774ff`](https://github.com/apache/spark/commit/07774ffa2ee742601a6103b59b32f6c50624e296).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org