You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/09/14 13:26:14 UTC
[GitHub] [spark] AngersZhuuuu opened a new pull request #33993: [SPARK-36742][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
AngersZhuuuu opened a new pull request #33993:
URL: https://github.com/apache/spark/pull/33993
### What changes were proposed in this pull request?
For query
```
select array_distinct(array(cast('nan' as double), cast('nan' as double)))
```
This returns [NaN, NaN], but it should return [NaN].
This issue is caused by `OpenHashSet` can't handle `Double.NaN` and `Float.NaN` too.
In this pr fix this based on https://github.com/apache/spark/pull/33955
### Why are the changes needed?
Fix bug
### Does this PR introduce _any_ user-facing change?
ArrayUnion won't show duplicated `NaN` value
### How was this patch tested?
Added UT
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-919206586
Kubernetes integration test unable to build dist.
exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47765/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-919420666
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143262/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920793709
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47851/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #33993:
URL: https://github.com/apache/spark/pull/33993#discussion_r708454521
##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala
##########
@@ -2326,4 +2326,13 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
Literal.create(Seq(Float.NaN, null, 1f), ArrayType(FloatType))),
Seq(Float.NaN, null, 1f))
}
+
+ test("SPARK-36740: ArrayDistinct should handle duplicated Double.NaN and Float.Nan") {
Review comment:
SPARK-36741 instead of SPARK-36740?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920683880
**[Test build #143342 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143342/testReport)** for PR 33993 at commit [`2478eb4`](https://github.com/apache/spark/commit/2478eb446f8a47bc983661dbfd7801c7f6fe2230).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921475884
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47895/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921420738
**[Test build #143385 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143385/testReport)** for PR 33993 at commit [`d8e80fc`](https://github.com/apache/spark/commit/d8e80fc567f2524075cd6ee23787e47bca480c4f).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921544035
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47902/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921085976
**[Test build #143360 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143360/testReport)** for PR 33993 at commit [`389c9fd`](https://github.com/apache/spark/commit/389c9fd70b6cdff1df09aa90735844558ddef35b).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920797913
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47852/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920331170
**[Test build #143307 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143307/testReport)** for PR 33993 at commit [`f5c5452`](https://github.com/apache/spark/commit/f5c54527905343d599d760e76a03a435962f8d1a).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920803835
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47852/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #33993:
URL: https://github.com/apache/spark/pull/33993#discussion_r710753893
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -3410,32 +3410,59 @@ case class ArrayDistinct(child: Expression)
}
override def nullSafeEval(array: Any): Any = {
- val data = array.asInstanceOf[ArrayData].toArray[AnyRef](elementType)
+ val data = array.asInstanceOf[ArrayData]
doEvaluation(data)
}
@transient private lazy val doEvaluation = if (TypeUtils.typeWithProperEquals(elementType)) {
- (data: Array[AnyRef]) => new GenericArrayData(data.distinct.asInstanceOf[Array[Any]])
+ (array: ArrayData) =>
+ val arrayBuffer = new scala.collection.mutable.ArrayBuffer[Any]
+ val hs = new SQLOpenHashSet[Any]()
+ val withNaNCheckFunc = SQLOpenHashSet.withNaNCheckFunc(elementType, hs,
+ (value: Any) =>
+ if (!hs.contains(value)) {
+ if (arrayBuffer.size > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+ ArrayBinaryLike.throwUnionLengthOverflowException(arrayBuffer.size)
+ }
+ arrayBuffer += value
+ hs.add(value)
+ },
+ (value: Any) => arrayBuffer += value)
Review comment:
Done
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921672405
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143391/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921497259
**[Test build #143391 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143391/testReport)** for PR 33993 at commit [`434a2be`](https://github.com/apache/spark/commit/434a2beb096025db1e9363d73b1ff4a7ffab7c61).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921441449
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47892/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921085976
**[Test build #143360 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143360/testReport)** for PR 33993 at commit [`389c9fd`](https://github.com/apache/spark/commit/389c9fd70b6cdff1df09aa90735844558ddef35b).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921773222
thanks, merging to master/3.2/3.1/3.0!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920069452
ping @cloud-fan @dongjoon-hyun
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920757770
**[Test build #143346 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143346/testReport)** for PR 33993 at commit [`5546a55`](https://github.com/apache/spark/commit/5546a55bdba6855c3a229a2b44778c1240b71bc1).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #33993:
URL: https://github.com/apache/spark/pull/33993#discussion_r709947109
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/util/SQLOpenHashSet.scala
##########
@@ -60,21 +60,52 @@ class SQLOpenHashSet[@specialized(Long, Int, Double, Float) T: ClassTag](
}
object SQLOpenHashSet {
- def isNaN(dataType: DataType): Any => Boolean = {
+ def isNaNFuncAndValueNaN(dataType: DataType): (Any => Boolean, Any) = {
dataType match {
case DoubleType =>
- (value: Any) => java.lang.Double.isNaN(value.asInstanceOf[java.lang.Double])
+ ((value: Any) => java.lang.Double.isNaN(value.asInstanceOf[java.lang.Double]),
+ java.lang.Double.NaN)
case FloatType =>
- (value: Any) => java.lang.Float.isNaN(value.asInstanceOf[java.lang.Float])
- case _ => (_: Any) => false
+ ((value: Any) => java.lang.Float.isNaN(value.asInstanceOf[java.lang.Float]),
+ java.lang.Float.NaN)
+ case _ => ((_: Any) => false, null)
}
}
- def valueNaN(dataType: DataType): Any = {
+ def isNaNFuncAndValueNaN(dataType: DataType, valueName: String): Option[(String, String)] = {
Review comment:
Yea, Done
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920971148
**[Test build #143347 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143347/testReport)** for PR 33993 at commit [`72870f6`](https://github.com/apache/spark/commit/72870f6350d870b4c2eb71c70be962affd5a237a).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921112877
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47867/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #33993:
URL: https://github.com/apache/spark/pull/33993#discussion_r710700103
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -3410,32 +3410,60 @@ case class ArrayDistinct(child: Expression)
}
override def nullSafeEval(array: Any): Any = {
- val data = array.asInstanceOf[ArrayData].toArray[AnyRef](elementType)
+ val data = array.asInstanceOf[ArrayData]
doEvaluation(data)
}
@transient private lazy val doEvaluation = if (TypeUtils.typeWithProperEquals(elementType)) {
- (data: Array[AnyRef]) => new GenericArrayData(data.distinct.asInstanceOf[Array[Any]])
+ (array: ArrayData) =>
+ val arrayBuffer = new scala.collection.mutable.ArrayBuffer[Any]
+ val hs = new SQLOpenHashSet[Any]()
+ val (isNaN, valueNaN) = SQLOpenHashSet.isNaNFuncAndValueNaN(elementType)
Review comment:
> `isNaN` and `valueNaN` are only used by the next function call `withNaNCheckFunc`, we can calculate them in `withNaNCheckFunc`
Yea, after make it return a partial function, the value will be calculated once. Done
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921588012
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143385/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #33993:
URL: https://github.com/apache/spark/pull/33993#discussion_r709247445
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -3491,17 +3521,41 @@ case class ArrayDistinct(child: Expression)
body
}
- val processArray = withArrayNullAssignment(
+ def withNaNCheck(body: String): String = {
Review comment:
shall we move these codegen utils to `SQLOpenHashSet` as well to reduce duplicated code?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
cloud-fan closed pull request #33993:
URL: https://github.com/apache/spark/pull/33993
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920972806
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143347/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920771530
**[Test build #143347 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143347/testReport)** for PR 33993 at commit [`72870f6`](https://github.com/apache/spark/commit/72870f6350d870b4c2eb71c70be962affd5a237a).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920837653
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143346/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920139835
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47809/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-919166595
**[Test build #143262 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143262/testReport)** for PR 33993 at commit [`0ac9924`](https://github.com/apache/spark/commit/0ac9924723944acf1b0e320e6bda5be264e98651).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921670874
**[Test build #143391 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143391/testReport)** for PR 33993 at commit [`434a2be`](https://github.com/apache/spark/commit/434a2beb096025db1e9363d73b1ff4a7ffab7c61).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920836860
**[Test build #143346 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143346/testReport)** for PR 33993 at commit [`5546a55`](https://github.com/apache/spark/commit/5546a55bdba6855c3a229a2b44778c1240b71bc1).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920332893
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143307/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920734541
**[Test build #143345 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143345/testReport)** for PR 33993 at commit [`202cf4e`](https://github.com/apache/spark/commit/202cf4e966d24b063e2fe2d7c9179c42eb6257c3).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921588012
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143385/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #33993:
URL: https://github.com/apache/spark/pull/33993#discussion_r710697090
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -3410,32 +3410,60 @@ case class ArrayDistinct(child: Expression)
}
override def nullSafeEval(array: Any): Any = {
- val data = array.asInstanceOf[ArrayData].toArray[AnyRef](elementType)
+ val data = array.asInstanceOf[ArrayData]
doEvaluation(data)
}
@transient private lazy val doEvaluation = if (TypeUtils.typeWithProperEquals(elementType)) {
- (data: Array[AnyRef]) => new GenericArrayData(data.distinct.asInstanceOf[Array[Any]])
+ (array: ArrayData) =>
+ val arrayBuffer = new scala.collection.mutable.ArrayBuffer[Any]
+ val hs = new SQLOpenHashSet[Any]()
+ val (isNaN, valueNaN) = SQLOpenHashSet.isNaNFuncAndValueNaN(elementType)
Review comment:
`isNaN` and `valueNaN` are only used by the next function call `withNaNCheckFunc`, we can calculate them in `withNaNCheckFunc`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921628160
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143387/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-919420666
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143262/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920771959
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47850/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921445437
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47892/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920874071
**[Test build #143342 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143342/testReport)** for PR 33993 at commit [`2478eb4`](https://github.com/apache/spark/commit/2478eb446f8a47bc983661dbfd7801c7f6fe2230).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920147834
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47809/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920723076
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47848/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920730752
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47848/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920773099
**[Test build #143345 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143345/testReport)** for PR 33993 at commit [`202cf4e`](https://github.com/apache/spark/commit/202cf4e966d24b063e2fe2d7c9179c42eb6257c3).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920773210
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143345/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921123904
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47867/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921626552
**[Test build #143387 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143387/testReport)** for PR 33993 at commit [`21e9422`](https://github.com/apache/spark/commit/21e9422d9cd83590ba3b9935ef6596b82d5f9b4e).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921123855
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47867/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921282041
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143360/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921675421
ping @cloud-fan
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921461082
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47895/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921534626
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47902/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920332893
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143307/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921475961
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47895/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921544035
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47902/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921497259
**[Test build #143391 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143391/testReport)** for PR 33993 at commit [`434a2be`](https://github.com/apache/spark/commit/434a2beb096025db1e9363d73b1ff4a7ffab7c61).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921451536
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47894/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921434154
**[Test build #143387 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143387/testReport)** for PR 33993 at commit [`21e9422`](https://github.com/apache/spark/commit/21e9422d9cd83590ba3b9935ef6596b82d5f9b4e).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921628160
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143387/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-919535649
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143267/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920788615
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47851/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921280951
**[Test build #143360 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143360/testReport)** for PR 33993 at commit [`389c9fd`](https://github.com/apache/spark/commit/389c9fd70b6cdff1df09aa90735844558ddef35b).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921123904
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47867/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921475961
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47895/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920875649
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143342/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921445098
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47894/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921467329
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47894/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #33993:
URL: https://github.com/apache/spark/pull/33993#discussion_r710752639
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -3410,32 +3410,59 @@ case class ArrayDistinct(child: Expression)
}
override def nullSafeEval(array: Any): Any = {
- val data = array.asInstanceOf[ArrayData].toArray[AnyRef](elementType)
+ val data = array.asInstanceOf[ArrayData]
doEvaluation(data)
}
@transient private lazy val doEvaluation = if (TypeUtils.typeWithProperEquals(elementType)) {
- (data: Array[AnyRef]) => new GenericArrayData(data.distinct.asInstanceOf[Array[Any]])
+ (array: ArrayData) =>
+ val arrayBuffer = new scala.collection.mutable.ArrayBuffer[Any]
+ val hs = new SQLOpenHashSet[Any]()
+ val withNaNCheckFunc = SQLOpenHashSet.withNaNCheckFunc(elementType, hs,
+ (value: Any) =>
+ if (!hs.contains(value)) {
+ if (arrayBuffer.size > ByteArrayMethods.MAX_ROUNDED_ARRAY_LENGTH) {
+ ArrayBinaryLike.throwUnionLengthOverflowException(arrayBuffer.size)
+ }
+ arrayBuffer += value
+ hs.add(value)
+ },
+ (value: Any) => arrayBuffer += value)
Review comment:
nit: name it `valueNaN` to be clear
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921434154
**[Test build #143387 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143387/testReport)** for PR 33993 at commit [`21e9422`](https://github.com/apache/spark/commit/21e9422d9cd83590ba3b9935ef6596b82d5f9b4e).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-919366104
Kubernetes integration test unable to build dist.
exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47770/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-919374713
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47770/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920765589
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47850/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920972806
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143347/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920837653
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143346/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920757770
**[Test build #143346 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143346/testReport)** for PR 33993 at commit [`5546a55`](https://github.com/apache/spark/commit/5546a55bdba6855c3a229a2b44778c1240b71bc1).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920793747
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47851/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #33993:
URL: https://github.com/apache/spark/pull/33993#discussion_r709250978
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -3491,17 +3521,41 @@ case class ArrayDistinct(child: Expression)
body
}
- val processArray = withArrayNullAssignment(
+ def withNaNCheck(body: String): String = {
Review comment:
> shall we move these codegen utils to `SQLOpenHashSet` as well to reduce duplicated code?
How about to do this after all done.
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -3491,17 +3521,41 @@ case class ArrayDistinct(child: Expression)
body
}
- val processArray = withArrayNullAssignment(
+ def withNaNCheck(body: String): String = {
Review comment:
> shall we move these codegen utils to `SQLOpenHashSet` as well to reduce duplicated code?
How about to do this after all done. Not only this can be refactored.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920088524
**[Test build #143307 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143307/testReport)** for PR 33993 at commit [`f5c5452`](https://github.com/apache/spark/commit/f5c54527905343d599d760e76a03a435962f8d1a).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920773210
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143345/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920734541
**[Test build #143345 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143345/testReport)** for PR 33993 at commit [`202cf4e`](https://github.com/apache/spark/commit/202cf4e966d24b063e2fe2d7c9179c42eb6257c3).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36742][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-919166595
**[Test build #143262 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143262/testReport)** for PR 33993 at commit [`0ac9924`](https://github.com/apache/spark/commit/0ac9924723944acf1b0e320e6bda5be264e98651).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-919535649
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143267/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921445457
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47892/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920803835
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47852/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920875649
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143342/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920771917
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47850/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #33993:
URL: https://github.com/apache/spark/pull/33993#discussion_r709243579
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -3410,32 +3410,63 @@ case class ArrayDistinct(child: Expression)
}
override def nullSafeEval(array: Any): Any = {
- val data = array.asInstanceOf[ArrayData].toArray[AnyRef](elementType)
+ val data = array.asInstanceOf[ArrayData]
doEvaluation(data)
}
@transient private lazy val doEvaluation = if (TypeUtils.typeWithProperEquals(elementType)) {
- (data: Array[AnyRef]) => new GenericArrayData(data.distinct.asInstanceOf[Array[Any]])
+ (array: ArrayData) =>
+ val arrayBuffer = new scala.collection.mutable.ArrayBuffer[Any]
+ val hs = new SQLOpenHashSet[Any]()
+ val isNaN = SQLOpenHashSet.isNaN(elementType)
+ var i = 0
+ while (i < array.numElements()) {
+ if (array.isNullAt(i)) {
+ if (!hs.containsNull) {
+ hs.addNull
+ arrayBuffer += null
+ }
+ } else {
+ val elem = array.get(i, elementType)
+ if (isNaN(elem)) {
+ if (!hs.containsNaN) {
+ arrayBuffer += elem
Review comment:
> For this, let's wait for the decision at the first PR.
>
> * https://github.com/apache/spark/pull/33955/files#r708570515
Done
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921282041
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143360/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #33993:
URL: https://github.com/apache/spark/pull/33993#discussion_r708571227
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -3410,32 +3410,63 @@ case class ArrayDistinct(child: Expression)
}
override def nullSafeEval(array: Any): Any = {
- val data = array.asInstanceOf[ArrayData].toArray[AnyRef](elementType)
+ val data = array.asInstanceOf[ArrayData]
doEvaluation(data)
}
@transient private lazy val doEvaluation = if (TypeUtils.typeWithProperEquals(elementType)) {
- (data: Array[AnyRef]) => new GenericArrayData(data.distinct.asInstanceOf[Array[Any]])
+ (array: ArrayData) =>
+ val arrayBuffer = new scala.collection.mutable.ArrayBuffer[Any]
+ val hs = new SQLOpenHashSet[Any]()
+ val isNaN = SQLOpenHashSet.isNaN(elementType)
+ var i = 0
+ while (i < array.numElements()) {
+ if (array.isNullAt(i)) {
+ if (!hs.containsNull) {
+ hs.addNull
+ arrayBuffer += null
+ }
+ } else {
+ val elem = array.get(i, elementType)
+ if (isNaN(elem)) {
+ if (!hs.containsNaN) {
+ arrayBuffer += elem
Review comment:
For this, let's wait for the decision at the first PR.
- https://github.com/apache/spark/pull/33955/files#r708570515
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #33993:
URL: https://github.com/apache/spark/pull/33993#discussion_r709942031
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/util/SQLOpenHashSet.scala
##########
@@ -60,21 +60,52 @@ class SQLOpenHashSet[@specialized(Long, Int, Double, Float) T: ClassTag](
}
object SQLOpenHashSet {
- def isNaN(dataType: DataType): Any => Boolean = {
+ def isNaNFuncAndValueNaN(dataType: DataType): (Any => Boolean, Any) = {
dataType match {
case DoubleType =>
- (value: Any) => java.lang.Double.isNaN(value.asInstanceOf[java.lang.Double])
+ ((value: Any) => java.lang.Double.isNaN(value.asInstanceOf[java.lang.Double]),
+ java.lang.Double.NaN)
case FloatType =>
- (value: Any) => java.lang.Float.isNaN(value.asInstanceOf[java.lang.Float])
- case _ => (_: Any) => false
+ ((value: Any) => java.lang.Float.isNaN(value.asInstanceOf[java.lang.Float]),
+ java.lang.Float.NaN)
+ case _ => ((_: Any) => false, null)
}
}
- def valueNaN(dataType: DataType): Any = {
+ def isNaNFuncAndValueNaN(dataType: DataType, valueName: String): Option[(String, String)] = {
Review comment:
seems we can remove it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920771530
**[Test build #143347 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143347/testReport)** for PR 33993 at commit [`72870f6`](https://github.com/apache/spark/commit/72870f6350d870b4c2eb71c70be962affd5a237a).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
cloud-fan commented on a change in pull request #33993:
URL: https://github.com/apache/spark/pull/33993#discussion_r710270899
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/util/SQLOpenHashSet.scala
##########
@@ -60,21 +60,59 @@ class SQLOpenHashSet[@specialized(Long, Int, Double, Float) T: ClassTag](
}
object SQLOpenHashSet {
- def isNaN(dataType: DataType): Any => Boolean = {
+ def isNaNFuncAndValueNaN(dataType: DataType): (Any => Boolean, Any) = {
dataType match {
case DoubleType =>
- (value: Any) => java.lang.Double.isNaN(value.asInstanceOf[java.lang.Double])
+ ((value: Any) => java.lang.Double.isNaN(value.asInstanceOf[java.lang.Double]),
+ java.lang.Double.NaN)
case FloatType =>
- (value: Any) => java.lang.Float.isNaN(value.asInstanceOf[java.lang.Float])
- case _ => (_: Any) => false
+ ((value: Any) => java.lang.Float.isNaN(value.asInstanceOf[java.lang.Float]),
+ java.lang.Float.NaN)
+ case _ => ((_: Any) => false, null)
}
}
- def valueNaN(dataType: DataType): Any = {
- dataType match {
- case DoubleType => java.lang.Double.NaN
- case FloatType => java.lang.Float.NaN
- case _ => null
+ def withNaNCheckFunc(
+ isNaN: Any => Boolean,
+ valueNaN: Any,
+ value: Any,
+ hashSet: SQLOpenHashSet[Any],
+ handleNotNaN: () => Unit,
+ handleNaN: Any => Unit): Unit = {
Review comment:
can we make it return a function?
```
def withNaNCheckFunc(
dataType: DataType,
hashSet: SQLOpenHashSet[Any],
handleNotNaN: Any => Unit,
handleNaN: Any => Unit): Any => Unit = {
(dataType match {
case FloatType => ...
case DoubleType => ...
}).map { case (isNaN, valueNaN) =>
(value: Any) => ...
}.getOrElse {
(value: Any) => handleNonNaN(value)
}
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #33993:
URL: https://github.com/apache/spark/pull/33993#discussion_r710317852
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/util/SQLOpenHashSet.scala
##########
@@ -60,21 +60,59 @@ class SQLOpenHashSet[@specialized(Long, Int, Double, Float) T: ClassTag](
}
object SQLOpenHashSet {
- def isNaN(dataType: DataType): Any => Boolean = {
+ def isNaNFuncAndValueNaN(dataType: DataType): (Any => Boolean, Any) = {
dataType match {
case DoubleType =>
- (value: Any) => java.lang.Double.isNaN(value.asInstanceOf[java.lang.Double])
+ ((value: Any) => java.lang.Double.isNaN(value.asInstanceOf[java.lang.Double]),
+ java.lang.Double.NaN)
case FloatType =>
- (value: Any) => java.lang.Float.isNaN(value.asInstanceOf[java.lang.Float])
- case _ => (_: Any) => false
+ ((value: Any) => java.lang.Float.isNaN(value.asInstanceOf[java.lang.Float]),
+ java.lang.Float.NaN)
+ case _ => ((_: Any) => false, null)
}
}
- def valueNaN(dataType: DataType): Any = {
- dataType match {
- case DoubleType => java.lang.Double.NaN
- case FloatType => java.lang.Float.NaN
- case _ => null
+ def withNaNCheckFunc(
+ isNaN: Any => Boolean,
+ valueNaN: Any,
+ value: Any,
+ hashSet: SQLOpenHashSet[Any],
+ handleNotNaN: () => Unit,
+ handleNaN: Any => Unit): Unit = {
Review comment:
Good suggestion. Done
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #33993:
URL: https://github.com/apache/spark/pull/33993#discussion_r709883506
##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala
##########
@@ -3491,17 +3521,41 @@ case class ArrayDistinct(child: Expression)
body
}
- val processArray = withArrayNullAssignment(
+ def withNaNCheck(body: String): String = {
Review comment:
@cloud-fan Updated, How about current?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-919220064
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47765/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920683880
**[Test build #143342 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143342/testReport)** for PR 33993 at commit [`2478eb4`](https://github.com/apache/spark/commit/2478eb446f8a47bc983661dbfd7801c7f6fe2230).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921586672
**[Test build #143385 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143385/testReport)** for PR 33993 at commit [`d8e80fc`](https://github.com/apache/spark/commit/d8e80fc567f2524075cd6ee23787e47bca480c4f).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920730752
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47848/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920803802
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47852/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920793747
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47851/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-919418628
**[Test build #143262 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143262/testReport)** for PR 33993 at commit [`0ac9924`](https://github.com/apache/spark/commit/0ac9924723944acf1b0e320e6bda5be264e98651).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921420738
**[Test build #143385 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143385/testReport)** for PR 33993 at commit [`d8e80fc`](https://github.com/apache/spark/commit/d8e80fc567f2524075cd6ee23787e47bca480c4f).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-919338900
**[Test build #143267 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143267/testReport)** for PR 33993 at commit [`63763df`](https://github.com/apache/spark/commit/63763dfa8518d4cdd3b90755f2502354363a2493).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921530458
Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47902/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-919374713
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47770/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920147939
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47809/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on a change in pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AngersZhuuuu commented on a change in pull request #33993:
URL: https://github.com/apache/spark/pull/33993#discussion_r708461715
##########
File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollectionExpressionsSuite.scala
##########
@@ -2326,4 +2326,13 @@ class CollectionExpressionsSuite extends SparkFunSuite with ExpressionEvalHelper
Literal.create(Seq(Float.NaN, null, 1f), ArrayType(FloatType))),
Seq(Float.NaN, null, 1f))
}
+
+ test("SPARK-36740: ArrayDistinct should handle duplicated Double.NaN and Float.Nan") {
Review comment:
Done...
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-919534589
**[Test build #143267 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143267/testReport)** for PR 33993 at commit [`63763df`](https://github.com/apache/spark/commit/63763dfa8518d4cdd3b90755f2502354363a2493).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-919338900
**[Test build #143267 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143267/testReport)** for PR 33993 at commit [`63763df`](https://github.com/apache/spark/commit/63763dfa8518d4cdd3b90755f2502354363a2493).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-919220064
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47765/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921672405
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/143391/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920771959
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47850/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920088524
**[Test build #143307 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/143307/testReport)** for PR 33993 at commit [`f5c5452`](https://github.com/apache/spark/commit/f5c54527905343d599d760e76a03a435962f8d1a).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921467329
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47894/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920730673
Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47848/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-920147939
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47809/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #33993: [SPARK-36741][SQL] ArrayDistinct handle duplicated Double.NaN and Float.Nan
Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #33993:
URL: https://github.com/apache/spark/pull/33993#issuecomment-921445457
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/47892/
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org