You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by kiszk <gi...@git.apache.org> on 2018/08/08 18:49:24 UTC
[GitHub] spark pull request #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistin...
GitHub user kiszk opened a pull request:
https://github.com/apache/spark/pull/22044
[SPARK-23912][SQL][Followup] Refactor ArrayDistinct
## What changes were proposed in this pull request?
This PR simplified code generation for `ArrayDistinct`. #21966 enabled code generation only if the type can be specialized by the hash set. This PR follows this strategy.
Optimization of null handling will be implemented in #21912.
## How was this patch tested?
Existing UTs
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/kiszk/spark SPARK-23912-follow
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22044.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22044
----
commit 9d0cb100a85dc77569e1a62b83e0461ee2c33ddb
Author: Kazuaki Ishizaki <is...@...>
Date: 2018-08-08T18:43:51Z
stop code generation when elementTypeSupportEquals is false
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/22044
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94461/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94487/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94475/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/22044
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistin...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22044#discussion_r208855525
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
@@ -3410,6 +3410,28 @@ case class ArrayDistinct(child: Expression)
case _ => false
}
+ @transient protected lazy val canUseSpecializedHashSet = elementType match {
--- End diff --
Can we extract those common methods?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistin...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/22044
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22044
**[Test build #94475 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94475/testReport)** for PR 22044 at commit [`9d0cb10`](https://github.com/apache/spark/commit/9d0cb100a85dc77569e1a62b83e0461ee2c33ddb).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/22044
Thanks! merging to master.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1995/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1994/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22044
**[Test build #94450 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94450/testReport)** for PR 22044 at commit [`9d0cb10`](https://github.com/apache/spark/commit/9d0cb100a85dc77569e1a62b83e0461ee2c33ddb).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22044
**[Test build #94450 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94450/testReport)** for PR 22044 at commit [`9d0cb10`](https://github.com/apache/spark/commit/9d0cb100a85dc77569e1a62b83e0461ee2c33ddb).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94450/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on the issue:
https://github.com/apache/spark/pull/22044
LGTM pending Jenkins.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1981/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22044
**[Test build #94488 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94488/testReport)** for PR 22044 at commit [`ec262ab`](https://github.com/apache/spark/commit/ec262abf69fccdc8d922b6643a0fc8605b5d3294).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22044
**[Test build #94475 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94475/testReport)** for PR 22044 at commit [`9d0cb10`](https://github.com/apache/spark/commit/9d0cb100a85dc77569e1a62b83e0461ee2c33ddb).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94488/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22044
**[Test build #94471 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94471/testReport)** for PR 22044 at commit [`9d0cb10`](https://github.com/apache/spark/commit/9d0cb100a85dc77569e1a62b83e0461ee2c33ddb).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1972/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22044
**[Test build #94461 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94461/testReport)** for PR 22044 at commit [`9d0cb10`](https://github.com/apache/spark/commit/9d0cb100a85dc77569e1a62b83e0461ee2c33ddb).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/22044
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22044
**[Test build #94487 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94487/testReport)** for PR 22044 at commit [`89d0664`](https://github.com/apache/spark/commit/89d0664684875deeb8da29c4d45a7076a5e07bb4).
* This patch **fails MiMa tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94471/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1984/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22044
**[Test build #94488 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94488/testReport)** for PR 22044 at commit [`ec262ab`](https://github.com/apache/spark/commit/ec262abf69fccdc8d922b6643a0fc8605b5d3294).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22044
**[Test build #94471 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94471/testReport)** for PR 22044 at commit [`9d0cb10`](https://github.com/apache/spark/commit/9d0cb100a85dc77569e1a62b83e0461ee2c33ddb).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistin...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22044#discussion_r208854615
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
@@ -3517,56 +3510,24 @@ case class ArrayDistinct(child: Expression)
""".stripMargin
}
- private def setNotNullValue(isPrimitive: Boolean,
+ private def setNotNullValue(
--- End diff --
We can inline this?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistin...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22044#discussion_r208856510
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
@@ -3517,56 +3510,24 @@ case class ArrayDistinct(child: Expression)
""".stripMargin
}
- private def setNotNullValue(isPrimitive: Boolean,
+ private def setNotNullValue(
distinctArray: String,
pos: String,
getValue1: String,
primitiveValueTypeName: String): String = {
- if (!isPrimitive) {
- s"$distinctArray[$pos] = $getValue1";
- } else {
- s"$distinctArray.set$primitiveValueTypeName($pos, $getValue1)";
- }
+ s"$distinctArray.set$primitiveValueTypeName($pos, $getValue1)"
}
private def setValueForFastEval(
--- End diff --
Maybe we should rename this.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistin...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22044#discussion_r208873100
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
@@ -3442,17 +3464,15 @@ case class ArrayDistinct(child: Expression)
}
override def doGenCode(ctx: CodegenContext, ev: ExprCode): ExprCode = {
- nullSafeCodeGen(ctx, ev, (array) => {
- val i = ctx.freshName("i")
- val j = ctx.freshName("j")
- val sizeOfDistinctArray = ctx.freshName("sizeOfDistinctArray")
- val getValue1 = CodeGenerator.getValue(array, elementType, i)
- val getValue2 = CodeGenerator.getValue(array, elementType, j)
- val foundNullElement = ctx.freshName("foundNullElement")
- val openHashSet = classOf[OpenHashSet[_]].getName
- val hs = ctx.freshName("hs")
- val classTag = s"scala.reflect.ClassTag$$.MODULE$$.Object()"
- if (elementTypeSupportEquals) {
+ if (canUseSpecializedHashSet) {
+ nullSafeCodeGen(ctx, ev, (array) => {
+ val i = ctx.freshName("i")
+ val sizeOfDistinctArray = ctx.freshName("sizeOfDistinctArray")
+ val foundNullElement = ctx.freshName("foundNullElement")
+ val openHashSet = classOf[OpenHashSet[_]].getName
+ val hs = ctx.freshName("hs")
+ val classTag = s"scala.reflect.ClassTag$$.MODULE$$.$hsTypeName()"
+ val getValue1 = CodeGenerator.getValue(array, elementType, i)
--- End diff --
nit: `val getValue`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistin...
Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on a diff in the pull request:
https://github.com/apache/spark/pull/22044#discussion_r208862917
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
@@ -3410,6 +3410,28 @@ case class ArrayDistinct(child: Expression)
case _ => false
}
+ @transient protected lazy val canUseSpecializedHashSet = elementType match {
--- End diff --
We can do so. To minimize the changes due to remaining time for cutting, I would like to do this in another PR #21912.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22044
**[Test build #94487 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94487/testReport)** for PR 22044 at commit [`89d0664`](https://github.com/apache/spark/commit/89d0664684875deeb8da29c4d45a7076a5e07bb4).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistin...
Posted by ueshin <gi...@git.apache.org>.
Github user ueshin commented on a diff in the pull request:
https://github.com/apache/spark/pull/22044#discussion_r208849675
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collectionOperations.scala ---
@@ -3410,6 +3410,28 @@ case class ArrayDistinct(child: Expression)
case _ => false
}
+ @transient protected lazy val canUseSpecializedHashSet = elementType match {
+ case ByteType | ShortType | IntegerType | LongType | FloatType | DoubleType => true
+ case _ => false
+ }
+
+ @transient protected lazy val (hsPostFix, hsTypeName) = {
+ val ptName = CodeGenerator.primitiveTypeName (elementType)
--- End diff --
nit: extra space after `primitiveTypeName`.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22044
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1966/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22044: [SPARK-23912][SQL][Followup] Refactor ArrayDistinct
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22044
**[Test build #94461 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94461/testReport)** for PR 22044 at commit [`9d0cb10`](https://github.com/apache/spark/commit/9d0cb100a85dc77569e1a62b83e0461ee2c33ddb).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org