You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by cloud-fan <gi...@git.apache.org> on 2017/11/04 00:27:08 UTC
[GitHub] spark pull request #19656: [SPARK-22445][SQL] move CodegenContext.copyResult...
GitHub user cloud-fan opened a pull request:
https://github.com/apache/spark/pull/19656
[SPARK-22445][SQL] move CodegenContext.copyResult to CodegenSupport
## What changes were proposed in this pull request?
`CodegenContext.copyResult` is kind of a global status for whole stage codegen. But the tricky part is, it is only used to transfer an information from child to parent when calling the `consume` chain. We have to be super careful in `produce`/`consume`, to set it to true when producing multiple result rows, and set it to false in operators that start new pipeline(like sort).
This PR moves the `copyResult` to `CodegenSupport`, and call it at `WholeStageCodegenExec`. This is much easier to reason about.
## How was this patch tested?
existing tests
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/cloud-fan/spark whole-sage
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19656.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19656
----
commit 94beff0e229da4c5dd3cb3ddb677e5fe6e42499c
Author: Wenchen Fan <we...@databricks.com>
Date: 2017-11-04T00:14:22Z
move CodegenContext.copyResult to CodegenSupport
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83459/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83426/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83429/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83450/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19656
**[Test build #83429 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83429/testReport)** for PR 19656 at commit [`94beff0`](https://github.com/apache/spark/commit/94beff0e229da4c5dd3cb3ddb677e5fe6e42499c).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19656
**[Test build #83444 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83444/testReport)** for PR 19656 at commit [`d1de300`](https://github.com/apache/spark/commit/d1de3002f0fad391cdea44011ce1458ca0348956).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19656
**[Test build #83459 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83459/testReport)** for PR 19656 at commit [`35c38d0`](https://github.com/apache/spark/commit/35c38d04a1ed7fbc0f637a54c797fc3b26e871a4).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19656
**[Test build #83450 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83450/testReport)** for PR 19656 at commit [`7ccc3ea`](https://github.com/apache/spark/commit/7ccc3ea3207a89b8c56d224094e926ffeb800f20).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83444/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19656
**[Test build #83450 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83450/testReport)** for PR 19656 at commit [`7ccc3ea`](https://github.com/apache/spark/commit/7ccc3ea3207a89b8c56d224094e926ffeb800f20).
* This patch **fails Spark unit tests**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19656: [SPARK-22445][SQL] move CodegenContext.copyResult...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/19656#discussion_r148917096
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala ---
@@ -213,19 +213,32 @@ trait CodegenSupport extends SparkPlan {
}
/**
- * For optimization to suppress shouldStop() in a loop of WholeStageCodegen.
- * Returning true means we need to insert shouldStop() into the loop producing rows, if any.
+ * Whether or not the result rows of this operator should be copied before putting into a buffer.
+ *
+ * If any operator inside WholeStageCodegen generate multiple rows from a single row (for
+ * example, Join), this should be true.
+ *
+ * If an operator starts a new pipeline, this should be false.
*/
- def isShouldStopRequired: Boolean = {
- return shouldStopRequired && (this.parent == null || this.parent.isShouldStopRequired)
+ protected def needCopyResult: Boolean = {
+ if (children.isEmpty) {
+ false
+ } else if (children.length == 1) {
+ children.head.asInstanceOf[CodegenSupport].needStopCheck
+ } else {
+ throw new UnsupportedOperationException
+ }
}
/**
- * Set to false if this plan consumes all rows produced by children but doesn't output row
- * to buffer by calling append(), so the children don't require shouldStop()
- * in the loop of producing rows.
+ * Whether or not the children of this operator should generate a stop check when consuming input
+ * rows. This is used to suppress shouldStop() in a loop of WholeStageCodegen.
+ *
+ * This should be false if an operator starts a new pipeline, which means it consumes all rows
+ * produced by children but doesn't output row to buffer by calling append(), so the children
+ * don't require shouldStop() in the loop of producing rows.
*/
- protected def shouldStopRequired: Boolean = true
+ protected def needStopCheck: Boolean = parent.needStopCheck
--- End diff --
it's unrelated but a simple clean up: we only need one method instead of 2.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19656
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19656
A rough glance looks good. Will review carefully after solving the build.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on the issue:
https://github.com/apache/spark/pull/19656
LGTM except for two comments.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83449/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19656
**[Test build #83444 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83444/testReport)** for PR 19656 at commit [`d1de300`](https://github.com/apache/spark/commit/d1de3002f0fad391cdea44011ce1458ca0348956).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19656: [SPARK-22445][SQL] move CodegenContext.copyResult...
Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/19656#discussion_r148933391
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala ---
@@ -213,19 +213,32 @@ trait CodegenSupport extends SparkPlan {
}
/**
- * For optimization to suppress shouldStop() in a loop of WholeStageCodegen.
- * Returning true means we need to insert shouldStop() into the loop producing rows, if any.
+ * Whether or not the result rows of this operator should be copied before putting into a buffer.
+ *
+ * If any operator inside WholeStageCodegen generate multiple rows from a single row (for
+ * example, Join), this should be true.
+ *
+ * If an operator starts a new pipeline, this should be false.
*/
- def isShouldStopRequired: Boolean = {
- return shouldStopRequired && (this.parent == null || this.parent.isShouldStopRequired)
+ def needCopyResult: Boolean = {
+ if (children.isEmpty) {
+ false
+ } else if (children.length == 1) {
+ children.head.asInstanceOf[CodegenSupport].needStopCheck
--- End diff --
`needCopyResult` here?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/19656
@juliuszsompolski @rednaxelafx @kiszk @viirya @gatorsmile
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/19656
thanks, merging to master! @kiszk I'll address your comments in follow-up if you have any.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83452/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83443/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19656
**[Test build #83429 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83429/testReport)** for PR 19656 at commit [`94beff0`](https://github.com/apache/spark/commit/94beff0e229da4c5dd3cb3ddb677e5fe6e42499c).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19656
**[Test build #83462 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83462/testReport)** for PR 19656 at commit [`35c38d0`](https://github.com/apache/spark/commit/35c38d04a1ed7fbc0f637a54c797fc3b26e871a4).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/19656
I will look at this on Monday.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19656
**[Test build #83443 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83443/testReport)** for PR 19656 at commit [`011be50`](https://github.com/apache/spark/commit/011be50028d769d6018a8e5d50a16a9e40950cc1).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by juliuszsompolski <gi...@git.apache.org>.
Github user juliuszsompolski commented on the issue:
https://github.com/apache/spark/pull/19656
`isShouldStopRequired` is currently not respected by most operators (they insert `shouldStop()` code regardless of this setting). If you're refactoring this, maybe make sure that all places that do `shouldStop()` use it?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19656
**[Test build #83443 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83443/testReport)** for PR 19656 at commit [`011be50`](https://github.com/apache/spark/commit/011be50028d769d6018a8e5d50a16a9e40950cc1).
* This patch **fails to build**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19656
**[Test build #83459 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83459/testReport)** for PR 19656 at commit [`35c38d0`](https://github.com/apache/spark/commit/35c38d04a1ed7fbc0f637a54c797fc3b26e871a4).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19656: [SPARK-22445][SQL] move CodegenContext.copyResult...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/19656
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/19656
retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19656
**[Test build #83462 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83462/testReport)** for PR 19656 at commit [`35c38d0`](https://github.com/apache/spark/commit/35c38d04a1ed7fbc0f637a54c797fc3b26e871a4).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19656: [SPARK-22445][SQL] move CodegenContext.copyResult...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/19656#discussion_r148934478
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/WholeStageCodegenExec.scala ---
@@ -213,19 +213,32 @@ trait CodegenSupport extends SparkPlan {
}
/**
- * For optimization to suppress shouldStop() in a loop of WholeStageCodegen.
- * Returning true means we need to insert shouldStop() into the loop producing rows, if any.
+ * Whether or not the result rows of this operator should be copied before putting into a buffer.
+ *
+ * If any operator inside WholeStageCodegen generate multiple rows from a single row (for
+ * example, Join), this should be true.
+ *
+ * If an operator starts a new pipeline, this should be false.
*/
- def isShouldStopRequired: Boolean = {
- return shouldStopRequired && (this.parent == null || this.parent.isShouldStopRequired)
+ def needCopyResult: Boolean = {
+ if (children.isEmpty) {
+ false
+ } else if (children.length == 1) {
+ children.head.asInstanceOf[CodegenSupport].needStopCheck
--- End diff --
oops
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19656: [SPARK-22445][SQL] move CodegenContext.copyResult...
Posted by viirya <gi...@git.apache.org>.
Github user viirya commented on a diff in the pull request:
https://github.com/apache/spark/pull/19656#discussion_r148933507
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala ---
@@ -76,6 +76,20 @@ case class BroadcastHashJoinExec(
streamedPlan.asInstanceOf[CodegenSupport].inputRDDs()
}
+ override def needCopyResult: Boolean = joinType match {
+ case _: InnerLike | LeftOuter | RightOuter =>
+ // For inner and outer joins, one row from the streamed side may produce multiple result rows,
+ // if the build side has duplicated keys. Then we need to copy the result rows before putting
+ // them in a buffer, because these result rows share one UnsafeRow instance. Note that here
+ // we wait for the broadcast to be finished, which is a no-op because it's already finished
+ // when we wait it in `doProduce`.
+ buildPlan.executeBroadcast[HashedRelation]().value.keyIsUnique
--- End diff --
Seems to be `!buildPlan.executeBroadcast[HashedRelation]().value.keyIsUnique`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19656
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83462/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/19656
Sorry for being late to look at this. LGTM
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/19656
Jenkins, retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19656: [SPARK-22445][SQL] move CodegenContext.copyResult to Cod...
Posted by kiszk <gi...@git.apache.org>.
Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/19656
Jenkins, retest this please
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org