You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "xiangxiang Shen (Jira)" <ji...@apache.org> on 2022/07/10 12:32:00 UTC
[jira] [Commented] (SPARK-39729) Why generate WholeStagecodegen for single operator?
[ https://issues.apache.org/jira/browse/SPARK-39729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17564676#comment-17564676 ]
xiangxiang Shen commented on SPARK-39729:
-----------------------------------------
CC [~tdas] [~dongjoon]
Thanks
> Why generate WholeStagecodegen for single operator?
> ---------------------------------------------------
>
> Key: SPARK-39729
> URL: https://issues.apache.org/jira/browse/SPARK-39729
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Affects Versions: 3.3.0
> Reporter: xiangxiang Shen
> Priority: Major
>
> WholeStagecodegen will have better performance in many cases. But it should not use WholeStagecodegen for single operator.
> Below is a simple experiment.
> {code:java}
> test("range/filter should be combined") {
> val df = spark.range(10).filter("id = 1").selectExpr("id + 1")
> val plan = df.queryExecution.executedPlan
> assert(plan.find(_.isInstanceOf[WholeStageCodegenExec]).isDefined)
> assert(df.collect() === Array(Row(2)))
> df.explain(false)
> df.queryExecution.debug.codegen
> }{code}
>
> If add
> {code:java}
> override def supportCodegen: Boolean = false{code}
> in FilterExec.
>
> The physical plan is
> {code:java}
> == Physical Plan ==
> *(2) Project [(id#0L + 1) AS (id + 1)#4L]
> +- Filter (id#0L = 1)
> +- *(1) Range (0, 10, step=1, splits=2){code}
>
> The performence is not good in this case.
> How can disable WholeStagecodegen in these cases?
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org