You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Kapil Singh (Jira)" <ji...@apache.org> on 2022/07/06 06:02:00 UTC

[jira] [Updated] (SPARK-39690) Reuse exchange across subqueries is broken with AQE if subquery side exchange materialized first

     [ https://issues.apache.org/jira/browse/SPARK-39690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Kapil Singh updated SPARK-39690:
--------------------------------
    Description: 
When trying to reuse Exchange of a subquery in main plan, if the Exchange inside subquery materialize first then main ASPE node won't have that stage info (in [stageToReplace|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala#L243]) to replace in current logical plan. This will cause AQE to produce new candidate physical plan without reusing the exchange present inside subquery. And depending on how complex the inner plan is (no. of exchanges) AQE could choose plan without ReusedExchange. 

We have seen with multiple queries with our private build. This can happen in DPP also.

  was:
When trying to reuse Exchange of a subquery in main plan, if the Exchange inside subquery materialize first then main ASPE node won't have that stage info (in [stageToReplace|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala#L243]) to replace in current logical plan. This will cause AQE to produce new candidate physical plan without reusing the exchange present inside subquery. And depending on how complex the inner plan is (no. of exchanges) AQE could choose plan without ReusedExchange. 

We have seen in with multiple queries with our private build. This can happen in DPP also.


> Reuse exchange across subqueries is broken with AQE if subquery side exchange materialized first
> ------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-39690
>                 URL: https://issues.apache.org/jira/browse/SPARK-39690
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.3.0
>            Reporter: Kapil Singh
>            Priority: Major
>
> When trying to reuse Exchange of a subquery in main plan, if the Exchange inside subquery materialize first then main ASPE node won't have that stage info (in [stageToReplace|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/AdaptiveSparkPlanExec.scala#L243]) to replace in current logical plan. This will cause AQE to produce new candidate physical plan without reusing the exchange present inside subquery. And depending on how complex the inner plan is (no. of exchanges) AQE could choose plan without ReusedExchange. 
> We have seen with multiple queries with our private build. This can happen in DPP also.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org