You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Kapil Singh (Jira)" <ji...@apache.org> on 2022/01/25 06:13:00 UTC

[jira] [Comment Edited] (SPARK-37995) TPCDS 1TB q72 fails when spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly is false

    [ https://issues.apache.org/jira/browse/SPARK-37995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17481561#comment-17481561 ] 

Kapil Singh edited comment on SPARK-37995 at 1/25/22, 6:12 AM:
---------------------------------------------------------------

Seems it is related to this part of [PlanAdaptiveDynamicPruningFilters:79|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/PlanAdaptiveDynamicPruningFilters.scala#L79]

 
{code:java}
// Here we can't call the QueryExecution.prepareExecutedPlan() method to
// get the sparkPlan as Non-AQE use case, which will cause the physical
// plan optimization rules be inserted twice, once in AQE framework and
// another in prepareExecutedPlan() method.
val sparkPlan = QueryExecution.createSparkPlan(session, planner, aggregate) {code}
In non-AQE *preparedExecutedPlan* also used to plan subquery inside current dynamic pruning expression but the *createSparkPlan* does not. This leaves the inner subquery in logical phase thus cast exception later.

 


was (Author: kapilks_ms):
Seems it is related to this part of [PlanAdaptiveDynamicPruningFilters:79|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/PlanAdaptiveDynamicPruningFilters.scala#L79]

 
{code:java}
// Here we can't call the QueryExecution.prepareExecutedPlan() method to
// get the sparkPlan as Non-AQE use case, which will cause the physical
// plan optimization rules be inserted twice, once in AQE framework and
// another in prepareExecutedPlan() method.
val sparkPlan = QueryExecution.createSparkPlan(session, planner, aggregate) {code}
In non-AQE *preparedExecutedPlan* ** also used to plan subquery inside current dynamic pruning expression but the *createSparkPlan* does not. This leaves the inner subquery in logical phase thus cast exception later.

 

> TPCDS 1TB q72 fails when spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly is false
> ------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-37995
>                 URL: https://issues.apache.org/jira/browse/SPARK-37995
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 3.2.0
>            Reporter: Kapil Singh
>            Priority: Major
>         Attachments: full-stacktrace.txt
>
>
> TPCDS 1TB q72 fails in 3.2 Spark when spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly is false. We have been running with this config in 3.1 as well and it worked fine in that version. This used to add a subquery dpp in q72.
> Relevant stack trace
> {code:java}
> rror: java.lang.ClassCastException: org.apache.spark.sql.catalyst.plans.logical.Project cannot be cast to org.apache.spark.sql.execution.SparkPlan  at scala.collection.immutable.List.map(List.scala:293)  at org.apache.spark.sql.execution.SparkPlanInfo$.fromSparkPlan(SparkPlanInfo.scala:75)  at org.apache.spark.sql.execution.SparkPlanInfo$.$anonfun$fromSparkPlan$3(SparkPlanInfo.scala:75)  at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
> ....
> ....
> at org.apache.spark.sql.execution.SparkPlanInfo$.fromSparkPlan(SparkPlanInfo.scala:75)  at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.onUpdatePlan(AdaptiveSparkPlanExec.scala:708)  at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.$anonfun$getFinalPhysicalPlan$2(AdaptiveSparkPlanExec.scala:239)  at scala.runtime.java8.JFunction1$mcVJ$sp.apply(JFunction1$mcVJ$sp.java:23)  at scala.Option.foreach(Option.scala:407)  at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.$anonfun$getFinalPhysicalPlan$1(AdaptiveSparkPlanExec.scala:239)  at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)  at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.getFinalPhysicalPlan(AdaptiveSparkPlanExec.scala:226)  at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.withFinalPlanUpdate(AdaptiveSparkPlanExec.scala:365)  at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.executeCollect(AdaptiveSparkPlanExec.scala:338) {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org