You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Kapil Singh (Jira)" <ji...@apache.org> on 2022/01/25 06:13:00 UTC
[jira] [Comment Edited] (SPARK-37995) TPCDS 1TB q72 fails when spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly is false
[ https://issues.apache.org/jira/browse/SPARK-37995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17481561#comment-17481561 ]
Kapil Singh edited comment on SPARK-37995 at 1/25/22, 6:12 AM:
---------------------------------------------------------------
Seems it is related to this part of [PlanAdaptiveDynamicPruningFilters:79|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/PlanAdaptiveDynamicPruningFilters.scala#L79]
{code:java}
// Here we can't call the QueryExecution.prepareExecutedPlan() method to
// get the sparkPlan as Non-AQE use case, which will cause the physical
// plan optimization rules be inserted twice, once in AQE framework and
// another in prepareExecutedPlan() method.
val sparkPlan = QueryExecution.createSparkPlan(session, planner, aggregate) {code}
In non-AQE *preparedExecutedPlan* also used to plan subquery inside current dynamic pruning expression but the *createSparkPlan* does not. This leaves the inner subquery in logical phase thus cast exception later.
was (Author: kapilks_ms):
Seems it is related to this part of [PlanAdaptiveDynamicPruningFilters:79|https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/PlanAdaptiveDynamicPruningFilters.scala#L79]
{code:java}
// Here we can't call the QueryExecution.prepareExecutedPlan() method to
// get the sparkPlan as Non-AQE use case, which will cause the physical
// plan optimization rules be inserted twice, once in AQE framework and
// another in prepareExecutedPlan() method.
val sparkPlan = QueryExecution.createSparkPlan(session, planner, aggregate) {code}
In non-AQE *preparedExecutedPlan* ** also used to plan subquery inside current dynamic pruning expression but the *createSparkPlan* does not. This leaves the inner subquery in logical phase thus cast exception later.
> TPCDS 1TB q72 fails when spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly is false
> ------------------------------------------------------------------------------------------------
>
> Key: SPARK-37995
> URL: https://issues.apache.org/jira/browse/SPARK-37995
> Project: Spark
> Issue Type: Bug
> Components: Spark Core
> Affects Versions: 3.2.0
> Reporter: Kapil Singh
> Priority: Major
> Attachments: full-stacktrace.txt
>
>
> TPCDS 1TB q72 fails in 3.2 Spark when spark.sql.optimizer.dynamicPartitionPruning.reuseBroadcastOnly is false. We have been running with this config in 3.1 as well and it worked fine in that version. This used to add a subquery dpp in q72.
> Relevant stack trace
> {code:java}
> rror: java.lang.ClassCastException: org.apache.spark.sql.catalyst.plans.logical.Project cannot be cast to org.apache.spark.sql.execution.SparkPlan at scala.collection.immutable.List.map(List.scala:293) at org.apache.spark.sql.execution.SparkPlanInfo$.fromSparkPlan(SparkPlanInfo.scala:75) at org.apache.spark.sql.execution.SparkPlanInfo$.$anonfun$fromSparkPlan$3(SparkPlanInfo.scala:75) at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:286)
> ....
> ....
> at org.apache.spark.sql.execution.SparkPlanInfo$.fromSparkPlan(SparkPlanInfo.scala:75) at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.onUpdatePlan(AdaptiveSparkPlanExec.scala:708) at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.$anonfun$getFinalPhysicalPlan$2(AdaptiveSparkPlanExec.scala:239) at scala.runtime.java8.JFunction1$mcVJ$sp.apply(JFunction1$mcVJ$sp.java:23) at scala.Option.foreach(Option.scala:407) at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.$anonfun$getFinalPhysicalPlan$1(AdaptiveSparkPlanExec.scala:239) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775) at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.getFinalPhysicalPlan(AdaptiveSparkPlanExec.scala:226) at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.withFinalPlanUpdate(AdaptiveSparkPlanExec.scala:365) at org.apache.spark.sql.execution.adaptive.AdaptiveSparkPlanExec.executeCollect(AdaptiveSparkPlanExec.scala:338) {code}
--
This message was sent by Atlassian Jira
(v8.20.1#820001)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org