You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2021/12/29 10:26:00 UTC

[jira] [Commented] (SPARK-37779) Make ColumnarToRowExec plan canonicalizable after (de)serialization

    [ https://issues.apache.org/jira/browse/SPARK-37779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17466391#comment-17466391 ] 

Apache Spark commented on SPARK-37779:
--------------------------------------

User 'HyukjinKwon' has created a pull request for this issue:
https://github.com/apache/spark/pull/35058

> Make ColumnarToRowExec plan canonicalizable after (de)serialization
> -------------------------------------------------------------------
>
>                 Key: SPARK-37779
>                 URL: https://issues.apache.org/jira/browse/SPARK-37779
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.0.3, 3.1.2, 3.2.0, 3.3.0
>            Reporter: Hyukjin Kwon
>            Priority: Minor
>
> SPARK-23731 fixed the plans to be serializable by leveraging lazy but SPARK-28213 introduced new code path that calls the lazy val which triggers null point exception in https://github.com/apache/spark/blob/77b164aac9764049a4820064421ef82ec0bc14fb/sql/core/src/main/scala/org/apache/spark/sql/execution/Columnar.scala#L68
> This can fail during canonicalization during, for example, eliminating sub common expressions:
> {code}
> java.lang.NullPointerException
>     at org.apache.spark.sql.execution.FileSourceScanExec.supportsColumnar$lzycompute(DataSourceScanExec.scala:280)
>     at org.apache.spark.sql.execution.FileSourceScanExec.supportsColumnar(DataSourceScanExec.scala:279)
>     at org.apache.spark.sql.execution.InputAdapter.supportsColumnar(WholeStageCodegenExec.scala:509)
>     at org.apache.spark.sql.execution.ColumnarToRowExec.<init>(Columnar.scala:67)
>     ...
>     at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized$lzycompute(QueryPlan.scala:581)
>     at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized(QueryPlan.scala:580)
>     at org.apache.spark.sql.execution.ScalarSubquery.canonicalized$lzycompute(subquery.scala:110)
>     ...
>     at org.apache.spark.sql.catalyst.expressions.ExpressionEquals.hashCode(EquivalentExpressions.scala:275)
>     ...
>     at scala.collection.mutable.HashTable.findEntry$(HashTable.scala:135)
>     at scala.collection.mutable.HashMap.findEntry(HashMap.scala:44)
>     at scala.collection.mutable.HashMap.get(HashMap.scala:74)
>     at org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.addExpr(EquivalentExpressions.scala:46)
>     at org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.addExprTreeHelper$1(EquivalentExpressions.scala:147)
>     at org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.addExprTree(EquivalentExpressions.scala:170)
>     at org.apache.spark.sql.catalyst.expressions.SubExprEvaluationRuntime.$anonfun$proxyExpressions$1(SubExprEvaluationRuntime.scala:89)
>     at org.apache.spark.sql.catalyst.expressions.SubExprEvaluationRuntime.$anonfun$proxyExpressions$1$adapted(SubExprEvaluationRuntime.scala:89)
>     at scala.collection.immutable.List.foreach(List.scala:392)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org