You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "jiahong.li (Jira)" <ji...@apache.org> on 2022/02/25 22:35:00 UTC

[jira] [Created] (SPARK-38333) DPP cause DataSourceScanExec java.lang.NullPointerException

jiahong.li created SPARK-38333:
----------------------------------

             Summary: DPP cause DataSourceScanExec java.lang.NullPointerException
                 Key: SPARK-38333
                 URL: https://issues.apache.org/jira/browse/SPARK-38333
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 3.1.2
            Reporter: jiahong.li


In DPP,we trigger NPE,like blow:

Caused by: java.lang.NullPointerException
    at org.apache.spark.sql.execution.DataSourceScanExec.$init$(DataSourceScanExec.scala:57)
    at org.apache.spark.sql.execution.FileSourceScanExec.<init>(DataSourceScanExec.scala:172)

...

    at org.apache.spark.sql.catalyst.expressions.CodeGeneratorWithInterpretedFallback.createObject(CodeGeneratorWithInterpretedFallback.scala:56)
    at org.apache.spark.sql.catalyst.expressions.Predicate$.create(predicates.scala:101)
    at org.apache.spark.sql.execution.FilterExec.$anonfun$doExecute$2(basicPhysicalOperators.scala:246)
    at org.apache.spark.sql.execution.FilterExec.$anonfun$doExecute$2$adapted(basicPhysicalOperators.scala:245)
    at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2(RDD.scala:885)

,the root cause is addExprTree funtion in EquivalentExpressions:

```

def addExprTree(
expr: Expression,
addFunc: Expression => Boolean = addExpr): Unit = {
val skip = expr.isInstanceOf[LeafExpression] ||
// `LambdaVariable` is usually used as a loop variable, which can't be evaluated ahead of the
// loop. So we can't evaluate sub-expressions containing `LambdaVariable` at the beginning.
expr.find(_.isInstanceOf[LambdaVariable]).isDefined ||
// `PlanExpression` wraps query plan. To compare query plans of `PlanExpression` on executor,
// can cause error like NPE.
(expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null)

if (!skip && !addFunc(expr)) {
childrenToRecurse(expr).foreach(addExprTree(_, addFunc))
commonChildrenToRecurse(expr).filter(_.nonEmpty).foreach(addCommonExprs(_, addFunc))

```

maybe we should change it like this :
```

(expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != null)

```

because, in DPP,the filter expression like this:

DynamicPruningExpression(InSubqueryExec(value, broadcastValues, exprId)

so, we should iterator children, if PlanExpression found, such as  InSubqueryExec, we should skip addExprTree, then NPE will not appears



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org