You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/02/26 00:22:31 UTC

[GitHub] [spark] monkeyboy123 opened a new pull request #35662: [SPARK-38333][SQL] DPP cause DataSourceScanExec java.lang.NullPointer…

monkeyboy123 opened a new pull request #35662:
URL: https://github.com/apache/spark/pull/35662


   …Exception
   
   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'core/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   to fix canonicalization NPE, maybe is a proof of ![spark-35742](https://github.com/apache/spark/pull/32885),just as ![SPARK-23731](https://github.com/apache/spark/pull/21815) says
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   it is a bug:
   run this sql in yarn client mode,it will generate DynamicPruningExpression:
   ```
   drop table if exists test_b;
   create table test_b as select 1 as `uv` ,2 as `gmv`, 'gogo' as scenes, '20210101' as date1;
   drop table if exists dest;
   create table dest as 
   SELECT a.pt,
          a.scenes
   FROM    (
               SELECT    "20220101" as pt 
                        ,'comeon' AS scenes
               FROM    test_b where scenes='gogo' and exists(array(date1),x-> x =='20210101')
               UNION ALL
               SELECT  pt
                        ,'comeon'
               FROM    (
                           SELECT  pt,COUNT( distinct col2) AS buy_tab_uv
                           FROM    test_a_pt
                           where pt='20220101'
                           GROUP BY pt 
                       ) 
           ) a
   JOIN    (
               SELECT  pt ,COUNT(distinct col2) AS buy_tab_uv
                       FROM  test_a_pt
                       where pt='20220101'
                       GROUP BY pt 
           ) b
   ON      a.pt = b.pt
   ;
   ```
   BTW, function: exists is extends CodegenFallback,
   as  DPP will contains expressions : DynamicPruningExpression(InSubqueryExec(value, broadcastValues, exprId),    
   so, we should iterator all children, if PlanExpression found, such as  InSubqueryExec, we should skip addExprTree, then NPE will disappears
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   yes,
   before this pr:
   NPE will throw, like this:
   ```
   Caused by: java.lang.NullPointerException
    at org.apache.spark.sql.execution.DataSourceScanExec.$init$(DataSourceScanExec.scala:57)
    at org.apache.spark.sql.execution.FileSourceScanExec.<init>(DataSourceScanExec.scala:172)
    at org.apache.spark.sql.execution.FileSourceScanExec.doCanonicalize(DataSourceScanExec.scala:635)
    at org.apache.spark.sql.execution.FileSourceScanExec.doCanonicalize(DataSourceScanExec.scala:162)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized$lzycompute(QueryPlan.scala:373)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized(QueryPlan.scala:372)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$doCanonicalize$1(QueryPlan.scala:387)
    at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at scala.collection.TraversableLike.map(TraversableLike.scala:238)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
    at scala.collection.immutable.List.map(List.scala:298)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.doCanonicalize(QueryPlan.scala:387)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized$lzycompute(QueryPlan.scala:373)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized(QueryPlan.scala:372)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$doCanonicalize$1(QueryPlan.scala:387)
    at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at scala.collection.TraversableLike.map(TraversableLike.scala:238)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
    at scala.collection.immutable.List.map(List.scala:298)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.doCanonicalize(QueryPlan.scala:387)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized$lzycompute(QueryPlan.scala:373)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized(QueryPlan.scala:372)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$doCanonicalize$1(QueryPlan.scala:387)
    at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at scala.collection.TraversableLike.map(TraversableLike.scala:238)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
    at scala.collection.immutable.List.map(List.scala:298)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.doCanonicalize(QueryPlan.scala:387)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized$lzycompute(QueryPlan.scala:373)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized(QueryPlan.scala:372)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$doCanonicalize$1(QueryPlan.scala:387)
    at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at scala.collection.TraversableLike.map(TraversableLike.scala:238)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
    at scala.collection.immutable.List.map(List.scala:298)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.doCanonicalize(QueryPlan.scala:387)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized$lzycompute(QueryPlan.scala:373)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized(QueryPlan.scala:372)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$doCanonicalize$1(QueryPlan.scala:387)
    at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at scala.collection.TraversableLike.map(TraversableLike.scala:238)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
    at scala.collection.immutable.List.map(List.scala:298)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.doCanonicalize(QueryPlan.scala:387)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized$lzycompute(QueryPlan.scala:373)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized(QueryPlan.scala:372)
    at org.apache.spark.sql.execution.exchange.ReusedExchangeExec.doCanonicalize(Exchange.scala:57)
    at org.apache.spark.sql.execution.exchange.ReusedExchangeExec.doCanonicalize(Exchange.scala:51)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized$lzycompute(QueryPlan.scala:373)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized(QueryPlan.scala:372)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$doCanonicalize$1(QueryPlan.scala:387)
    at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at scala.collection.TraversableLike.map(TraversableLike.scala:238)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
    at scala.collection.immutable.List.map(List.scala:298)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.doCanonicalize(QueryPlan.scala:387)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized$lzycompute(QueryPlan.scala:373)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized(QueryPlan.scala:372)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$doCanonicalize$1(QueryPlan.scala:387)
    at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at scala.collection.TraversableLike.map(TraversableLike.scala:238)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
    at scala.collection.immutable.List.map(List.scala:298)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.doCanonicalize(QueryPlan.scala:387)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized$lzycompute(QueryPlan.scala:373)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized(QueryPlan.scala:372)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.$anonfun$doCanonicalize$1(QueryPlan.scala:387)
    at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at scala.collection.TraversableLike.map(TraversableLike.scala:238)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
    at scala.collection.immutable.List.map(List.scala:298)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.doCanonicalize(QueryPlan.scala:387)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized$lzycompute(QueryPlan.scala:373)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized(QueryPlan.scala:372)
    at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doCanonicalize(BroadcastExchangeExec.scala:89)
    at org.apache.spark.sql.execution.exchange.BroadcastExchangeExec.doCanonicalize(BroadcastExchangeExec.scala:72)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized$lzycompute(QueryPlan.scala:373)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized(QueryPlan.scala:372)
    at org.apache.spark.sql.execution.exchange.ReusedExchangeExec.doCanonicalize(Exchange.scala:57)
    at org.apache.spark.sql.execution.exchange.ReusedExchangeExec.doCanonicalize(Exchange.scala:51)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized$lzycompute(QueryPlan.scala:373)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized(QueryPlan.scala:372)
    at org.apache.spark.sql.execution.SubqueryBroadcastExec.doCanonicalize(SubqueryBroadcastExec.scala:66)
    at org.apache.spark.sql.execution.SubqueryBroadcastExec.doCanonicalize(SubqueryBroadcastExec.scala:41)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized$lzycompute(QueryPlan.scala:373)
    at org.apache.spark.sql.catalyst.plans.QueryPlan.canonicalized(QueryPlan.scala:372)
    at org.apache.spark.sql.execution.InSubqueryExec.canonicalized$lzycompute(subquery.scala:165)
    at org.apache.spark.sql.execution.InSubqueryExec.canonicalized(subquery.scala:162)
    at org.apache.spark.sql.execution.InSubqueryExec.canonicalized(subquery.scala:113)
    at org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$canonicalized$1(Expression.scala:229)
    at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at scala.collection.TraversableLike.map(TraversableLike.scala:238)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
    at scala.collection.immutable.List.map(List.scala:298)
    at org.apache.spark.sql.catalyst.expressions.Expression.canonicalized$lzycompute(Expression.scala:229)
    at org.apache.spark.sql.catalyst.expressions.Expression.canonicalized(Expression.scala:228)
    at org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$canonicalized$1(Expression.scala:229)
    at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:238)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at scala.collection.TraversableLike.map(TraversableLike.scala:238)
    at scala.collection.TraversableLike.map$(TraversableLike.scala:231)
    at scala.collection.immutable.List.map(List.scala:298)
    at org.apache.spark.sql.catalyst.expressions.Expression.canonicalized$lzycompute(Expression.scala:229)
    at org.apache.spark.sql.catalyst.expressions.Expression.canonicalized(Expression.scala:228)
    at org.apache.spark.sql.catalyst.expressions.Expression.semanticHash(Expression.scala:248)
    at org.apache.spark.sql.catalyst.expressions.EquivalentExpressions$Expr.hashCode(EquivalentExpressions.scala:41)
    at scala.runtime.Statics.anyHash(Statics.java:122)
    at scala.collection.mutable.HashTable$HashUtils.elemHashCode(HashTable.scala:416)
    at scala.collection.mutable.HashTable$HashUtils.elemHashCode$(HashTable.scala:416)
    at scala.collection.mutable.HashMap.elemHashCode(HashMap.scala:44)
    at scala.collection.mutable.HashTable.findEntry(HashTable.scala:136)
    at scala.collection.mutable.HashTable.findEntry$(HashTable.scala:135)
    at scala.collection.mutable.HashMap.findEntry(HashMap.scala:44)
    at scala.collection.mutable.HashMap.get(HashMap.scala:74)
    at org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.addExpr(EquivalentExpressions.scala:55)
    at org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.$anonfun$addExprTree$default$2$1(EquivalentExpressions.scala:143)
    at org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.$anonfun$addExprTree$default$2$1$adapted(EquivalentExpressions.scala:143)
    at org.apache.spark.sql.catalyst.expressions.EquivalentExpressions.addExprTree(EquivalentExpressions.scala:152)
    at org.apache.spark.sql.catalyst.expressions.SubExprEvaluationRuntime.$anonfun$proxyExpressions$1(SubExprEvaluationRuntime.scala:89)
    at org.apache.spark.sql.catalyst.expressions.SubExprEvaluationRuntime.$anonfun$proxyExpressions$1$adapted(SubExprEvaluationRuntime.scala:89)
    at scala.collection.immutable.List.foreach(List.scala:392)
    at org.apache.spark.sql.catalyst.expressions.SubExprEvaluationRuntime.proxyExpressions(SubExprEvaluationRuntime.scala:89)
    at org.apache.spark.sql.catalyst.expressions.InterpretedPredicate.<init>(predicates.scala:53)
    at org.apache.spark.sql.catalyst.expressions.Predicate$.createInterpretedObject(predicates.scala:92)
    at org.apache.spark.sql.catalyst.expressions.Predicate$.createInterpretedObject(predicates.scala:85)
    at org.apache.spark.sql.catalyst.expressions.CodeGeneratorWithInterpretedFallback.createObject(CodeGeneratorWithInterpretedFallback.scala:56)
    at org.apache.spark.sql.catalyst.expressions.Predicate$.create(predicates.scala:101)
    at org.apache.spark.sql.execution.FilterExec.$anonfun$doExecute$2(basicPhysicalOperators.scala:246)
    at org.apache.spark.sql.execution.FilterExec.$anonfun$doExecute$2$adapted(basicPhysicalOperators.scala:245)
    at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2(RDD.scala:885)
    at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsWithIndexInternal$2$adapted(RDD.scala:885)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
    at org.apache.spark.rdd.UnionRDD.compute(UnionRDD.scala:106)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
    at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
    at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373)
    at org.apache.spark.rdd.RDD.iterator(RDD.scala:337)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
    at org.apache.spark.scheduler.Task.run(Task.scala:131)
    at org.apache.spark.executor.Executor$TaskRunner.$anonfun$simpleRun$3(Executor.scala:514)
    at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1439)
    at org.apache.spark.executor.Executor$TaskRunner.simpleRun(Executor.scala:517)
    at org.apache.spark.executor.Executor$TaskRunner$$anon$1.run(Executor.scala:442)
    at org.apache.spark.executor.Executor$TaskRunner$$anon$1.run(Executor.scala:441)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:441)
    ... 3 more (state=,code=0)
   ```
   after this pr,
   everything is ok
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   If benchmark tests were added, please run the benchmarks in GitHub Actions for the consistent environment, and the instructions could accord to: https://spark.apache.org/developer-tools.html#github-workflow-benchmarks.
   -->
   No tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1054200620


   > I think we need to find which pr fix this issue correctly. then backport to spark 3.1 is the best way? or the code path is not same between 3.1 and master?
   
   It fix by https://github.com/apache/spark/pull/32947 and the code related to DPP ,in spark master/3.2 the sql will be translated to normal join, instead of DPP.
   And, The Error is more like [SPARK-29239](https://github.com/apache/spark/pull/25925)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on a change in pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on a change in pull request #35662:
URL: https://github.com/apache/spark/pull/35662#discussion_r815950849



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -147,7 +147,7 @@ class EquivalentExpressions {
       expr.find(_.isInstanceOf[LambdaVariable]).isDefined ||
       // `PlanExpression` wraps query plan. To compare query plans of `PlanExpression` on executor,
       // can cause error like NPE.
-      (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null)
+      (expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != null)

Review comment:
       > We can also add a UT in `SubexpressionEliminationSuite`, which tests `addExprTree` directly.
   
   I think add a UT in SubexpressionEliminationSuite is not reasonable, as we can not access InSubqueryExec  in Spark Catalyst module,  InSubqueryExec is in the Spark Sql module




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 edited a comment on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 edited a comment on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053498753


   > On the other hand, if we backport [SPARK-35798](https://issues.apache.org/jira/browse/SPARK-35798) to branch-3.1, can this issue be solved?
   
   After backport [SPARK-35798](https://issues.apache.org/jira/browse/SPARK-35798) ,
   new NullPointerException will throws:
   ```
   case class FileSourceScanExec(
       @transient relation: HadoopFsRelation,
       output: Seq[Attribute],
       requiredSchema: StructType,
       partitionFilters: Seq[Expression],
       optionalBucketSet: Option[BitSet],
       optionalNumCoalescedBuckets: Option[Int],
       dataFilters: Seq[Expression],
       tableIdentifier: Option[TableIdentifier],
       disableBucketedScan: Boolean = false)
     extends DataSourceScanExec {
   
     // Note that some vals referring the file-based relation are lazy intentionally
     // so that this plan can be canonicalized on executor side too. See SPARK-23731.
     override lazy val supportsColumnar: Boolean = {
       relation.fileFormat.supportBatch(relation.sparkSession, schema)
     }
   ```
   because relation is null
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 edited a comment on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 edited a comment on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053498753


   > On the other hand, if we backport [SPARK-35798](https://issues.apache.org/jira/browse/SPARK-35798) to branch-3.1, can this issue be solved?
   
   I will try backport [SPARK-35798](https://issues.apache.org/jira/browse/SPARK-35798) ,then tell the answer
   new NullPointerException will throws:
   ```
   case class FileSourceScanExec(
       @transient relation: HadoopFsRelation,
       output: Seq[Attribute],
       requiredSchema: StructType,
       partitionFilters: Seq[Expression],
       optionalBucketSet: Option[BitSet],
       optionalNumCoalescedBuckets: Option[Int],
       dataFilters: Seq[Expression],
       tableIdentifier: Option[TableIdentifier],
       disableBucketedScan: Boolean = false)
     extends DataSourceScanExec {
   
     // Note that some vals referring the file-based relation are lazy intentionally
     // so that this plan can be canonicalized on executor side too. See SPARK-23731.
     override lazy val supportsColumnar: Boolean = {
       relation.fileFormat.supportBatch(relation.sparkSession, schema)
     }
   ```
   because relation is null
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 closed pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 closed pull request #35662:
URL: https://github.com/apache/spark/pull/35662


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on a change in pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on a change in pull request #35662:
URL: https://github.com/apache/spark/pull/35662#discussion_r815950849



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -147,7 +147,7 @@ class EquivalentExpressions {
       expr.find(_.isInstanceOf[LambdaVariable]).isDefined ||
       // `PlanExpression` wraps query plan. To compare query plans of `PlanExpression` on executor,
       // can cause error like NPE.
-      (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null)
+      (expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != null)

Review comment:
       > We can also add a UT in `SubexpressionEliminationSuite`, which tests `addExprTree` directly.
   
   I think add a UT in SubexpressionEliminationSuite is not reasonable, as we can not access InSubqueryExec  in Spark Catalyst module,  InSubqueryExec is in the Spark Sql module




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 removed a comment on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 removed a comment on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053513496


   > FileSourceScanExec
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 closed pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 closed pull request #35662:
URL: https://github.com/apache/spark/pull/35662


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

AngersZhuuuu commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053952820


   > > On the other hand, if we backport [SPARK-35798](https://issues.apache.org/jira/browse/SPARK-35798) to branch-3.1, can this issue be solved?
   > 
   > After backport [SPARK-35798](https://issues.apache.org/jira/browse/SPARK-35798) , new NullPointerException will throws:
   > 
   > ```
   > case class FileSourceScanExec(
   >     @transient relation: HadoopFsRelation,
   >     output: Seq[Attribute],
   >     requiredSchema: StructType,
   >     partitionFilters: Seq[Expression],
   >     optionalBucketSet: Option[BitSet],
   >     optionalNumCoalescedBuckets: Option[Int],
   >     dataFilters: Seq[Expression],
   >     tableIdentifier: Option[TableIdentifier],
   >     disableBucketedScan: Boolean = false)
   >   extends DataSourceScanExec {
   > 
   >   // Note that some vals referring the file-based relation are lazy intentionally
   >   // so that this plan can be canonicalized on executor side too. See SPARK-23731.
   >   override lazy val supportsColumnar: Boolean = {
   >     relation.fileFormat.supportBatch(relation.sparkSession, schema)
   >   }
   > ```
   > 
   > because relation is null
   
   I think we need to find which pr fix this issue correctly. then backport to spark 3.1 is the best way? or the code path is not same between 3.1 and master?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] weixiuli commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

weixiuli commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053780581


   > It is hard to add a unit test，as it only happen in the runtime
   
   Good catch, we may add a small end-to-end test?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053823953


   > Good catch, we may add a small end-to-end test?
   
   Agree with @weixiuli +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053597756


   cc @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053969288


   So we should add a new test case and then we can use the new case to find which patches need to be backport faster
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on a change in pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on a change in pull request #35662:
URL: https://github.com/apache/spark/pull/35662#discussion_r838406159



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -147,7 +147,7 @@ class EquivalentExpressions {
       expr.find(_.isInstanceOf[LambdaVariable]).isDefined ||
       // `PlanExpression` wraps query plan. To compare query plans of `PlanExpression` on executor,
       // can cause error like NPE.
-      (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null)
+      (expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != null)

Review comment:
       > Do we have a master branch PR now?
   
   done, new pr [SPARK-38333](https://github.com/apache/spark/pull/36012)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on a change in pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on a change in pull request #35662:
URL: https://github.com/apache/spark/pull/35662#discussion_r815950849



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -147,7 +147,7 @@ class EquivalentExpressions {
       expr.find(_.isInstanceOf[LambdaVariable]).isDefined ||
       // `PlanExpression` wraps query plan. To compare query plans of `PlanExpression` on executor,
       // can cause error like NPE.
-      (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null)
+      (expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != null)

Review comment:
       > We can also add a UT in `SubexpressionEliminationSuite`, which tests `addExprTree` directly.
   
   I think add a UT in SubexpressionEliminationSuite is not reasonable, as we can not access InSubqueryExec in Spark Catalyst module




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1055095432


   > So I think we should also find out which pr it can't be translated to DPP? And backport to branch-3.1. Then we can fix current pr's issue in both master/3.2/3.1. WDYT? @LuciferYang @cloud-fan
   
   +1 agree with you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on a change in pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on a change in pull request #35662:
URL: https://github.com/apache/spark/pull/35662#discussion_r816590264



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -147,7 +147,7 @@ class EquivalentExpressions {
       expr.find(_.isInstanceOf[LambdaVariable]).isDefined ||
       // `PlanExpression` wraps query plan. To compare query plans of `PlanExpression` on executor,
       // can cause error like NPE.
-      (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null)
+      (expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != null)

Review comment:
       unit test added




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on a change in pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on a change in pull request #35662:
URL: https://github.com/apache/spark/pull/35662#discussion_r838368603



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -147,7 +147,7 @@ class EquivalentExpressions {
       expr.find(_.isInstanceOf[LambdaVariable]).isDefined ||
       // `PlanExpression` wraps query plan. To compare query plans of `PlanExpression` on executor,
       // can cause error like NPE.
-      (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null)
+      (expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != null)

Review comment:
       > Do we have a master branch PR now?
   
   Sorry for late reply，i will open a pr right now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053197830


   For a bug fix pr, we need to add at least one UT. The new UT should fail before this pr and passed after this pr, which also helps to ensure that the change of other pr in the future will not keep this fix
   
   .
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #35662: [SPARK-38333][SQL] DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1052126484


   Does this issue only exist in Spark 3.1? If yes, please add `[3.1]` to the pr title. If not, please send pr to the master branch first.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 edited a comment on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 edited a comment on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053498753


   > On the other hand, if we backport [SPARK-35798](https://issues.apache.org/jira/browse/SPARK-35798) to branch-3.1, can this issue be solved?
   
   I will try backport [SPARK-35798](https://issues.apache.org/jira/browse/SPARK-35798) ,then tell the answer
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on a change in pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on a change in pull request #35662:
URL: https://github.com/apache/spark/pull/35662#discussion_r815935904



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -147,7 +147,7 @@ class EquivalentExpressions {
       expr.find(_.isInstanceOf[LambdaVariable]).isDefined ||
       // `PlanExpression` wraps query plan. To compare query plans of `PlanExpression` on executor,
       // can cause error like NPE.
-      (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null)
+      (expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != null)

Review comment:
       OK, i will open another pr to fix it at the master branch and add a UT




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on a change in pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on a change in pull request #35662:
URL: https://github.com/apache/spark/pull/35662#discussion_r838368603



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -147,7 +147,7 @@ class EquivalentExpressions {
       expr.find(_.isInstanceOf[LambdaVariable]).isDefined ||
       // `PlanExpression` wraps query plan. To compare query plans of `PlanExpression` on executor,
       // can cause error like NPE.
-      (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null)
+      (expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != null)

Review comment:
       > Do we have a master branch PR now?
   Sorry for late reply，i will open a pr right now.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1085236868


   Thanks for review all of you


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

AngersZhuuuu commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1055067849


   > It fix by #32947 and the code related to DPP ,in spark master/3.2 the sql will be translated to normal join, instead of DPP. And, The Error is more like [SPARK-29239](https://github.com/apache/spark/pull/25925),in spark master/3.2 the skip function in addExprTree should exits same question, but the demo sql i pasted does not trigger it.
   
   So I think we should also find out which pr it can't be translated to DPP?  And backport to branch-3.1. Then we can fix current pr's issue in both master/3.2/3.1. WDYT? @LuciferYang @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053170908


   > > Yes, only exists in Spark 3.1.Updated
   > 3.
   > Cloud you explain why the master/3.2 does not have this issue?
   
   It fix by [SPARK-35798](https://github.com/apache/spark/pull/32947) and  the code related to DPP ,in spark master/3.2 the sql will be translated to normal join, instead  of DPP


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang edited a comment on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

LuciferYang edited a comment on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053197830


   For a bug fix pr, we need to add at least one UT. The new UT should fail before this pr and passed after this pr, which also helps to ensure that the change of other pr in the future will not keep this fix.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053569344


   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on a change in pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on a change in pull request #35662:
URL: https://github.com/apache/spark/pull/35662#discussion_r835928857



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -147,7 +147,7 @@ class EquivalentExpressions {
       expr.find(_.isInstanceOf[LambdaVariable]).isDefined ||
       // `PlanExpression` wraps query plan. To compare query plans of `PlanExpression` on executor,
       // can cause error like NPE.
-      (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null)
+      (expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != null)

Review comment:
       @cloud-fan  How should this pr title be named? Maybe this is a potential problem.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on a change in pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on a change in pull request #35662:
URL: https://github.com/apache/spark/pull/35662#discussion_r836081212



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -147,7 +147,7 @@ class EquivalentExpressions {
       expr.find(_.isInstanceOf[LambdaVariable]).isDefined ||
       // `PlanExpression` wraps query plan. To compare query plans of `PlanExpression` on executor,
       // can cause error like NPE.
-      (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null)
+      (expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != null)

Review comment:
       @monkeyboy123 we can replace `expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined` with `TreeNode.exists` api now




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

cloud-fan commented on a change in pull request #35662:
URL: https://github.com/apache/spark/pull/35662#discussion_r836079372



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -147,7 +147,7 @@ class EquivalentExpressions {
       expr.find(_.isInstanceOf[LambdaVariable]).isDefined ||
       // `PlanExpression` wraps query plan. To compare query plans of `PlanExpression` on executor,
       // can cause error like NPE.
-      (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null)
+      (expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != null)

Review comment:
       Do we have a master branch PR now?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053204655


   On the other hand, if we backup SPARK-35798 to branch-3.1, can this issue be solved?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

cloud-fan commented on a change in pull request #35662:
URL: https://github.com/apache/spark/pull/35662#discussion_r815912254



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -147,7 +147,7 @@ class EquivalentExpressions {
       expr.find(_.isInstanceOf[LambdaVariable]).isDefined ||
       // `PlanExpression` wraps query plan. To compare query plans of `PlanExpression` on executor,
       // can cause error like NPE.
-      (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null)
+      (expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != null)

Review comment:
       I think we should fix it at the master branch as well, as the code does not match the comment.
   
   We can also add a UT in `SubexpressionEliminationSuite`, which tests `addExprTree` directly.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on a change in pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on a change in pull request #35662:
URL: https://github.com/apache/spark/pull/35662#discussion_r838378581



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -147,7 +147,7 @@ class EquivalentExpressions {
       expr.find(_.isInstanceOf[LambdaVariable]).isDefined ||
       // `PlanExpression` wraps query plan. To compare query plans of `PlanExpression` on executor,
       // can cause error like NPE.
-      (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null)
+      (expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != null)

Review comment:
       > @monkeyboy123 we can replace `expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined` with `TreeNode.exists` api now
   ok
   
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 edited a comment on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 edited a comment on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1054200620


   > I think we need to find which pr fix this issue correctly. then backport to spark 3.1 is the best way? or the code path is not same between 3.1 and master?
   
   It fix by https://github.com/apache/spark/pull/32947 and the code related to DPP ,in spark master/3.2 the sql will be translated to normal join, instead of DPP.
   And, The Error is more like [SPARK-29239](https://github.com/apache/spark/pull/25925),in spark master/3.2 the skip function in addExprTree should encounter same question, but the demo sql i pasted does not trigger it. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053513496


   > FileSourceScanExec
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053894420


   > Good catch, we may add a small end-to-end test?
   
   I will add a test later


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 edited a comment on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 edited a comment on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1054200620


   > I think we need to find which pr fix this issue correctly. then backport to spark 3.1 is the best way? or the code path is not same between 3.1 and master?
   
   It fix by https://github.com/apache/spark/pull/32947 and the code related to DPP ,in spark master/3.2 the sql will be translated to normal join, instead of DPP.
   And, The Error is more like [SPARK-29239](https://github.com/apache/spark/pull/25925),in spark master/3.2 the skip function in addExprTree should exits same question, but the demo sql i pasted does not trigger it. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on a change in pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on a change in pull request #35662:
URL: https://github.com/apache/spark/pull/35662#discussion_r838406159



##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/EquivalentExpressions.scala
##########
@@ -147,7 +147,7 @@ class EquivalentExpressions {
       expr.find(_.isInstanceOf[LambdaVariable]).isDefined ||
       // `PlanExpression` wraps query plan. To compare query plans of `PlanExpression` on executor,
       // can cause error like NPE.
-      (expr.isInstanceOf[PlanExpression[_]] && TaskContext.get != null)
+      (expr.find(_.isInstanceOf[PlanExpression[_]]).isDefined && TaskContext.get != null)

Review comment:
       > potential
   
   done, new pr [SPARK-38333](https://github.com/apache/spark/pull/36012)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

LuciferYang commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1052978689


   > Yes, only exists in Spark 3.1.Updated
   
   Cloud you explain why the master/3.2 does not have this issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053528888


   > EquivalentExpressions
   
   
   
   > For a bug fix pr, we need to add at least one UT. The new UT should fail before this pr and passed after this pr, which also helps to ensure that the change of other pr in the future will not keep this fix.
   
   It is hard to add a unit test，as it only happen in the runtime


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] LuciferYang edited a comment on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

LuciferYang edited a comment on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053204655


   On the other hand, if we backport SPARK-35798 to branch-3.1, can this issue be solved?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1053498753


   > On the other hand, if we backport [SPARK-35798](https://issues.apache.org/jira/browse/SPARK-35798) to branch-3.1, can this issue be solved?
    I will try backport [SPARK-35798](https://issues.apache.org/jira/browse/SPARK-35798) ,then tell the answer
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] monkeyboy123 commented on pull request #35662: [SPARK-38333][SQL] [3.1]DPP cause DataSourceScanExec java.lang.NullPointer…

Posted by GitBox <gi...@apache.org>.

monkeyboy123 commented on pull request #35662:
URL: https://github.com/apache/spark/pull/35662#issuecomment-1052883422


   > Does this issue only exist in Spark 3.1? If yes, please add `[3.1]` to the pr title. If not, please send pr to the master branch first.
   
   Yes, only exists in Spark 3.1.Updated


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org