You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "L. C. Hsieh (Jira)" <ji...@apache.org> on 2022/01/15 22:38:00 UTC

[jira] [Commented] (SPARK-37897) Filter with subexpression elimination may cause query failed

    [ https://issues.apache.org/jira/browse/SPARK-37897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17476700#comment-17476700 ] 

L. C. Hsieh commented on SPARK-37897:
-------------------------------------

SQL is a declarative language, please don't think it in imperative style of short-circuit evaluation. How to evaluate the predicate is implementation-dependent. 

> Filter with subexpression elimination may cause query failed
> ------------------------------------------------------------
>
>                 Key: SPARK-37897
>                 URL: https://issues.apache.org/jira/browse/SPARK-37897
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.2.0
>            Reporter: hujiahua
>            Priority: Major
>         Attachments: image-2022-01-13-20-22-09-055.png
>
>
>  
> The following test results will fail, the root cause was that the execution order of filter predicates had changed after subexpression elimination. So I think we should keep predicates execution order after subexpression elimination.
> {code:java}
> test("filter with subexpression elimination may cause query failed.") {
>   withSQLConf((SQLConf.WHOLESTAGE_CODEGEN_ENABLED.key, "false")) {
>     val df = Seq(-1, 1, 2).toDF("c1")
>     //register `plusOne` udf, and the function will failed if input was not a positive number.
>     spark.sqlContext.udf.register("plusOne",
>       (n: Int) => { if (n >= 0) n + 1 else throw new SparkException("Must be positive number.") })
>     val result = df.filter("c1 >= 0 and plusOne(c1) > 1 and plusOne(c1) < 3").collect()
>     assert(result.size === 1)
>   }
> } 
> Caused by: org.apache.spark.SparkException: Must be positive number.
>     at org.apache.spark.sql.DataFrameSuite.$anonfun$new$3(DataFrameSuite.scala:67)
>     at scala.runtime.java8.JFunction1$mcII$sp.apply(JFunction1$mcII$sp.java:23)
>     ... 20 more{code}
>  
> https://github.com/apache/spark/blob/0e186e8a19926f91810f3eaf174611b71e598de6/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/codegen/GeneratePredicate.scala#L63
> !image-2022-01-13-20-22-09-055.png!
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org