You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (Jira)" <ji...@apache.org> on 2022/04/05 17:28:00 UTC

[jira] [Assigned] (SPARK-38666) Missing aggregate filter checks

     [ https://issues.apache.org/jira/browse/SPARK-38666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Apache Spark reassigned SPARK-38666:
------------------------------------

    Assignee: Apache Spark

> Missing aggregate filter checks
> -------------------------------
>
>                 Key: SPARK-38666
>                 URL: https://issues.apache.org/jira/browse/SPARK-38666
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 3.4.0
>            Reporter: Bruce Robbins
>            Assignee: Apache Spark
>            Priority: Major
>
> h3. Window function in filter
> {noformat}
> select sum(a) filter (where nth_value(a, 2) over (order by b) > 1)
> from (select 1 a, '2' b);
> {noformat}
> This query should produce an analysis error, but instead produces a stack overflow:
> {noformat}
> java.lang.StackOverflowError: null
> 	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$collect$1(TreeNode.scala:305) ~[spark-catalyst_2.12-3.4.0-SNAPSHOT.jar:3.4.0-SNAPSHOT]
> 	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$collect$1$adapted(TreeNode.scala:305) ~[spark-catalyst_2.12-3.4.0-SNAPSHOT.jar:3.4.0-SNAPSHOT]
> 	at org.apache.spark.sql.catalyst.trees.TreeNode.foreach(TreeNode.scala:264) ~[spark-catalyst_2.12-3.4.0-SNAPSHOT.jar:3.4.0-SNAPSHOT]
> 	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1(TreeNode.scala:265) ~[spark-catalyst_2.12-3.4.0-SNAPSHOT.jar:3.4.0-SNAPSHOT]
> 	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$foreach$1$adapted(TreeNode.scala:265) ~[spark-catalyst_2.12-3.4.0-SNAPSHOT.jar:3.4.0-SNAPSHOT]
> 	at scala.collection.Iterator.foreach(Iterator.scala:943) ~[scala-library.jar:?]
> ...
> {noformat}
> h3. Non-boolean filter expression
> {noformat}
> select sum(a) filter (where a) from (select 1 a, '2' b);
> {noformat}
> This query should produce an analysis error, but instead causes a projection compilation error or whole-stage codegen error (depending on the datatype of the expression):
> {noformat}
> 22/03/26 17:19:03 ERROR CodeGenerator: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 50, Column 6: Not a boolean expression
> org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 50, Column 6: Not a boolean expression
> 	at org.codehaus.janino.UnitCompiler.compileError(UnitCompiler.java:12021) ~[janino-3.0.16.jar:?]
> 	at org.codehaus.janino.UnitCompiler.compileBoolean2(UnitCompiler.java:4049) ~[janino-3.0.16.jar:?]
> 	at org.codehaus.janino.UnitCompiler.access$6300(UnitCompiler.java:226) ~[janino-3.0.16.jar:?]
> 	at org.codehaus.janino.UnitCompiler$14.visitIntegerLiteral(UnitCompiler.java:4016) ~[janino-3.0.16.jar:?]
> ...
> 22/03/26 17:19:05 WARN MutableProjection: Expr codegen error and falling back to interpreter mode
> java.util.concurrent.ExecutionException: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 40, Column 15: failed to compile: org.codehaus.commons.compiler.CompileException: File 'generated.java', Line 40, Column 15: Not a boolean expression
> 	at com.google.common.util.concurrent.AbstractFuture$Sync.getValue(AbstractFuture.java:306) ~[guava-14.0.1.jar:?]
> 	at com.google.common.util.concurrent.AbstractFuture$Sync.get(AbstractFuture.java:293) ~[guava-14.0.1.jar:?]
> 	at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116) ~[guava-14.0.1.jar:?]
> 	at com.google.common.util.concurrent.Uninterruptibles.getUninterruptibly(Uninterruptibles.java:135) ~[guava-14.0.1.jar:?]
> 	at com.google.common.cache.LocalCache$Segment.getAndRecordStats(LocalCache.java:2410) ~[guava-14.0.1.jar:?]
> 	at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2380) ~[guava-14.0.1.jar:?]
> 	at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2342) ~[guava-14.0.1.jar:?]
> 	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2257) ~[guava-14.0.1.jar:?]
> 	at com.google.common.cache.LocalCache.get(LocalCache.java:4000) ~[guava-14.0.1.jar:?]
> 	at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:4004) ~[guava-14.0.1.jar:?]
> ...
> NULL
> Time taken: 5.397 seconds, Fetched 1 row(s)
> {noformat}
> Interestingly, it also returns a result (NULL).
> h3. Aggregate expression in filter expression
> {noformat}
> select max(b) filter (where max(a) > 1) from (select 1 a, '2' b);
> {noformat}
> This query should produce an analysis error, but instead causes a projection compilation error or whole-stage codegen error (depending on the datatype of the expression being aggregated):
> {noformat}
> 22/03/26 17:26:38 ERROR TaskSetManager: Task 0 in stage 3.0 failed 1 times; aborting job
> org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 3.0 failed 1 times, most recent failure: Lost task 0.0 in stage 3.0 (TID 2) (10.0.0.106 executor driver): org.apache.spark.SparkUnsupportedOperationException: Cannot evaluate expression: max(1)
> 	at org.apache.spark.sql.errors.QueryExecutionErrors$.cannotEvaluateExpressionError(QueryExecutionErrors.scala:79)
> 	at org.apache.spark.sql.catalyst.expressions.Unevaluable.eval(Expression.scala:344)
> 	at org.apache.spark.sql.catalyst.expressions.Unevaluable.eval$(Expression.scala:343)
> 	at org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression.eval(interfaces.scala:99)
> 	at org.apache.spark.sql.catalyst.expressions.BinaryExpression.eval(Expression.scala:593)
> 	at org.apache.spark.sql.catalyst.expressions.If.eval(conditionalExpressions.scala:68)
> ...
> {noformat}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org