You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Mingcong Han (JIRA)" <ji...@apache.org> on 2019/03/23 02:51:00 UTC

[jira] [Created] (SPARK-27255) Aggregate functions should not be allowed in WHERE

Mingcong Han created SPARK-27255:
------------------------------------

             Summary: Aggregate functions should not be allowed in WHERE
                 Key: SPARK-27255
                 URL: https://issues.apache.org/jira/browse/SPARK-27255
             Project: Spark
          Issue Type: Improvement
          Components: SQL
    Affects Versions: 2.4.0
            Reporter: Mingcong Han


Aggregate functions should not be allowed in WHERE clause. But Spark SQL throws an exception when generating codes. It is supposed to throw an exception during parsing or analyzing.
Here is an example:

{code:scala}
val df = spark.sql("select * from t where sum(ta) > 0")
df.explain(true)
df.show()
{code}

Spark SQL explains it as:
{noformat}
== Parsed Logical Plan ==
'Project [*]
+- 'Filter ('sum('ta) > 0)
   +- 'UnresolvedRelation `t`

== Analyzed Logical Plan ==
ta: int, tb: int
Project [ta#5, tb#6]
+- Filter (sum(cast(ta#5 as bigint)) > cast(0 as bigint))
   +- SubqueryAlias `t`
      +- Project [ta#5, tb#6]
         +- SubqueryAlias `as`
            +- LocalRelation [ta#5, tb#6]

== Optimized Logical Plan ==
Filter (sum(cast(ta#5 as bigint)) > 0)
+- LocalRelation [ta#5, tb#6]

== Physical Plan ==
*(1) Filter (sum(cast(ta#5 as bigint)) > 0)
+- LocalTableScan [ta#5, tb#6]
{noformat}

But when executing `df.show()`:
{noformat}
Exception in thread "main" java.lang.UnsupportedOperationException: Cannot generate code for expression: sum(cast(input[0, int, false] as bigint))
	at org.apache.spark.sql.catalyst.expressions.Unevaluable.doGenCode(Expression.scala:291)
	at org.apache.spark.sql.catalyst.expressions.Unevaluable.doGenCode$(Expression.scala:290)
	at org.apache.spark.sql.catalyst.expressions.aggregate.AggregateExpression.doGenCode(interfaces.scala:87)
	at org.apache.spark.sql.catalyst.expressions.Expression.$anonfun$genCode$3(Expression.scala:138)
	at scala.Option.getOrElse(Option.scala:138)
{noformat}

I have tried it in PostgreSQL, and it directly throws an error:
{noformat}
ERROR: Aggregate functions are not allowed in WHERE. 
{noformat}

We'd better throw an AnalysisException here.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org