You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Shixiong Zhu (JIRA)" <ji...@apache.org> on 2014/11/07 12:20:33 UTC

[jira] [Created] (SPARK-4296) Throw "Expression not in GROUP BY" when using same expression in group by clause and select clause

Shixiong Zhu created SPARK-4296:
-----------------------------------

             Summary: Throw "Expression not in GROUP BY" when using same expression in group by clause and  select clause
                 Key: SPARK-4296
                 URL: https://issues.apache.org/jira/browse/SPARK-4296
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 1.1.0
            Reporter: Shixiong Zhu


When the input data has a complex structure, using same expression in group by clause and  select clause will throw "Expression not in GROUP BY".

{code:java}
val sqlContext = new org.apache.spark.sql.SQLContext(sc)
import sqlContext.createSchemaRDD
case class Birthday(date: String)
case class Person(name: String, birthday: Birthday)
val people = sc.parallelize(List(Person("John", Birthday("1990-01-22")), Person("Jim", Birthday("1980-02-28"))))
people.registerTempTable("people")
val year = sqlContext.sql("select count(*), upper(birthday.date) from people group by upper(birthday.date)")
year.collect
{code}

Here is the plan of year:
{code:java}
SchemaRDD[3] at RDD at SchemaRDD.scala:105
== Query Plan ==
== Physical Plan ==
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Expression not in GROUP BY: Upper(birthday#1.date AS date#9) AS c1#3, tree:
Aggregate [Upper(birthday#1.date)], [COUNT(1) AS c0#2L,Upper(birthday#1.date AS date#9) AS c1#3]
 Subquery people
  LogicalRDD [name#0,birthday#1], MapPartitionsRDD[1] at mapPartitions at ExistingRDD.scala:36
{code}

The bug is the equality test for `Upper(birthday#1.date)` and `Upper(birthday#1.date AS date#9)`.

Maybe Spark SQL needs a mechanism to compare Alias expression and non-Alias expression.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org