You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Martin Gammelsæter <ma...@gmail.com> on 2014/07/18 12:11:42 UTC

TreeNodeException: No function to evaluate expression. type: AttributeReference, tree: id#0 on GROUP BY

Hi again!

I am having problems when using GROUP BY on both SQLContext and
HiveContext (same problem).

My code (simplified as much as possible) can be seen here:
http://pastebin.com/33rjW67H

In short, I'm getting data from a Cassandra store with Datastax' new
driver (which works great by the way, recommended!), and mapping it to
a Spark SQL table through a Product class (Dokument in the source).
Regular SELECTs and stuff works fine, but once I try to do a GROUP BY,
I get the following error:

Exception in thread "main" org.apache.spark.SparkException: Job
aborted due to stage failure: Task 0.0:25 failed 4 times, most recent
failure: Exception failure in TID 63 on host 192.168.121.132:
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: No
function to evaluate expression. type: AttributeReference, tree: id#0
        org.apache.spark.sql.catalyst.expressions.AttributeReference.eval(namedExpressions.scala:158)
        org.apache.spark.sql.catalyst.expressions.MutableProjection.apply(Projection.scala:64)
        org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:195)
        org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:174)
        scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        scala.collection.Iterator$class.foreach(Iterator.scala:727)
        scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
        scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
        scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
        scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
        scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
        scala.collection.AbstractIterator.to(Iterator.scala:1157)
        scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
        scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
        scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
        scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
        org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750)
        org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750)
        org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096)
        org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096)
        org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:112)
        org.apache.spark.scheduler.Task.run(Task.scala:51)
        org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
        java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        java.lang.Thread.run(Thread.java:745)

What am I doing wrong?

-- 
Best regards,
Martin Gammelsæter

Re: TreeNodeException: No function to evaluate expression. type: AttributeReference, tree: id#0 on GROUP BY

Posted by Martin Gammelsæter <ma...@gmail.com>.

Aha, that makes sense. Thanks for the response! I guess one of the
areas Spark could need some love in in error messages (:

On Fri, Jul 18, 2014 at 9:41 PM, Michael Armbrust
<mi...@databricks.com> wrote:
> Sorry for the non-obvious error message.  It is not valid SQL to include
> attributes in the select clause unless they are also in the group by clause
> or are inside of an aggregate function.
>
> On Jul 18, 2014 5:12 AM, "Martin Gammelsæter" <ma...@gmail.com>
> wrote:
>>
>> Hi again!
>>
>> I am having problems when using GROUP BY on both SQLContext and
>> HiveContext (same problem).
>>
>> My code (simplified as much as possible) can be seen here:
>> http://pastebin.com/33rjW67H
>>
>> In short, I'm getting data from a Cassandra store with Datastax' new
>> driver (which works great by the way, recommended!), and mapping it to
>> a Spark SQL table through a Product class (Dokument in the source).
>> Regular SELECTs and stuff works fine, but once I try to do a GROUP BY,
>> I get the following error:
>>
>> Exception in thread "main" org.apache.spark.SparkException: Job
>> aborted due to stage failure: Task 0.0:25 failed 4 times, most recent
>> failure: Exception failure in TID 63 on host 192.168.121.132:
>> org.apache.spark.sql.catalyst.errors.package$TreeNodeException: No
>> function to evaluate expression. type: AttributeReference, tree: id#0
>>
>> org.apache.spark.sql.catalyst.expressions.AttributeReference.eval(namedExpressions.scala:158)
>>
>> org.apache.spark.sql.catalyst.expressions.MutableProjection.apply(Projection.scala:64)
>>
>> org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:195)
>>
>> org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:174)
>>         scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>         scala.collection.Iterator$class.foreach(Iterator.scala:727)
>>         scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>>
>> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>>
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>>
>> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
>>
>> scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
>>         scala.collection.AbstractIterator.to(Iterator.scala:1157)
>>
>> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
>>         scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
>>
>> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
>>         scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
>>         org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750)
>>         org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750)
>>
>> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096)
>>
>> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096)
>>
>> org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:112)
>>         org.apache.spark.scheduler.Task.run(Task.scala:51)
>>
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>         java.lang.Thread.run(Thread.java:745)
>>
>> What am I doing wrong?
>>
>> --
>> Best regards,
>> Martin Gammelsæter



-- 
Mvh.
Martin Gammelsæter
92209139

Re: TreeNodeException: No function to evaluate expression. type: AttributeReference, tree: id#0 on GROUP BY

Posted by Michael Armbrust <mi...@databricks.com>.

Sorry for the non-obvious error message.  It is not valid SQL to include
attributes in the select clause unless they are also in the group by clause
or are inside of an aggregate function.
On Jul 18, 2014 5:12 AM, "Martin Gammelsæter" <ma...@gmail.com>
wrote:

> Hi again!
>
> I am having problems when using GROUP BY on both SQLContext and
> HiveContext (same problem).
>
> My code (simplified as much as possible) can be seen here:
> http://pastebin.com/33rjW67H
>
> In short, I'm getting data from a Cassandra store with Datastax' new
> driver (which works great by the way, recommended!), and mapping it to
> a Spark SQL table through a Product class (Dokument in the source).
> Regular SELECTs and stuff works fine, but once I try to do a GROUP BY,
> I get the following error:
>
> Exception in thread "main" org.apache.spark.SparkException: Job
> aborted due to stage failure: Task 0.0:25 failed 4 times, most recent
> failure: Exception failure in TID 63 on host 192.168.121.132:
> org.apache.spark.sql.catalyst.errors.package$TreeNodeException: No
> function to evaluate expression. type: AttributeReference, tree: id#0
>
> org.apache.spark.sql.catalyst.expressions.AttributeReference.eval(namedExpressions.scala:158)
>
> org.apache.spark.sql.catalyst.expressions.MutableProjection.apply(Projection.scala:64)
>
> org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:195)
>
> org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7$$anon$1.next(Aggregate.scala:174)
>         scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>         scala.collection.Iterator$class.foreach(Iterator.scala:727)
>         scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
>         scala.collection.TraversableOnce$class.to
> (TraversableOnce.scala:273)
>         scala.collection.AbstractIterator.to(Iterator.scala:1157)
>
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
>         scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
>
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
>         scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
>         org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750)
>         org.apache.spark.rdd.RDD$$anonfun$15.apply(RDD.scala:750)
>
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096)
>
> org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1096)
>         org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:112)
>         org.apache.spark.scheduler.Task.run(Task.scala:51)
>
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:187)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         java.lang.Thread.run(Thread.java:745)
>
> What am I doing wrong?
>
> --
> Best regards,
> Martin Gammelsæter
>