You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2016/07/10 15:17:10 UTC

[jira] [Commented] (SPARK-16474) Global Aggregation doesn't seem to work at all

    [ https://issues.apache.org/jira/browse/SPARK-16474?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15369686#comment-15369686 ] 

Sean Owen commented on SPARK-16474:
-----------------------------------

I am not sure that is expected to work. You have defined an aggregation over Ints but applied it to Rows, as the exception says. Do you see an implicit you would expect to do the conversion for you? I don't from just inspecting the class.

> Global Aggregation doesn't seem to work at all 
> -----------------------------------------------
>
>                 Key: SPARK-16474
>                 URL: https://issues.apache.org/jira/browse/SPARK-16474
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 1.6.2, 2.0.0
>            Reporter: Amit Sela
>
> Executing a global aggregation (not grouped by key) fails.
> Take the following code for example:
> {code}
>     val session = SparkSession.builder()
>                               .appName("TestGlobalAggregator")
>                               .master("local[*]")
>                               .getOrCreate()
>     import session.implicits._
>     val ds1 = List(1, 2, 3).toDS
>     val ds2 = ds1.agg(
>           new Aggregator[Int, Int, Int]{
>       def zero: Int = 0
>       def reduce(b: Int, a: Int): Int = b + a
>       def merge(b1: Int, b2: Int): Int = b1 + b2
>       def finish(reduction: Int): Int = reduction
>       def bufferEncoder: Encoder[Int] = implicitly[Encoder[Int]]
>       def outputEncoder: Encoder[Int] = implicitly[Encoder[Int]]
>     }.toColumn)
>     ds2.printSchema
>     ds2.show
> {code}
> I would expect the result to be 6, but instead I get the following exception:
> {noformat}
> java.lang.ClassCastException: org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema cannot be cast to java.lang.Integer 
> at scala.runtime.BoxesRunTime.unboxToInt(BoxesRunTime.java:101)
> .........
> {noformat}
> Trying the same code on DataFrames in 1.6.2 results in:
> {noformat}
> Exception in thread "main" org.apache.spark.sql.AnalysisException: unresolved operator 'Aggregate [(anon$1(),mode=Complete,isDistinct=false) AS anon$1()#8]; 
> at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38)
> ..........
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org