You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by Jacek Laskowski <ja...@japila.pl> on 2016/04/03 21:23:05 UTC

[SQL] Dataset.map gives error: missing parameter type for expanded function?

Hi,

(since 2.0.0-SNAPSHOT it's more for dev not user)

With today's master I'm getting the following:

scala> ds
res14: org.apache.spark.sql.Dataset[(String, Int)] = [_1: string, _2: int]

// WHY?!
scala> ds.groupBy(_._1)
<console>:26: error: missing parameter type for expanded function
((x$1) => x$1._1)
       ds.groupBy(_._1)
                  ^

scala> ds.filter(_._1.size > 10)
res23: org.apache.spark.sql.Dataset[(String, Int)] = [_1: string, _2: int]

It's even on the slide of Michael in
https://youtu.be/i7l3JQRx7Qw?t=7m38s from Spark Summit East?! Am I
doing something wrong? Please guide.

Pozdrawiam,
Jacek Laskowski
----
https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: [SQL] Dataset.map gives error: missing parameter type for expanded function?

Posted by Michael Armbrust <mi...@databricks.com>.

It is called groupByKey now.  Similar to joinWith, the schema produced by
relational joins and aggregations is different than what you would expect
when working with objects.  So, when combining DataFrame+Dataset we renamed
these functions to make this distinction clearer.

On Sun, Apr 3, 2016 at 12:23 PM, Jacek Laskowski <ja...@japila.pl> wrote:

> Hi,
>
> (since 2.0.0-SNAPSHOT it's more for dev not user)
>
> With today's master I'm getting the following:
>
> scala> ds
> res14: org.apache.spark.sql.Dataset[(String, Int)] = [_1: string, _2: int]
>
> // WHY?!
> scala> ds.groupBy(_._1)
> <console>:26: error: missing parameter type for expanded function
> ((x$1) => x$1._1)
>        ds.groupBy(_._1)
>                   ^
>
> scala> ds.filter(_._1.size > 10)
> res23: org.apache.spark.sql.Dataset[(String, Int)] = [_1: string, _2: int]
>
> It's even on the slide of Michael in
> https://youtu.be/i7l3JQRx7Qw?t=7m38s from Spark Summit East?! Am I
> doing something wrong? Please guide.
>
> Pozdrawiam,
> Jacek Laskowski
> ----
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
> For additional commands, e-mail: dev-help@spark.apache.org
>
>