You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Michael Armbrust <mi...@databricks.com> on 2017/03/01 02:46:24 UTC

Re: Why Spark cannot get the derived field of case class in Dataset?

We only serialize things that are in the constructor.  You would have
access to it in the typed API (df.map(_.day)).  I'd suggest making a
factory method that fills these in and put them in the constructor if you
need to get to it from other dataframe operations.

On Tue, Feb 28, 2017 at 12:03 PM, Yong Zhang <ja...@hotmail.com> wrote:

> In the following example, the "day" value is in the case class, but I
> cannot get that in the Spark dataset, which I would like to use at runtime?
> Any idea? Do I have to force it to be present in the case class
> constructor? I like to derive it out automatically and used in the dataset
> or dataframe.
>
>
> Thanks
>
>
> scala> spark.versionres12: String = 2.1.0
>
> scala> import java.text.SimpleDateFormatimport java.text.SimpleDateFormat
>
> scala> val dateFormat = new SimpleDateFormat("yyyy-MM-dd")dateFormat: java.text.SimpleDateFormat = java.text.SimpleDateFormat@f67a0200
>
> scala> case class Test(time: Long) {     |   val day = dateFormat.format(time)     | }defined class Testscala> val t = Test(1487185076410L)t: Test = Test(1487185076410)
>
> scala> t.timeres13: Long = 1487185076410
>
> scala> t.dayres14: String = 2017-02-15
>
> scala> val ds = Seq(t).toDS()ds: org.apache.spark.sql.Dataset[Test] = [time: bigint]
>
> scala> ds.show+-------------+|         time|+-------------+|1487185076410|+-------------+
>
>
>