You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Hyukjin Kwon (JIRA)" <ji...@apache.org> on 2019/05/21 04:12:23 UTC

[jira] [Resolved] (SPARK-22351) Support user-created custom Encoders for Datasets

     [ https://issues.apache.org/jira/browse/SPARK-22351?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Hyukjin Kwon resolved SPARK-22351.
----------------------------------
    Resolution: Incomplete

> Support user-created custom Encoders for Datasets
> -------------------------------------------------
>
>                 Key: SPARK-22351
>                 URL: https://issues.apache.org/jira/browse/SPARK-22351
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 2.2.0
>            Reporter: Adamos Loizou
>            Priority: Minor
>              Labels: bulk-closed
>
> It would be very helpful if we could easily support creating custom encoders for classes in Spark SQL.
> This is to allow a user to properly define a business model using types of their choice. They can then map them to Spark SQL types without being forced to pollute their model with the built-in mappable types (e.g. {{java.sql.Timestamp}}).
> Specifically in our case, we tend to use either the Java 8 time API or the joda time API for dates instead of {{java.sql.Timestamp}} whose API is quite limited compared to the others.
> Ideally we would like to be able to have a dataset of such a class:
> {code:java}
> case class Person(name: String, dateOfBirth: org.joda.time.LocalDate)
> implicit def localDateTimeEncoder: Encoder[LocalDate] = ??? // we define something that maps to Spark SQL TimestampType
> ...
> // read csv and map it to model
> val people:Dataset[Person] = spark.read.csv("/my/path/file.csv").as[Person]
> {code}
> While this was possible in Spark 1.6 it's not longer the case in Spark 2.x.
> It's also not straight forward as to how to support that using an {{ExpressionEncoder}} (any tips would be much appreciated)
> Thanks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org