You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Adamos Loizou (JIRA)" <ji...@apache.org> on 2017/10/25 09:45:01 UTC

[jira] [Created] (SPARK-22351) Support user-created custom Encoders for Datasets

Adamos Loizou created SPARK-22351:
-------------------------------------

             Summary: Support user-created custom Encoders for Datasets
                 Key: SPARK-22351
                 URL: https://issues.apache.org/jira/browse/SPARK-22351
             Project: Spark
          Issue Type: New Feature
          Components: SQL
    Affects Versions: 2.2.0
            Reporter: Adamos Loizou


It would be very helpful if we could easily support creating custom encoders for classes in Spark SQL.

This is to allow a user to properly define a business model using types of their choice. They can then map them to Spark SQL types without being forced to pollute their model with the built-in mappable types (e.g. {{java.sql.Timestamp}}).

Specifically in our case, we tend to use either the Java 8 time API or the joda time API for dates instead of {{java.sql.Timestamp}} whose API is quite limited compared to the others.

Ideally we would like to be able to have a dataset of such a class:


{code:java}
case class Person(name: String, dateOfBirth: org.joda.time.LocalDate)
implicit def localDateTimeEncoder: Encoder[LocalDate] = ??? // we define something that maps to Spark SQL TimestampType
...
// read csv and map it to model
val people:Dataset[Person] = spark.read.csv("/my/path/file.csv").as[Person]
{code}

While this was possible in Spark 1.6 it's not longer the case in Spark 2.x.
It's also not straight forward as to how to support that using an {{ExpressionEncoder}} (any tips would be much appreciated)

Thanks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org