You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Chris Bannister (JIRA)" <ji...@apache.org> on 2016/08/31 11:26:20 UTC

[jira] [Commented] (SPARK-12787) Dataset to support custom encoder

    [ https://issues.apache.org/jira/browse/SPARK-12787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15451964#comment-15451964 ] 

Chris Bannister commented on SPARK-12787:
-----------------------------------------

We would like to use spark-avro provided schema mapping to allow reading avro files via DataSets, this would make working with avro files very easy. It appears to be very simple to provide an avro Encoder but this is currently explicitly forbidden in [0]. Is it possible to relax the api around Encoders to allow experimental implementations?

[0] https://github.com/apache/spark/blob/12fd0cd615683cd4c3e9094ce71a1e6fc33b8d6a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/encoders/package.scala#L33 

> Dataset to support custom encoder
> ---------------------------------
>
>                 Key: SPARK-12787
>                 URL: https://issues.apache.org/jira/browse/SPARK-12787
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 1.6.0
>            Reporter: Muthu Jayakumar
>
> The current Dataset API allows to be loaded using a case-class that requires the the attribute name and types to be match up precisely.
> It would be nicer, if a Partial function can be provided as a parameter to transform the Dataframe like schema into Dataset. 
> Something like...
> test_dataframe.as[TestCaseClass](partial_function)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org