You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Liang-Chi Hsieh (JIRA)" <ji...@apache.org> on 2017/01/04 07:35:58 UTC

[jira] [Comment Edited] (SPARK-7768) Make user-defined type (UDT) API public

    [ https://issues.apache.org/jira/browse/SPARK-7768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15797443#comment-15797443 ] 

Liang-Chi Hsieh edited comment on SPARK-7768 at 1/4/17 7:35 AM:
----------------------------------------------------------------

I would like to push this forward and make UDTRegistration public. I would like to hear the options from others. cc [~rxin] [~mengxr], appreciate if you can provide some insights.

I noticed that in SPARK-14155 we hide UserDefinedType in Spark 2.0. One reason is UDT doesn't work with dataset.

In fact, currently UserDefinedType can work with dataset if you define the implicit encoder of the UDT object, i.e., something like {{implicit val encoder: Encoder[UDT.MyDenseVector] = ExpressionEncoder()}}. ScalaReflection will take care of encoding UDT object. So I am not sure if "making UDTs actually work with Datasets" is not a problem anymore.



was (Author: viirya):
I would like to push this forward and make UDTRegistration public. I would like to hear the options from others. cc [~rxin] [~mengxr], appreciate if you can provide some insights.

I noticed that in SPARK-14155 we hide UserDefinedType in Spark 2.0. One reason is UDT doesn't work with dataset.

In fact, currently UserDefinedType can work with dataset if you define the implicit encoder of the UDT object, i.e., something like {{implicit val encoder: Encoder[UDT.MyDenseVector] = ExpressionEncoder()}}. So I am not sure if "making UDTs actually work with Datasets" is not a problem anymore.


> Make user-defined type (UDT) API public
> ---------------------------------------
>
>                 Key: SPARK-7768
>                 URL: https://issues.apache.org/jira/browse/SPARK-7768
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>            Reporter: Xiangrui Meng
>            Priority: Critical
>
> As the demand for UDTs increases beyond sparse/dense vectors in MLlib, it would be nice to make the UDT API public in 1.5.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org