You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Nicholas Chammas (JIRA)" <ji...@apache.org> on 2016/07/07 20:21:11 UTC

[jira] [Updated] (SPARK-16074) Expose VectorUDT/MatrixUDT in a public API

     [ https://issues.apache.org/jira/browse/SPARK-16074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Nicholas Chammas updated SPARK-16074:
-------------------------------------
    Component/s:     (was: MLilb)
                 MLlib

> Expose VectorUDT/MatrixUDT in a public API
> ------------------------------------------
>
>                 Key: SPARK-16074
>                 URL: https://issues.apache.org/jira/browse/SPARK-16074
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>    Affects Versions: 2.0.0
>            Reporter: Xiangrui Meng
>            Assignee: Xiangrui Meng
>            Priority: Critical
>             Fix For: 2.0.0
>
>
> Both VectorUDT and MatrixUDT are private APIs, because UserDefinedType itself is private in Spark. However, in order to let developers implement their own transformers and estimators, we should expose both types in a public API to simply the implementation of transformSchema, transform, etc. Otherwise, they need to get the data types using reflection.
> Note that this doesn't mean to expose VectorUDT/MatrixUDT classes. We can just have a method or a static value that returns VectorUDT/MatrixUDT instance with DataType as the return type. There are two ways to implement this:
> 1. following DataTypes.java in SQL, so Java users doesn't need the extra "()".
> 2. Define DataTypes in Scala.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org