You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yanbo Liang (JIRA)" <ji...@apache.org> on 2015/12/01 10:39:11 UTC

[jira] [Comment Edited] (SPARK-11939) PySpark support model export/import for Pipeline API

    [ https://issues.apache.org/jira/browse/SPARK-11939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15033425#comment-15033425 ] 

Yanbo Liang edited comment on SPARK-11939 at 12/1/15 9:38 AM:
--------------------------------------------------------------

I'm working on this feature, but I think we should discuss before start to work:
* Support save for Estimator/Transformer is very simple that we can directly call the corresponding functions of java object.
* Support load for Transformer(Model) is not very hard, because we can load the java model and pass it to PySpark model constructor. What we should do is to make model constructor public at Scala side.
* Support load for Estimator is not easy due to the constructor of Estimator can not accept java estimator as argument. Here I have two proposal:
** Re-implement load for Estimator at Python side following the Scala code. 
** Add an param named "javaObj" that we can pass java estimator object to the constructor of Python estimator.

Any suggestions, looking forward your comments. [~mengxr] [~josephkb]


was (Author: yanboliang):
I'm working on this feature, but I think we should discuss before start to work:
* Support save for Estimator/Transformer is very simple that we can directly call the corresponding functions of java object.
* Support load for Transformer(Model) is not very hard, because we can load the java model and pass it to PySpark model constructor. What we should do is to make model constructor public at Scala side.
* Support load for Estimator is not easy due to the constructor of Estimator can not accept java estimator as argument. Here I have two proposal:
** Re-implement load for Estimator at Python side following the Scala code. 
** Add an param named "javaObj" that we can pass java estimator object to the constructor of Python estimator.
Any suggestions, looking forward your comments. [~mengxr] [~josephkb]

> PySpark support model export/import for Pipeline API
> ----------------------------------------------------
>
>                 Key: SPARK-11939
>                 URL: https://issues.apache.org/jira/browse/SPARK-11939
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML, PySpark
>            Reporter: Yanbo Liang
>
> SPARK-6725 provide model export/import for Pipeline API at Scala side, we should also support it at Python side.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org