You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Yanbo Liang (JIRA)" <ji...@apache.org> on 2017/03/21 15:17:41 UTC

[jira] [Comment Edited] (SPARK-17136) Design optimizer interface for ML algorithms

    [ https://issues.apache.org/jira/browse/SPARK-17136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15934736#comment-15934736 ] 

Yanbo Liang edited comment on SPARK-17136 at 3/21/17 3:17 PM:
--------------------------------------------------------------

[~sethah] Thanks for the design doc.
One quick question: In your design, if we set the parameters in optimizer, Do we still support setting these parameters in estimator again?
If yes, why we need to support two entrances for the same set of params? I saw you reply at the design doc, you propose to make the params in optimizer superior to the ones in estimator. Does it involves confusion for users and extra maintenance cost?
Does the grid search-based model selection in the current framework (such as CrossValidator) can still work well? Thanks.
I'm more prefer to keep these params in estimators, make the optimizer layer as an internal API, and users can register their own optimizer implementation such as the data source support. Since I found this is more aligned with the original [ML pipeline design|https://docs.google.com/document/d/1rVwXRjWKfIb-7PI6b86ipytwbUH7irSNLF1_6dLmh8o/edit#] which stores params outside a pipeline component.



was (Author: yanboliang):
[~sethah] Thanks for the design doc.
One quick question: In your design, if we set the parameters in optimizer, Do we still support setting these parameters in estimator again?
If yes, why we need to support two entrances for the same set of params? I saw you reply at the design doc, you propose to make the params in optimizer superior to the ones in estimator. Does it involves confusion for users and extra maintenance cost?
Does the grid search-based model selection in the current framework (such as CrossValidator) can still work well? Thanks.


> Design optimizer interface for ML algorithms
> --------------------------------------------
>
>                 Key: SPARK-17136
>                 URL: https://issues.apache.org/jira/browse/SPARK-17136
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML
>            Reporter: Seth Hendrickson
>
> We should consider designing an interface that allows users to use their own optimizers in some of the ML algorithms, similar to MLlib. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org