You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Nick Pentreath (JIRA)" <ji...@apache.org> on 2017/08/03 12:00:01 UTC

[jira] [Commented] (SPARK-21086) CrossValidator, TrainValidationSplit should preserve all models after fitting

    [ https://issues.apache.org/jira/browse/SPARK-21086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16112623#comment-16112623 ] 

Nick Pentreath commented on SPARK-21086:
----------------------------------------

I just want to understand _why_ folks want to keep all the models? Is it actually the models (and model data) they want, or a way (well, easier "official API" way) to link the param permutations with the cross-val score to see what param combinations result in what scores? (In which case, https://issues.apache.org/jira/browse/SPARK-18704 is actually the solution).

> CrossValidator, TrainValidationSplit should preserve all models after fitting
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-21086
>                 URL: https://issues.apache.org/jira/browse/SPARK-21086
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>    Affects Versions: 2.2.0
>            Reporter: Joseph K. Bradley
>
> I've heard multiple requests for having CrossValidatorModel and TrainValidationSplitModel preserve the full list of fitted models.  This sounds very valuable.
> One decision should be made before we do this: Should we save and load the models in ML persistence?  That could blow up the size of a saved Pipeline if the models are large.
> * I suggest *not* saving the models by default but allowing saving if specified.  We could specify whether to save the model as an extra Param for CrossValidatorModelWriter, but we would have to make sure to expose CrossValidatorModelWriter as a public API and modify the return type of CrossValidatorModel.write to be CrossValidatorModelWriter (but this will not be a breaking change).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org