You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Joseph K. Bradley (JIRA)" <ji...@apache.org> on 2016/01/05 20:29:39 UTC

[jira] [Commented] (SPARK-12098) Cross validator with multi-arm bandit search

    [ https://issues.apache.org/jira/browse/SPARK-12098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15083610#comment-15083610 ] 

Joseph K. Bradley commented on SPARK-12098:
-------------------------------------------

[~yinxusen] Thanks for your work on this, but I think we need to delay this feature.  It's something we'll probably want to add in the future, but we just don't have the bandwidth right now for it.  Could you publish your work as a Spark package for the time being?  It would be great if you could get some feedback about the package from users, so that we can get more info about how much it improves on CrossValidator.  Thanks for your understanding.

> Cross validator with multi-arm bandit search
> --------------------------------------------
>
>                 Key: SPARK-12098
>                 URL: https://issues.apache.org/jira/browse/SPARK-12098
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML, MLlib
>            Reporter: Xusen Yin
>
> The classic cross-validation requires all inner classifiers iterate to a fixed number of iterations, or until convergence states. It is costly especially in the massive data scenario. According to the paper Non-stochastic Best Arm Identification and Hyperparameter Optimization (http://arxiv.org/pdf/1502.07943v1.pdf), we can see a promising way to reduce the amount of total iterations of cross-validation with multi-armed bandit search.
> The multi-armed bandit search for cross-validation (bandit search for short) requires warm-start of ml algorithms, and fine-grained control of the inner behavior of the corss validator.
> Since there are bunch of algorithms of bandit search to find the best parameter set, we intent to provide only a few of them in the beginning to reduce the test/perf-test work and make it more stable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org