You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by yinxusen <gi...@git.apache.org> on 2015/12/02 18:42:26 UTC

[GitHub] spark pull request: [SPARK-12098] Cross validator with multi-arm b...

GitHub user yinxusen opened a pull request:

    https://github.com/apache/spark/pull/10105

    [SPARK-12098] Cross validator with multi-arm bandit search

    https://issues.apache.org/jira/browse/SPARK-12098
    
    The classic cross-validation requires all inner classifiers iterate to a fixed number of iterations, or until convergence states. It is costly especially in the massive data scenario. According to the paper [Non-stochastic Best Arm Identification and Hyperparameter Optimization](http://arxiv.org/pdf/1502.07943v1.pdf), we can see a promising way to reduce the amount of total iterations of cross-validation with multi-armed bandit search.
    
    The multi-armed bandit search for cross-validation (bandit search for short) requires warm-start of ml algorithms, and fine-grained control of the inner behavior of the corss validator.
    
    Since there are bunch of algorithms of bandit search to find the best parameter set, we intent to provide only a few of them in the beginning to reduce the test/perf-test work and make it more stable.
    
    Here we only provide StaticSearch and ExponentialWeightsSearch (See chapter 3 of the [Regret Analysis of Stochastic and Nonstochastic Multi-armed Bandit Problems](http://arxiv.org/abs/1204.5721)) in the version. More search strategies and perf-test please see https://github.com/yinxusen/spark/tree/bandit.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/yinxusen/spark SPARK-12098

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/10105.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #10105
    
----
commit fa728e8068fd4ec732bf5d10bf91d9ef4c5444b2
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-10-21T20:37:24Z

    add bandit search

commit 85e0de61dddde37afc2253c52284c2f7b43d1ac4
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-10-21T20:51:06Z

    refine arm

commit 73835cc887b15f5474e4e346c515940d08df7a55
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-10-21T21:14:47Z

    refine search

commit 32362334ada727afc7f6b68c55fd2f2e660cd4b5
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-10-21T21:16:16Z

    refine search 2

commit 5f812bf8b20b7c9f9aece198d915a183442148dc
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-10-21T21:37:18Z

    refine search

commit 30a948ec6c327db3035133c377ff25cb46138a92
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-10-22T18:53:24Z

    refine imports

commit 68543d83baaa38c4e111ae14f00a901310824c89
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-10-22T19:02:30Z

    add bandit test

commit 77c50a943931e8ed53e12c492f0cfd4866f238ca
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-10-22T20:09:32Z

    fix style

commit 92aee07e6e0876e1ad7c97d58c83eba0634a9aab
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-10-22T20:26:39Z

    add bandit example

commit 1db3b251ac0dab91172a35854b52ce5875333397
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-20T06:13:50Z

    push with errors, talk with someone

commit 8d760c4d88fa65a629f227a74f34bdc5a9ec138b
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-24T09:01:19Z

    fix type errors

commit 0bc4ebfcefd78f01162acc184adc56608082fa76
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-24T09:02:58Z

    Merge branch 'master' into bandit

commit f7d7f53b24f388966735bf5cedda6122e40420e6
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-24T10:11:33Z

    add package object

commit 9285367f03c88cb71190fa483402c1f3cd7ab6bb
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-24T15:00:57Z

    a runnable strong type check version

commit e240b3ecf92f45f9b9912d3e64bf7a824670ebda
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-24T16:07:10Z

    change arm into common class

commit 4b8c441ffaa633b8a46a564e17e7b7feb65f28e7
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-24T16:14:44Z

    fix all search strategies

commit 36302898d86d691b653440476387809f842c4a84
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-24T16:18:21Z

    fix code style

commit 49d13a6863f11382a6e16cae39f4e65741e90d1c
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-24T16:23:36Z

    fix style

commit 988c27e0eca0c920019a2ef929365e54b75f31d6
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-24T16:39:02Z

    add comments

commit e97236623e0566f654f28e3ac220429154076988
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-29T07:23:20Z

    fix errors

commit ec21779ec7783e4d05163daae5c3d6a828df9b7e
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-29T07:41:58Z

    fix small bugs

commit 98ea474dc5b5d812ed1ddda006a7a6e3d2395408
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-29T14:48:39Z

    add perf test

commit 47647ede5b5e9341c61086f1b4699f45107c6050
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-29T15:23:04Z

    fix search and num of iterations

commit 4e6f6dc8761e62feeb3290ef19987b94a375fc53
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-29T16:11:08Z

    get records of each iteration

commit 370193a3907d123d3b676da7b3dd3c020b46fcaa
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-29T16:41:20Z

    add summary

commit 0112d800d8b57bd137e836372e20340f2fef411d
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-29T17:13:03Z

    refine it

commit b63b0c32a4487a014838f7333714288f00608906
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-11-29T17:25:37Z

    fix errors

commit e54926a880d3c9aab4fc7afb0becfee8ec137336
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-12-02T10:29:03Z

    add for painting

commit be270a7d5b87ae198234f9ebb2aa37ee0030de67
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-12-02T14:58:43Z

    add more comments

commit 2b7064a59c261972e6bea73decf10c713faebef4
Author: Xusen Yin <yi...@gmail.com>
Date:   2015-12-02T15:09:51Z

    fix typo

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12098] Cross validator with multi-arm b...

Posted by yinxusen <gi...@git.apache.org>.
Github user yinxusen commented on the pull request:

    https://github.com/apache/spark/pull/10105#issuecomment-169250966
  
    @jkbradley Get it and close it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12098] Cross validator with multi-arm b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10105#issuecomment-161392711
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/47075/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12098] Cross validator with multi-arm b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10105#issuecomment-161392614
  
    **[Test build #47075 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47075/consoleFull)** for PR 10105 at commit [`e261872`](https://github.com/apache/spark/commit/e26187206e73f2e98ad38e31cf89b119fa2b5390).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `trait BanditValidatorParams extends ValidatorParams with HasMaxIter `\n  * `class BanditValidator(override val uid: String)`\n  * `class Arm(`\n  * `trait Controllable extends Params with HasMaxIter `\n  * `abstract class Search `\n  * `class StaticSearch extends Search `\n  * `class ExponentialWeightsSearch extends Search `\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12098] Cross validator with multi-arm b...

Posted by yinxusen <gi...@git.apache.org>.
Github user yinxusen commented on the pull request:

    https://github.com/apache/spark/pull/10105#issuecomment-169251108
  
    Close it for more feedbacks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12098] Cross validator with multi-arm b...

Posted by yinxusen <gi...@git.apache.org>.
Github user yinxusen commented on the pull request:

    https://github.com/apache/spark/pull/10105#issuecomment-161497171
  
    The error is caused by my modification in `LogisticRegression`. I add `Controllable` to it without implementing the save/load for new params.
    
    However, this is not the right place to add those functionalities here, I plan to add warm-start to estimators in other JIRA issue. Then go back and fix the error here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12098] Cross validator with multi-arm b...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/10105#issuecomment-161392709
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12098] Cross validator with multi-arm b...

Posted by yinxusen <gi...@git.apache.org>.
Github user yinxusen closed the pull request at:

    https://github.com/apache/spark/pull/10105


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12098] Cross validator with multi-arm b...

Posted by jkbradley <gi...@git.apache.org>.
Github user jkbradley commented on the pull request:

    https://github.com/apache/spark/pull/10105#issuecomment-169104918
  
    @yinxusen I just commented on the JIRA about this, but could we please close this issue for now?  I'd like to postpone this feature due to limited review bandwidth.  But definitely post it as a Spark package, and see if you can get some feedback from users.  Thanks very much.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12098] Cross validator with multi-arm b...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/10105#issuecomment-161381057
  
    **[Test build #47075 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/47075/consoleFull)** for PR 10105 at commit [`e261872`](https://github.com/apache/spark/commit/e26187206e73f2e98ad38e31cf89b119fa2b5390).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org