You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by shahidki31 <gi...@git.apache.org> on 2018/10/06 18:18:50 UTC
[GitHub] spark pull request #22660: [SPARK-25624][TEST] Reduce test time of LogisticR...
GitHub user shahidki31 opened a pull request:
https://github.com/apache/spark/pull/22660
[SPARK-25624][TEST] Reduce test time of LogisticRegressionSuite.multinomial logistic regression…
… with intercept with elasticnet regularization
## What changes were proposed in this pull request?
In the test, "multinomial logistic regression with intercept with elasticnet regularization" in the "LogisticRegressionSuite", taking around 1 minute to train 2 logistic regression model.
However after analyzing the training cost over iteration, we can reduce the computation time by 50%.
Training cost vs iteration for model 1
![image](https://user-images.githubusercontent.com/23054875/46574488-ca050d80-c9c1-11e8-98e5-2206e4db5106.png)
So, model1 is converging after iteration 200.
Training cost vs iteration for model 2:
![image](https://user-images.githubusercontent.com/23054875/46574505-0173ba00-c9c2-11e8-98c4-f62a19274ce8.png)
After around 50 iteration, model2 is converging.
So, if we give maximum iteration for model1 and model2 as 250 and 75 respectively, we can reduce the computation time by half.
## How was this patch tested?
Computation time in local setup :
Before change:
~54 sec
After change:
~28 sec
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
Please review http://spark.apache.org/contributing.html before opening a pull request.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/shahidki31/spark SPARK-25624_LR
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22660.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22660
----
commit 02d721dfe9d98efc54aa19c0d11904b17141b53b
Author: Shahid <sh...@...>
Date: 2018-10-06T18:05:16Z
[SPARK-25624]LogisticRegressionSuite.multinomial logistic regression with intercept with elasticnet regularization 56 seconds
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22660: [SPARK-25624][TEST] Reduce test time of LogisticR...
Posted by shahidki31 <gi...@git.apache.org>.
Github user shahidki31 closed the pull request at:
https://github.com/apache/spark/pull/22660
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22660: [SPARK-25624][TEST] Reduce test time of LogisticRegressi...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22660
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22660: [SPARK-25624][TEST] Reduce test time of LogisticRegressi...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22660
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22660: [SPARK-25624][TEST] Reduce test time of LogisticRegressi...
Posted by shahidki31 <gi...@git.apache.org>.
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/22660
Thanks for the suggestion. I will close this and amend in the other PR.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22660: [SPARK-25624][TEST] Reduce test time of LogisticRegressi...
Posted by shahidki31 <gi...@git.apache.org>.
Github user shahidki31 commented on the issue:
https://github.com/apache/spark/pull/22660
cc @srowen Kindly review.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22660: [SPARK-25624][TEST] Reduce test time of LogisticRegressi...
Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:
https://github.com/apache/spark/pull/22660
This kind of thing looks OK, but please make one PR. There's no point in opening lots of JIRAs and PRs for the same change in N places.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22660: [SPARK-25624][TEST] Reduce test time of LogisticRegressi...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22660
Can one of the admins verify this patch?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org