You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by shahidki31 <gi...@git.apache.org> on 2018/10/06 18:18:50 UTC

[GitHub] spark pull request #22660: [SPARK-25624][TEST] Reduce test time of LogisticR...

GitHub user shahidki31 opened a pull request:

    https://github.com/apache/spark/pull/22660

     [SPARK-25624][TEST] Reduce test time of LogisticRegressionSuite.multinomial logistic regression…

    … with intercept with elasticnet regularization
    
    ## What changes were proposed in this pull request?
    In the test, "multinomial logistic regression with intercept with elasticnet regularization" in the "LogisticRegressionSuite", taking around 1 minute  to train 2 logistic regression model.
    However after analyzing the training cost over iteration, we can reduce the computation time by 50%.
    Training cost vs iteration for model 1
    ![image](https://user-images.githubusercontent.com/23054875/46574488-ca050d80-c9c1-11e8-98e5-2206e4db5106.png)
    
    So, model1 is converging after iteration 200.
    
    Training cost vs iteration for model 2:
    ![image](https://user-images.githubusercontent.com/23054875/46574505-0173ba00-c9c2-11e8-98c4-f62a19274ce8.png)
    After around 50 iteration, model2 is converging.
    So, if we give maximum iteration for model1 and model2 as 250 and 75 respectively, we can reduce the computation time by half.
    
    ## How was this patch tested?
    Computation time in local setup :
    Before change:
    ~54 sec
    After change:
    ~28 sec
    
    (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
    (If this patch involves UI changes, please attach a screenshot; otherwise, remove this)
    
    Please review http://spark.apache.org/contributing.html before opening a pull request.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/shahidki31/spark SPARK-25624_LR

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22660.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22660
    
----
commit 02d721dfe9d98efc54aa19c0d11904b17141b53b
Author: Shahid <sh...@...>
Date:   2018-10-06T18:05:16Z

     [SPARK-25624]LogisticRegressionSuite.multinomial logistic regression with intercept with elasticnet regularization 56 seconds

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22660: [SPARK-25624][TEST] Reduce test time of LogisticR...

Posted by shahidki31 <gi...@git.apache.org>.
Github user shahidki31 closed the pull request at:

    https://github.com/apache/spark/pull/22660


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22660: [SPARK-25624][TEST] Reduce test time of LogisticRegressi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22660
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22660: [SPARK-25624][TEST] Reduce test time of LogisticRegressi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22660
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22660: [SPARK-25624][TEST] Reduce test time of LogisticRegressi...

Posted by shahidki31 <gi...@git.apache.org>.
Github user shahidki31 commented on the issue:

    https://github.com/apache/spark/pull/22660
  
    Thanks for the suggestion. I will close this and amend in the other PR. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22660: [SPARK-25624][TEST] Reduce test time of LogisticRegressi...

Posted by shahidki31 <gi...@git.apache.org>.
Github user shahidki31 commented on the issue:

    https://github.com/apache/spark/pull/22660
  
    cc @srowen Kindly review.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22660: [SPARK-25624][TEST] Reduce test time of LogisticRegressi...

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/22660
  
    This kind of thing looks OK, but please make one PR. There's no point in opening lots of JIRAs and PRs for the same change in N places.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22660: [SPARK-25624][TEST] Reduce test time of LogisticRegressi...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22660
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org