You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by HyukjinKwon <gi...@git.apache.org> on 2017/11/11 13:31:02 UTC

[GitHub] spark pull request #19722: [WIP][SPARK-21693][R][ML] Reduce max iterations i...

GitHub user HyukjinKwon opened a pull request:

    https://github.com/apache/spark/pull/19722

    [WIP][SPARK-21693][R][ML] Reduce max iterations in Linear SVM test in R to speed up AppVeyor build

    ## What changes were proposed in this pull request?
    
    This PR proposes to reduce max iteration in Linear SVM test in SparkR. This particular test elapsed roughly 5 mins and 20+mins on Windows.
    
    The root cause is, it triggers 2500ish jobs by the default 100 iterations. In Linux, `daemon.R` is forked but on Windows, another process is launched, which is extremely slow. 
    
    So, given my observation, there are many processes are ran on Windows, which makes the differences of elapsed time.
    
    After reducing the max iteration, the total jobs in this single test is reduced to 550ish.
    
    ## How was this patch tested?
    
    Manually tested.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/HyukjinKwon/spark SPARK-21693-test

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19722.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19722
    
----
commit c359644d8d52ceddbbdd0e62bdcf85f95787850d
Author: hyukjinkwon <gu...@gmail.com>
Date:   2017-11-11T13:13:21Z

    Reduce max iterations in Linear SVM test in R to speed up AppVeyor build

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [WIP][SPARK-21693][R][ML] Reduce max iterations in Linea...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83725/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [WIP][SPARK-21693][R][ML] Reduce max iterations in Linea...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    **[Test build #83725 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83725/testReport)** for PR 19722 at commit [`4ee95d4`](https://github.com/apache/spark/commit/4ee95d472140deb4019797b53f6bfb34047b294b).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [WIP][SPARK-21693][R][ML] Reduce max iterations in Linea...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    I checked it speeded up in https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1909-master but the test seems hanging in `SparkSQL functions`, which seems unrelated. I triggered three more builds above to verify it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [SPARK-21693][R][ML] Reduce max iterations in Linear SVM...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    cc @felixcheung and @shivaram too. This one finally reduces the time!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [SPARK-21693][R][ML] Reduce max iterations in Linear SVM...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    Thank you @srowen and @felixcheung for review! and thanks @mgaido91 and @dongjoon-hyun for your thumbs up :D.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [WIP][SPARK-21693][R][ML] Reduce max iterations in Linea...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    I think I need to make sure it reduces the time significantly before cc'ing someone to review. I expect the decreas of 15ish mins.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [WIP][SPARK-21693][R][ML] Reduce max iterations in Linea...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    Build started: [SparkR] `ALL` [![PR-19722](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=FCAF9D64-08C7-4199-8986-F2A6DD76A906&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/FCAF9D64-08C7-4199-8986-F2A6DD76A906)
    Build started: [SparkR] `ALL` [![PR-19722](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=930D0519-A427-4D8D-9372-757B31429E45&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/930D0519-A427-4D8D-9372-757B31429E45)
    Build started: [SparkR] `ALL` [![PR-19722](https://ci.appveyor.com/api/projects/status/github/spark-test/spark?branch=F80F59CA-9CC2-4273-82B6-6A0F95724E6A&svg=true)](https://ci.appveyor.com/project/spark-test/spark/branch/F80F59CA-9CC2-4273-82B6-6A0F95724E6A)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [WIP][SPARK-21693][R][ML] Reduce max iterations in Linea...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    **[Test build #83725 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83725/testReport)** for PR 19722 at commit [`4ee95d4`](https://github.com/apache/spark/commit/4ee95d472140deb4019797b53f6bfb34047b294b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [WIP][SPARK-21693][R][ML] Reduce max iterations in Linea...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    Yes, it looked so. Let me give a try in my local and push a commit to check.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [SPARK-21693][R][ML] Reduce max iterations in Linear SVM...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    merged to master/cherry pick to 2.2


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19722: [SPARK-21693][R][ML] Reduce max iterations in Lin...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/19722


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [WIP][SPARK-21693][R][ML] Reduce max iterations in Linea...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    **[Test build #83724 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83724/testReport)** for PR 19722 at commit [`c359644`](https://github.com/apache/spark/commit/c359644d8d52ceddbbdd0e62bdcf85f95787850d).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [WIP][SPARK-21693][R][ML] Reduce max iterations in Linea...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83724/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [WIP][SPARK-21693][R][ML] Reduce max iterations in Linea...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    **[Test build #83724 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83724/testReport)** for PR 19722 at commit [`c359644`](https://github.com/apache/spark/commit/c359644d8d52ceddbbdd0e62bdcf85f95787850d).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [WIP][SPARK-21693][R][ML] Reduce max iterations in Linea...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [SPARK-21693][R][ML] Reduce max iterations in Linear SVM...

Posted by felixcheung <gi...@git.apache.org>.
Github user felixcheung commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    LGTM cool stuff, if we get the same result in less iterations
    
    Let’s get this into 2.2 as well
    



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [SPARK-21693][R][ML] Reduce max iterations in Linear SVM...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    Yea, the first build speeded up roughly about 30 minutes - https://ci.appveyor.com/project/ApacheSoftwareFoundation/spark/build/1909-master. I triggered more builds here for sure with my account..


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19722: [WIP][SPARK-21693][R][ML] Reduce max iterations in Linea...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19722
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org