You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by WeichenXu123 <gi...@git.apache.org> on 2017/12/01 07:00:33 UTC

[GitHub] spark pull request #19857: [SPARK-22667][ML] Fix model-specific optimization...

GitHub user WeichenXu123 opened a pull request:

    https://github.com/apache/spark/pull/19857

    [SPARK-22667][ML] Fix model-specific optimization support for ML tuning: Python API

    ## What changes were proposed in this pull request?
    
    Python CrossValidator/TrainValidationSplit:
    With base Estimator implemented in Scala/Java
    → Convert base Estimator to Scala/Java object, and call the JVM fit() (as in Weichen’s comment)
    With base Estimator implemented in Python
    → Python needs the same machinery for multi-model fitting and parallelism as Scala.  We can call directly into it. New API added:
    ```
    class Estimator:
      def parallelFit(self, dataset, paramMaps, threadPool, modelCallback):
    ```
    
    **Note** This PR also fix the `# TODO: persist average metrics as well` in CV/TVS. Because the testsuite need to check consistency of `avgMetrics` so this need to be fixed.
    If this need backport to old spark version, I can split it to a separate PR.
    
    ## How was this patch tested?
    
    Existing UT already covers each code paths which need test.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/WeichenXu123/spark fix_model_spec_optim_py

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19857.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19857
    
----
commit 980c8ec87ddbc9f938942e78bb4cfe9753722bd2
Author: WeichenXu <we...@databricks.com>
Date:   2017-11-30T10:08:55Z

    init pr

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19857: [SPARK-22667][ML] Fix model-specific optimization suppor...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19857
  
    **[Test build #84373 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84373/testReport)** for PR 19857 at commit [`c6f2250`](https://github.com/apache/spark/commit/c6f225025a1ba002b6aa4ce83fb67dbe742395b1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19857: [SPARK-22667][ML] Fix model-specific optimization suppor...

Posted by WeichenXu123 <gi...@git.apache.org>.
Github user WeichenXu123 commented on the issue:

    https://github.com/apache/spark/pull/19857
  
    @MrBago @jkbradley I think this PR need to be reviewed and merged first, before reviewing #19627 
    Because this PR change some critical code path.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19857: [SPARK-22667][ML] Fix model-specific optimization suppor...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19857
  
    **[Test build #84373 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84373/testReport)** for PR 19857 at commit [`c6f2250`](https://github.com/apache/spark/commit/c6f225025a1ba002b6aa4ce83fb67dbe742395b1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19857: [SPARK-22667][ML] Fix model-specific optimization suppor...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19857
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19857: [SPARK-22667][ML] Fix model-specific optimization suppor...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19857
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84372/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19857: [SPARK-22667][ML] Fix model-specific optimization suppor...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19857
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/84373/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19857: [SPARK-22667][ML][WIP] Fix model-specific optimization s...

Posted by WeichenXu123 <gi...@git.apache.org>.
Github user WeichenXu123 commented on the issue:

    https://github.com/apache/spark/pull/19857
  
    The design of this issue changed. @MrBago will take this over.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19857: [SPARK-22667][ML][WIP] Fix model-specific optimiz...

Posted by WeichenXu123 <gi...@git.apache.org>.
Github user WeichenXu123 closed the pull request at:

    https://github.com/apache/spark/pull/19857


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19857: [SPARK-22667][ML] Fix model-specific optimization suppor...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19857
  
    **[Test build #84372 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84372/testReport)** for PR 19857 at commit [`980c8ec`](https://github.com/apache/spark/commit/980c8ec87ddbc9f938942e78bb4cfe9753722bd2).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19857: [SPARK-22667][ML] Fix model-specific optimization suppor...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19857
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19857: [SPARK-22667][ML] Fix model-specific optimization suppor...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19857
  
    **[Test build #84372 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/84372/testReport)** for PR 19857 at commit [`980c8ec`](https://github.com/apache/spark/commit/980c8ec87ddbc9f938942e78bb4cfe9753722bd2).
     * This patch **fails Python style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org