You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by bravo-zhang <gi...@git.apache.org> on 2017/08/03 00:41:46 UTC

[GitHub] spark pull request #18826: LogisticRegressionModel.toString should summarize...

GitHub user bravo-zhang opened a pull request:

    https://github.com/apache/spark/pull/18826

    LogisticRegressionModel.toString should summarize model

    ## What changes were proposed in this pull request?
    
    [SPARK-14712](https://issues.apache.org/jira/browse/SPARK-14712)
    spark.mllib LogisticRegressionModel overrides toString to print a little model info. We should do the same in spark.ml and override repr in pyspark.
    
    ## How was this patch tested?
    
    LogisticRegressionSuite.scala
    Python doctest in pyspark.ml.classification.py

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/bravo-zhang/spark spark-14712

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18826.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18826
    
----
commit a71869472c974b6270d7360726f05e8f83e3cfc3
Author: bravo-zhang <mz...@gmail.com>
Date:   2017-08-03T00:31:21Z

    LogisticRegressionModel.toString should summarize model

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: LogisticRegressionModel.toString should summarize model

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: LogisticRegressionModel.toString should summarize model

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    **[Test build #4183 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4183/testReport)** for PR 18826 at commit [`a718694`](https://github.com/apache/spark/commit/a71869472c974b6270d7360726f05e8f83e3cfc3).
     * This patch **fails PySpark unit tests**.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    **[Test build #92157 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92157/testReport)** for PR 18826 at commit [`96d9430`](https://github.com/apache/spark/commit/96d9430b42b4d8f18f6b9b835032fdf6c6783df1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    **[Test build #91531 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91531/testReport)** for PR 18826 at commit [`96d9430`](https://github.com/apache/spark/commit/96d9430b42b4d8f18f6b9b835032fdf6c6783df1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    retest this please


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    **[Test build #92240 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92240/testReport)** for PR 18826 at commit [`189acfb`](https://github.com/apache/spark/commit/189acfb7a32d545e74ef660a316084739f32f88d).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    **[Test build #92240 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92240/testReport)** for PR 18826 at commit [`189acfb`](https://github.com/apache/spark/commit/189acfb7a32d545e74ef660a316084739f32f88d).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/91531/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: LogisticRegressionModel.toString should summarize model

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: LogisticRegressionModel.toString should summarize model

Posted by bravo-zhang <gi...@git.apache.org>.
Github user bravo-zhang commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    This PR recently got tested so it draws my attention. Is this something we want to proceed? @holdenk @yanboliang @jkbradley @dbtsai I don't see how the test failures relate to LogisticRegression. Might it be some other flaky tests? Also tagging @HyukjinKwon who is active in maintaining stale PRs.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92157/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    ok to test


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: LogisticRegressionModel.toString should summarize model

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    **[Test build #4187 has finished](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4187/testReport)** for PR 18826 at commit [`a718694`](https://github.com/apache/spark/commit/a71869472c974b6270d7360726f05e8f83e3cfc3).
     * This patch **fails Spark unit tests**.
     * This patch **does not merge cleanly**.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: LogisticRegressionModel.toString should summarize model

Posted by bravo-zhang <gi...@git.apache.org>.
Github user bravo-zhang commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    Hi @holdenk , I'm opening this PR to continue the effort in https://github.com/apache/spark/pull/12491
    When adding doctest, I noticed that the `uid` part of the string is always a random UUID, for example `LogisticRegression_42a29165218fb4d4f81d22`, generated in Scala even after I call `_resetUid`. So I re-construct the `__repr__` string in Python code.
    Really appreciate it if @yanboliang can also take a look.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by bravo-zhang <gi...@git.apache.org>.
Github user bravo-zhang commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    @HyukjinKwon Can you recommend someone to take a look at this PR or maybe you can take a look?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: LogisticRegressionModel.toString should summarize model

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    @bravo-zhang, mind if I ask to rebase it and see if the tests pass? BTW, let's fix the PR title to link the JIRA.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: LogisticRegressionModel.toString should summarize model

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: LogisticRegressionModel.toString should summarize model

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    On the Python side using __repr__ looks reasonable (although I would like to see a doctest for this). But we really should get @jkbradley or maybe @dbtsai to take a look on the ML side.
    
    Jenkins ok to test.
    
    Sorry for the delay on getting to this.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    I think you cc'ed right ones .. 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: LogisticRegressionModel.toString should summarize model

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    **[Test build #4187 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4187/testReport)** for PR 18826 at commit [`a718694`](https://github.com/apache/spark/commit/a71869472c974b6270d7360726f05e8f83e3cfc3).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: LogisticRegressionModel.toString should summarize model

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    LGTM merging to master.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92240/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    Thanks for the improvement :)


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18826: [SPARK-14712][ML] LogisticRegressionModel.toStrin...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18826


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    **[Test build #92157 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92157/testReport)** for PR 18826 at commit [`96d9430`](https://github.com/apache/spark/commit/96d9430b42b4d8f18f6b9b835032fdf6c6783df1).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: LogisticRegressionModel.toString should summarize model

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    **[Test build #4183 has started](https://amplab.cs.berkeley.edu/jenkins/job/NewSparkPullRequestBuilder/4183/testReport)** for PR 18826 at commit [`a718694`](https://github.com/apache/spark/commit/a71869472c974b6270d7360726f05e8f83e3cfc3).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    **[Test build #91531 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/91531/testReport)** for PR 18826 at commit [`96d9430`](https://github.com/apache/spark/commit/96d9430b42b4d8f18f6b9b835032fdf6c6783df1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    Can one of the admins verify this patch?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18826: [SPARK-14712][ML] LogisticRegressionModel.toStrin...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18826#discussion_r197496908
  
    --- Diff: python/pyspark/ml/classification.py ---
    @@ -562,6 +564,11 @@ def evaluate(self, dataset):
             java_blr_summary = self._call_java("evaluate", dataset)
             return BinaryLogisticRegressionSummary(java_blr_summary)
     
    +    def __repr__(self):
    --- End diff --
    
    So my question here is why we aren't calling the Java/Scala toString method directly as we do in the mllib one and in many of the other models in `regression.py` for the ml one?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by bravo-zhang <gi...@git.apache.org>.
Github user bravo-zhang commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    @HyukjinKwon It's ready to test.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18826: [SPARK-14712][ML] LogisticRegressionModel.toString shoul...

Posted by bravo-zhang <gi...@git.apache.org>.
Github user bravo-zhang commented on the issue:

    https://github.com/apache/spark/pull/18826
  
    @holdenk @HyukjinKwon I updated `__repr__` by calling `toString`. I also added class name `LogisticRegressionModel: ` to `toString`. Otherwise `uid` alone is a bit confusing.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org