You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by BryanCutler <gi...@git.apache.org> on 2016/02/27 01:27:38 UTC

[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...

GitHub user BryanCutler opened a pull request:

    https://github.com/apache/spark/pull/11404

    [SPARK-12633][PYSPARK] [DOC] PySpark regression parameter desc to consistent format

    Part of task for [SPARK-11219](https://issues.apache.org/jira/browse/SPARK-11219) to make PySpark MLlib parameter description formatting consistent. This is for the regression module.  Also, updated 2 params in classification to read as `Supported values:` to be consistent.
    
    closes #10600 
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/BryanCutler/spark param-desc-consistent-regression-SPARK-12633

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/11404.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #11404
    
----
commit 146c2a8b340f2269612a11469205aa10414cbb13
Author: vijaykiran <ma...@vijaykiran.com>
Date:   2016-01-05T11:08:34Z

    [SPARK-12633][DOC] Update param descriptions
    
    Updates the param descriptions to be consistent. See [SPARK-11219] for
    more details.

commit d361d70806a9e758a9ee2986c144a89f6a0c7b63
Author: vijaykiran <ma...@vijaykiran.com>
Date:   2016-01-06T10:30:18Z

    Style Fixes
    
    Change fill-column to 100.

commit 45bec55b2f6bb165a0491e71bff6f2341a58b744
Author: vijaykiran <ma...@vijaykiran.com>
Date:   2016-01-22T14:21:51Z

    Limit parameter descriptions to col 74

commit 5feecbad219895696709d804facfb8c575d1d5b4
Author: vijaykiran <ma...@vijaykiran.com>
Date:   2016-01-23T07:13:15Z

    Fix indentation

commit 6aae1ba047f8126eb8b41838ab356078051f043d
Author: Bryan Cutler <cu...@gmail.com>
Date:   2016-02-26T23:00:45Z

    Merge remote-tracking branch 'upstream/master' into pr-10600

commit 2e535424dae80fad627c6c23965046f8680139f6
Author: Bryan Cutler <cu...@gmail.com>
Date:   2016-02-27T00:19:28Z

    [SPARK-12633] Fixed allowed values, cleanup, and sync with Scala API

commit 94d532dbb410f2a5b96a563f38e543edea66eb98
Author: Bryan Cutler <cu...@gmail.com>
Date:   2016-02-27T00:20:09Z

    Changed 'Allowed values:' -> 'Supported values:' to be consistent

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...

Posted by MLnick <gi...@git.apache.org>.
Github user MLnick commented on the pull request:

    https://github.com/apache/spark/pull/11404#issuecomment-190609932
  
    Thank you for the PR - will do
    
    Sent from my iPhone
    
    > On 29 Feb 2016, at 19:53, Bryan Cutler <no...@github.com> wrote:
    > 
    > Thanks for the help to get this in @MLnick , if you come across any other issues I could help out on, feel free to ping me :)
    > 
    > —
    > Reply to this email directly or view it on GitHub.
    > 



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...

Posted by BryanCutler <gi...@git.apache.org>.
Github user BryanCutler commented on the pull request:

    https://github.com/apache/spark/pull/11404#issuecomment-190308961
  
    Thanks for the help to get this in @MLnick , if you come across any other issues I could help out on, feel free to ping me :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11404#issuecomment-189541980
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11404#issuecomment-189541910
  
    **[Test build #52090 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52090/consoleFull)** for PR 11404 at commit [`94d532d`](https://github.com/apache/spark/commit/94d532dbb410f2a5b96a563f38e543edea66eb98).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/11404#issuecomment-189538253
  
    **[Test build #52090 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52090/consoleFull)** for PR 11404 at commit [`94d532d`](https://github.com/apache/spark/commit/94d532dbb410f2a5b96a563f38e543edea66eb98).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/11404


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...

Posted by BryanCutler <gi...@git.apache.org>.
Github user BryanCutler commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11404#discussion_r54321225
  
    --- Diff: python/pyspark/mllib/regression.py ---
    @@ -368,56 +365,53 @@ def load(cls, sc, path):
     
     class LassoWithSGD(object):
         """
    -    Train a regression model with L1-regularization using Stochastic Gradient Descent.
    -    This solves the L1-regularized least squares regression formulation
    -
    -        f(weights) = 1/2n ||A weights-y||^2  + regParam ||weights||_1
    -
    -    Here the data matrix has n rows, and the input RDD holds the set of rows of A, each with
    -    its corresponding right hand side label y.
    -    See also the documentation for the precise formulation.
    -
    --- End diff --
    
    Same here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...

Posted by BryanCutler <gi...@git.apache.org>.
Github user BryanCutler commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11404#discussion_r54321208
  
    --- Diff: python/pyspark/mllib/regression.py ---
    @@ -217,19 +220,8 @@ def _regression_train_wrapper(train_func, modelClass, data, initial_weights):
     
     class LinearRegressionWithSGD(object):
         """
    -    Train a linear regression model with no regularization using Stochastic Gradient Descent.
    -    This solves the least squares regression formulation
    -
    -        f(weights) = 1/n ||A weights-y||^2
    -
    -    which is the mean squared error.
    -    Here the data matrix has n rows, and the input RDD holds the set of rows of A, each with
    -    its corresponding right hand side label y.
    -    See also the documentation for the precise formulation.
    -
    --- End diff --
    
    This comment block is repeated right below, so I thought it would be fine to remove


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/11404#issuecomment-189541982
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52090/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...

Posted by BryanCutler <gi...@git.apache.org>.
Github user BryanCutler commented on a diff in the pull request:

    https://github.com/apache/spark/pull/11404#discussion_r54321241
  
    --- Diff: python/pyspark/mllib/regression.py ---
    @@ -508,56 +502,53 @@ def load(cls, sc, path):
     
     class RidgeRegressionWithSGD(object):
         """
    -    Train a regression model with L2-regularization using Stochastic Gradient Descent.
    -    This solves the L2-regularized least squares regression formulation
    -
    -          f(weights) = 1/2n ||A weights-y||^2  + regParam/2 ||weights||^2
    -
    -    Here the data matrix has n rows, and the input RDD holds the set of rows of A, each with
    -    its corresponding right hand side label y.
    -    See also the documentation for the precise formulation.
    -
    --- End diff --
    
    This also repeated below


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...

Posted by MLnick <gi...@git.apache.org>.
Github user MLnick commented on the pull request:

    https://github.com/apache/spark/pull/11404#issuecomment-190221851
  
    LGTM, merged into master. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org