You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by BryanCutler <gi...@git.apache.org> on 2016/02/27 01:27:38 UTC
[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...
GitHub user BryanCutler opened a pull request:
https://github.com/apache/spark/pull/11404
[SPARK-12633][PYSPARK] [DOC] PySpark regression parameter desc to consistent format
Part of task for [SPARK-11219](https://issues.apache.org/jira/browse/SPARK-11219) to make PySpark MLlib parameter description formatting consistent. This is for the regression module. Also, updated 2 params in classification to read as `Supported values:` to be consistent.
closes #10600
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/BryanCutler/spark param-desc-consistent-regression-SPARK-12633
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/11404.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #11404
----
commit 146c2a8b340f2269612a11469205aa10414cbb13
Author: vijaykiran <ma...@vijaykiran.com>
Date: 2016-01-05T11:08:34Z
[SPARK-12633][DOC] Update param descriptions
Updates the param descriptions to be consistent. See [SPARK-11219] for
more details.
commit d361d70806a9e758a9ee2986c144a89f6a0c7b63
Author: vijaykiran <ma...@vijaykiran.com>
Date: 2016-01-06T10:30:18Z
Style Fixes
Change fill-column to 100.
commit 45bec55b2f6bb165a0491e71bff6f2341a58b744
Author: vijaykiran <ma...@vijaykiran.com>
Date: 2016-01-22T14:21:51Z
Limit parameter descriptions to col 74
commit 5feecbad219895696709d804facfb8c575d1d5b4
Author: vijaykiran <ma...@vijaykiran.com>
Date: 2016-01-23T07:13:15Z
Fix indentation
commit 6aae1ba047f8126eb8b41838ab356078051f043d
Author: Bryan Cutler <cu...@gmail.com>
Date: 2016-02-26T23:00:45Z
Merge remote-tracking branch 'upstream/master' into pr-10600
commit 2e535424dae80fad627c6c23965046f8680139f6
Author: Bryan Cutler <cu...@gmail.com>
Date: 2016-02-27T00:19:28Z
[SPARK-12633] Fixed allowed values, cleanup, and sync with Scala API
commit 94d532dbb410f2a5b96a563f38e543edea66eb98
Author: Bryan Cutler <cu...@gmail.com>
Date: 2016-02-27T00:20:09Z
Changed 'Allowed values:' -> 'Supported values:' to be consistent
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...
Posted by MLnick <gi...@git.apache.org>.
Github user MLnick commented on the pull request:
https://github.com/apache/spark/pull/11404#issuecomment-190609932
Thank you for the PR - will do
Sent from my iPhone
> On 29 Feb 2016, at 19:53, Bryan Cutler <no...@github.com> wrote:
>
> Thanks for the help to get this in @MLnick , if you come across any other issues I could help out on, feel free to ping me :)
>
> —
> Reply to this email directly or view it on GitHub.
>
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...
Posted by BryanCutler <gi...@git.apache.org>.
Github user BryanCutler commented on the pull request:
https://github.com/apache/spark/pull/11404#issuecomment-190308961
Thanks for the help to get this in @MLnick , if you come across any other issues I could help out on, feel free to ping me :)
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11404#issuecomment-189541980
Merged build finished. Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11404#issuecomment-189541910
**[Test build #52090 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52090/consoleFull)** for PR 11404 at commit [`94d532d`](https://github.com/apache/spark/commit/94d532dbb410f2a5b96a563f38e543edea66eb98).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/11404#issuecomment-189538253
**[Test build #52090 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/52090/consoleFull)** for PR 11404 at commit [`94d532d`](https://github.com/apache/spark/commit/94d532dbb410f2a5b96a563f38e543edea66eb98).
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/11404
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...
Posted by BryanCutler <gi...@git.apache.org>.
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/11404#discussion_r54321225
--- Diff: python/pyspark/mllib/regression.py ---
@@ -368,56 +365,53 @@ def load(cls, sc, path):
class LassoWithSGD(object):
"""
- Train a regression model with L1-regularization using Stochastic Gradient Descent.
- This solves the L1-regularized least squares regression formulation
-
- f(weights) = 1/2n ||A weights-y||^2 + regParam ||weights||_1
-
- Here the data matrix has n rows, and the input RDD holds the set of rows of A, each with
- its corresponding right hand side label y.
- See also the documentation for the precise formulation.
-
--- End diff --
Same here
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...
Posted by BryanCutler <gi...@git.apache.org>.
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/11404#discussion_r54321208
--- Diff: python/pyspark/mllib/regression.py ---
@@ -217,19 +220,8 @@ def _regression_train_wrapper(train_func, modelClass, data, initial_weights):
class LinearRegressionWithSGD(object):
"""
- Train a linear regression model with no regularization using Stochastic Gradient Descent.
- This solves the least squares regression formulation
-
- f(weights) = 1/n ||A weights-y||^2
-
- which is the mean squared error.
- Here the data matrix has n rows, and the input RDD holds the set of rows of A, each with
- its corresponding right hand side label y.
- See also the documentation for the precise formulation.
-
--- End diff --
This comment block is repeated right below, so I thought it would be fine to remove
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:
https://github.com/apache/spark/pull/11404#issuecomment-189541982
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/52090/
Test PASSed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...
Posted by BryanCutler <gi...@git.apache.org>.
Github user BryanCutler commented on a diff in the pull request:
https://github.com/apache/spark/pull/11404#discussion_r54321241
--- Diff: python/pyspark/mllib/regression.py ---
@@ -508,56 +502,53 @@ def load(cls, sc, path):
class RidgeRegressionWithSGD(object):
"""
- Train a regression model with L2-regularization using Stochastic Gradient Descent.
- This solves the L2-regularized least squares regression formulation
-
- f(weights) = 1/2n ||A weights-y||^2 + regParam/2 ||weights||^2
-
- Here the data matrix has n rows, and the input RDD holds the set of rows of A, each with
- its corresponding right hand side label y.
- See also the documentation for the precise formulation.
-
--- End diff --
This also repeated below
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request: [SPARK-12633][PYSPARK] [DOC] PySpark regressio...
Posted by MLnick <gi...@git.apache.org>.
Github user MLnick commented on the pull request:
https://github.com/apache/spark/pull/11404#issuecomment-190221851
LGTM, merged into master. Thanks!
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org