You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@spark.apache.org by me...@apache.org on 2015/10/09 07:21:13 UTC

spark git commit: [SPARK-10959] [PYSPARK] StreamingLogisticRegressionWithSGD does not train with given regParam and convergenceTol parameters

Repository: spark
Updated Branches:
  refs/heads/master 67fbecbf3 -> 5410747a8


[SPARK-10959] [PYSPARK] StreamingLogisticRegressionWithSGD does not train with given regParam and convergenceTol parameters

These params were being passed into the StreamingLogisticRegressionWithSGD constructor, but not transferred to the call for model training.  Same with StreamingLinearRegressionWithSGD.  I added the params as named arguments to the call and also fixed the intercept parameter, which was being passed as regularization value.

Author: Bryan Cutler <bj...@us.ibm.com>

Closes #9002 from BryanCutler/StreamingSGD-convergenceTol-bug-10959.


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/5410747a
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/5410747a
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/5410747a

Branch: refs/heads/master
Commit: 5410747a84e9be1cea44159dfc2216d5e0728ab4
Parents: 67fbecb
Author: Bryan Cutler <bj...@us.ibm.com>
Authored: Thu Oct 8 22:21:07 2015 -0700
Committer: Xiangrui Meng <me...@databricks.com>
Committed: Thu Oct 8 22:21:07 2015 -0700

----------------------------------------------------------------------
 python/pyspark/mllib/classification.py | 3 ++-
 python/pyspark/mllib/regression.py     | 2 +-
 2 files changed, 3 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/5410747a/python/pyspark/mllib/classification.py
----------------------------------------------------------------------
diff --git a/python/pyspark/mllib/classification.py b/python/pyspark/mllib/classification.py
index cb4ee83..b777545 100644
--- a/python/pyspark/mllib/classification.py
+++ b/python/pyspark/mllib/classification.py
@@ -639,7 +639,8 @@ class StreamingLogisticRegressionWithSGD(StreamingLinearAlgorithm):
             if not rdd.isEmpty():
                 self._model = LogisticRegressionWithSGD.train(
                     rdd, self.numIterations, self.stepSize,
-                    self.miniBatchFraction, self._model.weights)
+                    self.miniBatchFraction, self._model.weights,
+                    regParam=self.regParam, convergenceTol=self.convergenceTol)
 
         dstream.foreachRDD(update)
 

http://git-wip-us.apache.org/repos/asf/spark/blob/5410747a/python/pyspark/mllib/regression.py
----------------------------------------------------------------------
diff --git a/python/pyspark/mllib/regression.py b/python/pyspark/mllib/regression.py
index 256b753..961b5e8 100644
--- a/python/pyspark/mllib/regression.py
+++ b/python/pyspark/mllib/regression.py
@@ -679,7 +679,7 @@ class StreamingLinearRegressionWithSGD(StreamingLinearAlgorithm):
                 self._model = LinearRegressionWithSGD.train(
                     rdd, self.numIterations, self.stepSize,
                     self.miniBatchFraction, self._model.weights,
-                    self._model.intercept)
+                    intercept=self._model.intercept, convergenceTol=self.convergenceTol)
 
         dstream.foreachRDD(update)
 


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@spark.apache.org
For additional commands, e-mail: commits-help@spark.apache.org