You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Colin Beckingham (JIRA)" <ji...@apache.org> on 2016/07/29 02:09:20 UTC
[jira] [Comment Edited] (SPARK-16768) pyspark calls incorrect version of logistic regression

    [ https://issues.apache.org/jira/browse/SPARK-16768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15398571#comment-15398571 ] 

Colin Beckingham edited comment on SPARK-16768 at 7/29/16 2:08 AM:
-------------------------------------------------------------------

This is very strange then. I can launch Spark 2.1.0 with pyspark, run "from pyspark.mllib.classification import LogisticRegressionWithLBFGS" and the import succeeds, and I can call help on the import and get a description of what it does. If there is no longer an LBFGS version should not the import fail with some warning that the command is deprecated? I see from http://spark.apache.org/docs/latest/mllib-optimization.html that implementation of LBFGS is an issue that is "being worked on". It raises the issue of whether the currently working version in 1.6.2 is reliable; right now running the same problem on both 1.6.2 and 2.1.0 produces a much faster and accurate result on the former.


was (Author: colbec):
This is very strange then. I can launch Spark 2.1.0 with pyspark, run "from pyspark.mllib.classification import LogisticRegressionWithLBFGS" and the import succeeds, and I can call help on the import and get a description of what it does. If there is no longer an LBGFS version should not the import fail with some warning that the command is deprecated? I see from http://spark.apache.org/docs/latest/mllib-optimization.html that implementation of LGBFS is an issue that is "being worked on". It raises the issue of whether the currently working version in 1.6.2 is reliable; right now running the same problem on both 1.6.2 and 2.1.0 produces a much faster and accurate result on the former.

> pyspark calls incorrect version of logistic regression
> ------------------------------------------------------
>
>                 Key: SPARK-16768
>                 URL: https://issues.apache.org/jira/browse/SPARK-16768
>             Project: Spark
>          Issue Type: Bug
>          Components: MLlib, PySpark
>         Environment: Linux openSUSE Leap 42.1 Gnome
>            Reporter: Colin Beckingham
>             Fix For: 2.1.0
>
>
> PySpark call with Spark 1.6.2 "LogisticRegressionWithLBFGS.train()"  runs "treeAggregate at LBFGS.scala:218" but the same command in pyspark with Spark 2.1 runs "treeAggregate at LogisticRegression.scala:1092". This non-optimized version is much slower and produces a different answer from LBFGS.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org