You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "DB Tsai (JIRA)" <ji...@apache.org> on 2016/11/12 01:43:58 UTC

[jira] [Resolved] (SPARK-18060) Avoid unnecessary standardization in multinomial logistic regression training

     [ https://issues.apache.org/jira/browse/SPARK-18060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

DB Tsai resolved SPARK-18060.
-----------------------------
       Resolution: Fixed
    Fix Version/s: 2.1.0

Issue resolved by pull request 15593
[https://github.com/apache/spark/pull/15593]

> Avoid unnecessary standardization in multinomial logistic regression training
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-18060
>                 URL: https://issues.apache.org/jira/browse/SPARK-18060
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML
>            Reporter: Seth Hendrickson
>            Assignee: Seth Hendrickson
>             Fix For: 2.1.0
>
>
> The MLOR implementation in spark.ml trains the model in the standardized feature space by dividing the feature values by the column standard deviation in each iteration. We perform this computation many time more than is necessary in order to achieve sequential memory access pattern when computing the gradients. We can have both - sequential access patterns and reduced computation - if we use a column major layout for the coefficients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org