You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Apache Spark (JIRA)" <ji...@apache.org> on 2016/10/21 23:33:58 UTC

[jira] [Commented] (SPARK-18060) Avoid unnecessary standardization in multinomial logistic regression training

    [ https://issues.apache.org/jira/browse/SPARK-18060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15596679#comment-15596679 ] 

Apache Spark commented on SPARK-18060:
--------------------------------------

User 'sethah' has created a pull request for this issue:
https://github.com/apache/spark/pull/15593

> Avoid unnecessary standardization in multinomial logistic regression training
> -----------------------------------------------------------------------------
>
>                 Key: SPARK-18060
>                 URL: https://issues.apache.org/jira/browse/SPARK-18060
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML
>            Reporter: Seth Hendrickson
>
> The MLOR implementation in spark.ml trains the model in the standardized feature space by dividing the feature values by the column standard deviation in each iteration. We perform this computation many time more than is necessary in order to achieve sequential memory access pattern when computing the gradients. We can have both - sequential access patterns and reduced computation - if we use a column major layout for the coefficients.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org