You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "DB Tsai (JIRA)" <ji...@apache.org> on 2015/06/13 03:33:00 UTC

[jira] [Resolved] (SPARK-8314) improvement in performance of MLUtils.appendBias

     [ https://issues.apache.org/jira/browse/SPARK-8314?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

DB Tsai resolved SPARK-8314.
----------------------------
    Resolution: Fixed

Merged into master
https://github.com/apache/spark/commit/6e9c3ff1ecaf12a0126d83f27f5a4153ae420a34

> improvement in performance of MLUtils.appendBias
> ------------------------------------------------
>
>                 Key: SPARK-8314
>                 URL: https://issues.apache.org/jira/browse/SPARK-8314
>             Project: Spark
>          Issue Type: Improvement
>          Components: MLlib
>    Affects Versions: 1.4.0
>            Reporter: Roger Menezes
>             Fix For: 1.5.0
>
>
> MLUtils.appendBias method is heavily used in creating intercepts for linear models. This method uses Breeze's vector concatenation which is very slow compared to the plain System.arrayCopy. This improvement is to change the implementation to use System.arrayCopy. 
> We saw the following performance improvements after the change:
> Benchmark with mnist dataset for 50 times:
> MLUtils.appendBias (SparseVector Before): 47320 ms
> MLUtils.appendBias (SparseVector After): 1935 ms
> MLUtils.appendBias (DenseVector Before): 5340 ms
> MLUtils.appendBias (DenseVector After): 4080 ms
> This is almost a 24 times performance boost for SparseVectors.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org