You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Roger Menezes (JIRA)" <ji...@apache.org> on 2015/06/12 00:34:01 UTC

[jira] [Created] (SPARK-8314) improvement in performance of MLUtils.appendBias

Roger Menezes created SPARK-8314:
------------------------------------

             Summary: improvement in performance of MLUtils.appendBias
                 Key: SPARK-8314
                 URL: https://issues.apache.org/jira/browse/SPARK-8314
             Project: Spark
          Issue Type: Improvement
          Components: MLlib
    Affects Versions: 1.4.0
            Reporter: Roger Menezes
             Fix For: 1.5.0


MLUtils.appendBias method is heavily used in creating intercepts for linear models. This method uses Breeze's vector concatenation which is very slow compared to the plain System.arrayCopy. This improvement is to change the implementation to use System.arrayCopy. 

We saw the following performance improvements after the change:
Benchmark with mnist dataset for 50 times:
MLUtils.appendBias (SparseVector Before): 47320 ms
MLUtils.appendBias (SparseVector After): 1935 ms

MLUtils.appendBias (DenseVector Before): 5340 ms
MLUtils.appendBias (DenseVector After): 4080 ms

This is almost a 24 times performance boost for SparseVectors.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org