You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@mahout.apache.org by "Pat Ferrel (JIRA)" <ji...@apache.org> on 2017/10/03 20:58:00 UTC

[jira] [Issue Comment Deleted] (MAHOUT-2019) SparseRowMatrix assign ops user for loops instead of iterateNonZero and so can be optimized

     [ https://issues.apache.org/jira/browse/MAHOUT-2019?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Pat Ferrel updated MAHOUT-2019:
-------------------------------
    Comment: was deleted

(was: This may be a non-issue: 

Trevor said in email:

{quote}The spark is included via maven classifier-

the sbt line should be

libraryDependencies += "org.apache.mahout" % "mahout-spark_2.11" %
"0.13.1-SNAPSHOT" classifier "spark_2.1"


{quote})

> SparseRowMatrix assign ops user for loops instead of iterateNonZero and so can be optimized
> -------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-2019
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-2019
>             Project: Mahout
>          Issue Type: Bug
>          Components: Math
>    Affects Versions: 0.13.0
>            Reporter: Pat Ferrel
>            Assignee: Pat Ferrel
>             Fix For: 0.13.1
>
>
> DRMs get blockified into SparseRowMatrix instances if the density is low. But SRM inherits the implementation of method like "assign" from AbstractMatrix, which uses nest for loops to traverse rows. For multiplying 2 matrices that are extremely sparse, the kind if data you see in collaborative filtering, this is extremely wasteful of execution time. Better to use a sparse vector's iterateNonZero Iterator for some function types.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)