You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Gaurav Mishra (JIRA)" <ji...@apache.org> on 2014/09/19 10:10:33 UTC

[jira] [Commented] (SPARK-3434) Distributed block matrix

    [ https://issues.apache.org/jira/browse/SPARK-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140160#comment-14140160 ] 

Gaurav Mishra commented on SPARK-3434:
--------------------------------------

A matrix being represented by multiple RDDs of sub-matrices may be helpful when an operation on the matrix requires computation over only a small set of its sub-matrices. However, operations like matrix multiplication require computation over all elements in the matrix (i.e. all elements need to be read). Therefore, at least in the case of matrix multiplication, keeping a single RDD seems to be a better idea. Keeping multiple RDDs in that case will only burden us further with the task of keeping track of all sub matrices.

> Distributed block matrix
> ------------------------
>
>                 Key: SPARK-3434
>                 URL: https://issues.apache.org/jira/browse/SPARK-3434
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: Xiangrui Meng
>
> This JIRA is for discussing distributed matrices stored in block sub-matrices. The main challenge is the partitioning scheme to allow adding linear algebra operations in the future, e.g.:
> 1. matrix multiplication
> 2. matrix factorization (QR, LU, ...)
> Let's discuss the partitioning and storage and how they fit into the above use cases.
> Questions:
> 1. Should it be backed by a single RDD that contains all of the sub-matrices or many RDDs with each contains only one sub-matrix?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org