You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@spark.apache.org by liaoyuxi <li...@huawei.com> on 2014/11/18 07:50:23 UTC

答复: matrix computation in spark

Hi,
I checked the work of ml-matrix. For now, it doesn’t include matrix multiply and LU decomposition. What’s your plan? Can we contribute our work to these parts?
Otherwise, the block number of row/column is decided manually, As we mentioned, the CARMA method in paper is communication-optimal.

发件人: Zongheng Yang [mailto:zongheng.y@gmail.com]
发送时间: 2014年11月18日 11:37
收件人: liaoyuxi; dev@spark.incubator.apache.org
抄送: Shivaram Venkataraman
主题: Re: matrix computation in spark

There's been some work at the AMPLab on a distributed matrix library on top of Spark; see here [1]. In particular, the repo contains a couple factorization algorithms.

[1] https://github.com/amplab/ml-matrix

Zongheng

On Mon Nov 17 2014 at 7:34:17 PM liaoyuxi <li...@huawei.com>> wrote:
Hi,
Matrix computation is critical for algorithm efficiency like least square, Kalman filter and so on.
For now, the mllib module offers limited linear algebra on matrix, especially for distributed matrix.

We have been working on establishing distributed matrix computation APIs based on data structures in MLlib.
The main idea is to partition the matrix into sub-blocks, based on the strategy in the following paper.
http://www.cs.berkeley.edu/~odedsc/papers/bfsdfs-mm-ipdps13.pdf
In our experiment, it's communication-optimal.
But operations like factorization may not be appropriate to carry out in blocks.

Any suggestions and guidance are welcome.

Thanks,
Yuxi