You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by hhbyyh <gi...@git.apache.org> on 2016/03/22 15:36:31 UTC

[GitHub] spark pull request: [SPARK-11507] [MLlib] add compact in Matrices ...

GitHub user hhbyyh reopened a pull request:

    https://github.com/apache/spark/pull/9520

    [SPARK-11507] [MLlib] add compact in Matrices fromBreeze

    jira: https://issues.apache.org/jira/browse/SPARK-11507
    "In certain situations when adding two block matrices, I get an error regarding colPtr and the operation fails. External issue URL includes full error and code for reproducing the problem."
    
    root cause: colPtr.last does NOT always equal to values.length in breeze SCSMatrix, which fails the require in SparseMatrix.
    
    easy step to repro:
    ```
    val m1: BM[Double] = new CSCMatrix[Double] (Array (1.0, 1, 1), 3, 3, Array (0, 1, 2, 3), Array (0, 1, 2) )
    val m2: BM[Double] = new CSCMatrix[Double] (Array (1.0, 2, 2, 4), 3, 3, Array (0, 0, 2, 4), Array (1, 2, 1, 2) )
    val sum = m1 + m2
    Matrices.fromBreeze(sum)
    ```
    
    Solution: By checking the code in [CSCMatrix](https://github.com/scalanlp/breeze/blob/28000a7b901bc3cfbbbf5c0bce1d0a5dda8281b0/math/src/main/scala/breeze/linalg/CSCMatrix.scala), CSCMatrix in breeze can have extra zeros in the end of data array. Invoking compact will make sure it aligns with the require of SparseMatrix. This should add limited overhead as the actual compact operation is only performed when necessary.
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/hhbyyh/spark matricesFromBreeze

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9520.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9520
    
----
commit 50eee83d39d146df83d4aa9d76a1cad49669f9b1
Author: Yuhao Yang <hh...@gmail.com>
Date:   2015-11-06T09:32:37Z

    add compact in Matrices fromBreeze

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org