You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by dusenberrymw <gi...@git.apache.org> on 2015/11/03 19:58:51 UTC

[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

GitHub user dusenberrymw opened a pull request:

    https://github.com/apache/spark/pull/9441

    [WIP] [SPARK-9656] [MLlib] [Python] Add missing methods to PySpark's Distributed Linear Algebra Classes

    This PR adds the remaining group of methods to PySpark's distributed linear algebra classes as follows:
    
    * `RowMatrix` <sup>**[1]**</sup>
      1. `computeGramianMatrix`
      2. `computeCovariance`
      3. `computeColumnSummaryStatistics`
      4. `columnSimilarities`
      5. `tallSkinnyQR` <sup>**[2]**</sup>
    * `IndexedRowMatrix` <sup>**[3]**</sup>
      1. `computeGramianMatrix`
    * `CoordinateMatrix`
      1. `transpose`
    * `BlockMatrix`
      1. `validate`
      2. `cache`
      3. `persist`
      4. `transpose`
    
    **[1]**: Note: `multiply`, `computeSVD`, and `computePrincipalComponents` are already part of PR #7963 for SPARK-6227.
    **[2]**: Implementing `tallSkinnyQR` uncovered a bug with our PySpark `RowMatrix` constructor.  As discussed on the dev list [here](http://apache-spark-developers-list.1001551.n3.nabble.com/K-Means-And-Class-Tags-td10038.html), there appears to be an issue with type erasure with RDDs coming from Java, and by extension from PySpark.  Although we are attempting to construct a `RowMatrix` from an `RDD[Vector]` in [PythonMLlibAPI](https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/api/python/PythonMLLibAPI.scala#L1115), the `Vector` type is erased, resulting in an `RDD[Object]`.  Thus, when calling Scala's `tallSkinnyQR` from PySpark, we get a Java `ClassCastException` in which an `Object` cannot be cast to a Spark `Vector`.  As noted in the aforementioned dev list thread, this issue was also encountered with `DecisionTrees`, and the fix involved an explicit `retag` of the RDD with a `Vector` type.  Thus, this PR currently contains that fix applie
 d to the `createRowMatrix` helper function in `PythonMLlibAPI`.  `IndexedRowMatrix` and `CoordinateMatrix` do not appear to have this issue likely due to their related helper functions in `PythonMLlibAPI` creating the RDDs explicitly from DataFrames with pattern matching, thus preserving the types.  However, this fix may be out of scope for this single PR, and it may be better suited in a separate JIRA/PR.  Therefore, I have marked this PR as WIP and am open to discussion.
    **[3]**: Note: `multiply` and `computeSVD` are already part of PR #7963 for SPARK-6227.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/dusenberrymw/spark SPARK-9656_Add_Missing_Methods_to_PySpark_Distributed_Linear_Algebra

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9441.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9441
    
----
commit 7a98f55c60bfb90cda76ca1104db0486924e3667
Author: Mike Dusenberry <mw...@us.ibm.com>
Date:   2015-10-30T21:35:15Z

    Adding remaining methods to PySpark BlockMatrix: cache, persist, validate, transpose.

commit c713a27e3952bc3b533f129fb32a1e698af7bc13
Author: Mike Dusenberry <mw...@us.ibm.com>
Date:   2015-10-30T21:56:19Z

    Adding remaining method to PySpark CoordinateMatrix: transpose.

commit 0532f12dbb5bc136f231a6028446f17ea90b7bb0
Author: Mike Dusenberry <mw...@us.ibm.com>
Date:   2015-10-30T22:27:37Z

    Adding remaining method to PySpark IndexedRowMatrix: computeGramianMatrix. Note that 'multiply' and 'computeSVD' are part of the SPARK-6227 PR.

commit cbddf10e717e74508ba512c8f303106959439d17
Author: Mike Dusenberry <mw...@us.ibm.com>
Date:   2015-11-02T21:53:15Z

    Adding remaining methods to PySpark RowMatrix: computeGramianMatrix, computeCovariance, computeColumnSummaryStatistics, columnSimilarities, tallSkinnyQR.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by dusenberrymw <gi...@git.apache.org>.
Github user dusenberrymw commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-212527427
  
    @MLnick Thoughts on merging this?  It's been sitting for quite some time now, and is just a followup to a few previous commits.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213145616
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/9441


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213147708
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56598/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9441#discussion_r43831697
  
    --- Diff: python/pyspark/mllib/linalg/distributed.py ---
    @@ -297,6 +444,20 @@ def numCols(self):
             """
             return self._java_matrix_wrapper.call("numCols")
     
    +    def computeGramianMatrix(self):
    +        """
    +        Computes the Gramian matrix `A^T A`. Note that this cannot be
    +        computed on matrices with more than 65535 columns.
    --- End diff --
    
    Thats a good question, totally reasonable to do this in a follow up PR I think since its pretty unrelated just while we are at it good to unify things while we are working on it. Just create a follow up JIRA to do this :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9441#discussion_r43829680
  
    --- Diff: python/pyspark/mllib/linalg/distributed.py ---
    @@ -500,6 +661,25 @@ def numCols(self):
             """
             return self._java_matrix_wrapper.call("numCols")
     
    +    def transpose(self):
    +        """
    +        Transpose this CoordinateMatrix.
    +
    +        >>> entries = sc.parallelize([MatrixEntry(0, 0, 1.2),
    +        ...                           MatrixEntry(1, 0, 2),
    +        ...                           MatrixEntry(2, 1, 3.7)])
    +        >>> mat = CoordinateMatrix(entries)
    +        >>> mat_transposed = mat.transpose()
    +
    --- End diff --
    
    Is this blank line intentional?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-153457314
  
    **[Test build #44943 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44943/consoleFull)** for PR 9441 at commit [`cbddf10`](https://github.com/apache/spark/commit/cbddf10e717e74508ba512c8f303106959439d17).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213145622
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56597/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-153477100
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by dusenberrymw <gi...@git.apache.org>.
Github user dusenberrymw commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9441#discussion_r60621403
  
    --- Diff: python/pyspark/mllib/linalg/distributed.py ---
    @@ -297,6 +444,20 @@ def numCols(self):
             """
             return self._java_matrix_wrapper.call("numCols")
     
    +    def computeGramianMatrix(self):
    +        """
    +        Computes the Gramian matrix `A^T A`. Note that this cannot be
    +        computed on matrices with more than 65535 columns.
    --- End diff --
    
    Great, added.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-153466701
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-212974570
  
    **[Test build #56549 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56549/consoleFull)** for PR 9441 at commit [`9c530f6`](https://github.com/apache/spark/commit/9c530f69220050e116054306b6196ed12aa2eee6).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213041589
  
    **[Test build #56565 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56565/consoleFull)** for PR 9441 at commit [`0f82902`](https://github.com/apache/spark/commit/0f829022e56500bc45e9042887aaaca03fa562eb).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213036217
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56560/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213130260
  
    **[Test build #56598 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56598/consoleFull)** for PR 9441 at commit [`c0c9565`](https://github.com/apache/spark/commit/c0c9565706c148cb7dd64250630931ab41838d3b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213147514
  
    **[Test build #56598 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56598/consoleFull)** for PR 9441 at commit [`c0c9565`](https://github.com/apache/spark/commit/c0c9565706c148cb7dd64250630931ab41838d3b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by MLnick <gi...@git.apache.org>.
Github user MLnick commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-212965382
  
    jenkins retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by dusenberrymw <gi...@git.apache.org>.
Github user dusenberrymw commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9441#discussion_r43831207
  
    --- Diff: python/pyspark/mllib/linalg/distributed.py ---
    @@ -500,6 +661,25 @@ def numCols(self):
             """
             return self._java_matrix_wrapper.call("numCols")
     
    +    def transpose(self):
    +        """
    +        Transpose this CoordinateMatrix.
    --- End diff --
    
    I think that is just the case for the `BlockMatrix` type.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213147704
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213059645
  
    **[Test build #56565 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56565/consoleFull)** for PR 9441 at commit [`0f82902`](https://github.com/apache/spark/commit/0f829022e56500bc45e9042887aaaca03fa562eb).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by dusenberrymw <gi...@git.apache.org>.
Github user dusenberrymw commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9441#discussion_r43831148
  
    --- Diff: python/pyspark/mllib/linalg/distributed.py ---
    @@ -297,6 +444,20 @@ def numCols(self):
             """
             return self._java_matrix_wrapper.call("numCols")
     
    +    def computeGramianMatrix(self):
    +        """
    +        Computes the Gramian matrix `A^T A`. Note that this cannot be
    +        computed on matrices with more than 65535 columns.
    --- End diff --
    
    Agreed.  Would it be reasonable to include that in this PR?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-212974716
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56549/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213128867
  
    **[Test build #56597 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56597/consoleFull)** for PR 9441 at commit [`c98f6eb`](https://github.com/apache/spark/commit/c98f6eb4d9658e731018a5506cbe946495cf02f0).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by MLnick <gi...@git.apache.org>.
Github user MLnick commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-215168988
  
    LGTM, thanks! Merged to master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-153455109
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9441#discussion_r43830079
  
    --- Diff: python/pyspark/mllib/linalg/distributed.py ---
    @@ -500,6 +661,25 @@ def numCols(self):
             """
             return self._java_matrix_wrapper.call("numCols")
     
    +    def transpose(self):
    +        """
    +        Transpose this CoordinateMatrix.
    --- End diff --
    
    Maybe mention it shares the same underlying data as mentioned in the scaladoc.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by dusenberrymw <gi...@git.apache.org>.
Github user dusenberrymw commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9441#discussion_r43831342
  
    --- Diff: python/pyspark/mllib/linalg/distributed.py ---
    @@ -500,6 +661,25 @@ def numCols(self):
             """
             return self._java_matrix_wrapper.call("numCols")
     
    +    def transpose(self):
    +        """
    +        Transpose this CoordinateMatrix.
    +
    +        >>> entries = sc.parallelize([MatrixEntry(0, 0, 1.2),
    +        ...                           MatrixEntry(1, 0, 2),
    +        ...                           MatrixEntry(2, 1, 3.7)])
    +        >>> mat = CoordinateMatrix(entries)
    +        >>> mat_transposed = mat.transpose()
    +
    --- End diff --
    
    Yeah, I like the visual clarity when viewing these tests on the Python docs, as it helps indicate that the following two tests rely on the data structures formed above.  This is generally the pattern I've followed with these classes for cases with >1 test.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by dusenberrymw <gi...@git.apache.org>.
Github user dusenberrymw commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-153553660
  
    Talking with @holdenk, I've decided to pull the `retag` fix out into a separate JIRA/PR that blocks this.  I've opened #9458 to address that issue, so once that is merged, this can be rebased.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-153493085
  
    **[Test build #44954 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44954/consoleFull)** for PR 9441 at commit [`9b5b7ae`](https://github.com/apache/spark/commit/9b5b7ae79bafdd9c655e42da6685d3299e144b0b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `class QRDecomposition(object):`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-170744026
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-212974711
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by dusenberrymw <gi...@git.apache.org>.
Github user dusenberrymw commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-170744335
  
    @jkbradley Now that #9458 has been merged, this is ready for review.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by dusenberrymw <gi...@git.apache.org>.
Github user dusenberrymw commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-153454390
  
    @holdenk Could you review this and provide any thoughts you may have?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-153479763
  
    **[Test build #44954 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44954/consoleFull)** for PR 9441 at commit [`9b5b7ae`](https://github.com/apache/spark/commit/9b5b7ae79bafdd9c655e42da6685d3299e144b0b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by MLnick <gi...@git.apache.org>.
Github user MLnick commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-214859175
  
    jenkins retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by dusenberrymw <gi...@git.apache.org>.
Github user dusenberrymw commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9441#discussion_r60622845
  
    --- Diff: python/pyspark/mllib/linalg/distributed.py ---
    @@ -151,6 +153,151 @@ def numCols(self):
             """
             return self._java_matrix_wrapper.call("numCols")
     
    +    def computeColumnSummaryStatistics(self):
    --- End diff --
    
    Yeah probably, although they would have been a little outdated if I had originally added them. :D


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-153493249
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by dusenberrymw <gi...@git.apache.org>.
Github user dusenberrymw commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213141092
  
    @MLnick I've addressed the comments and added the `subtract(...)` method.  Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-212968302
  
    **[Test build #56549 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56549/consoleFull)** for PR 9441 at commit [`9c530f6`](https://github.com/apache/spark/commit/9c530f69220050e116054306b6196ed12aa2eee6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213145373
  
    **[Test build #56597 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56597/consoleFull)** for PR 9441 at commit [`c98f6eb`](https://github.com/apache/spark/commit/c98f6eb4d9658e731018a5506cbe946495cf02f0).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-214860515
  
    **[Test build #57019 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57019/consoleFull)** for PR 9441 at commit [`c0c9565`](https://github.com/apache/spark/commit/c0c9565706c148cb7dd64250630931ab41838d3b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by MLnick <gi...@git.apache.org>.
Github user MLnick commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-212965269
  
    @dusenberrymw made a high-level pass and generally looks good. I'll go through it again in more detail soon, in particular checking the test cases.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by MLnick <gi...@git.apache.org>.
Github user MLnick commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9441#discussion_r60595201
  
    --- Diff: python/pyspark/mllib/linalg/distributed.py ---
    @@ -151,6 +153,151 @@ def numCols(self):
             """
             return self._java_matrix_wrapper.call("numCols")
     
    +    def computeColumnSummaryStatistics(self):
    --- End diff --
    
    Do these need `@since` annotations?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by dusenberrymw <gi...@git.apache.org>.
Github user dusenberrymw commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-214447540
  
    @MLnick Any additional thoughts on this, or is it ready to merge?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9441#discussion_r43831883
  
    --- Diff: python/pyspark/mllib/linalg/distributed.py ---
    @@ -500,6 +661,25 @@ def numCols(self):
             """
             return self._java_matrix_wrapper.call("numCols")
     
    +    def transpose(self):
    +        """
    +        Transpose this CoordinateMatrix.
    --- End diff --
    
    Ah ok, looking at Matrices.scala (the root class) it indicates it shares the same data type but I forgot to look at the Coordinate matrix underneath. Sorry about that.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by MLnick <gi...@git.apache.org>.
Github user MLnick commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9441#discussion_r60596372
  
    --- Diff: python/pyspark/mllib/linalg/distributed.py ---
    @@ -297,6 +444,20 @@ def numCols(self):
             """
             return self._java_matrix_wrapper.call("numCols")
     
    +    def computeGramianMatrix(self):
    +        """
    +        Computes the Gramian matrix `A^T A`. Note that this cannot be
    +        computed on matrices with more than 65535 columns.
    --- End diff --
    
    I think since it's a small doc consistency thing, add it to this PR.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-153477070
  
     Merged build triggered.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by dusenberrymw <gi...@git.apache.org>.
Github user dusenberrymw commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-153539972
  
    @holdenk Great, thanks for the feedback!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213059786
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/56565/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-153466627
  
    **[Test build #44943 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44943/consoleFull)** for PR 9441 at commit [`cbddf10`](https://github.com/apache/spark/commit/cbddf10e717e74508ba512c8f303106959439d17).
     * This patch **fails PySpark unit tests**.
     * This patch merges cleanly.
     * This patch adds the following public classes _(experimental)_:\n  * `class QRDecomposition(object):`\n


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-153466705
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44943/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by MLnick <gi...@git.apache.org>.
Github user MLnick commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9441#discussion_r60598056
  
    --- Diff: python/pyspark/mllib/linalg/distributed.py ---
    @@ -789,6 +969,30 @@ def numCols(self):
             """
             return self._java_matrix_wrapper.call("numCols")
     
    +    def cache(self):
    +        """
    +        Caches the underlying RDD.
    +        """
    +        self._java_matrix_wrapper.call("cache")
    +        return self
    +
    +    def persist(self, storageLevel):
    +        """
    +        Persists the underlying RDD with the specified storage level.
    +        """
    +        if not isinstance(storageLevel, StorageLevel):
    +            raise TypeError("`storageLevel` should be a StorageLevel, got %s" % type(storageLevel))
    +        javaStorageLevel = self._java_matrix_wrapper._sc._getJavaStorageLevel(storageLevel)
    +        self._java_matrix_wrapper.call("persist", javaStorageLevel)
    +        return self
    +
    +    def validate(self):
    +        """
    +        Validates the block matrix info against the matrix data (`blocks`)
    +        and throws an exception if any error is found.
    +        """
    +        self._java_matrix_wrapper.call("validate")
    +
    --- End diff --
    
    `subtract` now exists on the Scale side, it can be added here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by dusenberrymw <gi...@git.apache.org>.
Github user dusenberrymw commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-215170024
  
    Awesome, thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213035265
  
    **[Test build #56560 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56560/consoleFull)** for PR 9441 at commit [`9e05eba`](https://github.com/apache/spark/commit/9e05ebac1780501308b564cbf45785521b2c01b7).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by holdenk <gi...@git.apache.org>.
Github user holdenk commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9441#discussion_r43829896
  
    --- Diff: python/pyspark/mllib/linalg/distributed.py ---
    @@ -297,6 +444,20 @@ def numCols(self):
             """
             return self._java_matrix_wrapper.call("numCols")
     
    +    def computeGramianMatrix(self):
    +        """
    +        Computes the Gramian matrix `A^T A`. Note that this cannot be
    +        computed on matrices with more than 65535 columns.
    --- End diff --
    
    We should maybe also add this note about max columns to the IndexedRowMatrix.scala for consistency.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-153493254
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/44954/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213036192
  
    **[Test build #56560 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/56560/consoleFull)** for PR 9441 at commit [`9e05eba`](https://github.com/apache/spark/commit/9e05ebac1780501308b564cbf45785521b2c01b7).
     * This patch **fails Python style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213059784
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-213036211
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-170741144
  
    **[Test build #49192 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49192/consoleFull)** for PR 9441 at commit [`9c530f6`](https://github.com/apache/spark/commit/9c530f69220050e116054306b6196ed12aa2eee6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-153455136
  
    Merged build started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-170744030
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/49192/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [WIP] [SPARK-9656] [MLlib] [Python] Add missin...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-170743821
  
    **[Test build #49192 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/49192/consoleFull)** for PR 9441 at commit [`9c530f6`](https://github.com/apache/spark/commit/9c530f69220050e116054306b6196ed12aa2eee6).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by dusenberrymw <gi...@git.apache.org>.
Github user dusenberrymw commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-173673834
  
    ping @jkbradley


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-214871278
  
    **[Test build #57019 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/57019/consoleFull)** for PR 9441 at commit [`c0c9565`](https://github.com/apache/spark/commit/c0c9565706c148cb7dd64250630931ab41838d3b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-214871421
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request: [SPARK-9656] [MLlib] [Python] Add missing meth...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the pull request:

    https://github.com/apache/spark/pull/9441#issuecomment-214871424
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/57019/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org