You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mpjlu <gi...@git.apache.org> on 2017/08/10 02:05:24 UTC

[GitHub] spark pull request #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

GitHub user mpjlu opened a pull request:

    https://github.com/apache/spark/pull/18899

    [SPARK-21680][ML][MLLIB]optimzie Vector coompress

    ## What changes were proposed in this pull request?
    
    When use Vector.compressed to change a Vector to SparseVector, the performance is very low comparing with Vector.toSparse.
    This is because you have to scan the value three times using Vector.compressed, but you just need two times when use Vector.toSparse.
    When the length of the vector is large, there is significant performance difference between this two method.
    
    ## How was this patch tested?
    
    The existing UT


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mpjlu/spark optVectorCompress

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18899.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18899
    
----
commit 5dc5c89242a0c2a5ac6a693c3703eef8ee160616
Author: Peng Meng <pe...@intel.com>
Date:   2017-08-10T01:59:17Z

    optimzie Vector coompress

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by sethah <gi...@git.apache.org>.
Github user sethah commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    First suggestion is that there must be unit tests :)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80485/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80529/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    This isn't a public method though. The dv.toSparseWithSize(2) error will never come up unless Spark causes it, and there's no contract for its behavior in that case. It's probably over-specifying things to require it to throw AIOOBE, for example; nothing depends on that nor should. 
    
    It doesn't hurt to unit test though and the additional test seems fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    **[Test build #80600 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80600/testReport)** for PR 18899 at commit [`d50de99`](https://github.com/apache/spark/commit/d50de9961f78c8d259b9167081c2d9529ce91a63).
     * This patch **fails to generate documentation**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    **[Test build #80517 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80517/testReport)** for PR 18899 at commit [`4404b10`](https://github.com/apache/spark/commit/4404b1050770c5d9206d00e42cb80cd61ab826ef).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80523/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by mpjlu <gi...@git.apache.org>.
Github user mpjlu commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Hi @sethah , the unit test is added.  Thanks


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    **[Test build #80529 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80529/testReport)** for PR 18899 at commit [`4404b10`](https://github.com/apache/spark/commit/4404b1050770c5d9206d00e42cb80cd61ab826ef).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    **[Test build #80485 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80485/testReport)** for PR 18899 at commit [`cebe600`](https://github.com/apache/spark/commit/cebe600358d31c18bd282a4be8999805fc609249).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by mpjlu <gi...@git.apache.org>.
Github user mpjlu commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Hi @srowen; how about using our first version? though duplicate some code, but change is small.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by mpjlu <gi...@git.apache.org>.
Github user mpjlu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18899#discussion_r132610049
  
    --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala ---
    @@ -635,8 +642,9 @@ class SparseVector @Since("2.0.0") (
         nnz
       }
     
    -  override def toSparse: SparseVector = {
    -    val nnz = numNonzeros
    +  override def toSparse: SparseVector = toSparse(numNonzeros)
    --- End diff --
    
    If define 
    `def toSparse: SparseVector = toSparse(numNonzeros)`
    in the superclass, when call dv.toSparse (there are this kinds of call in the code), there will be error message:
    Both toSparse in the DenseVector of type (nnz:Int) org.apache.spark.ml.linalg.SparseVector and toSparse in trait Vector of type =>org.apache.spark.ml.linalg.SparseVector match  .
    So we should change the name of toSparse(nnz: Int), maybe toSparseWithSize(nnz: Int).



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    @mpjlu sorry which benchmark are you referring to? PR 18904 doesn't seem to benchmark just this in isolation. I just want to be sure the gain is significant


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    This isn't what was proposed in the JIRA?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by sethah <gi...@git.apache.org>.
Github user sethah commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    I think there _is_ new functionality, a new method that needs its functionality defined. One specific example, we need a test like:
    
    ````scala
      test("toSparseWithSize") {
        val dv = Vectors.dense(1, 2, 3)
        withClue("toSparseWithSize fails on the wrong number of non-zeros") {
          intercept[java.lang.ArrayIndexOutOfBoundsException] {
            dv.toSparseWithSize(2)
          }
        }
      }
    ```` 
    
    This is evidence to future developers that the potential failure of this method is known and intended.  Also, we need a test for when we specify it incorrectly the other way. i.e. what is the expected outcome of:
    
    ````scala
    val dv = Vectors.dense(1, 2, 3)
    val sv = dv.toSparseWithSize(6)
    ````
    
    Right now, I get the error `java.lang.IllegalArgumentException: requirement failed: You provided 6 indices and values, which exceeds the specified vector size 3.` These behaviors are undesirable IMO, but should at least be tested.
    
    I don't believe there's any way to find a better solution without at least adding an O(nnz) operation. Honestly, some more specific performance results would be great to have here.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18899#discussion_r132437157
  
    --- Diff: mllib-local/src/main/scala/org/apache/spark/ml/linalg/Vectors.scala ---
    @@ -635,8 +642,9 @@ class SparseVector @Since("2.0.0") (
         nnz
       }
     
    -  override def toSparse: SparseVector = {
    -    val nnz = numNonzeros
    +  override def toSparse: SparseVector = toSparse(numNonzeros)
    --- End diff --
    
    This doesn't need to be overridden. Just define it in the superclass


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Check what? you're just saving the extra call to numNonZeroes. Change the declaration like so:
    
    ```
      def toSparse: SparseVector = toSparse(numNonzeros)
    
      private[linalg] def toSparse(nnz: Int): SparseVector
    ```
    
    Then make the implementations override the new private method and use the given nnz arg, and change `compressed` to pass the `nnz` value it has already computed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    OK maybe include some of this text in the scaladoc for it, to make it clear it is always intended to be called with the value of `numNonZeroes`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by sethah <gi...@git.apache.org>.
Github user sethah commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Btw, I think the compile error is because `v.toSparse(2)` could resolve to either `v.toSparse(nnz = 2)` OR `v.toSparse.apply(2)`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by mpjlu <gi...@git.apache.org>.
Github user mpjlu commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    For PR-18904, before this change, one iteration is about 58s, after this change, one iteration is about:40s


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80602/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    **[Test build #80487 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80487/testReport)** for PR 18899 at commit [`d4f9e51`](https://github.com/apache/spark/commit/d4f9e516391d2f890c877aad1de295c84aff0635).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by mpjlu <gi...@git.apache.org>.
Github user mpjlu commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Thanks @srowen. 
    I will revise the code per your suggestion.
    when I wrote the code, I just concerned user call toSparse(size) and give a very small size.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimize Vector compress

Posted by mpjlu <gi...@git.apache.org>.
Github user mpjlu commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    I have tested the performance of toSparse and toSparseWithSize separately.  There is about 35% performance improvement for this change. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    The new method is private. Certainly the user is not intended to call it and supply nnz. This change shouldn't alter any semantics or functionality. It's just trying to avoid calculating nnz twice: to figure out if the vector is sparse, and then to convert it to sparse.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    **[Test build #80602 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80602/testReport)** for PR 18899 at commit [`83ac893`](https://github.com/apache/spark/commit/83ac8933a14a01919f27797ad9b13376977f5a98).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80600/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    **[Test build #80602 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80602/testReport)** for PR 18899 at commit [`83ac893`](https://github.com/apache/spark/commit/83ac8933a14a01919f27797ad9b13376977f5a98).
     * This patch **fails to generate documentation**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18899#discussion_r132561864
  
    --- Diff: project/MimaExcludes.scala ---
    @@ -1012,6 +1012,10 @@ object MimaExcludes {
           ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.ml.classification.RandomForestClassificationModel.setFeatureSubsetStrategy"),
           ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.ml.regression.RandomForestRegressionModel.numTrees"),
           ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.ml.regression.RandomForestRegressionModel.setFeatureSubsetStrategy")
    +    ) ++ Seq(
    +      // [SPARK-21680][ML][MLLIB]optimzie Vector coompress
    --- End diff --
    
    Hm, does this really cause a MiMa failure? what's the message, is it about adding the new method to the interface? I think it could be OK because it's a sealed trait that user code can't implement. CC maybe @MLnick or @sethah or @jkbradley for a thought on that


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18899#discussion_r132437879
  
    --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala ---
    @@ -823,8 +831,10 @@ class SparseVector @Since("1.0.0") (
       }
     
       @Since("1.4.0")
    -  override def toSparse: SparseVector = {
    -    val nnz = numNonzeros
    +  override def toSparse: SparseVector = toSparse(numNonzeros)
    +
    +  @Since("2.3.0")
    --- End diff --
    
    Does not need Since because it is private 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    **[Test build #80517 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80517/testReport)** for PR 18899 at commit [`4404b10`](https://github.com/apache/spark/commit/4404b1050770c5d9206d00e42cb80cd61ab826ef).
     * This patch **fails PySpark pip packaging tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    **[Test build #80485 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80485/testReport)** for PR 18899 at commit [`cebe600`](https://github.com/apache/spark/commit/cebe600358d31c18bd282a4be8999805fc609249).
     * This patch **fails MiMa tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by sethah <gi...@git.apache.org>.
Github user sethah commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Ok, it's fairly safe since it's limited to `private[linalg]`. The confusion for me is that this method introduces all sorts of edge cases which have behavior that is not at all obvious or clear. If we don't think we need to test all of these edge cases, I'd like to at least be very explicit in the method doc, e.g.:
    
    ````scala
      /**
       * This method is used to avoid re-computing the number of non-zero elements when it is
       * already known. This method should only be called after computing the number of non-zero
       * elements via [[numNonZeros]]. e.g.
       *   {{{
       *     val nnz = this.numNonZeros
       *     val sv = toSparse(nnz)
       *   }}}
       *
       * If `nnz` is under-specified, a [[java.lang.ArrayIndexOutOfBoundsException]] is thrown.
       */
    ````


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by mpjlu <gi...@git.apache.org>.
Github user mpjlu commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18899#discussion_r132610165
  
    --- Diff: project/MimaExcludes.scala ---
    @@ -1012,6 +1012,10 @@ object MimaExcludes {
           ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.ml.classification.RandomForestClassificationModel.setFeatureSubsetStrategy"),
           ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.ml.regression.RandomForestRegressionModel.numTrees"),
           ProblemFilters.exclude[IncompatibleResultTypeProblem]("org.apache.spark.ml.regression.RandomForestRegressionModel.setFeatureSubsetStrategy")
    +    ) ++ Seq(
    +      // [SPARK-21680][ML][MLLIB]optimzie Vector coompress
    --- End diff --
    
    The error message is"method toSparse(nnz: Int) in trait is present only in current version"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by mpjlu <gi...@git.apache.org>.
Github user mpjlu commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Yes, I just concern if add toSparse(size) we should check the size in the code, there will be no performance gain. If we don't need to check the "size" (comparing size with numNonZero) in the code, add toSparse(size) is ok. 
    Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18899: [SPARK-21680][ML][MLLIB]optimize Vector compress

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18899


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    **[Test build #80473 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80473/testReport)** for PR 18899 at commit [`5dc5c89`](https://github.com/apache/spark/commit/5dc5c89242a0c2a5ac6a693c3703eef8ee160616).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    **[Test build #80487 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80487/testReport)** for PR 18899 at commit [`d4f9e51`](https://github.com/apache/spark/commit/d4f9e516391d2f890c877aad1de295c84aff0635).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimize Vector compress

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    merged to master


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    The user can't call toSparse(nnz). It will be private.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by sethah <gi...@git.apache.org>.
Github user sethah commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Ok, yes, I see it now. Though, the point remains but to a lesser degree. We still have a method, albeit private, that indexes the array at potentially unsafe locations. It's probably ok, but at the very least we need a unit test to document the behavior. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80517/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    In theory there's no new functionality here so nothing new to test, but more tests never hurt. 
    
    This seems OK. Is there any other call site where nnz is already known?
    
    It is a nontrivial bit of change though. How much does this speed things up, do you have any benchmark, for the record?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by sethah <gi...@git.apache.org>.
Github user sethah commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    This approach doesn't feel right to me. The goal of the change is to avoid making a pass over the values to find out if there are any explicit zeros that need to be eliminated, which is fine. Instead of allowing the user to specify how man non-zero elements there are, we should instead allow them to specify a Boolean value on whether or not we should bother removing explicit zeros. Here's a small example to demonstrate why:
    
    ````scala
    val v = Vectors.dense(1, 2, 3)
    val sv = v.toSparse(2)
    ````
    
    This raises a `java.lang.ArrayIndexOutOfBoundsException`. Since I incorrectly specified the number of non-zero elements. If instead we use the boolean like `removeExplicitZeros` then when that is `false` we can just make the `values` array the size of the dense `values in the dense case, and just return `this` in the Sparse case. So I'm suggesting a method like:
    
    ````scala
    def toSparse(removeExplicitZeros: Boolean): SparseVector
    ````
    
    That won't work anyway because of ambiguous reference compile errors (another reason unit tests are so important). I ran into that problem before, and never found a good solution, and so you'll have to come up with a way around that. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by mpjlu <gi...@git.apache.org>.
Github user mpjlu commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    I did not only test this PR. Only work for PR 18904 and find this performance difference. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    **[Test build #80473 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80473/testReport)** for PR 18899 at commit [`5dc5c89`](https://github.com/apache/spark/commit/5dc5c89242a0c2a5ac6a693c3703eef8ee160616).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    **[Test build #80606 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80606/testReport)** for PR 18899 at commit [`7ab264d`](https://github.com/apache/spark/commit/7ab264d848f1690c6af316cb9940687d81db360b).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80473/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    **[Test build #80606 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80606/testReport)** for PR 18899 at commit [`7ab264d`](https://github.com/apache/spark/commit/7ab264d848f1690c6af316cb9940687d81db360b).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    **[Test build #80600 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80600/testReport)** for PR 18899 at commit [`d50de99`](https://github.com/apache/spark/commit/d50de9961f78c8d259b9167081c2d9529ce91a63).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80606/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by srowen <gi...@git.apache.org>.
Github user srowen commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    No, duplicate code like that is bad.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by mpjlu <gi...@git.apache.org>.
Github user mpjlu commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Thanks @sethah @srowen . The comment is added.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    **[Test build #80529 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/80529/testReport)** for PR 18899 at commit [`4404b10`](https://github.com/apache/spark/commit/4404b1050770c5d9206d00e42cb80cd61ab826ef).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by mpjlu <gi...@git.apache.org>.
Github user mpjlu commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/80487/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18899: [SPARK-21680][ML][MLLIB]optimzie Vector coompress

Posted by mpjlu <gi...@git.apache.org>.
Github user mpjlu commented on the issue:

    https://github.com/apache/spark/pull/18899
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org