You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Jon Zhong (JIRA)" <ji...@apache.org> on 2016/08/18 13:39:20 UTC

[jira] [Created] (SPARK-17130) SparseVectors.apply and SparseVectors.toArray have different returns when creating with a illegal indices

Jon Zhong created SPARK-17130:
---------------------------------

             Summary: SparseVectors.apply and SparseVectors.toArray have different returns when creating with a illegal indices
                 Key: SPARK-17130
                 URL: https://issues.apache.org/jira/browse/SPARK-17130
             Project: Spark
          Issue Type: Bug
          Components: ML, MLlib
    Affects Versions: 2.0.0, 1.6.2
         Environment: spark 1.6.1 + scala
            Reporter: Jon Zhong
            Priority: Minor


One of my colleagues ran into a bug of SparseVectors. He called the Vectors.sparse(size: Int, indices: Array[Int], values: Array[Double]) without noticing that the indices are assumed to be ordered.

The vector he created has all value of 0.0 (without any warning), if we try to get value via apply method. However, SparseVector.toArray will generates a array using a method that is order insensitive. Hence, you will get a 0.0 when you call apply method, while you can get correct result using toArray or toDense method. The result of SparseVector.toArray is actually misleading.



It could be safer if there is a validation of indices in the constructor or at least make the returns of apply method and toArray method the same.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org