You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Miao Wang (JIRA)" <ji...@apache.org> on 2017/01/31 22:51:51 UTC

[jira] [Commented] (SPARK-19382) Test sparse vectors in LinearSVCSuite

    [ https://issues.apache.org/jira/browse/SPARK-19382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847683#comment-15847683 ] 

Miao Wang commented on SPARK-19382:
-----------------------------------

[~josephkb] If I understand correctly, I think we have to create separate tests for SparseVector. For example, assert(model.numFeatures === 2) in test("linear svc: default params").
If it is the DenseVector case, each Vector is size 2, which determines model.numFeatures = summarizer.mean.size = n = instance.size =2.

However, if I create a SparseVector of size 20 with non-zero values the same as the DenseVector (i.e., 2 non-zero values and 18 zero values), model.numFeatures = 20, based on the logic above.

Therefore, we should create separate test case for SparseVector, or we have to remove the test above.

test("linearSVC comparison with R e1071 and scikit-learn") also fails for all SparseVector case. 

Other tests pass for all  SparseVector case.

I am generating a mixed test now. 




> Test sparse vectors in LinearSVCSuite
> -------------------------------------
>
>                 Key: SPARK-19382
>                 URL: https://issues.apache.org/jira/browse/SPARK-19382
>             Project: Spark
>          Issue Type: Test
>          Components: ML
>            Reporter: Joseph K. Bradley
>            Priority: Minor
>
> Currently, LinearSVCSuite does not test sparse vectors.  We should.  I recommend that generateSVMInput be modified to create a mix of dense and sparse vectors, rather than adding an additional test.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org