You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Miao Wang (JIRA)" <ji...@apache.org> on 2017/01/31 22:51:51 UTC
[jira] [Commented] (SPARK-19382) Test sparse vectors in
LinearSVCSuite
[ https://issues.apache.org/jira/browse/SPARK-19382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847683#comment-15847683 ]
Miao Wang commented on SPARK-19382:
-----------------------------------
[~josephkb] If I understand correctly, I think we have to create separate tests for SparseVector. For example, assert(model.numFeatures === 2) in test("linear svc: default params").
If it is the DenseVector case, each Vector is size 2, which determines model.numFeatures = summarizer.mean.size = n = instance.size =2.
However, if I create a SparseVector of size 20 with non-zero values the same as the DenseVector (i.e., 2 non-zero values and 18 zero values), model.numFeatures = 20, based on the logic above.
Therefore, we should create separate test case for SparseVector, or we have to remove the test above.
test("linearSVC comparison with R e1071 and scikit-learn") also fails for all SparseVector case.
Other tests pass for all SparseVector case.
I am generating a mixed test now.
> Test sparse vectors in LinearSVCSuite
> -------------------------------------
>
> Key: SPARK-19382
> URL: https://issues.apache.org/jira/browse/SPARK-19382
> Project: Spark
> Issue Type: Test
> Components: ML
> Reporter: Joseph K. Bradley
> Priority: Minor
>
> Currently, LinearSVCSuite does not test sparse vectors. We should. I recommend that generateSVMInput be modified to create a mix of dense and sparse vectors, rather than adding an additional test.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org