You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "zhengruifeng (JIRA)" <ji...@apache.org> on 2016/03/15 02:57:33 UTC

[jira] [Issue Comment Deleted] (SPARK-13712) Add OneVsOne to ML

     [ https://issues.apache.org/jira/browse/SPARK-13712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

zhengruifeng updated SPARK-13712:
---------------------------------
    Comment: was deleted

(was: OK, I have closed the PR.
I had also planned to implement ECC after this PR.
In general, OneVsOne is slowest among the three methods, but it generate the highest accuracy. ECC is the fastest one (about log(num_class) submodels) with lowest accuracy. OneVsRest is in middle of them, both speed and accuracy.
In most case, num_class is a small number, and so OneVsOne is useful.
Suppose there are 3 classes, OneVsOne is even faster than OneVsRest. So I think it may be a useful choice for user.)

> Add OneVsOne to ML
> ------------------
>
>                 Key: SPARK-13712
>                 URL: https://issues.apache.org/jira/browse/SPARK-13712
>             Project: Spark
>          Issue Type: New Feature
>          Components: ML
>            Reporter: zhengruifeng
>            Priority: Minor
>
> Another Meta method for multi-class classification.
> Most classification algorithms were designed for balanced data.
> The OneVsRest method will generate K models on imbalanced data.
> The OneVsOne will train K*(K-1)/2 models on balanced data.
> OneVsOne is less sensitive to the problems of imbalanced datasets, and can usually result in higher precision.
> But it is much more computationally expensive, although each model are trained on a much smaller dataset. (2/K of total)
> The OneVsOne is implemented in the way OneVsRest did:
> val classifier = new LogisticRegression()
> val ovo = new OneVsOne()
> ovo.setClassifier(classifier)
> val ovoModel = ovo.fit(data)
> val predictions = ovoModel.transform(data)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org