You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "hoa nguyen (JIRA)" <ji...@apache.org> on 2016/04/22 03:00:23 UTC

[jira] [Commented] (FLINK-1729) Assess performance of classification algorithms

    [ https://issues.apache.org/jira/browse/FLINK-1729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253113#comment-15253113 ] 

hoa nguyen commented on FLINK-1729:
-----------------------------------

Hi [~till.rohrmann], Is there an update on this? To confirm, this would provide an example implementation of say SVMs on publicly available datasets to validate the algorithm. Would it be possible for me to be assigned this? Many thanks,
Hoa

> Assess performance of classification algorithms
> -----------------------------------------------
>
>                 Key: FLINK-1729
>                 URL: https://issues.apache.org/jira/browse/FLINK-1729
>             Project: Flink
>          Issue Type: New Feature
>          Components: Machine Learning Library
>            Reporter: Till Rohrmann
>              Labels: ML
>
> In order to validate Flink's classification algorithms (in terms of performance and accuracy), we should run them on publicly available classification data sets. This will not only serve as a proof for the correctness of the implementations but will also show how easy the machine learning library can be used.
> Bottou [1] published some results for the RCV1 dataset using SVMs for classification. The SVMs are trained using stochastic gradient descent. Thus, they would be a good comparison for the CoCoA trained SVMs.
> Some more benchmark results and publicly available data sets ca be found here [2].
> Resources:
> [1] [http://leon.bottou.org/projects/sgd]
> [2] [https://github.com/BIDData/BIDMach/wiki/Benchmarks]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)