You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Alexander Ulanov (JIRA)" <ji...@apache.org> on 2014/06/30 14:54:25 UTC

[jira] [Created] (SPARK-2329) Add multi-label evaluation metrics

Alexander Ulanov created SPARK-2329:
---------------------------------------

             Summary: Add multi-label evaluation metrics
                 Key: SPARK-2329
                 URL: https://issues.apache.org/jira/browse/SPARK-2329
             Project: Spark
          Issue Type: New Feature
          Components: MLlib
    Affects Versions: 1.0.0
            Reporter: Alexander Ulanov
             Fix For: 1.1.0


There is no class in Spark MLlib for measuring the performance of multi-label  classifiers. Multilabel classification is when the document is labeled with several labels (classes).

This task involves adding the class for multilabel evaluation and unit tests. The following measures are to be implemented: Precision, Recall and F1-measure (1) based on documents averaged by the number of documents; (2) per label; (3) based on labels micro and macro averaged; (4) Hamming loss. Reference: Tsoumakas, Grigorios, Ioannis Katakis, and Ioannis Vlahavas. "Mining multi-label data." Data mining and knowledge discovery handbook. Springer US, 2010. 667-685.



--
This message was sent by Atlassian JIRA
(v6.2#6252)