You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@commons.apache.org by "Chen Tao (Jira)" <ji...@apache.org> on 2020/03/09 17:34:00 UTC

[jira] [Created] (MATH-1520) A interface to implements various of clusters internal measurers

Chen Tao created MATH-1520:
------------------------------

             Summary: A interface to implements various of clusters internal measurers
                 Key: MATH-1520
                 URL: https://issues.apache.org/jira/browse/MATH-1520
             Project: Commons Math
          Issue Type: New Feature
            Reporter: Chen Tao


There are many clusters evaluation algorithm:
[scikit-learn clustering-performance-evaluation|https://scikit-learn.org/stable/modules/clustering.html#clustering-performance-evaluation]

They can be divided into 2 categories: “External Measurers” and "Internal Measurers".

The "ClusterEvaluator" declaration is much fit "Internal Measurers", but it is a abstract class and have some useless default properties and methods.
The "ClusterRanking" is design with "The higer rank the better", disobey the origin reference paper(e.g. "Davies-Bouldin Index"), what may mislead who want to compare the score between different ML libraries.

As opposed to “External Measurers”, the "Internal Measurers" may be:

{code:java}
public interface ClusterEvaluator {
    /**
     * @param cList List of clusters.
     * @return the score attributed by the evaluator.
     */
    <T extends Clusterable> double score(List<? extends Cluster<T>> cList);
    /**
     * @param a Score computed by this evaluator.
     * @param b Score computed by this evaluator.
     * @return true if the evaluator considers score {@code a} is
     * considered better than score {@code b}.
     */
    boolean isBetterScore(double a, double b);
{code}




--
This message was sent by Atlassian Jira
(v8.3.4#803005)