You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Chen Tao (Jira)" <ji...@apache.org> on 2020/03/09 17:39:00 UTC
[jira] [Updated] (MATH-1520) A interface to implements various of clusters internal measurers

     [ https://issues.apache.org/jira/browse/MATH-1520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chen Tao updated MATH-1520:
---------------------------
    Description: 
There are many clusters evaluation algorithm:
[scikit-learn clustering-performance-evaluation|https://scikit-learn.org/stable/modules/clustering.html#clustering-performance-evaluation]

They can be divided into 2 categories: “External Measurers” and "Internal Measurers".

The "ClusterEvaluator" declaration is much fit "Internal Measurers", but it is a abstract class and have some useless default properties and methods.
The "ClusterRanking" is design with "The higer rank the better", disobey the origin reference paper(e.g. "Davies-Bouldin Index"), what may mislead who want to compare the score between different ML libraries.

As opposed to “External Measurers”, the "Internal Measurers" may be:

{code:java}
public interface ClusterInternalEvaluator {
    /**
     * @param cList List of clusters.
     * @return the score attributed by the evaluator.
     */
    <T extends Clusterable> double score(List<? extends Cluster<T>> cList);
    /**
     * @param a Score computed by this evaluator.
     * @param b Score computed by this evaluator.
     * @return true if the evaluator considers score {@code a} is
     * considered better than score {@code b}.
     */
    boolean isBetterScore(double a, double b);
{code}


  was:
There are many clusters evaluation algorithm:
[scikit-learn clustering-performance-evaluation|https://scikit-learn.org/stable/modules/clustering.html#clustering-performance-evaluation]

They can be divided into 2 categories: “External Measurers” and "Internal Measurers".

The "ClusterEvaluator" declaration is much fit "Internal Measurers", but it is a abstract class and have some useless default properties and methods.
The "ClusterRanking" is design with "The higer rank the better", disobey the origin reference paper(e.g. "Davies-Bouldin Index"), what may mislead who want to compare the score between different ML libraries.

As opposed to “External Measurers”, the "Internal Measurers" may be:

{code:java}
public interface ClusterEvaluator {
    /**
     * @param cList List of clusters.
     * @return the score attributed by the evaluator.
     */
    <T extends Clusterable> double score(List<? extends Cluster<T>> cList);
    /**
     * @param a Score computed by this evaluator.
     * @param b Score computed by this evaluator.
     * @return true if the evaluator considers score {@code a} is
     * considered better than score {@code b}.
     */
    boolean isBetterScore(double a, double b);
{code}



> A interface to implements various of clusters internal measurers
> ----------------------------------------------------------------
>
>                 Key: MATH-1520
>                 URL: https://issues.apache.org/jira/browse/MATH-1520
>             Project: Commons Math
>          Issue Type: New Feature
>            Reporter: Chen Tao
>            Priority: Critical
>
> There are many clusters evaluation algorithm:
> [scikit-learn clustering-performance-evaluation|https://scikit-learn.org/stable/modules/clustering.html#clustering-performance-evaluation]
> They can be divided into 2 categories: “External Measurers” and "Internal Measurers".
> The "ClusterEvaluator" declaration is much fit "Internal Measurers", but it is a abstract class and have some useless default properties and methods.
> The "ClusterRanking" is design with "The higer rank the better", disobey the origin reference paper(e.g. "Davies-Bouldin Index"), what may mislead who want to compare the score between different ML libraries.
> As opposed to “External Measurers”, the "Internal Measurers" may be:
> {code:java}
> public interface ClusterInternalEvaluator {
>     /**
>      * @param cList List of clusters.
>      * @return the score attributed by the evaluator.
>      */
>     <T extends Clusterable> double score(List<? extends Cluster<T>> cList);
>     /**
>      * @param a Score computed by this evaluator.
>      * @param b Score computed by this evaluator.
>      * @return true if the evaluator considers score {@code a} is
>      * considered better than score {@code b}.
>      */
>     boolean isBetterScore(double a, double b);
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)