You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by mgaido91 <gi...@git.apache.org> on 2017/11/06 10:54:11 UTC

[GitHub] spark pull request #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to ...

GitHub user mgaido91 opened a pull request:

    https://github.com/apache/spark/pull/19668

    [SPARK-22440][ML] Add Calinski-Harabasz index to ClusteringEvaluator

    ## What changes were proposed in this pull request?
    
    sklearn contains two metrics for unsupervised clustering evaluation. One is silhouette, which has been previously added, and the other one is Calinski-Harabasz index.
    This PR aims to add Calinski-Harabasz index in order to reach feature parity with sklearn.
    
    ## How was this patch tested?
    
    added UT


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mgaido91/spark SPARK-22440

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/19668.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #19668
    
----
commit a4c4ff190235489c080fa69b62264fe73fbc4833
Author: Marco Gaido <mg...@hortonworks.com>
Date:   2017-11-03T17:54:12Z

    initial impl

commit 95a6e4ec6743536a7a1c0f9effd3628cd2b1adef
Author: Marco Gaido <mg...@hortonworks.com>
Date:   2017-11-05T11:07:04Z

    adding doc

commit ca7797f3c73678b316c7630639826952910f5b8e
Author: Marco Gaido <mg...@hortonworks.com>
Date:   2017-11-06T09:58:25Z

    added ut

commit 8a8d016599b3e940934cc4a53d93cbfa081f6279
Author: Marco Gaido <mg...@hortonworks.com>
Date:   2017-11-06T09:58:48Z

    fixes

commit 633b21d2762877c89486b9b9e2e2ec029830bfaa
Author: Marco Gaido <mg...@hortonworks.com>
Date:   2017-11-06T10:53:49Z

    minor

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    **[Test build #83684 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83684/testReport)** for PR 19668 at commit [`05c77b2`](https://github.com/apache/spark/commit/05c77b2d8af86554e0adfc0cac0723eae3922547).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    **[Test build #83489 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83489/testReport)** for PR 19668 at commit [`3e5e8c5`](https://github.com/apache/spark/commit/3e5e8c5f9c24c0185802f031f80de57b0d79c84c).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    **[Test build #83489 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83489/testReport)** for PR 19668 at commit [`3e5e8c5`](https://github.com/apache/spark/commit/3e5e8c5f9c24c0185802f031f80de57b0d79c84c).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83483/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    Merged build finished. Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83684/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83489/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/83683/
    Test FAILed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    cc @yanboliang 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    **[Test build #83483 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83483/testReport)** for PR 19668 at commit [`633b21d`](https://github.com/apache/spark/commit/633b21d2762877c89486b9b9e2e2ec029830bfaa).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to ...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 closed the pull request at:

    https://github.com/apache/spark/pull/19668


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    **[Test build #83483 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83483/testReport)** for PR 19668 at commit [`633b21d`](https://github.com/apache/spark/commit/633b21d2762877c89486b9b9e2e2ec029830bfaa).
     * This patch **fails Scala style tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by mgaido91 <gi...@git.apache.org>.
Github user mgaido91 commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    I am closing this as it seems that nobody has interest in it. If it will be considered useful I can reopen it later. Thanks.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    **[Test build #83683 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83683/testReport)** for PR 19668 at commit [`c2fa615`](https://github.com/apache/spark/commit/c2fa615ea867442bf20bac06b87ac83452b99724).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    **[Test build #83683 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83683/testReport)** for PR 19668 at commit [`c2fa615`](https://github.com/apache/spark/commit/c2fa615ea867442bf20bac06b87ac83452b99724).
     * This patch **fails to build**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #19668: [SPARK-22440][ML] Add Calinski-Harabasz index to Cluster...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/19668
  
    **[Test build #83684 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/83684/testReport)** for PR 19668 at commit [`05c77b2`](https://github.com/apache/spark/commit/05c77b2d8af86554e0adfc0cac0723eae3922547).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org