You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2020/01/22 08:20:00 UTC

[jira] [Work logged] (MATH-1374) KMeansPlusPlusClusterer unable to converge having repeatable points in input dataset

     [ https://issues.apache.org/jira/browse/MATH-1374?focusedWorklogId=375466&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-375466 ]

ASF GitHub Bot logged work on MATH-1374:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Jan/20 08:19
            Start Date: 22/Jan/20 08:19
    Worklog Time Spent: 10m 
      Work Description: C0rWin commented on pull request #37: [MATH-1374]:Aadd center as first point of the cluster while seeding.
URL: https://github.com/apache/commons-math/pull/37
 
 
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 375466)
    Remaining Estimate: 0h
            Time Spent: 10m

> KMeansPlusPlusClusterer unable to converge having repeatable points in input dataset
> ------------------------------------------------------------------------------------
>
>                 Key: MATH-1374
>                 URL: https://issues.apache.org/jira/browse/MATH-1374
>             Project: Commons Math
>          Issue Type: Bug
>            Reporter: Artem Barger
>            Assignee: Artem Barger
>            Priority: Major
>             Fix For: 4.0
>
>         Attachments: MATH-1374.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> If the input list size of {{Clusterable}} is greater than parameter {{k}} while has less unique points than {{k}}, the algorithm will fail to converge, tested w/ different EmptyClusterStrategy options, here is the example of default one: 
> {code}
>    @Test
>     public void testNumberOfRequestedClustersSameAsInputSize() {
>         final RandomVectorGenerator rng = new UncorrelatedRandomVectorGenerator(10,
>                 new GaussianRandomGenerator(RandomSource.create(RandomSource.MT)));
>         List<DoublePoint> points = new ArrayList<>();
>         for (int i = 0; i < 10; i++) {
>             final DoublePoint point = new DoublePoint(rng.nextVector());
>             for (int j = 0; j < 3; j++) {
>                 points.add(point);
>             }
>         }
>         final KMeansPlusPlusClusterer<DoublePoint> clusterer = new KMeansPlusPlusClusterer<>(12);
>         clusterer.cluster(points);
>     }
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)