You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Benson Margulies (JIRA)" <ji...@apache.org> on 2009/05/29 19:00:45 UTC

[jira] Created: (MAHOUT-129) kmeans sample makes one cluster

kmeans sample makes one cluster
-------------------------------

                 Key: MAHOUT-129
                 URL: https://issues.apache.org/jira/browse/MAHOUT-129
             Project: Mahout
          Issue Type: Bug
          Components: Collaborative Filtering
    Affects Versions: 0.2
            Reporter: Benson Margulies
         Attachments: kmeans-patch.diff

The kmeans sample job uses '1' for the number of clusters desired. I don't think this is intended.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-129) Kmeans sample does not expose numIterations control from KMeansDriver

Posted by "Jeff Eastman (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714632#action_12714632 ] 

Jeff Eastman commented on MAHOUT-129:
-------------------------------------

The KMeansDriver numCentroids argument is incorrectly named and should be changed to numReduceTasks. r 780137 committed that change. All tests run.

> Kmeans sample does not expose numIterations control from KMeansDriver
> ---------------------------------------------------------------------
>
>                 Key: MAHOUT-129
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-129
>             Project: Mahout
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 0.2
>            Reporter: Benson Margulies
>         Attachments: kmeans-patch.diff
>
>
> The KMeans driver forces the numReduceTasks parameter of KMeans to 1, and there are javadoc/param naming problems in KMeansDriver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAHOUT-129) Kmeans sample does not expose numIterations control from KMeansDriver

Posted by "Benson Margulies (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benson Margulies updated MAHOUT-129:
------------------------------------

    Description: 
The KMeans driver forces the numReduceTasks parameter of KMeans to 1, and there are javadoc/param naming problems in KMeansDriver.



  was:
The kmeans sample job uses '1' for the number of clusters desired. I don't think this is intended.




> Kmeans sample does not expose numIterations control from KMeansDriver
> ---------------------------------------------------------------------
>
>                 Key: MAHOUT-129
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-129
>             Project: Mahout
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 0.2
>            Reporter: Benson Margulies
>         Attachments: kmeans-patch.diff
>
>
> The KMeans driver forces the numReduceTasks parameter of KMeans to 1, and there are javadoc/param naming problems in KMeansDriver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAHOUT-129) Kmeans sample does not expose numIterations control from KMeansDriver

Posted by "Benson Margulies (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benson Margulies updated MAHOUT-129:
------------------------------------

    Component/s:     (was: Collaborative Filtering)
                 Clustering
        Summary: Kmeans sample does not expose numIterations control from KMeansDriver  (was: kmeans sample makes one cluster)

> Kmeans sample does not expose numIterations control from KMeansDriver
> ---------------------------------------------------------------------
>
>                 Key: MAHOUT-129
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-129
>             Project: Mahout
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 0.2
>            Reporter: Benson Margulies
>         Attachments: kmeans-patch.diff
>
>
> The kmeans sample job uses '1' for the number of clusters desired. I don't think this is intended.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (MAHOUT-129) Kmeans sample does not expose numIterations control from KMeansDriver

Posted by "Jeff Eastman (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jeff Eastman resolved MAHOUT-129.
---------------------------------

    Resolution: Fixed

Rename completed

> Kmeans sample does not expose numIterations control from KMeansDriver
> ---------------------------------------------------------------------
>
>                 Key: MAHOUT-129
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-129
>             Project: Mahout
>          Issue Type: Bug
>          Components: Clustering
>    Affects Versions: 0.2
>            Reporter: Benson Margulies
>         Attachments: kmeans-patch.diff
>
>
> The KMeans driver forces the numReduceTasks parameter of KMeans to 1, and there are javadoc/param naming problems in KMeansDriver.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (MAHOUT-129) kmeans sample makes one cluster

Posted by "Benson Margulies (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAHOUT-129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Benson Margulies updated MAHOUT-129:
------------------------------------

    Attachment: kmeans-patch.diff

> kmeans sample makes one cluster
> -------------------------------
>
>                 Key: MAHOUT-129
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-129
>             Project: Mahout
>          Issue Type: Bug
>          Components: Collaborative Filtering
>    Affects Versions: 0.2
>            Reporter: Benson Margulies
>         Attachments: kmeans-patch.diff
>
>
> The kmeans sample job uses '1' for the number of clusters desired. I don't think this is intended.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (MAHOUT-129) kmeans sample makes one cluster

Posted by "Benson Margulies (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/MAHOUT-129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12714518#action_12714518 ] 

Benson Margulies commented on MAHOUT-129:
-----------------------------------------

This patch isn't quite right, as follows.

KMeansDriver.runJob has an undocument parameter named numCentroids.

I thought this meant 'k'. It can't be that, because it's passed to KMeansDriver.runIteration, where the (undocumented) parameter is called numReduceTasks.

Whatever it is, it isn't k, because that's implicitly the number of input clusters.


> kmeans sample makes one cluster
> -------------------------------
>
>                 Key: MAHOUT-129
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-129
>             Project: Mahout
>          Issue Type: Bug
>          Components: Collaborative Filtering
>    Affects Versions: 0.2
>            Reporter: Benson Margulies
>         Attachments: kmeans-patch.diff
>
>
> The kmeans sample job uses '1' for the number of clusters desired. I don't think this is intended.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.