You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Paritosh Ranjan (Updated) (JIRA)" <ji...@apache.org> on 2012/03/17 09:13:48 UTC

[jira] [Updated] (MAHOUT-991) Convert Canopy, MeanShift, K-means, Dirichlet, Fuzzy KMeans and Other Tools to emit ClusterWritable

     [ https://issues.apache.org/jira/browse/MAHOUT-991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Paritosh Ranjan updated MAHOUT-991:
-----------------------------------

    Description: 
Adjust the Canopy, MeanShift, K-means, Dirichlet and Fuzzy KMeans implementations to emit ClusterWritables instead of Clusters. Adjust the other clustering tools (ClusterDumper and ClusterEvaluators) to accept ClusterWritables produced by these algorithms.

The new ClusterIterator and ClusterClassifier uses an expanded sequence file representation that stores Clusters as self-describing ClusterWritable objects. So, once all of these algorithms will start emitting ClusterWritables, then KMeans, Dirichlet and FuzzyK will be able to use ClusterIterator and ClusterClassifier for buildClusters phase.

  was:The new ClusterIterator and ClusterClassifier uses an expanded sequence file representation that stores Clusters as self-describing ClusterWritable objects. Adjust the Canopy and MeanShift implementations which do not use this approach to emit ClusterWritables instead of Clusters. Adjust the other clustering tools (ClusterDumper and ClusterEvaluators) to accept ClusterWritables produced by these algorithms.

        Summary: Convert Canopy, MeanShift, K-means, Dirichlet, Fuzzy KMeans and Other Tools to emit ClusterWritable  (was: Convert Canopy, MeanShift and Other Tools to Use ClusterWritable)
    
> Convert Canopy, MeanShift, K-means, Dirichlet, Fuzzy KMeans and Other Tools to emit ClusterWritable
> ---------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-991
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-991
>             Project: Mahout
>          Issue Type: Sub-task
>          Components: Clustering
>    Affects Versions: 0.6
>            Reporter: Jeff Eastman
>            Assignee: Jeff Eastman
>             Fix For: 0.7
>
>
> Adjust the Canopy, MeanShift, K-means, Dirichlet and Fuzzy KMeans implementations to emit ClusterWritables instead of Clusters. Adjust the other clustering tools (ClusterDumper and ClusterEvaluators) to accept ClusterWritables produced by these algorithms.
> The new ClusterIterator and ClusterClassifier uses an expanded sequence file representation that stores Clusters as self-describing ClusterWritable objects. So, once all of these algorithms will start emitting ClusterWritables, then KMeans, Dirichlet and FuzzyK will be able to use ClusterIterator and ClusterClassifier for buildClusters phase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira