You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hama.apache.org by "Edward J. Yoon (JIRA)" <ji...@apache.org> on 2013/11/26 04:53:36 UTC

[jira] [Commented] (HAMA-821) K-Means writes only k records as a output

    [ https://issues.apache.org/jira/browse/HAMA-821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13832287#comment-13832287 ] 

Edward J. Yoon commented on HAMA-821:
-------------------------------------

Just realized that readOutput reads cluster centers. So, I suggest, 

 * change the name readOutput to readClusterCenters
 * add the readOutput

And I found another bug.

{code}
    if (args.length < 4 || args.length != 7) {
      System.out
          .println("USAGE: <INPUT_PATH> <OUTPUT_PATH> <MAXITERATIONS> <K (how many centers)> -g [<COUNT> <DIMENSION OF VECTORS>]");
      return;
    }
{code}

> K-Means writes only k records as a output
> -----------------------------------------
>
>                 Key: HAMA-821
>                 URL: https://issues.apache.org/jira/browse/HAMA-821
>             Project: Hama
>          Issue Type: Bug
>          Components: machine learning
>            Reporter: Edward J. Yoon
>            Assignee: Edward J. Yoon
>             Fix For: 0.7.0
>
>
> KMeans writes only k records, because 276 line at KMeansBSP overwrites the value for the key. I'm sure it was not intended..
> And, many people ask me about meaning of input and output of KMeans. We need to make K-Means example output lines more readable like,
> {code}
> 13/11/25 17:34:04 INFO kmeans.KMeansBSP: Finished! Writing the results...
> [5.1, 3.5, 1.4, 0.2] belongs to cluster 2
> [4.9, 3.0, 1.4, 0.2] belongs to cluster 2
> [4.7, 3.2, 1.3, 0.2] belongs to cluster 2
> [4.6, 3.1, 1.5, 0.2] belongs to cluster 2
> [5.0, 3.6, 1.4, 0.2] belongs to cluster 2
> ....
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)