You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Shashikant Kore <sh...@gmail.com> on 2009/11/25 09:49:49 UTC

Re: A question about the naming of the cluster and points in synthetic data cluster

Check out ClusterDumper in utils
(utils/src/main/java/org/apache/mahout/utils/clustering/ClusterDumper.java).
This utility will print cluster ID and the associated vector IDs.

--shashi

On Wed, Nov 25, 2009 at 5:47 AM, Liang Chenmin <li...@gmail.com> wrote:
> Hi all,
>    I am a newbie to Mahout. I have a question about how to incorporate some
> naming for cluster and points in the synthetic data cluster example.
>
>    After getting the output of the synthetic data cluster, we have 6
> clusters, and each one looks like:
>
> ###First is the information of the cluster
> 0:name::{"class":"org.apache.mahout.matrix.SparseVector","vector":"{\"values\":{\"indices\":[0,1,2...59],\"values\":[29.58838112577385,...],\"numMappings\":60},\"cardinality\":60,\"lengthSquared\":-1.0,\"name\":\"\"}"}
>
> ###And then follow by points belong to this cluster:
> Points:
> {"class":"org.apache.mahout.matrix.SparseVector","vector":"{\"values\":{\"indices\":[0,1,2,...,59],\"values\":[28.7812,34.4632,......
> ],],\"numMappings\":60},\"cardinality\":60,\"lengthSquared\":-1.0,\"name\":\"\"}"},
>
> {"class":"org.apache.mahout.matrix.SparseVector","vector":"{\"values\":{\"indices\"
> ....
>
>
> Is there a way for me to specify the name of the cluster? And more
> importantly, if I actually have ID for each point, how could I show the ID
> for each point in the final result? I want to see clearly the IDs in each
> cluster. I have used my own data also, and the output is similar to the ones
> above, although the indices are not the same as my matrix are sparse. And as
> my data set is large, getting the IDs is quite important for me.
>
> Thanks,
> Mandy
>