You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Anne Sauve <An...@hotmail.com> on 2014/12/09 22:57:40 UTC

computing the distance between 2 values from fuzzyKmeans clustering clusteredPoints

Hello there,

I have been trying for a while to compute the pairwise distance between all the 
clustered points in order to fill in a distanceMatrix which will then be used to 
compute the silhouette of my clustering.

Here is my code I am using Mahout 0.8

 SequenceFile.Reader reader1 = new SequenceFile.Reader(fs, new 
Path("data/testdata/output8Box/clusteredPoints" + "/part-m-00000"),conf);
 IntWritable key1 = new IntWritable();
 WeightedVectorWritable value1 = new WeightedVectorWritable();

 List<NamedVector> clusters = new ArrayList<NamedVector>();
 while (reader1.next(key1,value1)) {
	 System.out.println(value1.toString() + " belongs to cluster " + 
key1.toString());
	 NamedVector cluster = (NamedVector) value1.getVector();
	 clusters.add(cluster);
}
// Compute the distanceMatrix
DistanceMeasure measure = new CosineDistanceMeasure();
for (int i = 0; i< clusters.size(); i++) {
	for (int j = i + 1; j < clusters.size(); j++) {
		double d = measure.distance(clusters.get(i), clusters.get(j));
		System.out.println("dist "+i + " ; "+ j + " : "+ d);
	}
}
 

When I run it I am getting the following exception:

Exception in thread "main" java.lang.ClassCastException: 
org.apache.mahout.math.RandomAccessSparseVector cannot be cast to 
org.apache.mahout.math.NamedVector

How do I convert a sparse vector into a NamedVector ?
Is there a better way to proceed ?

Thanks a lot for your help.

Anne



Re: computing the distance between 2 values from fuzzyKmeans clustering clusteredPoints

Posted by Andrew Musselman <an...@gmail.com>.
Could you please upgrade to Mahout 0.9 or work off of trunk?  Some of the
code related to k-means results changed since 0.8.

Thanks!

On Tue, Dec 9, 2014 at 1:57 PM, Anne Sauve <An...@hotmail.com> wrote:

> Hello there,
>
> I have been trying for a while to compute the pairwise distance between
> all the
> clustered points in order to fill in a distanceMatrix which will then be
> used to
> compute the silhouette of my clustering.
>
> Here is my code I am using Mahout 0.8
>
>  SequenceFile.Reader reader1 = new SequenceFile.Reader(fs, new
> Path("data/testdata/output8Box/clusteredPoints" + "/part-m-00000"),conf);
>  IntWritable key1 = new IntWritable();
>  WeightedVectorWritable value1 = new WeightedVectorWritable();
>
>  List<NamedVector> clusters = new ArrayList<NamedVector>();
>  while (reader1.next(key1,value1)) {
>          System.out.println(value1.toString() + " belongs to cluster " +
> key1.toString());
>          NamedVector cluster = (NamedVector) value1.getVector();
>          clusters.add(cluster);
> }
> // Compute the distanceMatrix
> DistanceMeasure measure = new CosineDistanceMeasure();
> for (int i = 0; i< clusters.size(); i++) {
>         for (int j = i + 1; j < clusters.size(); j++) {
>                 double d = measure.distance(clusters.get(i),
> clusters.get(j));
>                 System.out.println("dist "+i + " ; "+ j + " : "+ d);
>         }
> }
>
>
> When I run it I am getting the following exception:
>
> Exception in thread "main" java.lang.ClassCastException:
> org.apache.mahout.math.RandomAccessSparseVector cannot be cast to
> org.apache.mahout.math.NamedVector
>
> How do I convert a sparse vector into a NamedVector ?
> Is there a better way to proceed ?
>
> Thanks a lot for your help.
>
> Anne
>
>
>