You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Anne Sauve <An...@hotmail.com> on 2014/12/09 22:57:40 UTC
computing the distance between 2 values from fuzzyKmeans clustering clusteredPoints
Hello there,
I have been trying for a while to compute the pairwise distance between all the
clustered points in order to fill in a distanceMatrix which will then be used to
compute the silhouette of my clustering.
Here is my code I am using Mahout 0.8
SequenceFile.Reader reader1 = new SequenceFile.Reader(fs, new
Path("data/testdata/output8Box/clusteredPoints" + "/part-m-00000"),conf);
IntWritable key1 = new IntWritable();
WeightedVectorWritable value1 = new WeightedVectorWritable();
List<NamedVector> clusters = new ArrayList<NamedVector>();
while (reader1.next(key1,value1)) {
System.out.println(value1.toString() + " belongs to cluster " +
key1.toString());
NamedVector cluster = (NamedVector) value1.getVector();
clusters.add(cluster);
}
// Compute the distanceMatrix
DistanceMeasure measure = new CosineDistanceMeasure();
for (int i = 0; i< clusters.size(); i++) {
for (int j = i + 1; j < clusters.size(); j++) {
double d = measure.distance(clusters.get(i), clusters.get(j));
System.out.println("dist "+i + " ; "+ j + " : "+ d);
}
}
When I run it I am getting the following exception:
Exception in thread "main" java.lang.ClassCastException:
org.apache.mahout.math.RandomAccessSparseVector cannot be cast to
org.apache.mahout.math.NamedVector
How do I convert a sparse vector into a NamedVector ?
Is there a better way to proceed ?
Thanks a lot for your help.
Anne
Re: computing the distance between 2 values from fuzzyKmeans
clustering clusteredPoints
Posted by Andrew Musselman <an...@gmail.com>.
Could you please upgrade to Mahout 0.9 or work off of trunk? Some of the
code related to k-means results changed since 0.8.
Thanks!
On Tue, Dec 9, 2014 at 1:57 PM, Anne Sauve <An...@hotmail.com> wrote:
> Hello there,
>
> I have been trying for a while to compute the pairwise distance between
> all the
> clustered points in order to fill in a distanceMatrix which will then be
> used to
> compute the silhouette of my clustering.
>
> Here is my code I am using Mahout 0.8
>
> SequenceFile.Reader reader1 = new SequenceFile.Reader(fs, new
> Path("data/testdata/output8Box/clusteredPoints" + "/part-m-00000"),conf);
> IntWritable key1 = new IntWritable();
> WeightedVectorWritable value1 = new WeightedVectorWritable();
>
> List<NamedVector> clusters = new ArrayList<NamedVector>();
> while (reader1.next(key1,value1)) {
> System.out.println(value1.toString() + " belongs to cluster " +
> key1.toString());
> NamedVector cluster = (NamedVector) value1.getVector();
> clusters.add(cluster);
> }
> // Compute the distanceMatrix
> DistanceMeasure measure = new CosineDistanceMeasure();
> for (int i = 0; i< clusters.size(); i++) {
> for (int j = i + 1; j < clusters.size(); j++) {
> double d = measure.distance(clusters.get(i),
> clusters.get(j));
> System.out.println("dist "+i + " ; "+ j + " : "+ d);
> }
> }
>
>
> When I run it I am getting the following exception:
>
> Exception in thread "main" java.lang.ClassCastException:
> org.apache.mahout.math.RandomAccessSparseVector cannot be cast to
> org.apache.mahout.math.NamedVector
>
> How do I convert a sparse vector into a NamedVector ?
> Is there a better way to proceed ?
>
> Thanks a lot for your help.
>
> Anne
>
>
>