You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by "Videnova, Svetlana" <sv...@logica.com> on 2012/08/02 11:55:23 UTC

kmeans cluster V:0.7

Hello,
I'm using mahout 0.7 and trying to clusterise, but apparently there is no more  KMeansClusterer class available in 0.7.
Can somebody please tell me by which class kmeansclusterer is replaced?

Thank you



Think green - keep it on the screen.

This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you.


Re: kmeans cluster V:0.7

Posted by Paritosh Ranjan <pr...@xebia.com>.
The error says that it did not find any non empty cluster.
See ClusterClassificationDriverTest for its proper usage.

On 02-08-2012 18:38, Videnova, Svetlana wrote:
> Thank you for your answer.
>
>
> I'm using this:
> ClusterClassificationDriver.run(new Path("vector"), new Path("clusterOutput"), new Path("cluster"), 0.5, false, false);
>
> My vector looks like :
> SEQ__org.apache.hadoop.io.Text_org.apache.hadoop.io.Text______t€ðàó^æVG²RŸ˜Õ_________Ž__P(0):{15:1.4650986194610596,14:0.9997141361236572,11:0.9997141361236572,10:0.9997141361236572,9:0.9997141361236572,8:1.4650986194610596,7:1.4650986194610596,6:1.4650986194610596,5:0.9997141361236572,4:1.4650986194610596,2:3.1613736152648926,1:1.4650986194610596,0:0.9997141361236572}_________Ž__P(1):{15:1.4650986194610596,14:0.9997141361236572,11:0.9997141361236572,10:0.9997141361236572,9:0.9997141361236572,8:1.4650986194610596,7:1.4650986194610596,6:1.4650986194610596,5:0.9997141361236572,4:1.4650986194610596,2:3.1613736152648926,1:1.4650986194610596,0:0.9997141361236572}_________Ž__P(2):{ [… and others]
>
>
> My error:
>
>
>
> java.lang.ArrayIndexOutOfBoundsException: 0
>        at org.apache.mahout.clustering.classify.ClusterClassificationMapper.populateClusterModels(ClusterClassificationMapper.java:129)
>        at org.apache.mahout.clustering.classify.ClusterClassificationMapper.setup(ClusterClassificationMapper.java:74)
>        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
>        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
>        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
>        at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
> 12/08/02 14:53:31 INFO mapred.JobClient:  map 0% reduce 0%
> 12/08/02 14:53:32 INFO mapred.JobClient: Job complete: job_local_0001
> 12/08/02 14:53:32 INFO mapred.JobClient: Counters: 0
> Exception in thread "main" java.lang.InterruptedException: Cluster Classification Driver Job failed processing vector
>        at org.apache.mahout.clustering.classify.ClusterClassificationDriver.classifyClusterMR(ClusterClassificationDriver.java:276)
>        at org.apache.mahout.clustering.classify.ClusterClassificationDriver.run(ClusterClassificationDriver.java:135)
>        at main.LuceneDemo.main(LuceneDemo.java:210)
>
>
> does anyone please help me?
>
>
> -----Message d'origine-----
> De : Paritosh Ranjan [mailto:pranjan@xebia.com]
> Envoyé : jeudi 2 août 2012 12:45
> À : user@mahout.apache.org
> Objet : Re: kmeans cluster V:0.7
>
> KMeansClusterer has been removed.
>
> Clustering is done via ClusterClassificationDriver now in a different and generic way.
> KMeansDriver's run method is all you need to use KMeans Clustering.
>
>
> On 02-08-2012 15:25, Videnova, Svetlana wrote:
>
>> Hello,
>> I'm using mahout 0.7 and trying to clusterise, but apparently there is no more  KMeansClusterer class available in 0.7.
>> Can somebody please tell me by which class kmeansclusterer is replaced?
>>
>> Thank you
>>
>>
>>
>> Think green - keep it on the screen.
>>
>> This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you.
>>
>>
>
>
>
> Think green - keep it on the screen.
>
> This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you.
>



RE: kmeans cluster V:0.7

Posted by "Videnova, Svetlana" <sv...@logica.com>.
Thank you for your answer.


I'm using this:
ClusterClassificationDriver.run(new Path("vector"), new Path("clusterOutput"), new Path("cluster"), 0.5, false, false);

My vector looks like : 
SEQ__org.apache.hadoop.io.Text_org.apache.hadoop.io.Text______t€ðàó^æVG²RŸ˜Õ_________Ž__P(0):{15:1.4650986194610596,14:0.9997141361236572,11:0.9997141361236572,10:0.9997141361236572,9:0.9997141361236572,8:1.4650986194610596,7:1.4650986194610596,6:1.4650986194610596,5:0.9997141361236572,4:1.4650986194610596,2:3.1613736152648926,1:1.4650986194610596,0:0.9997141361236572}_________Ž__P(1):{15:1.4650986194610596,14:0.9997141361236572,11:0.9997141361236572,10:0.9997141361236572,9:0.9997141361236572,8:1.4650986194610596,7:1.4650986194610596,6:1.4650986194610596,5:0.9997141361236572,4:1.4650986194610596,2:3.1613736152648926,1:1.4650986194610596,0:0.9997141361236572}_________Ž__P(2):{ [… and others]


My error:



java.lang.ArrayIndexOutOfBoundsException: 0
      at org.apache.mahout.clustering.classify.ClusterClassificationMapper.populateClusterModels(ClusterClassificationMapper.java:129)
      at org.apache.mahout.clustering.classify.ClusterClassificationMapper.setup(ClusterClassificationMapper.java:74)
      at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142)
      at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:621)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:305)
      at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
12/08/02 14:53:31 INFO mapred.JobClient:  map 0% reduce 0%
12/08/02 14:53:32 INFO mapred.JobClient: Job complete: job_local_0001
12/08/02 14:53:32 INFO mapred.JobClient: Counters: 0
Exception in thread "main" java.lang.InterruptedException: Cluster Classification Driver Job failed processing vector
      at org.apache.mahout.clustering.classify.ClusterClassificationDriver.classifyClusterMR(ClusterClassificationDriver.java:276)
      at org.apache.mahout.clustering.classify.ClusterClassificationDriver.run(ClusterClassificationDriver.java:135)
      at main.LuceneDemo.main(LuceneDemo.java:210)


does anyone please help me?


-----Message d'origine-----
De : Paritosh Ranjan [mailto:pranjan@xebia.com] 
Envoyé : jeudi 2 août 2012 12:45
À : user@mahout.apache.org
Objet : Re: kmeans cluster V:0.7

KMeansClusterer has been removed.

Clustering is done via ClusterClassificationDriver now in a different and generic way.
KMeansDriver's run method is all you need to use KMeans Clustering.


On 02-08-2012 15:25, Videnova, Svetlana wrote:

> Hello,
> I'm using mahout 0.7 and trying to clusterise, but apparently there is no more  KMeansClusterer class available in 0.7.
> Can somebody please tell me by which class kmeansclusterer is replaced?
>
> Thank you
>
>
>
> Think green - keep it on the screen.
>
> This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you.
>
>




Think green - keep it on the screen.

This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you.


Re: kmeans cluster V:0.7

Posted by Paritosh Ranjan <pr...@xebia.com>.
KMeansClusterer has been removed.

Clustering is done via ClusterClassificationDriver now in a different and generic way.
KMeansDriver's run method is all you need to use KMeans Clustering.


On 02-08-2012 15:25, Videnova, Svetlana wrote:

> Hello,
> I'm using mahout 0.7 and trying to clusterise, but apparently there is no more  KMeansClusterer class available in 0.7.
> Can somebody please tell me by which class kmeansclusterer is replaced?
>
> Thank you
>
>
>
> Think green - keep it on the screen.
>
> This e-mail and any attachment is for authorised use by the intended recipient(s) only. It may contain proprietary material, confidential information and/or be subject to legal privilege. It should not be copied, disclosed to, retained or used by, any other party. If you are not an intended recipient then please promptly delete this e-mail and any attachment and all copies and inform the sender. Thank you.
>
>