You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Paul Hubenig (JIRA)" <ji...@apache.org> on 2012/09/25 23:56:07 UTC
[jira] [Created] (MAHOUT-1077) apparent spectral kmeans bug
Paul Hubenig created MAHOUT-1077:
------------------------------------
Summary: apparent spectral kmeans bug
Key: MAHOUT-1077
URL: https://issues.apache.org/jira/browse/MAHOUT-1077
Project: Mahout
Issue Type: Bug
Components: Clustering
Affects Versions: 0.7
Reporter: Paul Hubenig
Using example data at: https://cwiki.apache.org/MAHOUT/spectral-clustering.html
0,0,0
0,1,0.8
0,2,0.5
1,0,0.8
1,1,0
1,2,0.9
2,0,0.5
2,1,0.9
2,2,0
Using 0.7 distribution.
mahout spectralkmeans -i file:///Users/phubenig/affExGraph.txt -o file:///Users/phubenig/spectralEx -k 2 -d 3 -x 30 -cd 0.01 -ow
12/09/05 16:14:00 INFO mapred.JobClient: Combine output records=1
12/09/05 16:14:00 INFO mapred.JobClient: Reduce output records=1
12/09/05 16:14:00 INFO mapred.JobClient: Map output records=1
12/09/05 16:14:00 INFO lanczos.LanczosSolver: 2 passes through the corpus so far...
Exception in thread "main" org.apache.mahout.math.IndexException: Index 2 is outside allowable range of [0,2)
at org.apache.mahout.math.AbstractMatrix.set(AbstractMatrix.java:479)
at org.apache.mahout.math.decomposer.lanczos.LanczosSolver.solve(LanczosSolver.java:132)
at org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.runJob(DistributedLanczosSolver.java:73)
at org.apache.mahout.clustering.spectral.kmeans.SpectralKMeansDriver.run(SpectralKMeansDriver.java:148)
at org.apache.mahout.clustering.spectral.kmeans.SpectralKMeansDriver.run(SpectralKMeansDriver.java:86)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
at org.apache.mahout.clustering.spectral.kmeans.SpectralKMeansDriver.main(SpectralKMeansDriver.java:53)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (MAHOUT-1077) apparent spectral kmeans bug
Posted by "Daniel Davies (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/MAHOUT-1077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13481067#comment-13481067 ]
Daniel Davies commented on MAHOUT-1077:
---------------------------------------
I'm seeing exactly the same bug on a dataset with 936 points. The LanczosSolver crashes attempting to access an array element that's one larger than the desired number of clusters specified by the -k switch. The problem was also noted by Dan Brinkley as noted in this mailing list message:
From Dan Brickley <da...@danbri.org>
Subject Spectral Kmeans wiki category data test - can you confirm if you ran it to completion?
Date Thu, 21 Jun 2012 19:54:58 GMT
> apparent spectral kmeans bug
> ----------------------------
>
> Key: MAHOUT-1077
> URL: https://issues.apache.org/jira/browse/MAHOUT-1077
> Project: Mahout
> Issue Type: Bug
> Components: Clustering
> Affects Versions: 0.7
> Reporter: Paul Hubenig
>
> Using example data at: https://cwiki.apache.org/MAHOUT/spectral-clustering.html
> 0,0,0
> 0,1,0.8
> 0,2,0.5
> 1,0,0.8
> 1,1,0
> 1,2,0.9
> 2,0,0.5
> 2,1,0.9
> 2,2,0
> Using 0.7 distribution.
> mahout spectralkmeans -i file:///Users/phubenig/affExGraph.txt -o file:///Users/phubenig/spectralEx -k 2 -d 3 -x 30 -cd 0.01 -ow
> 12/09/05 16:14:00 INFO mapred.JobClient: Combine output records=1
> 12/09/05 16:14:00 INFO mapred.JobClient: Reduce output records=1
> 12/09/05 16:14:00 INFO mapred.JobClient: Map output records=1
> 12/09/05 16:14:00 INFO lanczos.LanczosSolver: 2 passes through the corpus so far...
> Exception in thread "main" org.apache.mahout.math.IndexException: Index 2 is outside allowable range of [0,2)
> at org.apache.mahout.math.AbstractMatrix.set(AbstractMatrix.java:479)
> at org.apache.mahout.math.decomposer.lanczos.LanczosSolver.solve(LanczosSolver.java:132)
> at org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.runJob(DistributedLanczosSolver.java:73)
> at org.apache.mahout.clustering.spectral.kmeans.SpectralKMeansDriver.run(SpectralKMeansDriver.java:148)
> at org.apache.mahout.clustering.spectral.kmeans.SpectralKMeansDriver.run(SpectralKMeansDriver.java:86)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
> at org.apache.mahout.clustering.spectral.kmeans.SpectralKMeansDriver.main(SpectralKMeansDriver.java:53)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira