You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Ken Williams <zo...@hotmail.com> on 2011/04/13 13:48:40 UTC
20NewsGroups Error: Illegal Capacity: -40
Hi All,
I'm having trouble getting the 20News-Groups
(https://cwiki.apache.org/confluence/display/MAHOUT/Twenty+Newsgroups,
and https://cwiki.apache.org/MAHOUT/twenty-newsgroups.html)
example to run.
I've downloaded the data and tried to train the Naive Bayes classifier
but I ran the 'trainclassifier' command and got this error message...
hadoop@kdevlinux:/usr/local/mahout$ mahout trainclassifier -i
examples/bin/work/20news-bydate/bayes-train-input -o
examples/bin/work/20news-bydate/bayes-model -type bayes -ng 1 -source hdfs
Running on hadoop, using HADOOP_HOME=/usr/local/hadoop
No HADOOP_CONF_DIR set, using /usr/local/hadoop/src/conf
11/04/13 09:16:29 WARN driver.MahoutDriver: Unable to add class:
org.apache.mahout.utils.eval.InMemoryFactorizationEvaluator
11/04/13 09:16:29 WARN driver.MahoutDriver: Unable to add class:
org.apache.mahout.utils.eval.ParallelFactorizationEvaluator
11/04/13 09:16:29 WARN driver.MahoutDriver: Unable to add class:
org.apache.mahout.utils.eval.DatasetSplitter
11/04/13 09:16:29 INFO bayes.TrainClassifier: Training Bayes Classifier
11/04/13 09:16:29 INFO bayes.BayesDriver: Reading features...
11/04/13 09:16:30 WARN mapred.JobClient: Use GenericOptionsParser for parsing
the arguments. Applications should implement Tool for the same.
11/04/13 09:16:31 INFO mapred.FileInputFormat: Total input paths to process : 20
Exception in thread "main" java.lang.IllegalArgumentException:
Illegal Capacity: -40
at java.util.ArrayList.<init>(ArrayList.java:110)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:216)
at org.apache.hadoop.mapred.JobClient.writeOldSplits(JobClient.java:810)
at org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:781)
at org.apache.hadoop.mapred.JobClient.submitJob(JobClient.java:730)
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1249)
at
org.apache.mahout.classifier.bayes.mapreduce.common.BayesFeatureDriver.runJob
(BayesFeatureDriver.java:63)
at
org.apache.mahout.classifier.bayes.mapreduce.bayes.BayesDriver.runJob
(BayesDriver.java:47)
at
org.apache.mahout.classifier.bayes.TrainClassifier.trainNaiveBayes
(TrainClassifier.java:54)
at org.apache.mahout.classifier.bayes.TrainClassifier.main
(TrainClassifier.java:162)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke
(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke
(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke
(ProgramDriver.java:68)
at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:187)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke
(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke
(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
I thought that maybe I had entered a command wrongly, but then I found the
'build-20news-bayes.sh' shell script, and when I try to run this I get the
same exception.
I've been running Hadoop 0.20.2 on a 4-node cluster smoothly until now, all
are Debian machines using sun-java6-* packages, and I'm running Mahout
trunk checked out of the svn repository
(svn co http://svn.apache.org/repos/asf/mahout/trunk) today.
All the <newsgroup>.txt files seem to have been created and uploaded
to HDFS correctly ('hadoop dfs -lsr examples/bin/work').
I'm not sure what to try next. Any help would be very welcome.
Ken
Re: 20NewsGroups Error: Illegal Capacity: -40
Posted by Ted Dunning <te...@gmail.com>.
I filed https://issues.apache.org/jira/browse/MAHOUT-669 for this.
Anybody who would like to should please file a patch to fix one or more
scripts.
On Wed, Apr 13, 2011 at 9:34 AM, Ken Williams <zo...@hotmail.com> wrote:
> Ted Dunning <ted.dunning <at> gmail.com> writes:
>
> >
> > This may be a bit of regression.
>
> Thanks for the reply.
>
> Just out of interest, I also reckon your
> 'build-cluster-syntheticcontrol.sh'
> script should be a bash script (#!/bin/bash) rather than a standard
> shell (#!/bin/sh) script.
>
>
> $ trunk/examples/bin/build-cluster-syntheticcontrol.sh
> trunk/examples/bin/build-cluster-syntheticcontrol.sh: 28: Syntax error: "("
> unexpected (expecting "fi")
> $
>
>
> Regards,
>
> Ken
>
>
> >
> > On Wed, Apr 13, 2011 at 4:48 AM, Ken Williams <zoo9000 <at> hotmail.com>
> wrote:
> >
> > > I'm not sure what to try next. Any help would be very welcome.
> > >
> >
>
>
>
>
>
Re: 20NewsGroups Error: Illegal Capacity: -40
Posted by Ted Dunning <te...@gmail.com>.
Very good idea.
On Wed, Apr 13, 2011 at 9:49 AM, Frank Scholten <sc...@gmail.com>wrote:
> This sh error also occurred for the reuters script but has been fixed.
> Maybe good to update all scripts to bash?
>
> On Apr 13, 2011, at 18:34, Ken Williams <zo...@hotmail.com> wrote:
>
> > Ted Dunning <ted.dunning <at> gmail.com> writes:
> >
> >>
> >> This may be a bit of regression.
> >
> > Thanks for the reply.
> >
> > Just out of interest, I also reckon your
> 'build-cluster-syntheticcontrol.sh'
> > script should be a bash script (#!/bin/bash) rather than a standard
> > shell (#!/bin/sh) script.
> >
> >
> > $ trunk/examples/bin/build-cluster-syntheticcontrol.sh
> > trunk/examples/bin/build-cluster-syntheticcontrol.sh: 28: Syntax error:
> "("
> > unexpected (expecting "fi")
> > $
> >
> >
> > Regards,
> >
> > Ken
> >
> >
> >>
> >> On Wed, Apr 13, 2011 at 4:48 AM, Ken Williams <zoo9000 <at> hotmail.com>
> wrote:
> >>
> >>> I'm not sure what to try next. Any help would be very welcome.
> >>>
> >>
> >
> >
> >
> >
>
Re: 20NewsGroups Error: Illegal Capacity: -40
Posted by Frank Scholten <sc...@gmail.com>.
This sh error also occurred for the reuters script but has been fixed. Maybe good to update all scripts to bash?
On Apr 13, 2011, at 18:34, Ken Williams <zo...@hotmail.com> wrote:
> Ted Dunning <ted.dunning <at> gmail.com> writes:
>
>>
>> This may be a bit of regression.
>
> Thanks for the reply.
>
> Just out of interest, I also reckon your 'build-cluster-syntheticcontrol.sh'
> script should be a bash script (#!/bin/bash) rather than a standard
> shell (#!/bin/sh) script.
>
>
> $ trunk/examples/bin/build-cluster-syntheticcontrol.sh
> trunk/examples/bin/build-cluster-syntheticcontrol.sh: 28: Syntax error: "("
> unexpected (expecting "fi")
> $
>
>
> Regards,
>
> Ken
>
>
>>
>> On Wed, Apr 13, 2011 at 4:48 AM, Ken Williams <zoo9000 <at> hotmail.com> wrote:
>>
>>> I'm not sure what to try next. Any help would be very welcome.
>>>
>>
>
>
>
>
Re: 20NewsGroups Error: Illegal Capacity: -40
Posted by Ken Williams <zo...@hotmail.com>.
Ted Dunning <ted.dunning <at> gmail.com> writes:
>
> This may be a bit of regression.
Thanks for the reply.
Just out of interest, I also reckon your 'build-cluster-syntheticcontrol.sh'
script should be a bash script (#!/bin/bash) rather than a standard
shell (#!/bin/sh) script.
$ trunk/examples/bin/build-cluster-syntheticcontrol.sh
trunk/examples/bin/build-cluster-syntheticcontrol.sh: 28: Syntax error: "("
unexpected (expecting "fi")
$
Regards,
Ken
>
> On Wed, Apr 13, 2011 at 4:48 AM, Ken Williams <zoo9000 <at> hotmail.com> wrote:
>
> > I'm not sure what to try next. Any help would be very welcome.
> >
>
Re: 20NewsGroups Error: Illegal Capacity: -40
Posted by Ted Dunning <te...@gmail.com>.
This may be a bit of regression.
On Wed, Apr 13, 2011 at 4:48 AM, Ken Williams <zo...@hotmail.com> wrote:
> I'm not sure what to try next. Any help would be very welcome.
>