You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by xenlee - Zerg <sc...@gmail.com> on 2014/08/07 10:28:53 UTC

Problem with Mahout Text Classifier following Apache Mahout Cookbook examples

Hi,

I am following the Apache Mahout Cookbook tutorials and I tried to run a
classifier on 20newsGroup.I managed to convert the files in SF then run the
TF-IDF algorithm, and split the Data into Train/test. But when I finally
build my model with trainnb, here is the error I got.
Did this already occur to someone?

Regards,
xenlee -


[mapr@fb-mapr1 new]$ mahout trainnb -i /input/new/20news-train-vectors -el
-o /input/new/model -li /input/new/labelindex -ow
No MAHOUT_CONF_DIR found
MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
Running on hadoop, using /opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop and
HADOOP_CONF_DIR=/opt/mapr/hadoop/hadoop-0.20.2/conf
MAHOUT-JOB: /opt/mapr/mahout/mahout-0.9/mahout-examples-0.9-mapr-job.jar
14/08/07 08:22:44 WARN driver.MahoutDriver: No trainnb.props found on
classpath, will use command-line arguments only
14/08/07 08:22:44 INFO common.AbstractJob: Command line arguments:
{--alphaI=[1.0], --endPhase=[2147483647], --extractLabels=null,
--input=[/input/new/20news-train-vectors],
--labelIndex=[/input/new/labelindex], --output=[/input/new/model],
--overwrite=null, --startPhase=[0], --tempDir=[temp]}
14/08/07 08:22:44 INFO zlib.ZlibFactory: Successfully loaded & initialized
native-zlib library
14/08/07 08:22:44 INFO compress.CodecPool: Got brand-new decompressor
Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
        at
org.apache.mahout.classifier.naivebayes.BayesUtils.writeLabelIndex(BayesUtils.java:123)
        at
org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.createLabelIndex(TrainNaiveBayesJob.java:180)
        at
org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.run(TrainNaiveBayesJob.java:94)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
        at
org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.main(TrainNaiveBayesJob.java:64)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:197)

Re: Problem with Mahout Text Classifier following Apache Mahout Cookbook examples

Posted by Ted Dunning <te...@gmail.com>.
Piero,

We all sympathize with that crunch time feeling.

good luck!




On Tue, Aug 12, 2014 at 12:50 PM, <pg...@gmail.com> wrote:

> Dear Ted
>
> I have many request to update the code on   github.
>
> Unfortunately I am on a very pushing period.
>
> However  as soon as possible I  will have some update I will push them on
> github
>
> Piero
>
> -----Messaggio originale-----
> Da: Ted Dunning [mailto:ted.dunning@gmail.com]
> Inviato: 08 August 2014 20:56
> A: user@mahout.apache.org
> Oggetto: Re: Problem with Mahout Text Classifier following Apache Mahout
> Cookbook examples
>
> Piero,
>
> It might help if you put your examples with updates on github so that you
> can point people to that.
>
>
>
>
> On Thu, Aug 7, 2014 at 2:30 AM, Piero Giacomelli <pg...@gmail.com>
> wrote:
>
> > Ok nice in case you have more problem pls do not hesitate to ask me
> >
> > Piero Giacomelli
> >
> >
> >
> > 2014-08-07 11:29 GMT+02:00 xenlee - Zerg <sc...@gmail.com>:
> >
> > > I solved my problem, I didnt split the right file.
> > >
> > >
> > > 2014-08-07 10:28 GMT+02:00 xenlee - Zerg <sc...@gmail.com>:
> > >
> > > > Hi,
> > > >
> > > > I am following the Apache Mahout Cookbook tutorials and I tried to
> > > > run
> > a
> > > > classifier on 20newsGroup.I managed to convert the files in SF
> > > > then run
> > > the
> > > > TF-IDF algorithm, and split the Data into Train/test. But when I
> > finally
> > > > build my model with trainnb, here is the error I got.
> > > > Did this already occur to someone?
> > > >
> > > > Regards,
> > > > xenlee -
> > > >
> > > >
> > > > [mapr@fb-mapr1 new]$ mahout trainnb -i
> > > > /input/new/20news-train-vectors -el -o /input/new/model -li
> > > > /input/new/labelindex -ow No MAHOUT_CONF_DIR found MAHOUT_LOCAL is
> > > > not set; adding HADOOP_CONF_DIR to classpath.
> > > > Running on hadoop, using /opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop
> > > > and HADOOP_CONF_DIR=/opt/mapr/hadoop/hadoop-0.20.2/conf
> > > > MAHOUT-JOB:
> > /opt/mapr/mahout/mahout-0.9/mahout-examples-0.9-mapr-job.jar
> > > > 14/08/07 08:22:44 WARN driver.MahoutDriver: No trainnb.props found
> > > > on classpath, will use command-line arguments only
> > > > 14/08/07 08:22:44 INFO common.AbstractJob: Command line arguments:
> > > > {--alphaI=[1.0], --endPhase=[2147483647], --extractLabels=null,
> > > > --input=[/input/new/20news-train-vectors],
> > > > --labelIndex=[/input/new/labelindex], --output=[/input/new/model],
> > > > --overwrite=null, --startPhase=[0], --tempDir=[temp]}
> > > > 14/08/07 08:22:44 INFO zlib.ZlibFactory: Successfully loaded &
> > > initialized
> > > > native-zlib library
> > > > 14/08/07 08:22:44 INFO compress.CodecPool: Got brand-new
> > > > decompressor Exception in thread "main"
> java.lang.ArrayIndexOutOfBoundsException: 1
> > > >         at
> > > >
> > >
> > org.apache.mahout.classifier.naivebayes.BayesUtils.writeLabelIndex(Bay
> > esUtils.java:123)
> > > >         at
> > > >
> > >
> > org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.cr
> > eateLabelIndex(TrainNaiveBayesJob.java:180)
> > > >         at
> > > >
> > >
> > org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.ru
> > n(TrainNaiveBayesJob.java:94)
> > > >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > > >         at
> > > >
> > >
> > org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.ma
> > in(TrainNaiveBayesJob.java:64)
> > > >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> > > >         at
> > > >
> > >
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > ava:57)
> > > >         at
> > > >
> > >
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > orImpl.java:43)
> > > >         at java.lang.reflect.Method.invoke(Method.java:606)
> > > >         at
> > > >
> > >
> > org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Program
> > Driver.java:68)
> > > >         at
> > > > org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> > > >         at
> > > > org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
> > > >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
> Method)
> > > >         at
> > > >
> > >
> > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> > ava:57)
> > > >         at
> > > >
> > >
> > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> > orImpl.java:43)
> > > >         at java.lang.reflect.Method.invoke(Method.java:606)
> > > >         at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
> > > >
> > >
> >
> >
> >
> > --
> > Piero Giacomelli, Italia
> > phone:+39 34 71 02 42 95
> > e-mail: pgiacome@gmail.com
> > skype: pgiacome
> > twitter: @pierogiacomelli
> > my books:
> > Apache Mahout cookbok
> > <http://www.packtpub.com/apache-mahout-cookbook/book
> > >
> > HornetQ Messaging Developer's Guide
> > <http://www.amazon.com/dp/1849518408/?tag=packtpubli-20>
> >
>
>

R: Problem with Mahout Text Classifier following Apache Mahout Cookbook examples

Posted by pg...@gmail.com.
Dear Ted

I have many request to update the code on   github.

Unfortunately I am on a very pushing period.

However  as soon as possible I  will have some update I will push them on github

Piero

-----Messaggio originale-----
Da: Ted Dunning [mailto:ted.dunning@gmail.com] 
Inviato: 08 August 2014 20:56
A: user@mahout.apache.org
Oggetto: Re: Problem with Mahout Text Classifier following Apache Mahout Cookbook examples

Piero,

It might help if you put your examples with updates on github so that you can point people to that.




On Thu, Aug 7, 2014 at 2:30 AM, Piero Giacomelli <pg...@gmail.com> wrote:

> Ok nice in case you have more problem pls do not hesitate to ask me
>
> Piero Giacomelli
>
>
>
> 2014-08-07 11:29 GMT+02:00 xenlee - Zerg <sc...@gmail.com>:
>
> > I solved my problem, I didnt split the right file.
> >
> >
> > 2014-08-07 10:28 GMT+02:00 xenlee - Zerg <sc...@gmail.com>:
> >
> > > Hi,
> > >
> > > I am following the Apache Mahout Cookbook tutorials and I tried to 
> > > run
> a
> > > classifier on 20newsGroup.I managed to convert the files in SF 
> > > then run
> > the
> > > TF-IDF algorithm, and split the Data into Train/test. But when I
> finally
> > > build my model with trainnb, here is the error I got.
> > > Did this already occur to someone?
> > >
> > > Regards,
> > > xenlee -
> > >
> > >
> > > [mapr@fb-mapr1 new]$ mahout trainnb -i 
> > > /input/new/20news-train-vectors -el -o /input/new/model -li 
> > > /input/new/labelindex -ow No MAHOUT_CONF_DIR found MAHOUT_LOCAL is 
> > > not set; adding HADOOP_CONF_DIR to classpath.
> > > Running on hadoop, using /opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop 
> > > and HADOOP_CONF_DIR=/opt/mapr/hadoop/hadoop-0.20.2/conf
> > > MAHOUT-JOB:
> /opt/mapr/mahout/mahout-0.9/mahout-examples-0.9-mapr-job.jar
> > > 14/08/07 08:22:44 WARN driver.MahoutDriver: No trainnb.props found 
> > > on classpath, will use command-line arguments only
> > > 14/08/07 08:22:44 INFO common.AbstractJob: Command line arguments:
> > > {--alphaI=[1.0], --endPhase=[2147483647], --extractLabels=null, 
> > > --input=[/input/new/20news-train-vectors],
> > > --labelIndex=[/input/new/labelindex], --output=[/input/new/model], 
> > > --overwrite=null, --startPhase=[0], --tempDir=[temp]}
> > > 14/08/07 08:22:44 INFO zlib.ZlibFactory: Successfully loaded &
> > initialized
> > > native-zlib library
> > > 14/08/07 08:22:44 INFO compress.CodecPool: Got brand-new 
> > > decompressor Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
> > >         at
> > >
> >
> org.apache.mahout.classifier.naivebayes.BayesUtils.writeLabelIndex(Bay
> esUtils.java:123)
> > >         at
> > >
> >
> org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.cr
> eateLabelIndex(TrainNaiveBayesJob.java:180)
> > >         at
> > >
> >
> org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.ru
> n(TrainNaiveBayesJob.java:94)
> > >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >         at
> > >
> >
> org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.ma
> in(TrainNaiveBayesJob.java:64)
> > >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >         at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> ava:57)
> > >         at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> orImpl.java:43)
> > >         at java.lang.reflect.Method.invoke(Method.java:606)
> > >         at
> > >
> >
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Program
> Driver.java:68)
> > >         at
> > > org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> > >         at
> > > org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
> > >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >         at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.j
> ava:57)
> > >         at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccess
> orImpl.java:43)
> > >         at java.lang.reflect.Method.invoke(Method.java:606)
> > >         at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
> > >
> >
>
>
>
> --
> Piero Giacomelli, Italia
> phone:+39 34 71 02 42 95
> e-mail: pgiacome@gmail.com
> skype: pgiacome
> twitter: @pierogiacomelli
> my books:
> Apache Mahout cookbok 
> <http://www.packtpub.com/apache-mahout-cookbook/book
> >
> HornetQ Messaging Developer's Guide
> <http://www.amazon.com/dp/1849518408/?tag=packtpubli-20>
>


Re: Problem with Mahout Text Classifier following Apache Mahout Cookbook examples

Posted by Ted Dunning <te...@gmail.com>.
Piero,

It might help if you put your examples with updates on github so that you
can point people to that.




On Thu, Aug 7, 2014 at 2:30 AM, Piero Giacomelli <pg...@gmail.com> wrote:

> Ok nice in case you have more problem pls do not hesitate to ask me
>
> Piero Giacomelli
>
>
>
> 2014-08-07 11:29 GMT+02:00 xenlee - Zerg <sc...@gmail.com>:
>
> > I solved my problem, I didnt split the right file.
> >
> >
> > 2014-08-07 10:28 GMT+02:00 xenlee - Zerg <sc...@gmail.com>:
> >
> > > Hi,
> > >
> > > I am following the Apache Mahout Cookbook tutorials and I tried to run
> a
> > > classifier on 20newsGroup.I managed to convert the files in SF then run
> > the
> > > TF-IDF algorithm, and split the Data into Train/test. But when I
> finally
> > > build my model with trainnb, here is the error I got.
> > > Did this already occur to someone?
> > >
> > > Regards,
> > > xenlee -
> > >
> > >
> > > [mapr@fb-mapr1 new]$ mahout trainnb -i /input/new/20news-train-vectors
> > > -el -o /input/new/model -li /input/new/labelindex -ow
> > > No MAHOUT_CONF_DIR found
> > > MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
> > > Running on hadoop, using /opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop and
> > > HADOOP_CONF_DIR=/opt/mapr/hadoop/hadoop-0.20.2/conf
> > > MAHOUT-JOB:
> /opt/mapr/mahout/mahout-0.9/mahout-examples-0.9-mapr-job.jar
> > > 14/08/07 08:22:44 WARN driver.MahoutDriver: No trainnb.props found on
> > > classpath, will use command-line arguments only
> > > 14/08/07 08:22:44 INFO common.AbstractJob: Command line arguments:
> > > {--alphaI=[1.0], --endPhase=[2147483647], --extractLabels=null,
> > > --input=[/input/new/20news-train-vectors],
> > > --labelIndex=[/input/new/labelindex], --output=[/input/new/model],
> > > --overwrite=null, --startPhase=[0], --tempDir=[temp]}
> > > 14/08/07 08:22:44 INFO zlib.ZlibFactory: Successfully loaded &
> > initialized
> > > native-zlib library
> > > 14/08/07 08:22:44 INFO compress.CodecPool: Got brand-new decompressor
> > > Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
> > >         at
> > >
> >
> org.apache.mahout.classifier.naivebayes.BayesUtils.writeLabelIndex(BayesUtils.java:123)
> > >         at
> > >
> >
> org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.createLabelIndex(TrainNaiveBayesJob.java:180)
> > >         at
> > >
> >
> org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.run(TrainNaiveBayesJob.java:94)
> > >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> > >         at
> > >
> >
> org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.main(TrainNaiveBayesJob.java:64)
> > >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >         at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > >         at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >         at java.lang.reflect.Method.invoke(Method.java:606)
> > >         at
> > >
> >
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> > >         at
> > > org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> > >         at
> > > org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
> > >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> > >         at
> > >
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> > >         at
> > >
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> > >         at java.lang.reflect.Method.invoke(Method.java:606)
> > >         at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
> > >
> >
>
>
>
> --
> Piero Giacomelli, Italia
> phone:+39 34 71 02 42 95
> e-mail: pgiacome@gmail.com
> skype: pgiacome
> twitter: @pierogiacomelli
> my books:
> Apache Mahout cookbok <http://www.packtpub.com/apache-mahout-cookbook/book
> >
> HornetQ Messaging Developer's Guide
> <http://www.amazon.com/dp/1849518408/?tag=packtpubli-20>
>

Re: Problem with Mahout Text Classifier following Apache Mahout Cookbook examples

Posted by Piero Giacomelli <pg...@gmail.com>.
Ok nice in case you have more problem pls do not hesitate to ask me

Piero Giacomelli



2014-08-07 11:29 GMT+02:00 xenlee - Zerg <sc...@gmail.com>:

> I solved my problem, I didnt split the right file.
>
>
> 2014-08-07 10:28 GMT+02:00 xenlee - Zerg <sc...@gmail.com>:
>
> > Hi,
> >
> > I am following the Apache Mahout Cookbook tutorials and I tried to run a
> > classifier on 20newsGroup.I managed to convert the files in SF then run
> the
> > TF-IDF algorithm, and split the Data into Train/test. But when I finally
> > build my model with trainnb, here is the error I got.
> > Did this already occur to someone?
> >
> > Regards,
> > xenlee -
> >
> >
> > [mapr@fb-mapr1 new]$ mahout trainnb -i /input/new/20news-train-vectors
> > -el -o /input/new/model -li /input/new/labelindex -ow
> > No MAHOUT_CONF_DIR found
> > MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
> > Running on hadoop, using /opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop and
> > HADOOP_CONF_DIR=/opt/mapr/hadoop/hadoop-0.20.2/conf
> > MAHOUT-JOB: /opt/mapr/mahout/mahout-0.9/mahout-examples-0.9-mapr-job.jar
> > 14/08/07 08:22:44 WARN driver.MahoutDriver: No trainnb.props found on
> > classpath, will use command-line arguments only
> > 14/08/07 08:22:44 INFO common.AbstractJob: Command line arguments:
> > {--alphaI=[1.0], --endPhase=[2147483647], --extractLabels=null,
> > --input=[/input/new/20news-train-vectors],
> > --labelIndex=[/input/new/labelindex], --output=[/input/new/model],
> > --overwrite=null, --startPhase=[0], --tempDir=[temp]}
> > 14/08/07 08:22:44 INFO zlib.ZlibFactory: Successfully loaded &
> initialized
> > native-zlib library
> > 14/08/07 08:22:44 INFO compress.CodecPool: Got brand-new decompressor
> > Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
> >         at
> >
> org.apache.mahout.classifier.naivebayes.BayesUtils.writeLabelIndex(BayesUtils.java:123)
> >         at
> >
> org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.createLabelIndex(TrainNaiveBayesJob.java:180)
> >         at
> >
> org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.run(TrainNaiveBayesJob.java:94)
> >         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> >         at
> >
> org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.main(TrainNaiveBayesJob.java:64)
> >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >         at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >         at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >         at java.lang.reflect.Method.invoke(Method.java:606)
> >         at
> >
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> >         at
> > org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> >         at
> > org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
> >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> >         at
> >
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> >         at
> >
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> >         at java.lang.reflect.Method.invoke(Method.java:606)
> >         at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
> >
>



-- 
Piero Giacomelli, Italia
phone:+39 34  71 02 42 95
e-mail: pgiacome@gmail.com
skype: pgiacome
twitter: @pierogiacomelli
my books:
Apache Mahout cookbok <http://www.packtpub.com/apache-mahout-cookbook/book>
HornetQ Messaging Developer's Guide
<http://www.amazon.com/dp/1849518408/?tag=packtpubli-20>

Re: Problem with Mahout Text Classifier following Apache Mahout Cookbook examples

Posted by xenlee - Zerg <sc...@gmail.com>.
I solved my problem, I didnt split the right file.


2014-08-07 10:28 GMT+02:00 xenlee - Zerg <sc...@gmail.com>:

> Hi,
>
> I am following the Apache Mahout Cookbook tutorials and I tried to run a
> classifier on 20newsGroup.I managed to convert the files in SF then run the
> TF-IDF algorithm, and split the Data into Train/test. But when I finally
> build my model with trainnb, here is the error I got.
> Did this already occur to someone?
>
> Regards,
> xenlee -
>
>
> [mapr@fb-mapr1 new]$ mahout trainnb -i /input/new/20news-train-vectors
> -el -o /input/new/model -li /input/new/labelindex -ow
> No MAHOUT_CONF_DIR found
> MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
> Running on hadoop, using /opt/mapr/hadoop/hadoop-0.20.2/bin/hadoop and
> HADOOP_CONF_DIR=/opt/mapr/hadoop/hadoop-0.20.2/conf
> MAHOUT-JOB: /opt/mapr/mahout/mahout-0.9/mahout-examples-0.9-mapr-job.jar
> 14/08/07 08:22:44 WARN driver.MahoutDriver: No trainnb.props found on
> classpath, will use command-line arguments only
> 14/08/07 08:22:44 INFO common.AbstractJob: Command line arguments:
> {--alphaI=[1.0], --endPhase=[2147483647], --extractLabels=null,
> --input=[/input/new/20news-train-vectors],
> --labelIndex=[/input/new/labelindex], --output=[/input/new/model],
> --overwrite=null, --startPhase=[0], --tempDir=[temp]}
> 14/08/07 08:22:44 INFO zlib.ZlibFactory: Successfully loaded & initialized
> native-zlib library
> 14/08/07 08:22:44 INFO compress.CodecPool: Got brand-new decompressor
> Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 1
>         at
> org.apache.mahout.classifier.naivebayes.BayesUtils.writeLabelIndex(BayesUtils.java:123)
>         at
> org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.createLabelIndex(TrainNaiveBayesJob.java:180)
>         at
> org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.run(TrainNaiveBayesJob.java:94)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at
> org.apache.mahout.classifier.naivebayes.training.TrainNaiveBayesJob.main(TrainNaiveBayesJob.java:64)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>         at
> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>         at
> org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
>