You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by mahout-newbie <ra...@gmail.com> on 2012/05/14 03:33:23 UTC

Exception running 20newsgroups example

When I try to run the 20 newsgroups example in local mode under cygwin,
mahout creates a work directory under /tmp but then an exception is thrown
indicating that the training data set is not found. It is looking for for it
with a DOS style path. I see the directory and data when I navigate to that
directory.

Can someone please shed some light what I'm missing here? The output is
below:

Thanks

$ sh classify-20newsgroups.sh
Please select a number to choose the corresponding task to run
1. naivebayes
2. sgd
3. clean -- cleans up the work area in /tmp/mahout-work-Administrator
Enter your choice : 1
ok. You chose 1 and we'll use naivebayes
*creating work directory at /tmp/mahout-work-Administrator*
Preparing Training Data
MAHOUT_LOCAL is set, so we don't add HADOOP_CONF_DIR to classpath.
no HADOOP_HOME set, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
[jar:file:/C:/cygwin/usr/local/mahout/examples/target/mahout-examples-0.6-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/C:/cygwin/usr/local/mahout/examples/target/dependency/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in
[jar:file:/C:/cygwin/usr/local/mahout/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
explanation.
12/05/13 20:26:46 WARN driver.MahoutDriver: No
org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups.props found on
classpath, will use command-line arguments only
Exception in thread "main" java.io.FileNotFoundException: *Can't find input
directory \tmp\mahout-work-Administrator\20news-bydate\20news-bydate-train*
        at
org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups.main(PrepareTwentyNewsgroups.java:92)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at
org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
        at
org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)

--
View this message in context: http://lucene.472066.n3.nabble.com/Exception-running-20newsgroups-example-tp3983590.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Exception running 20newsgroups example

Posted by Lance Norskog <go...@gmail.com>.
In this case it is looking for a c:\tmp. Do you have one? It does not
come standard with Windows, you have to make it.

This particular code path works, since bin/mahout does not run any
cygwin programs, only Java. I used it a lot.

On Mon, May 14, 2012 at 9:19 AM, Ted Dunning <te...@gmail.com> wrote:
> What you are missing is a Linux compatible environment.  Running programs under Cygwin can be pretty difficult because of the path name insanity that often ensues.
>
> Sent from my iPhone
>
> On May 13, 2012, at 6:33 PM, mahout-newbie <ra...@gmail.com> wrote:
>
>> When I try to run the 20 newsgroups example in local mode under cygwin,
>> mahout creates a work directory under /tmp but then an exception is thrown
>> indicating that the training data set is not found. It is looking for for it
>> with a DOS style path. I see the directory and data when I navigate to that
>> directory.
>>
>> Can someone please shed some light what I'm missing here? The output is
>> below:
>>
>> Thanks
>>
>> $ sh classify-20newsgroups.sh
>> Please select a number to choose the corresponding task to run
>> 1. naivebayes
>> 2. sgd
>> 3. clean -- cleans up the work area in /tmp/mahout-work-Administrator
>> Enter your choice : 1
>> ok. You chose 1 and we'll use naivebayes
>> *creating work directory at /tmp/mahout-work-Administrator*
>> Preparing Training Data
>> MAHOUT_LOCAL is set, so we don't add HADOOP_CONF_DIR to classpath.
>> no HADOOP_HOME set, running locally
>> SLF4J: Class path contains multiple SLF4J bindings.
>> SLF4J: Found binding in
>> [jar:file:/C:/cygwin/usr/local/mahout/examples/target/mahout-examples-0.6-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in
>> [jar:file:/C:/cygwin/usr/local/mahout/examples/target/dependency/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: Found binding in
>> [jar:file:/C:/cygwin/usr/local/mahout/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
>> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
>> explanation.
>> 12/05/13 20:26:46 WARN driver.MahoutDriver: No
>> org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups.props found on
>> classpath, will use command-line arguments only
>> Exception in thread "main" java.io.FileNotFoundException: *Can't find input
>> directory \tmp\mahout-work-Administrator\20news-bydate\20news-bydate-train*
>>        at
>> org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups.main(PrepareTwentyNewsgroups.java:92)
>>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>        at
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>>        at
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>>        at java.lang.reflect.Method.invoke(Method.java:597)
>>        at
>> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>>        at
>> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>>        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
>>
>> --
>> View this message in context: http://lucene.472066.n3.nabble.com/Exception-running-20newsgroups-example-tp3983590.html
>> Sent from the Mahout User List mailing list archive at Nabble.com.



-- 
Lance Norskog
goksron@gmail.com

Re: Exception running 20newsgroups example

Posted by Ted Dunning <te...@gmail.com>.
What you are missing is a Linux compatible environment.  Running programs under Cygwin can be pretty difficult because of the path name insanity that often ensues. 

Sent from my iPhone

On May 13, 2012, at 6:33 PM, mahout-newbie <ra...@gmail.com> wrote:

> When I try to run the 20 newsgroups example in local mode under cygwin,
> mahout creates a work directory under /tmp but then an exception is thrown
> indicating that the training data set is not found. It is looking for for it
> with a DOS style path. I see the directory and data when I navigate to that
> directory.
> 
> Can someone please shed some light what I'm missing here? The output is
> below:
> 
> Thanks
> 
> $ sh classify-20newsgroups.sh
> Please select a number to choose the corresponding task to run
> 1. naivebayes
> 2. sgd
> 3. clean -- cleans up the work area in /tmp/mahout-work-Administrator
> Enter your choice : 1
> ok. You chose 1 and we'll use naivebayes
> *creating work directory at /tmp/mahout-work-Administrator*
> Preparing Training Data
> MAHOUT_LOCAL is set, so we don't add HADOOP_CONF_DIR to classpath.
> no HADOOP_HOME set, running locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in
> [jar:file:/C:/cygwin/usr/local/mahout/examples/target/mahout-examples-0.6-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/C:/cygwin/usr/local/mahout/examples/target/dependency/slf4j-jcl-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in
> [jar:file:/C:/cygwin/usr/local/mahout/examples/target/dependency/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an
> explanation.
> 12/05/13 20:26:46 WARN driver.MahoutDriver: No
> org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups.props found on
> classpath, will use command-line arguments only
> Exception in thread "main" java.io.FileNotFoundException: *Can't find input
> directory \tmp\mahout-work-Administrator\20news-bydate\20news-bydate-train*
>        at
> org.apache.mahout.classifier.bayes.PrepareTwentyNewsgroups.main(PrepareTwentyNewsgroups.java:92)
>        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>        at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>        at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>        at java.lang.reflect.Method.invoke(Method.java:597)
>        at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>        at
> org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:188)
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Exception-running-20newsgroups-example-tp3983590.html
> Sent from the Mahout User List mailing list archive at Nabble.com.