You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Mishari SH (JIRA)" <ji...@apache.org> on 2014/12/09 20:16:12 UTC

[jira] [Created] (MAHOUT-1632) Please help me im stuck on using 20 newsgroups example on Windows

Mishari SH created MAHOUT-1632:
----------------------------------

             Summary: Please help me im stuck on using 20 newsgroups example on Windows
                 Key: MAHOUT-1632
                 URL: https://issues.apache.org/jira/browse/MAHOUT-1632
             Project: Mahout
          Issue Type: Question
            Reporter: Mishari SH


Hello there, I've been using hadoop & mahout on my windows OS and I started the hadoop cluster before starting the mahout in order to use the cluster for it, then, I did start the mahout to test the 20newsgroups example but it throws an exception as not a valid DFS filename as show below in details from the beginning :

Microsoft Windows [Version 6.1.7601]
Copyright (c) 2009 Microsoft Corporation.  All rights reserved.

C:\Users\Admin>cd\

C:\>cd mahout

C:\mahout>cd examples

C:\mahout\examples>cd bin

C:\mahout\examples\bin>classify-20newsgroups.sh
Welcome to Git (version 1.9.4-preview20140815)


Run 'git help git' to display the help index.
Run 'git help <command>' to display help for specific commands.
Please select a number to choose the corresponding task to run
1. cnaivebayes
2. naivebayes
3. sgd
4. clean -- cleans up the work area in /tmp/mahout-work-
Enter your choice : 2
ok. You chose 2 and we'll use naivebayes
creating work directory at /tmp/mahout-work-
+ echo 'Preparing 20newsgroups data'
Preparing 20newsgroups data
+ rm -rf /tmp/mahout-work-/20news-all
+ mkdir /tmp/mahout-work-/20news-all
+ cp -R /tmp/mahout-work-/20news-bydate/20news-bydate-test/alt.atheism /tmp/maho
ut-work-/20news-bydate/20news-bydate-test/comp.graphics /tmp/mahout-work-/20news
-bydate/20news-bydate-test/comp.os.ms-windows.misc /tmp/mahout-work-/20news-byda
te/20news-bydate-test/comp.sys.ibm.pc.hardware /tmp/mahout-work-/20news-bydate/2
0news-bydate-test/comp.sys.mac.hardware /tmp/mahout-work-/20news-bydate/20news-b
ydate-test/comp.windows.x /tmp/mahout-work-/20news-bydate/20news-bydate-test/mis
c.forsale /tmp/mahout-work-/20news-bydate/20news-bydate-test/rec.autos /tmp/maho
ut-work-/20news-bydate/20news-bydate-test/rec.motorcycles /tmp/mahout-work-/20ne
ws-bydate/20news-bydate-test/rec.sport.baseball /tmp/mahout-work-/20news-bydate/
20news-bydate-test/rec.sport.hockey /tmp/mahout-work-/20news-bydate/20news-bydat
e-test/sci.crypt /tmp/mahout-work-/20news-bydate/20news-bydate-test/sci.electron
ics /tmp/mahout-work-/20news-bydate/20news-bydate-test/sci.med /tmp/mahout-work-
/20news-bydate/20news-bydate-test/sci.space /tmp/mahout-work-/20news-bydate/20ne
ws-bydate-test/soc.religion.christian /tmp/mahout-work-/20news-bydate/20news-byd
ate-test/talk.politics.guns /tmp/mahout-work-/20news-bydate/20news-bydate-test/t
alk.politics.mideast /tmp/mahout-work-/20news-bydate/20news-bydate-test/talk.pol
itics.misc /tmp/mahout-work-/20news-bydate/20news-bydate-test/talk.religion.misc
 /tmp/mahout-work-/20news-bydate/20news-bydate-train/alt.atheism /tmp/mahout-wor
k-/20news-bydate/20news-bydate-train/comp.graphics /tmp/mahout-work-/20news-byda
te/20news-bydate-train/comp.os.ms-windows.misc /tmp/mahout-work-/20news-bydate/2
0news-bydate-train/comp.sys.ibm.pc.hardware /tmp/mahout-work-/20news-bydate/20ne
ws-bydate-train/comp.sys.mac.hardware /tmp/mahout-work-/20news-bydate/20news-byd
ate-train/comp.windows.x /tmp/mahout-work-/20news-bydate/20news-bydate-train/mis
c.forsale /tmp/mahout-work-/20news-bydate/20news-bydate-train/rec.autos /tmp/mah
out-work-/20news-bydate/20news-bydate-train/rec.motorcycles /tmp/mahout-work-/20
news-bydate/20news-bydate-train/rec.sport.baseball /tmp/mahout-work-/20news-byda
te/20news-bydate-train/rec.sport.hockey /tmp/mahout-work-/20news-bydate/20news-b
ydate-train/sci.crypt /tmp/mahout-work-/20news-bydate/20news-bydate-train/sci.el
ectronics /tmp/mahout-work-/20news-bydate/20news-bydate-train/sci.med /tmp/mahou
t-work-/20news-bydate/20news-bydate-train/sci.space /tmp/mahout-work-/20news-byd
ate/20news-bydate-train/soc.religion.christian /tmp/mahout-work-/20news-bydate/2
0news-bydate-train/talk.politics.guns /tmp/mahout-work-/20news-bydate/20news-byd
ate-train/talk.politics.mideast /tmp/mahout-work-/20news-bydate/20news-bydate-tr
ain/talk.politics.misc /tmp/mahout-work-/20news-bydate/20news-bydate-train/talk.
religion.misc /tmp/mahout-work-/20news-all
+ '[' 'C:\hadp' '!=' '' ']'
+ '[' '' == '' ']'
+ echo 'Copying 20newsgroups data to HDFS'
Copying 20newsgroups data to HDFS
+ set +e
+ 'C:\hadp/bin/hadoop' dfs -rmr /tmp/mahout-work-/20news-all
/c/hadp/etc/hadoop/hadoop-env.sh: line 103: /c/hadp/bin: is a directory
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

/c/hadp/etc/hadoop/hadoop-env.sh: line 103: /c/hadp/bin: is a directory
rmr: DEPRECATED: Please use 'rm -r' instead.
-rmr: Pathname /C:/Users/Admin/AppData/Local/Temp/mahout-work-/20news-all from h
dfs://localhost:9000/C:/Users/Admin/AppData/Local/Temp/mahout-work-/20news-all i
s not a valid DFS filename.
Usage: hadoop fs [generic options] -rmr
+ set -e
+ 'C:\hadp/bin/hadoop' dfs -put /tmp/mahout-work-/20news-all /tmp/mahout-work-/2
0news-all
/c/hadp/etc/hadoop/hadoop-env.sh: line 103: /c/hadp/bin: is a directory
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

/c/hadp/etc/hadoop/hadoop-env.sh: line 103: /c/hadp/bin: is a directory
-put: Pathname /C:/Users/Admin/AppData/Local/Temp/mahout-work-/20news-all from h
dfs://localhost:9000/C:/Users/Admin/AppData/Local/Temp/mahout-work-/20news-all i
s not a valid DFS filename.
Usage: hadoop fs [generic options] -put [-f] [-p] <localsrc> ... <dst>
+ echo 'Creating sequence files from 20newsgroups data'
Creating sequence files from 20newsgroups data
+ ./bin/mahout seqdirectory -i /tmp/mahout-work-/20news-all -o /tmp/mahout-work-
/20news-seq -ow
/c/hadp/etc/hadoop/hadoop-env.sh: line 103: /c/hadp/bin: is a directory
Running on hadoop, using \hadp/bin/hadoop and HADOOP_CONF_DIR=
MAHOUT-JOB: /c/mahout/examples/target/mahout-examples-0.9-job.jar
/c/hadp/etc/hadoop/hadoop-env.sh: line 103: /c/hadp/bin: is a directory
14/12/09 21:48:57 INFO common.AbstractJob: Command line arguments: {--charset=[U
TF-8], --chunkSize=[64], --endPhase=[2147483647], --fileFilterClass=[org.apache.
mahout.text.PrefixAdditionFilter], --input=[C:/Users/Admin/AppData/Local/Temp/ma
hout-work-/20news-all], --keyPrefix=[], --method=[mapreduce], --output=[C:/Users
/Admin/AppData/Local/Temp/mahout-work-/20news-seq], --overwrite=null, --startPha
se=[0], --tempDir=[temp]}
Exception in thread "main" java.lang.IllegalArgumentException: Pathname /C:/User
s/Admin/AppData/Local/Temp/mahout-work-/20news-seq from C:/Users/Admin/AppData/L
ocal/Temp/mahout-work-/20news-seq is not a valid DFS filename.
        at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedF
ileSystem.java:187)
        at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFi
leSystem.java:101)
        at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFil
eSystem.java:1068)
        at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFil
eSystem.java:1064)
        at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkRes
olver.java:81)
        at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(Distribute
dFileSystem.java:1064)
        at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1398)
        at org.apache.mahout.common.HadoopUtil.delete(HadoopUtil.java:192)
        at org.apache.mahout.common.HadoopUtil.delete(HadoopUtil.java:200)
        at org.apache.mahout.text.SequenceFilesFromDirectory.run(SequenceFilesFr
omDirectory.java:84)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
        at org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesF
romDirectory.java:65)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
mDriver.java:72)
        at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
        at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
        at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:39)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:212)

C:\mahout\examples\bin>

Please help me I'm new to the big data tools and I need this issue resolved as soon as possible.

Thank you,,,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)