You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Andrew Palumbo (JIRA)" <ji...@apache.org> on 2015/03/06 02:36:38 UTC

[jira] [Updated] (MAHOUT-1632) Please help me im stuck on using 20 newsgroups example on Windows

     [ https://issues.apache.org/jira/browse/MAHOUT-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Palumbo updated MAHOUT-1632:
-----------------------------------
    Labels: legacy  (was: )

> Please help me im stuck on using 20 newsgroups example on Windows
> -----------------------------------------------------------------
>
>                 Key: MAHOUT-1632
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1632
>             Project: Mahout
>          Issue Type: Question
>            Reporter: Mishari SH
>              Labels: legacy
>
> Hello there, I've been using hadoop & mahout on my windows OS and I started the hadoop cluster before starting the mahout in order to use the cluster for it, then, I did start the mahout to test the 20newsgroups example but it throws an exception as not a valid DFS filename as show below in details from the beginning :
> Microsoft Windows [Version 6.1.7601]
> Copyright (c) 2009 Microsoft Corporation.  All rights reserved.
> C:\Users\Admin>cd\
> C:\>cd mahout
> C:\mahout>cd examples
> C:\mahout\examples>cd bin
> C:\mahout\examples\bin>classify-20newsgroups.sh
> Welcome to Git (version 1.9.4-preview20140815)
> Run 'git help git' to display the help index.
> Run 'git help <command>' to display help for specific commands.
> Please select a number to choose the corresponding task to run
> 1. cnaivebayes
> 2. naivebayes
> 3. sgd
> 4. clean -- cleans up the work area in /tmp/mahout-work-
> Enter your choice : 2
> ok. You chose 2 and we'll use naivebayes
> creating work directory at /tmp/mahout-work-
> + echo 'Preparing 20newsgroups data'
> Preparing 20newsgroups data
> + rm -rf /tmp/mahout-work-/20news-all
> + mkdir /tmp/mahout-work-/20news-all
> + cp -R /tmp/mahout-work-/20news-bydate/20news-bydate-test/alt.atheism /tmp/maho
> ut-work-/20news-bydate/20news-bydate-test/comp.graphics /tmp/mahout-work-/20news
> -bydate/20news-bydate-test/comp.os.ms-windows.misc /tmp/mahout-work-/20news-byda
> te/20news-bydate-test/comp.sys.ibm.pc.hardware /tmp/mahout-work-/20news-bydate/2
> 0news-bydate-test/comp.sys.mac.hardware /tmp/mahout-work-/20news-bydate/20news-b
> ydate-test/comp.windows.x /tmp/mahout-work-/20news-bydate/20news-bydate-test/mis
> c.forsale /tmp/mahout-work-/20news-bydate/20news-bydate-test/rec.autos /tmp/maho
> ut-work-/20news-bydate/20news-bydate-test/rec.motorcycles /tmp/mahout-work-/20ne
> ws-bydate/20news-bydate-test/rec.sport.baseball /tmp/mahout-work-/20news-bydate/
> 20news-bydate-test/rec.sport.hockey /tmp/mahout-work-/20news-bydate/20news-bydat
> e-test/sci.crypt /tmp/mahout-work-/20news-bydate/20news-bydate-test/sci.electron
> ics /tmp/mahout-work-/20news-bydate/20news-bydate-test/sci.med /tmp/mahout-work-
> /20news-bydate/20news-bydate-test/sci.space /tmp/mahout-work-/20news-bydate/20ne
> ws-bydate-test/soc.religion.christian /tmp/mahout-work-/20news-bydate/20news-byd
> ate-test/talk.politics.guns /tmp/mahout-work-/20news-bydate/20news-bydate-test/t
> alk.politics.mideast /tmp/mahout-work-/20news-bydate/20news-bydate-test/talk.pol
> itics.misc /tmp/mahout-work-/20news-bydate/20news-bydate-test/talk.religion.misc
>  /tmp/mahout-work-/20news-bydate/20news-bydate-train/alt.atheism /tmp/mahout-wor
> k-/20news-bydate/20news-bydate-train/comp.graphics /tmp/mahout-work-/20news-byda
> te/20news-bydate-train/comp.os.ms-windows.misc /tmp/mahout-work-/20news-bydate/2
> 0news-bydate-train/comp.sys.ibm.pc.hardware /tmp/mahout-work-/20news-bydate/20ne
> ws-bydate-train/comp.sys.mac.hardware /tmp/mahout-work-/20news-bydate/20news-byd
> ate-train/comp.windows.x /tmp/mahout-work-/20news-bydate/20news-bydate-train/mis
> c.forsale /tmp/mahout-work-/20news-bydate/20news-bydate-train/rec.autos /tmp/mah
> out-work-/20news-bydate/20news-bydate-train/rec.motorcycles /tmp/mahout-work-/20
> news-bydate/20news-bydate-train/rec.sport.baseball /tmp/mahout-work-/20news-byda
> te/20news-bydate-train/rec.sport.hockey /tmp/mahout-work-/20news-bydate/20news-b
> ydate-train/sci.crypt /tmp/mahout-work-/20news-bydate/20news-bydate-train/sci.el
> ectronics /tmp/mahout-work-/20news-bydate/20news-bydate-train/sci.med /tmp/mahou
> t-work-/20news-bydate/20news-bydate-train/sci.space /tmp/mahout-work-/20news-byd
> ate/20news-bydate-train/soc.religion.christian /tmp/mahout-work-/20news-bydate/2
> 0news-bydate-train/talk.politics.guns /tmp/mahout-work-/20news-bydate/20news-byd
> ate-train/talk.politics.mideast /tmp/mahout-work-/20news-bydate/20news-bydate-tr
> ain/talk.politics.misc /tmp/mahout-work-/20news-bydate/20news-bydate-train/talk.
> religion.misc /tmp/mahout-work-/20news-all
> + '[' 'C:\hadp' '!=' '' ']'
> + '[' '' == '' ']'
> + echo 'Copying 20newsgroups data to HDFS'
> Copying 20newsgroups data to HDFS
> + set +e
> + 'C:\hadp/bin/hadoop' dfs -rmr /tmp/mahout-work-/20news-all
> /c/hadp/etc/hadoop/hadoop-env.sh: line 103: /c/hadp/bin: is a directory
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
> /c/hadp/etc/hadoop/hadoop-env.sh: line 103: /c/hadp/bin: is a directory
> rmr: DEPRECATED: Please use 'rm -r' instead.
> -rmr: Pathname /C:/Users/Admin/AppData/Local/Temp/mahout-work-/20news-all from h
> dfs://localhost:9000/C:/Users/Admin/AppData/Local/Temp/mahout-work-/20news-all i
> s not a valid DFS filename.
> Usage: hadoop fs [generic options] -rmr
> + set -e
> + 'C:\hadp/bin/hadoop' dfs -put /tmp/mahout-work-/20news-all /tmp/mahout-work-/2
> 0news-all
> /c/hadp/etc/hadoop/hadoop-env.sh: line 103: /c/hadp/bin: is a directory
> DEPRECATED: Use of this script to execute hdfs command is deprecated.
> Instead use the hdfs command for it.
> /c/hadp/etc/hadoop/hadoop-env.sh: line 103: /c/hadp/bin: is a directory
> -put: Pathname /C:/Users/Admin/AppData/Local/Temp/mahout-work-/20news-all from h
> dfs://localhost:9000/C:/Users/Admin/AppData/Local/Temp/mahout-work-/20news-all i
> s not a valid DFS filename.
> Usage: hadoop fs [generic options] -put [-f] [-p] <localsrc> ... <dst>
> + echo 'Creating sequence files from 20newsgroups data'
> Creating sequence files from 20newsgroups data
> + ./bin/mahout seqdirectory -i /tmp/mahout-work-/20news-all -o /tmp/mahout-work-
> /20news-seq -ow
> /c/hadp/etc/hadoop/hadoop-env.sh: line 103: /c/hadp/bin: is a directory
> Running on hadoop, using \hadp/bin/hadoop and HADOOP_CONF_DIR=
> MAHOUT-JOB: /c/mahout/examples/target/mahout-examples-0.9-job.jar
> /c/hadp/etc/hadoop/hadoop-env.sh: line 103: /c/hadp/bin: is a directory
> 14/12/09 21:48:57 INFO common.AbstractJob: Command line arguments: {--charset=[U
> TF-8], --chunkSize=[64], --endPhase=[2147483647], --fileFilterClass=[org.apache.
> mahout.text.PrefixAdditionFilter], --input=[C:/Users/Admin/AppData/Local/Temp/ma
> hout-work-/20news-all], --keyPrefix=[], --method=[mapreduce], --output=[C:/Users
> /Admin/AppData/Local/Temp/mahout-work-/20news-seq], --overwrite=null, --startPha
> se=[0], --tempDir=[temp]}
> Exception in thread "main" java.lang.IllegalArgumentException: Pathname /C:/User
> s/Admin/AppData/Local/Temp/mahout-work-/20news-seq from C:/Users/Admin/AppData/L
> ocal/Temp/mahout-work-/20news-seq is not a valid DFS filename.
>         at org.apache.hadoop.hdfs.DistributedFileSystem.getPathName(DistributedF
> ileSystem.java:187)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.access$000(DistributedFi
> leSystem.java:101)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFil
> eSystem.java:1068)
>         at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFil
> eSystem.java:1064)
>         at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkRes
> olver.java:81)
>         at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(Distribute
> dFileSystem.java:1064)
>         at org.apache.hadoop.fs.FileSystem.exists(FileSystem.java:1398)
>         at org.apache.mahout.common.HadoopUtil.delete(HadoopUtil.java:192)
>         at org.apache.mahout.common.HadoopUtil.delete(HadoopUtil.java:200)
>         at org.apache.mahout.text.SequenceFilesFromDirectory.run(SequenceFilesFr
> omDirectory.java:84)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:84)
>         at org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesF
> romDirectory.java:65)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
> java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
> sorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(Progra
> mDriver.java:72)
>         at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:145)
>         at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:153)
>         at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
> java:39)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
> sorImpl.java:25)
>         at java.lang.reflect.Method.invoke(Method.java:597)
>         at org.apache.hadoop.util.RunJar.main(RunJar.java:212)
> C:\mahout\examples\bin>
> Please help me I'm new to the big data tools and I need this issue resolved as soon as possible.
> Thank you,,,



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)