You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by jeanbabyxu <je...@aexp.com> on 2012/03/21 19:11:47 UTC

Error Running mahout-core-0.5-job.jar

I tried to run mahout in Hadoop using the following command, 

[jxu13@lppma692 hadoop-0.20.2]$ bin/hadoop jar
/opt/mapr/mahout/mahout-0.5/core/target/mahout-core-0.5-job.jar
org.apache.mahout.cf.taste.hadoop.ite
m.RecommenderJob -Dmapred.input.dir=input/input.txt
--Dmapred.output.dir=output --usersFile input/users.txt --booleanData

But got the following error msg:
12/03/21 10:55:59 ERROR common.AbstractJob: Unexpected
--Dmapred.output.dir=output while processing Job-Specific Options:
usage: <command> [Generic Options] [Job-Specific Options]

I have copied in the input and users files to HDFS:
[jxu13@lppma692 hadoop-0.20.2]$ bin/hadoop fs -ls /user/jxu13/input
Found 2 items
-rwxrwxrwx   3 jxu13 rimegg          1 2012-03-20 12:08
/user/jxu13/input/users.txt
-rwxrwxrwx   3 jxu13 rimegg       1732 2012-03-20 11:58
/user/jxu13/input/input.txt

What is the default output directory: is it /user/jxu13/output?

[jxu13@lppma692 hadoop-0.20.2]$ bin/hadoop fs -ls /user/jxu13/output
ls: Cannot access /user/jxu13/output: No such file or directory.

Is this a Hadoop set-up/configuration problem?

Any advice would be appreciated.


--
View this message in context: http://lucene.472066.n3.nabble.com/Error-Running-mahout-core-0-5-job-jar-tp3846385p3846385.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Error Running mahout-core-0.5-job.jar

Posted by Sean Owen <sr...@gmail.com>.
It's -Dmapred.output.dir=output not --Dmapred.output.dir=output (one dash),
but, that's not even the problem.
I don't think you can specify -D options this way, as they are JVM
arguments. You need to configure these in Hadoop's config files.
This is not specific to Mahout.

On Wed, Mar 21, 2012 at 6:11 PM, jeanbabyxu <je...@aexp.com> wrote:

> I tried to run mahout in Hadoop using the following command,
>
> [jxu13@lppma692 hadoop-0.20.2]$ bin/hadoop jar
> /opt/mapr/mahout/mahout-0.5/core/target/mahout-core-0.5-job.jar
> org.apache.mahout.cf.taste.hadoop.ite
> m.RecommenderJob -Dmapred.input.dir=input/input.txt
> --Dmapred.output.dir=output --usersFile input/users.txt --booleanData
>
> But got the following error msg:
> 12/03/21 10:55:59 ERROR common.AbstractJob: Unexpected
> --Dmapred.output.dir=output while processing Job-Specific Options:
> usage: <command> [Generic Options] [Job-Specific Options]
>
> I have copied in the input and users files to HDFS:
> [jxu13@lppma692 hadoop-0.20.2]$ bin/hadoop fs -ls /user/jxu13/input
> Found 2 items
> -rwxrwxrwx   3 jxu13 rimegg          1 2012-03-20 12:08
> /user/jxu13/input/users.txt
> -rwxrwxrwx   3 jxu13 rimegg       1732 2012-03-20 11:58
> /user/jxu13/input/input.txt
>
> What is the default output directory: is it /user/jxu13/output?
>
> [jxu13@lppma692 hadoop-0.20.2]$ bin/hadoop fs -ls /user/jxu13/output
> ls: Cannot access /user/jxu13/output: No such file or directory.
>
> Is this a Hadoop set-up/configuration problem?
>
> Any advice would be appreciated.
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Error-Running-mahout-core-0-5-job-jar-tp3846385p3846385.html
> Sent from the Mahout User List mailing list archive at Nabble.com.
>

Re: Error Running mahout-core-0.5-job.jar

Posted by Sean Owen <sr...@gmail.com>.
That pretty much means what it says = delete temp.

On Thu, Mar 22, 2012 at 6:06 PM, jeanbabyxu <je...@aexp.com> wrote:
> Thanks so much tianwild for pointing out the typo. Now it's running but I got
> a different error msg:
>
> Exception in thread "main"
> org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory
> temp/itemIDIndex already exists
>
> Any idea how to resolve this issue?
>
> Many thanks.
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Error-Running-mahout-core-0-5-job-jar-tp3846385p3849301.html
> Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Error Running mahout-core-0.5-job.jar

Posted by Isabel Drost <is...@apache.org>.
On 22.03.2012 Paritosh Ranjan wrote:
> You can also use HadoopUtil.delete(conf, paths) api or use the -ow
> (override) flag ( if available for that job).

If that flag isn't available for the job you are looking at, that might be a 
good chance to submit a bug report and mark it as "suitable for beginners" - 
just mark it as MAHOUT_INTRO_CONTRIBUTE  in JIRA.

Isabel

Re: Error Running mahout-core-0.5-job.jar

Posted by Sean Owen <sr...@gmail.com>.
Yes. This prevents accidental overwrite, and mimics how Hadoop/HDFS
generally act.

On Thu, Mar 22, 2012 at 6:58 PM, jeanbabyxu <je...@aexp.com> wrote:
> I was able to manually clear out the output directory by using
>
> bin/hadoop dfs -rmr output.
>
> But do we have to remove all content in the output directory manually every
> time we run mahout?
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Error-Running-mahout-core-0-5-job-jar-tp3846385p3849480.html
> Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Error Running mahout-core-0.5-job.jar

Posted by Paritosh Ranjan <pr...@xebia.com>.
You can also use HadoopUtil.delete(conf, paths) api or use the -ow 
(override) flag ( if available for that job).

On 23-03-2012 00:28, jeanbabyxu wrote:
> I was able to manually clear out the output directory by using
>
> bin/hadoop dfs -rmr output.
>
> But do we have to remove all content in the output directory manually every
> time we run mahout?
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/Error-Running-mahout-core-0-5-job-jar-tp3846385p3849480.html
> Sent from the Mahout User List mailing list archive at Nabble.com.


Re: Error Running mahout-core-0.5-job.jar

Posted by tianwild <ti...@hotmail.com>.
yes, I did this every time, and clear the tmp/ folder also

hadoop fs -rmr /user/**/output
hadoop fs -rmr /user/**/temp/*

--
View this message in context: http://lucene.472066.n3.nabble.com/Error-Running-mahout-core-0-5-job-jar-tp3846385p3860869.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Error Running mahout-core-0.5-job.jar

Posted by jeanbabyxu <je...@aexp.com>.
I was able to manually clear out the output directory by using 

bin/hadoop dfs -rmr output.

But do we have to remove all content in the output directory manually every
time we run mahout? 

--
View this message in context: http://lucene.472066.n3.nabble.com/Error-Running-mahout-core-0-5-job-jar-tp3846385p3849480.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Error Running mahout-core-0.5-job.jar

Posted by jeanbabyxu <je...@aexp.com>.
Thanks so much tianwild for pointing out the typo. Now it's running but I got
a different error msg:

Exception in thread "main"
org.apache.hadoop.mapred.FileAlreadyExistsException: Output directory
temp/itemIDIndex already exists

Any idea how to resolve this issue?

Many thanks.

--
View this message in context: http://lucene.472066.n3.nabble.com/Error-Running-mahout-core-0-5-job-jar-tp3846385p3849301.html
Sent from the Mahout User List mailing list archive at Nabble.com.

Re: Error Running mahout-core-0.5-job.jar

Posted by tianwild <ti...@hotmail.com>.
the correct Dmapred is -Dmapred.dir=output, not --Dmapred.dir=output

--
View this message in context: http://lucene.472066.n3.nabble.com/Error-Running-mahout-core-0-5-job-jar-tp3846385p3847789.html
Sent from the Mahout User List mailing list archive at Nabble.com.