You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Andrew Palumbo (JIRA)" <ji...@apache.org> on 2015/03/18 01:53:38 UTC

[jira] [Commented] (MAHOUT-1592) bin/maout's seqdirectory doesn't work when MAHOUT_LOCAL non-empty

    [ https://issues.apache.org/jira/browse/MAHOUT-1592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14366420#comment-14366420 ] 

Andrew Palumbo commented on MAHOUT-1592:
----------------------------------------

I can't reproduce this and have not seen any other issues with this.

{code}
$ mahout seqdirectory -i /tmp/mahout-work-andy/20news-all -o /tmp/mahout-work-andy/20news-seq -ow
MAHOUT_LOCAL is set, so we don't add HADOOP_CONF_DIR to classpath.
MAHOUT_LOCAL is set, running locally
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/home/andy/sandbox/mahout/examples/target/mahout-examples-1.0-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/home/andy/sandbox/mahout/examples/target/dependency/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
log4j:WARN No appenders could be found for logger (org.apache.mahout.common.AbstractJob).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
{code}

> bin/maout's seqdirectory doesn't work when MAHOUT_LOCAL non-empty
> -----------------------------------------------------------------
>
>                 Key: MAHOUT-1592
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1592
>             Project: Mahout
>          Issue Type: Bug
>          Components: Integration
>    Affects Versions: 0.9
>         Environment: Linux
>            Reporter: Alex Ott
>            Priority: Minor
>              Labels: legacy
>
> trying to run seqdirectory with MAHOUT_LOCAL set to non-empty lead to following error:
> {noformat}
> >mahout seqdirectory -i ${WORK_DIR}/20news-all -o ${WORK_DIR}/20news-seq -ow   13:48 0
> MAHOUT_LOCAL is set, so we don't add HADOOP_CONF_DIR to classpath.                                           
> MAHOUT_LOCAL is set, running locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:file:/home/ott/work/mahout-head/examples/target/mahout-examples-1.0-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in [jar:file:/home/ott/work/mahout-head/examples/target/dependency/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 14/07/08 13:50:39 INFO common.AbstractJob: Command line arguments: {--charset=[UTF-8], --chunkSize=[64], --endPhase=[2147483647], --fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter], --input=[/home/ott/work/exps/mh/20news-all], --keyPrefix=[], --method=[mapreduce], --output=[/home/ott/work/exps/mh/20news-seq], --overwrite=null, --startPhase=[0], --tempDir=[temp]}
> 14/07/08 13:50:39 INFO common.HadoopUtil: Deleting /home/ott/work/exps/mh/20news-seq
> Exception in thread "main" java.io.FileNotFoundException: File does not exist: /home/ott/work/url-cat-exps/mh/20news-all
>         at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:558)
>         at org.apache.mahout.text.SequenceFilesFromDirectory.runMapReduce(SequenceFilesFromDirectory.java:162)
>         at org.apache.mahout.text.SequenceFilesFromDirectory.run(SequenceFilesFromDirectory.java:91)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>         at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>         at org.apache.mahout.text.SequenceFilesFromDirectory.main(SequenceFilesFromDirectory.java:65)
>         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>         at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>         at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>         at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:195)
> {noformat}
> But directory exists in the specified folder:
> {noformat}
> ott@mercury:work/exps/mh\>ls -lsd 20news-all                                                            13:50 0
> 4 drwxrwxr-x 22 ott ott 4096 Jul  8 08:49 20news-all/
> {noformat}
> If I explicitly specify {{-xm sequential}} flag, then there is no error, but the task isn't performed at all:
> {noformat}
> MAHOUT_LOCAL is set, so we don't add HADOOP_CONF_DIR to classpath.
> MAHOUT_LOCAL is set, running locally
> SLF4J: Class path contains multiple SLF4J bindings.
> SLF4J: Found binding in [jar:file:/home/ott/work/mahout-head/examples/target/mahout-examples-1.0-SNAPSHOT-job.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: Found binding in [jar:file:/home/ott/work/mahout-head/examples/target/dependency/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]
> SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
> SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
> 14/07/08 13:54:19 INFO common.AbstractJob: Command line arguments: {--charset=[UTF-8], --chunkSize=[64], --endPhase=[2147483647], --fileFilterClass=[org.apache.mahout.text.PrefixAdditionFilter], --input=[/home/ott/work/exps/mh/20news-all], --keyPrefix=[], --method=[sequential], --output=[/home/ott/work/exps/mh/20news-seq], --overwrite=null, --startPhase=[0], --tempDir=[temp]}
> 14/07/08 13:54:19 INFO driver.MahoutDriver: Program took 548 ms (Minutes: 0.009133333333333334)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)