You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by amit handa <am...@gmail.com> on 2009/06/08 07:43:14 UTC

Problem specifying archives with hadoop streaming

Hi,

I am a newbie to Hadoop. I am planning to run a sample hadoop streaming
application for testing custom inputformat with streaming.

I use the following command:

hadoop jar $HADOOP_HOME/hadoop-streaming.jar \

-input "/user/ahanda/input/images" \

-output "/user/ahanda/output" \

-mapper "org.apache.hadoop.mapred.lib.IdentityMapper" \

-reducer NONE \

-inputformat "MyFileInputFormat" \

-outputformat "org.apache.hadoop.mapred.SequenceFileAsBinaryOutputFormat" \

-archives hdfs://localhost:9000/user/ahanda/lib/myfile.jar#wholefile \

I get the following output :

09/06/07 22:26:35 ERROR streaming.StreamJob: Unexpected -archives while
processing
-input|-output|-mapper|-combiner|-reducer|-file|-dfs|-jt|-additionalconfspec|-inputformat|-outputformat|-partitioner|-numReduceTasks|-inputreader|-mapdebug|-reducedebug|||-cacheFile|-cacheArchive|-verbose|-info|-debug|-inputtagged|-help
Usage: $HADOOP_HOME/bin/hadoop jar \
          $HADOOP_HOME/hadoop-streaming.jar [options]
<rest of the output about specifing options>

MyFileInputFormat is my own custom input format which i have put in
myfile.jar and copied it to /user/ahanda/lib directory.
I am using hadoop 0.20.0 version. Can somebody help ? Am i missing something
?

Thanks,
Amit