You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Shrish <s....@lancaster.ac.uk> on 2011/07/22 17:18:49 UTC
Problem with Hadoop Streaming -file option for Java class files
I am struggling with a issue in hadoop streaming in the "-file" option.
First I tried the very basic example in streaming:
hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar
contrib/streaming/hadoop-streaming-0.20.203.0.jar -mapper
org.apache.hadoop.mapred.lib.IdentityMapper \ -reducer /bin/wc -inputformat
KeyValueTextInputFormat -input gutenberg/* -output gutenberg-outputtstchk22
which worked absolutely fine.
Then I copied the IdentityMapper.java source code and compiled it. Then I placed
this class file in the /home/hadoop folder and executed the following in the
terminal.
hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar
contrib/streaming/hadoop-streaming-0.20.203.0.jar -file ~/IdentityMapper.class
-mapper IdentityMapper.class \ -reducer /bin/wc -inputformat
KeyValueTextInputFormat -input gutenberg/* -output gutenberg-outputtstch6
The execution failed with the following error in the stderr file:
java.io.IOException: Cannot run program "IdentityMapper.class":
java.io.IOException: error=2, No such file or directory
Then again I tried it by copying the IdentityMapper.class file in the hadoop
installation and executed the following:
hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar
contrib/streaming/hadoop-streaming-0.20.203.0.jar -file IdentityMapper.class
-mapper IdentityMapper.class \ -reducer /bin/wc -inputformat
KeyValueTextInputFormat -input gutenberg/* -output gutenberg-outputtstch5
But unfortunately again I got the same error.
It would be great if you can help me with it as I cannot move any further
without overcoming this.
***I am trying this after I tried hadoop-streaming for a different class file
which failed, so to identify if there is something wrong with the class file
itself or with the way I am using it
Thanking you in anticipation
Re: Problem with Hadoop Streaming -file option for Java class files
Posted by Robert Evans <ev...@YAHOO-INC.COM>.
>From a practical standpoint if you just leave off the -mapper you will get an IdentityMapper being run in streaming. I don't believe that -mapper will understand something.class as a class file that should be loaded and used as the mapper. I think you need to specify the class, including the package to get it to load like you did with org.apache.hadoop.mapred.lib.IdentityMapper. I am not sure what changes you made to IdentiyMapper.java before recompiling but in order to get it on the classpath you probably need to ship it as a jar not as a single file. I believe that you can use -libJars to ship it and add it to the classpath of the JVM, but I am not positive of that.
--Bobby Evans
On 7/22/11 10:18 AM, "Shrish" <s....@lancaster.ac.uk> wrote:
I am struggling with a issue in hadoop streaming in the "-file" option.
First I tried the very basic example in streaming:
hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar
contrib/streaming/hadoop-streaming-0.20.203.0.jar -mapper
org.apache.hadoop.mapred.lib.IdentityMapper \ -reducer /bin/wc -inputformat
KeyValueTextInputFormat -input gutenberg/* -output gutenberg-outputtstchk22
which worked absolutely fine.
Then I copied the IdentityMapper.java source code and compiled it. Then I placed
this class file in the /home/hadoop folder and executed the following in the
terminal.
hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar
contrib/streaming/hadoop-streaming-0.20.203.0.jar -file ~/IdentityMapper.class
-mapper IdentityMapper.class \ -reducer /bin/wc -inputformat
KeyValueTextInputFormat -input gutenberg/* -output gutenberg-outputtstch6
The execution failed with the following error in the stderr file:
java.io.IOException: Cannot run program "IdentityMapper.class":
java.io.IOException: error=2, No such file or directory
Then again I tried it by copying the IdentityMapper.class file in the hadoop
installation and executed the following:
hadoop@ubuntu:/usr/local/hadoop$ bin/hadoop jar
contrib/streaming/hadoop-streaming-0.20.203.0.jar -file IdentityMapper.class
-mapper IdentityMapper.class \ -reducer /bin/wc -inputformat
KeyValueTextInputFormat -input gutenberg/* -output gutenberg-outputtstch5
But unfortunately again I got the same error.
It would be great if you can help me with it as I cannot move any further
without overcoming this.
***I am trying this after I tried hadoop-streaming for a different class file
which failed, so to identify if there is something wrong with the class file
itself or with the way I am using it
Thanking you in anticipation