You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by HU Wenjing A <We...@alcatel-sbell.com.cn> on 2012/06/14 12:01:01 UTC

try to fix hadoop streaming bug

Hi all,

   I tried to fix the hadoop streaming bug for the version 0.21.0 (streaming overrides user given output key and value types). I saw some useful message about this issue on https://issues.apache.org/jira/browse/MAPREDUCE-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel and modified some code following the patch file. I modified and compiled the code. It seems only about thirteen .java files need to be modified. But when I tried to replace the old .classes files using the new ones, I can only find StreamJob.class in ${hadoop_home}/ /root/hadoop-0.21.0/mapred/contrib/streaming/hadoop-0.21.0-streaming.jar.  And the other twelve modified files could't be found in any jar files in the ${hadoop_home} directory.
Then I executed the command "bin/hadoop jar mapred/contrib/streaming/hadoop-0.21.0-streaming.jar  -mapper org.apache.hadoop.mapred.lib.IdentityMapper  -reducer NONE -input input -output output" with the modified streaming jar and just received some error information:

Exception in thread "main" java.lang.ClassNotFoundException: -mapper
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:247)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:185)


And I think this error should have some thing to do with the modification of the StreamJob.java. But I saw someone says they have fixed the streaming override issue using the patch.
So, Could anyone give me some suggestion about this issue? Or just give me another way to fix the bug?

Thanks in advance!  : )

Thanks & best regards,
Wenjing


Re: try to fix hadoop streaming bug

Posted by Robert Evans <ev...@yahoo-inc.com>.
It looks like your jar's MANIFEST file is missing the Main Class attribute.  It may have something to do with how you created the updated jar you are using.  Hadoop is trying to run the jar, and because it did not find the MainClass in the jar's manifest it thinks you are supplying it as the next argument, and looking for the -mapper class, which obviously does not exist.  You can either update the MANIFEST when you build the jar, or you can supply the main class on the command line like

hadoop <path>/hadoop-streaming.jar org.apache.hadoop.streaming.HadoopStreaming -mapper ...

--Bobby Evans


On 6/14/12 5:01 AM, "HU Wenjing A" <We...@alcatel-sbell.com.cn> wrote:

Hi all,

   I tried to fix the hadoop streaming bug for the version 0.21.0 (streaming overrides user given output key and value types). I saw some useful message about this issue on https://issues.apache.org/jira/browse/MAPREDUCE-1888?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel and modified some code following the patch file. I modified and compiled the code. It seems only about thirteen .java files need to be modified. But when I tried to replace the old .classes files using the new ones, I can only find StreamJob.class in ${hadoop_home}/ /root/hadoop-0.21.0/mapred/contrib/streaming/hadoop-0.21.0-streaming.jar.  And the other twelve modified files could't be found in any jar files in the ${hadoop_home} directory.
Then I executed the command "bin/hadoop jar mapred/contrib/streaming/hadoop-0.21.0-streaming.jar  -mapper org.apache.hadoop.mapred.lib.IdentityMapper  -reducer NONE -input input -output output" with the modified streaming jar and just received some error information:

Exception in thread "main" java.lang.ClassNotFoundException: -mapper
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:306)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:247)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:247)
        at org.apache.hadoop.util.RunJar.main(RunJar.java:185)


And I think this error should have some thing to do with the modification of the StreamJob.java. But I saw someone says they have fixed the streaming override issue using the patch.
So, Could anyone give me some suggestion about this issue? Or just give me another way to fix the bug?

Thanks in advance!  : )

Thanks & best regards,
Wenjing