You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Jingui Lee <le...@gmail.com> on 2011/12/20 14:14:19 UTC

Map and Reduce process hang out at 0%

Hi,all

I am running hadoop 0.23 on 5 nodes.

I could run any YARN application or Mapreduce Job on this cluster before.

But, after I changed Resourcemanager Node from node4 to node5, when I run
applications (I have modified property referenced in configure file), map
and reduce process will hang up at 0% until I killed the application.

I don't know why.

terminal output:

bin/hadoop jar hadoop-mapreduce-examples-0.23.0.jar wordcount
/share/stdinput/1k /testread/hao
11/12/20 20:20:29 INFO mapreduce.Cluster: Cannot pick
org.apache.hadoop.mapred.LocalClientProtocolProvider as the
ClientProtocolProvider - returned null protocol
11/12/20 20:20:29 INFO ipc.YarnRPC: Creating YarnRPC for
org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Connecting to
ResourceManager at dn5/192.168.3.204:50010
11/12/20 20:20:29 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc
proxy for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Connected to
ResourceManager at dn5/192.168.3.204:50010
11/12/20 20:20:29 WARN conf.Configuration: fs.default.name is deprecated.
Instead, use fs.defaultFS
11/12/20 20:20:29 WARN conf.Configuration: mapred.used.genericoptionsparser
is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
11/12/20 20:20:29 INFO input.FileInputFormat: Total input paths to process
: 1
11/12/20 20:20:29 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
11/12/20 20:20:29 WARN snappy.LoadSnappy: Snappy native library not loaded
11/12/20 20:20:29 INFO mapreduce.JobSubmitter: number of splits:1
11/12/20 20:20:29 INFO mapred.YARNRunner: AppMaster capability = memory:
2048
11/12/20 20:20:29 INFO mapred.YARNRunner: Command to launch container for
ApplicationMaster is : $JAVA_HOME/bin/java
-Dlog4j.configuration=container-log4j.properties
-Dyarn.app.mapreduce.container.log.dir=<LOG_DIR>
-Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
-Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout
2><LOG_DIR>/stderr
11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Submitted application
application_1324372145692_0004 to ResourceManager
11/12/20 20:20:29 INFO mapred.ClientCache: Connecting to HistoryServer at:
dn5:10020
11/12/20 20:20:29 INFO ipc.YarnRPC: Creating YarnRPC for
org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
11/12/20 20:20:29 INFO mapred.ClientCache: Connected to HistoryServer at:
dn5:10020
11/12/20 20:20:29 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc
proxy for protocol interface
org.apache.hadoop.mapreduce.v2.api.MRClientProtocol
11/12/20 20:20:30 INFO mapreduce.Job: Running job: job_1324372145692_0004
11/12/20 20:20:31 INFO mapreduce.Job:  map 0% reduce 0%

Re: Map and Reduce process hang out at 0%

Posted by Jingui Lee <le...@gmail.com>.
Shuffle is configured and I could run MR Job on this 5-nodes cluster before
I move Resourcemanger node  from dn4 to dn5.


After analysis logs under $HADOOP_LOG_DIR and review terminal output, I
find the most possible reason that caused map task hang out at 0% is that:
NodeManager won't run correctly because connection is refused caused by
google protocol buffer, so slave in cluster could not communicate with
master(Resourcemanager), job will not run.


By the way, I compile and make/make install protocol buf at the same time
on 5 nodes using parallel ssh tool. May be dn5 environment have something
wrong.

Thanks a lot !



most important part of nodemanager node output is here:
2011-12-21 21:23:21,142 ERROR service.CompositeService
(CompositeService.java:start(72)) - Error starting services
org.apache.hadoop.yarn.server.nodemanager.NodeManager
org.apache.avro.AvroRuntimeException:
java.lang.reflect.UndeclaredThrowableException
    at
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:132)
    at
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
    at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.start(NodeManager.java:163)
    at
org.apache.hadoop.yarn.server.nodemanager.NodeManager.main(NodeManager.java:231)
Caused by: java.lang.reflect.UndeclaredThrowableException
    at
org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:66)
    at
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.registerWithRM(NodeStatusUpdaterImpl.java:161)
    at
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.start(NodeStatusUpdaterImpl.java:128)
    ... 3 more
Caused by: com.google.protobuf.ServiceException: java.net.ConnectException:
Call From dn3/192.168.3.227 to dn4:50030 failed on connection exception:
java.net.ConnectException: Connection refused; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused
    at
org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:139)
    at $Proxy14.registerNodeManager(Unknown Source)
    at
org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.registerNodeManager(ResourceTrackerPBClientImpl.java:59)
    ... 5 more
Caused by: java.net.ConnectException: Call From dn3/192.168.3.227 to
dn4:50030 failed on connection exception: java.net.ConnectException:
Connection refused; For more details see:
http://wiki.apache.org/hadoop/ConnectionRefused
    at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:617)
    at org.apache.hadoop.ipc.Client.call(Client.java:1089)
    at
org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:136)
    ... 7 more
Caused by: java.net.ConnectException: Connection refused
    at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
    at
sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592)
    at
org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
    at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:419)
    at
org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:460)
    at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:557)
    at org.apache.hadoop.ipc.Client$Connection.access$2000(Client.java:205)
    at org.apache.hadoop.ipc.Client.getConnection(Client.java:1195)
    at org.apache.hadoop.ipc.Client.call(Client.java:1065)
    ... 8 more
2011-12-21 21:23:21,143 INFO  event.AsyncDispatcher
(AsyncDispatcher.java:run(71)) - AsyncDispatcher thread interrupted
java.lang.InterruptedException
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:2017)
    at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2052)
    at
java.util.concurrent.LinkedBlockingQueue.take(LinkedBlockingQueue.java:386)
    at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:69)
    at java.lang.Thread.run(Thread.java:636)
2011-12-21 21:23:21,144 INFO  service.AbstractService
(AbstractService.java:stop(75)) - Service:Dispatcher is stopped.

Re: Map and Reduce process hang out at 0%

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
Actually after MAPREDUCE-2652, all tasks will fail if they get containers
on nodes with no shuffle service with an "invalid shuffle port error". The
job should eventually fail.

Maybe the OP was running into something else, will wait for him to return
with more information.

Thanks,
+Vinod

On Tue, Dec 20, 2011 at 11:29 AM, Robert Evans <ev...@yahoo-inc.com> wrote:

>  Should we file a JIRA so that the MapReduce AM blows up in a more
> obvious fashion if Shuffle is not configured?
>
> --Bobby Evans
>
>
> On 12/20/11 12:30 PM, "Vinod Kumar Vavilapalli" <vi...@hortonworks.com>
> wrote:
>
>
> I guess you don't have shuffle configured. Can you look at the application
> master (AM) logs and paste logs from there? There will a link to AM logs on
> the application page of RM web UI.
>
> You can also check and see if shuffle is configured. From the INSTALL file
> (
> http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/INSTALL
> ),
>
> Step 7) Setup config: for running mapreduce applications, which now are in
> user land, you need to setup nodemanager with the following configuration
> in your yarn-site.xml before you start the nodemanager.
>     <property>
>       <name>yarn.nodemanager.aux-services</name>
>       <value>mapreduce.shuffle</value>
>     </property>
>
>     <property>
>       <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
>       <value>org.apache.hadoop.mapred.ShuffleHandler</value>
>     </property>
>
> Step 8) Modify mapred-site.xml to use yarn framework
>     <property>
>       <name> mapreduce.framework.name <http://mapreduce.framework.name>
> </name>
>       <value>yarn</value>
>     </property>
>
>
> +Vinod
>
>
> On Tue, Dec 20, 2011 at 8:12 AM, Arun C Murthy <ac...@hortonworks.com>
> wrote:
>
> Can you look at the /nodes web-page to see how many nodes you have?
>
> Also, do you see any exceptions in the ResourceManager logs on dn5?
>
> Arun
>
> On Dec 20, 2011, at 5:14 AM, Jingui Lee wrote:
>
> Hi,all
>
> I am running hadoop 0.23 on 5 nodes.
>
> I could run any YARN application or Mapreduce Job on this cluster before.
>
> But, after I changed Resourcemanager Node from node4 to node5, when I run
> applications (I have modified property referenced in configure file), map
> and reduce process will hang up at 0% until I killed the application.
>
> I don't know why.
>
> terminal output:
>
> bin/hadoop jar hadoop-mapreduce-examples-0.23.0.jar wordcount
> /share/stdinput/1k /testread/hao
> 11/12/20 20:20:29 INFO mapreduce.Cluster: Cannot pick
> org.apache.hadoop.mapred.LocalClientProtocolProvider as the
> ClientProtocolProvider - returned null protocol
> 11/12/20 20:20:29 INFO ipc.YarnRPC: Creating YarnRPC for
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
> 11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Connecting to
> ResourceManager at dn5/192.168.3.204:50010 <http://192.168.3.204:50010/>
> 11/12/20 20:20:29 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc
> proxy for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
> 11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Connected to
> ResourceManager at dn5/192.168.3.204:50010 <http://192.168.3.204:50010/>
> 11/12/20 20:20:29 WARN conf.Configuration: fs.default.name <
> http://fs.default.name/>  is deprecated. Instead, use fs.defaultFS
>
> 11/12/20 20:20:29 WARN conf.Configuration:
> mapred.used.genericoptionsparser is deprecated. Instead, use
> mapreduce.client.genericoptionsparser.used
> 11/12/20 20:20:29 INFO input.FileInputFormat: Total input paths to process
> : 1
> 11/12/20 20:20:29 INFO util.NativeCodeLoader: Loaded the native-hadoop
> library
> 11/12/20 20:20:29 WARN snappy.LoadSnappy: Snappy native library not loaded
> 11/12/20 20:20:29 INFO mapreduce.JobSubmitter: number of splits:1
> 11/12/20 20:20:29 INFO mapred.YARNRunner: AppMaster capability = memory:
> 2048
> 11/12/20 20:20:29 INFO mapred.YARNRunner: Command to launch container for
> ApplicationMaster is : $JAVA_HOME/bin/java
> -Dlog4j.configuration=container-log4j.properties
> -Dyarn.app.mapreduce.container.log.dir=<LOG_DIR>
> -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
> -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout
> 2><LOG_DIR>/stderr
> 11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Submitted application
> application_1324372145692_0004 to ResourceManager
> 11/12/20 20:20:29 INFO mapred.ClientCache: Connecting to HistoryServer at:
> dn5:10020
> 11/12/20 20:20:29 INFO ipc.YarnRPC: Creating YarnRPC for
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
> 11/12/20 20:20:29 INFO mapred.ClientCache: Connected to HistoryServer at:
> dn5:10020
> 11/12/20 20:20:29 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc
> proxy for protocol interface
> org.apache.hadoop.mapreduce.v2.api.MRClientProtocol
> 11/12/20 20:20:30 INFO mapreduce.Job: Running job: job_1324372145692_0004
> 11/12/20 20:20:31 INFO mapreduce.Job:  map 0% reduce 0%
>
>
>
>
>
>

Re: Map and Reduce process hang out at 0%

Posted by Robert Evans <ev...@yahoo-inc.com>.
Should we file a JIRA so that the MapReduce AM blows up in a more obvious fashion if Shuffle is not configured?

--Bobby Evans

On 12/20/11 12:30 PM, "Vinod Kumar Vavilapalli" <vi...@hortonworks.com> wrote:


I guess you don't have shuffle configured. Can you look at the application master (AM) logs and paste logs from there? There will a link to AM logs on the application page of RM web UI.

You can also check and see if shuffle is configured. From the INSTALL file (http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/INSTALL),
Step 7) Setup config: for running mapreduce applications, which now are in user land, you need to setup nodemanager with the following configuration in your yarn-site.xml before you start the nodemanager.
    <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce.shuffle</value>
    </property>

    <property>
      <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
      <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>

Step 8) Modify mapred-site.xml to use yarn framework
    <property>
      <name> mapreduce.framework.name <http://mapreduce.framework.name> </name>
      <value>yarn</value>
    </property>

+Vinod


On Tue, Dec 20, 2011 at 8:12 AM, Arun C Murthy <ac...@hortonworks.com> wrote:
Can you look at the /nodes web-page to see how many nodes you have?

Also, do you see any exceptions in the ResourceManager logs on dn5?

Arun

On Dec 20, 2011, at 5:14 AM, Jingui Lee wrote:

Hi,all

I am running hadoop 0.23 on 5 nodes.

I could run any YARN application or Mapreduce Job on this cluster before.

But, after I changed Resourcemanager Node from node4 to node5, when I run applications (I have modified property referenced in configure file), map and reduce process will hang up at 0% until I killed the application.

I don't know why.

terminal output:

bin/hadoop jar hadoop-mapreduce-examples-0.23.0.jar wordcount /share/stdinput/1k /testread/hao
11/12/20 20:20:29 INFO mapreduce.Cluster: Cannot pick org.apache.hadoop.mapred.LocalClientProtocolProvider as the ClientProtocolProvider - returned null protocol
11/12/20 20:20:29 INFO ipc.YarnRPC: Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Connecting to ResourceManager at dn5/192.168.3.204:50010 <http://192.168.3.204:50010/>
11/12/20 20:20:29 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Connected to ResourceManager at dn5/192.168.3.204:50010 <http://192.168.3.204:50010/>
11/12/20 20:20:29 WARN conf.Configuration: fs.default.name <http://fs.default.name/>  is deprecated. Instead, use fs.defaultFS
11/12/20 20:20:29 WARN conf.Configuration: mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
11/12/20 20:20:29 INFO input.FileInputFormat: Total input paths to process : 1
11/12/20 20:20:29 INFO util.NativeCodeLoader: Loaded the native-hadoop library
11/12/20 20:20:29 WARN snappy.LoadSnappy: Snappy native library not loaded
11/12/20 20:20:29 INFO mapreduce.JobSubmitter: number of splits:1
11/12/20 20:20:29 INFO mapred.YARNRunner: AppMaster capability = memory: 2048
11/12/20 20:20:29 INFO mapred.YARNRunner: Command to launch container for ApplicationMaster is : $JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.mapreduce.container.log.dir=<LOG_DIR> -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr
11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Submitted application application_1324372145692_0004 to ResourceManager
11/12/20 20:20:29 INFO mapred.ClientCache: Connecting to HistoryServer at: dn5:10020
11/12/20 20:20:29 INFO ipc.YarnRPC: Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
11/12/20 20:20:29 INFO mapred.ClientCache: Connected to HistoryServer at: dn5:10020
11/12/20 20:20:29 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.mapreduce.v2.api.MRClientProtocol
11/12/20 20:20:30 INFO mapreduce.Job: Running job: job_1324372145692_0004
11/12/20 20:20:31 INFO mapreduce.Job:  map 0% reduce 0%






Re: Map and Reduce process hang out at 0%

Posted by Vinod Kumar Vavilapalli <vi...@hortonworks.com>.
I guess you don't have shuffle configured. Can you look at the application
master (AM) logs and paste logs from there? There will a link to AM logs on
the application page of RM web UI.

You can also check and see if shuffle is configured. From the INSTALL file (
http://svn.apache.org/repos/asf/hadoop/common/trunk/hadoop-mapreduce-project/INSTALL
),

Step 7) Setup config: for running mapreduce applications, which now are in
user land, you need to setup nodemanager with the following configuration
in your yarn-site.xml before you start the nodemanager.
    <property>
      <name>yarn.nodemanager.aux-services</name>
      <value>mapreduce.shuffle</value>
    </property>

    <property>
      <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
      <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>

Step 8) Modify mapred-site.xml to use yarn framework
    <property>
      <name> mapreduce.framework.name</name>
      <value>yarn</value>
    </property>


+Vinod


On Tue, Dec 20, 2011 at 8:12 AM, Arun C Murthy <ac...@hortonworks.com> wrote:

> Can you look at the /nodes web-page to see how many nodes you have?
>
> Also, do you see any exceptions in the ResourceManager logs on dn5?
>
> Arun
>
> On Dec 20, 2011, at 5:14 AM, Jingui Lee wrote:
>
> Hi,all
>
> I am running hadoop 0.23 on 5 nodes.
>
> I could run any YARN application or Mapreduce Job on this cluster before.
>
> But, after I changed Resourcemanager Node from node4 to node5, when I run
> applications (I have modified property referenced in configure file), map
> and reduce process will hang up at 0% until I killed the application.
>
> I don't know why.
>
> terminal output:
>
> bin/hadoop jar hadoop-mapreduce-examples-0.23.0.jar wordcount
> /share/stdinput/1k /testread/hao
> 11/12/20 20:20:29 INFO mapreduce.Cluster: Cannot pick
> org.apache.hadoop.mapred.LocalClientProtocolProvider as the
> ClientProtocolProvider - returned null protocol
> 11/12/20 20:20:29 INFO ipc.YarnRPC: Creating YarnRPC for
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
> 11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Connecting to
> ResourceManager at dn5/192.168.3.204:50010
> 11/12/20 20:20:29 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc
> proxy for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
> 11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Connected to
> ResourceManager at dn5/192.168.3.204:50010
> 11/12/20 20:20:29 WARN conf.Configuration: fs.default.name is deprecated.
> Instead, use fs.defaultFS
> 11/12/20 20:20:29 WARN conf.Configuration:
> mapred.used.genericoptionsparser is deprecated. Instead, use
> mapreduce.client.genericoptionsparser.used
> 11/12/20 20:20:29 INFO input.FileInputFormat: Total input paths to process
> : 1
> 11/12/20 20:20:29 INFO util.NativeCodeLoader: Loaded the native-hadoop
> library
> 11/12/20 20:20:29 WARN snappy.LoadSnappy: Snappy native library not loaded
> 11/12/20 20:20:29 INFO mapreduce.JobSubmitter: number of splits:1
> 11/12/20 20:20:29 INFO mapred.YARNRunner: AppMaster capability = memory:
> 2048
> 11/12/20 20:20:29 INFO mapred.YARNRunner: Command to launch container for
> ApplicationMaster is : $JAVA_HOME/bin/java
> -Dlog4j.configuration=container-log4j.properties
> -Dyarn.app.mapreduce.container.log.dir=<LOG_DIR>
> -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA
> -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout
> 2><LOG_DIR>/stderr
> 11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Submitted application
> application_1324372145692_0004 to ResourceManager
> 11/12/20 20:20:29 INFO mapred.ClientCache: Connecting to HistoryServer at:
> dn5:10020
> 11/12/20 20:20:29 INFO ipc.YarnRPC: Creating YarnRPC for
> org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
> 11/12/20 20:20:29 INFO mapred.ClientCache: Connected to HistoryServer at:
> dn5:10020
> 11/12/20 20:20:29 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc
> proxy for protocol interface
> org.apache.hadoop.mapreduce.v2.api.MRClientProtocol
> 11/12/20 20:20:30 INFO mapreduce.Job: Running job: job_1324372145692_0004
> 11/12/20 20:20:31 INFO mapreduce.Job:  map 0% reduce 0%
>
>
>
>

Re: Map and Reduce process hang out at 0%

Posted by Arun C Murthy <ac...@hortonworks.com>.
Can you look at the /nodes web-page to see how many nodes you have?

Also, do you see any exceptions in the ResourceManager logs on dn5?

Arun

On Dec 20, 2011, at 5:14 AM, Jingui Lee wrote:

> Hi,all
> 
> I am running hadoop 0.23 on 5 nodes.
> 
> I could run any YARN application or Mapreduce Job on this cluster before.
> 
> But, after I changed Resourcemanager Node from node4 to node5, when I run applications (I have modified property referenced in configure file), map and reduce process will hang up at 0% until I killed the application.
> 
> I don't know why.
> 
> terminal output:
> 
> bin/hadoop jar hadoop-mapreduce-examples-0.23.0.jar wordcount /share/stdinput/1k /testread/hao
> 11/12/20 20:20:29 INFO mapreduce.Cluster: Cannot pick org.apache.hadoop.mapred.LocalClientProtocolProvider as the ClientProtocolProvider - returned null protocol
> 11/12/20 20:20:29 INFO ipc.YarnRPC: Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
> 11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Connecting to ResourceManager at dn5/192.168.3.204:50010
> 11/12/20 20:20:29 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.yarn.api.ClientRMProtocol
> 11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Connected to ResourceManager at dn5/192.168.3.204:50010
> 11/12/20 20:20:29 WARN conf.Configuration: fs.default.name is deprecated. Instead, use fs.defaultFS
> 11/12/20 20:20:29 WARN conf.Configuration: mapred.used.genericoptionsparser is deprecated. Instead, use mapreduce.client.genericoptionsparser.used
> 11/12/20 20:20:29 INFO input.FileInputFormat: Total input paths to process : 1
> 11/12/20 20:20:29 INFO util.NativeCodeLoader: Loaded the native-hadoop library
> 11/12/20 20:20:29 WARN snappy.LoadSnappy: Snappy native library not loaded
> 11/12/20 20:20:29 INFO mapreduce.JobSubmitter: number of splits:1
> 11/12/20 20:20:29 INFO mapred.YARNRunner: AppMaster capability = memory: 2048
> 11/12/20 20:20:29 INFO mapred.YARNRunner: Command to launch container for ApplicationMaster is : $JAVA_HOME/bin/java -Dlog4j.configuration=container-log4j.properties -Dyarn.app.mapreduce.container.log.dir=<LOG_DIR> -Dyarn.app.mapreduce.container.log.filesize=0 -Dhadoop.root.logger=INFO,CLA -Xmx1536m org.apache.hadoop.mapreduce.v2.app.MRAppMaster 1><LOG_DIR>/stdout 2><LOG_DIR>/stderr 
> 11/12/20 20:20:29 INFO mapred.ResourceMgrDelegate: Submitted application application_1324372145692_0004 to ResourceManager
> 11/12/20 20:20:29 INFO mapred.ClientCache: Connecting to HistoryServer at: dn5:10020
> 11/12/20 20:20:29 INFO ipc.YarnRPC: Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC
> 11/12/20 20:20:29 INFO mapred.ClientCache: Connected to HistoryServer at: dn5:10020
> 11/12/20 20:20:29 INFO ipc.HadoopYarnRPC: Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.mapreduce.v2.api.MRClientProtocol
> 11/12/20 20:20:30 INFO mapreduce.Job: Running job: job_1324372145692_0004
> 11/12/20 20:20:31 INFO mapreduce.Job:  map 0% reduce 0%
> 
>