You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Bibin A Chundatt (JIRA)" <ji...@apache.org> on 2015/07/29 08:55:04 UTC

[jira] [Commented] (YARN-3990) AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected

    [ https://issues.apache.org/jira/browse/YARN-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14645567#comment-14645567 ] 

Bibin A Chundatt commented on YARN-3990:
----------------------------------------

[~rohithsharma]

{code}
2015-07-29 19:39:03,409 | INFO  | ResourceManager Event Processor | Added node host-7:26009 clusterResource: <memory:178400, vCores:64> | CapacityScheduler.java:1358
2015-07-29 19:39:03,409 | INFO  | AsyncDispatcher event handler | Size of event-queue is 3000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,409 | DEBUG | Socket Reader #1 for port 26003 |  got #2125 | Server.java:1790
2015-07-29 19:39:03,409 | DEBUG | IPC Server handler 7 on 26003 | IPC Server handler 7 on 26003: org.apache.hadoop.yarn.server.api.ResourceTrackerPB.nodeHeartbeat from 172.168.100.7:24999 Call#2125 Retry#0 for RpcKind RPC_PROTOCOL_BUFFER | Server.java:2058
2015-07-29 19:39:03,410 | DEBUG | IPC Server handler 7 on 26003 | PrivilegedAction as:mapred/hadoop.hadoop.com@HADOOP.COM (auth:KERBEROS) from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:2082) | UserGroupInformation.java:1696
2015-07-29 19:39:03,410 | INFO  | AsyncDispatcher event handler | Size of event-queue is 4000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,410 | INFO  | AsyncDispatcher event handler | Size of event-queue is 5000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,411 | INFO  | AsyncDispatcher event handler | Size of event-queue is 6000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,412 | INFO  | AsyncDispatcher event handler | Size of event-queue is 7000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,412 | INFO  | IPC Server handler 7 on 26003 | Size of event-queue is 7000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,412 | INFO  | AsyncDispatcher event handler | Size of event-queue is 8000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,413 | INFO  | AsyncDispatcher event handler | Size of event-queue is 9000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,414 | INFO  | AsyncDispatcher event handler | Size of event-queue is 10000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,414 | INFO  | AsyncDispatcher event handler | Size of event-queue is 11000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,415 | DEBUG | IPC Server handler 7 on 26003 | Served: nodeHeartbeat queueTime= 1 procesingTime= 5 | ProtobufRpcEngine.java:631
2015-07-29 19:39:03,415 | INFO  | AsyncDispatcher event handler | Size of event-queue is 12000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,416 | DEBUG | IPC Server handler 7 on 26003 | Adding saslServer wrapped token of size 100 as call response. | Server.java:2460
2015-07-29 19:39:03,416 | DEBUG | IPC Server handler 7 on 26003 | IPC Server handler 7 on 26003: responding to org.apache.hadoop.yarn.server.api.ResourceTrackerPB.nodeHeartbeat from 172.168.100.7:24999 Call#2125 Retry#0 | Server.java:994
2015-07-29 19:39:03,416 | INFO  | AsyncDispatcher event handler | Size of event-queue is 13000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,416 | DEBUG | IPC Server handler 7 on 26003 | IPC Server handler 7 on 26003: responding to org.apache.hadoop.yarn.server.api.ResourceTrackerPB.nodeHeartbeat from 172.168.100.7:24999 Call#2125 Retry#0 Wrote 118 bytes. | Server.java:1013
2015-07-29 19:39:03,416 | INFO  | AsyncDispatcher event handler | Size of event-queue is 14000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,417 | INFO  | AsyncDispatcher event handler | Size of event-queue is 15000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,418 | INFO  | AsyncDispatcher event handler | Size of event-queue is 16000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,419 | INFO  | AsyncDispatcher event handler | Size of event-queue is 17000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,419 | INFO  | AsyncDispatcher event handler | Size of event-queue is 18000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,420 | INFO  | AsyncDispatcher event handler | Size of event-queue is 19000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,421 | INFO  | AsyncDispatcher event handler | Size of event-queue is 20000 | AsyncDispatcher.java:235
2015-07-29 19:39:03,421 | DEBUG | AsyncDispatcher event handler | Dispatching the event org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppNodeUpdateEvent.EventType: NODE_UPDATE | AsyncDispatcher.java:166
2015-07-29 19:39:03,421 | DEBUG | AsyncDispatcher event handler | Processing event for application_1438101193238_224125 of type NODE_UPDATE | RMAppImpl.java:741
2015-07-29 19:39:03,421 | DEBUG | AsyncDispatcher event handler | Dispatching the event org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppNodeUpdateEvent.EventType: NODE_UPDATE | AsyncDispatcher.java:166
2015-07-29 19:39:03,421 | DEBUG | AsyncDispatcher event handler | Processing event for application_1438101193238_224126 of type NODE_UPDATE | RMAppImpl.java:741
2015-07-29 19:39:03,422 | DEBUG | AsyncDispatcher event handler | Dispatching the event org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppNodeUpdateEvent.EventType: NODE_UPDATE | AsyncDispatcher.

{code}
Was able to reproduce the same. Attaching logs

> AsyncDispatcher may overloaded with RMAppNodeUpdateEvent when Node is connected/disconnected
> --------------------------------------------------------------------------------------------
>
>                 Key: YARN-3990
>                 URL: https://issues.apache.org/jira/browse/YARN-3990
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>            Reporter: Rohith Sharma K S
>            Assignee: Bibin A Chundatt
>            Priority: Critical
>
> Whenever node is added or removed, NodeListManager sends RMAppNodeUpdateEvent to all the applications that are in the rmcontext. But for finished/killed/failed applications it is not required to send these events. Additional check for wheather app is finished/killed/failed would minimizes the unnecessary events
> {code}
>   public void handle(NodesListManagerEvent event) {
>     RMNode eventNode = event.getNode();
>     switch (event.getType()) {
>     case NODE_UNUSABLE:
>       LOG.debug(eventNode + " reported unusable");
>       unusableRMNodesConcurrentSet.add(eventNode);
>       for(RMApp app: rmContext.getRMApps().values()) {
>         this.rmContext
>             .getDispatcher()
>             .getEventHandler()
>             .handle(
>                 new RMAppNodeUpdateEvent(app.getApplicationId(), eventNode,
>                     RMAppNodeUpdateType.NODE_UNUSABLE));
>       }
>       break;
>     case NODE_USABLE:
>       if (unusableRMNodesConcurrentSet.contains(eventNode)) {
>         LOG.debug(eventNode + " reported usable");
>         unusableRMNodesConcurrentSet.remove(eventNode);
>       }
>       for (RMApp app : rmContext.getRMApps().values()) {
>         this.rmContext
>             .getDispatcher()
>             .getEventHandler()
>             .handle(
>                 new RMAppNodeUpdateEvent(app.getApplicationId(), eventNode,
>                     RMAppNodeUpdateType.NODE_USABLE));
>       }
>       break;
>     default:
>       LOG.error("Ignoring invalid eventtype " + event.getType());
>     }
>   }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)