You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-issues@hadoop.apache.org by "Sandy Ryza (JIRA)" <ji...@apache.org> on 2013/01/08 01:30:14 UTC

[jira] [Commented] (MAPREDUCE-4922) Request with multiple data local nodes can cause NPE in AppSchedulingInfo

    [ https://issues.apache.org/jira/browse/MAPREDUCE-4922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13546446#comment-13546446 ] 

Sandy Ryza commented on MAPREDUCE-4922:
---------------------------------------

2012-12-20 05:45:12,554 FATAL org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error in handling event type NODE_UPDATE to the scheduler
java.lang.NullPointerException
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocateNodeLocal(AppSchedulingInfo.java:262)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.AppSchedulingInfo.allocate(AppSchedulingInfo.java:223)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSSchedulerApp.allocate(FSSchedulerApp.java:551)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable.assignContainer(AppSchedulable.java:250)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.AppSchedulable.assignContainer(AppSchedulable.java:318)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FSLeafQueue.assignContainer(FSLeafQueue.java:180)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.nodeUpdate(FairScheduler.java:780)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:843)
        at org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler.handle(FairScheduler.java:98)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager$SchedulerEventDispatcher$EventProcessor.run(ResourceManager.java:330)
        at java.lang.Thread.run(Thread.java:662)
2012-12-20 05:45:12,556 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Exiting, bbye..
                
> Request with multiple data local nodes can cause NPE in AppSchedulingInfo
> -------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4922
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4922
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: applicationmaster, mr-am, mrv2, scheduler
>    Affects Versions: 2.0.2-alpha
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>
> With the way that the schedulers work, each request for a container on a node must consist of 3 ResourceRequests - one on the node, one on the rack, and one with *.
> AppSchedulingInfo tracks the outstanding requests.  When a node is assigned a node-local container, allocateNodeLocal decrements the outstanding requests at each level - node, rack, and *.  If the rack requests reach 0, it removes the mapping.
> A mapreduce task with multiple data local nodes submits multiple container requests, one for each node.  It also submits one for each unique rack, and one for *.  If there are fewer unique racks than data local nodes, this means that fewer rack-local ResourceRequests will be submitted than node-local ResourceRequests, so the rack-local mapping will be deleted before all the node-local requests are allocated and an NPE will come up the next time a node-local request from that rack is allocated.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira