You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Sunil G (JIRA)" <ji...@apache.org> on 2014/04/24 14:49:20 UTC

[jira] [Commented] (YARN-1980) Possible NPE in KillAMPreemptionPolicy related to ProportionalCapacityPreemptionPolicy

    [ https://issues.apache.org/jira/browse/YARN-1980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13979662#comment-13979662 ] 

Sunil G commented on YARN-1980:
-------------------------------

The problem is because while setting the preemption message from RM side, always null is set for strictContainers as 3rd param (FicaSchedulerApp).

    return new Allocation(allocation.getContainerList(), getHeadroom(), null,
      currentContPreemption, Collections.singletonList(rr),
      allocation.getNMTokenList());

I was able to see above code from trunk also. So evenif I use new version, same problem may be there.

In AM side, KillAMPreemptionPolicy directly do below check.

for (PreemptionContainer c :
        preemptionRequests.getStrictContract().getContainers()) {
      killContainer(ctxt, c);
    }

Here preemptionRequests.getStrictContract() seems coming NULL.

Also getStrictContract() can send NULL as per the code in PreemptionMessagePBImpl.
So I think we can add a NULL check for safety in such cases and come directly to normal container lists.

Attaching patch also for this. pls review

> Possible NPE in KillAMPreemptionPolicy related to ProportionalCapacityPreemptionPolicy
> --------------------------------------------------------------------------------------
>
>                 Key: YARN-1980
>                 URL: https://issues.apache.org/jira/browse/YARN-1980
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 2.3.0
>            Reporter: Sunil G
>         Attachments: Yarn-1980.1.patch
>
>
> I configured KillAMPreemptionPolicy for My Application Master and tried to check preemption of queues.
> In one scenario I have seen below NPE in my AM
> 014-04-24 15:11:08,860 ERROR [RMCommunicator Allocator] org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: ERROR IN CONTACTING RM. 
> java.lang.NullPointerException
> 	at org.apache.hadoop.mapreduce.v2.app.rm.preemption.KillAMPreemptionPolicy.preempt(KillAMPreemptionPolicy.java:57)
> 	at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.getResources(RMContainerAllocator.java:662)
> 	at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.heartbeat(RMContainerAllocator.java:246)
> 	at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator$1.run(RMCommunicator.java:267)
> 	at java.lang.Thread.run(Thread.java:662)
> I was using 2.2.0 and merged MAPREDUCE-5189 to see how AM preemption works.



--
This message was sent by Atlassian JIRA
(v6.2#6252)