You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Varun Saxena (JIRA)" <ji...@apache.org> on 2014/12/31 07:37:13 UTC

[jira] [Commented] (YARN-2978) ResourceManager crashes with NPE while getting queue info

    [ https://issues.apache.org/jira/browse/YARN-2978?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14261983#comment-14261983 ] 

Varun Saxena commented on YARN-2978:
------------------------------------

This issue is due to lack of synchronization. NPE happens in following piece of code while calling the line {noformat}getApplications(i).isInitialized(){noformat}. 
getApplications(int i) gets object at index i from the underlying applications list. This means List#get returns null object from the index location specified.
{code:title=YarnProtos.java}
public final boolean isInitialized() {
  ....
  for (int i = 0; i < getApplicationsCount(); i++) {
        if (!getApplications(i).isInitialized()) {
          memoizedIsInitialized = 0;
          return false;
        }
      }
   ....
}
{code}

This is because multiple threads can access the same QueueInfo object in case of Capacity scheduler and this can lead to underlying array list which stores applications being modified by different threads, which in turn can lead to NPE. YARN-2979 is also due to lack of synchronization

> ResourceManager crashes with NPE while getting queue info
> ---------------------------------------------------------
>
>                 Key: YARN-2978
>                 URL: https://issues.apache.org/jira/browse/YARN-2978
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.5.1
>            Reporter: Jason Tufo
>            Assignee: Varun Saxena
>
>  java.lang.NullPointerException
> 	at org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto.isInitialized(YarnProtos.java:29625)
> 	at org.apache.hadoop.yarn.proto.YarnProtos$QueueInfoProto$Builder.build(YarnProtos.java:29939)
> 	at org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.mergeLocalToProto(QueueInfoPBImpl.java:290)
> 	at org.apache.hadoop.yarn.api.records.impl.pb.QueueInfoPBImpl.getProto(QueueInfoPBImpl.java:157)
> 	at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.convertToProtoFormat(GetQueueInfoResponsePBImpl.java:128)
> 	at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToBuilder(GetQueueInfoResponsePBImpl.java:104)
> 	at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.mergeLocalToProto(GetQueueInfoResponsePBImpl.java:111)
> 	at org.apache.hadoop.yarn.api.protocolrecords.impl.pb.GetQueueInfoResponsePBImpl.getProto(GetQueueInfoResponsePBImpl.java:53)
> 	at org.apache.hadoop.yarn.api.impl.pb.service.ApplicationClientProtocolPBServiceImpl.getQueueInfo(ApplicationClientProtocolPBServiceImpl.java:235)
> 	at org.apache.hadoop.yarn.proto.ApplicationClientProtocol$ApplicationClientProtocolService$2.callBlockingMethod(ApplicationClientProtocol.java:333)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:585)
> 	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:928)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)