You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-dev@hadoop.apache.org by "Chris Riccomini (JIRA)" <ji...@apache.org> on 2011/09/21 00:34:08 UTC

[jira] [Created] (MAPREDUCE-3053) YARN Protobuf RPC Failures in RM

YARN Protobuf RPC Failures in RM
--------------------------------

                 Key: MAPREDUCE-3053
                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3053
             Project: Hadoop Map/Reduce
          Issue Type: Bug
          Components: resourcemanager
    Affects Versions: 0.24.0
         Environment: URL: http://svn.apache.org/repos/asf/hadoop/common/trunk
Repository Root: http://svn.apache.org/repos/asf
Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
Revision: 1173401
Node Kind: directory
Schedule: normal
Last Changed Author: vinodkv
Last Changed Rev: 1173130
Last Changed Date: 2011-09-20 06:11:12 -0700 (Tue, 20 Sep 2011)

            Reporter: Chris Riccomini


When I try to register my ApplicationMaster with YARN's RM, it fails.

In my ApplicationMaster's logs:

Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
	at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:108)
	at kafka.yarn.util.ApplicationMasterHelper.registerWithResourceManager(YarnHelper.scala:48)
	at kafka.yarn.ApplicationMaster$.main(ApplicationMaster.scala:32)
	at kafka.yarn.ApplicationMaster.main(ApplicationMaster.scala)
Caused by: com.google.protobuf.ServiceException: java.lang.NullPointerException: java.lang.NullPointerException
	at org.apache.hadoop.yarn.proto.ClientRMProtocol$ClientRMProtocolService$2.getRequestPrototype(ClientRMProtocol.java:186)
	at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Server.call(ProtoOverHadoopRpcEngine.java:323)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1489)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1485)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1483)

	at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:130)
	at $Proxy6.registerApplicationMaster(Unknown Source)
	at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:101)
	... 3 more
Caused by: java.lang.NullPointerException: java.lang.NullPointerException
	at org.apache.hadoop.yarn.proto.ClientRMProtocol$ClientRMProtocolService$2.getRequestPrototype(ClientRMProtocol.java:186)
	at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Server.call(ProtoOverHadoopRpcEngine.java:323)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1489)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1485)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1483)

	at org.apache.hadoop.ipc.Client.call(Client.java:1084)
	at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:127)
	... 5 more


In the ResourceManager's logs:

2011-09-20 15:11:20,973 INFO  ipc.Server (Server.java:run(1497)) - IPC Server handler 2 on 8040, call: org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$ProtoSpecificRequestWritable@455dd32a from 127.0.0.1:33793, error: 
java.lang.NullPointerException
	at org.apache.hadoop.yarn.proto.ClientRMProtocol$ClientRMProtocolService$2.getRequestPrototype(ClientRMProtocol.java:186)
	at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Server.call(ProtoOverHadoopRpcEngine.java:323)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1489)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1485)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:396)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1483)

My registration code:

    val appId = args(0).toInt
    val attemptId = args(1).toInt
    val timestamp = args(2).toLong

    // these are our application master's parameters
    val streamerClass = args(3)
    val tasks = args(4).toInt

    // TODO log params here

    // start the application master helper
    val conf = new Configuration
    val applicationMasterHelper = new ApplicationMasterHelper(appId, attemptId, timestamp, conf)
      .registerWithResourceManager

  .....

  val rpc = YarnRPC.create(conf)
  val appId = Records.newRecord(classOf[ApplicationId])
  val appAttemptId = Records.newRecord(classOf[ApplicationAttemptId])
  val rmAddress = NetUtils.createSocketAddr(conf.get(YarnConfiguration.RM_ADDRESS, YarnConfiguration.DEFAULT_RM_ADDRESS))
  val resourceManager = rpc.getProxy(classOf[AMRMProtocol], rmAddress, conf).asInstanceOf[AMRMProtocol]
  var requestId = 0

  appId.setClusterTimestamp(lTimestamp)
  appId.setId(iAppId)
  appAttemptId.setApplicationId(appId)
  appAttemptId.setAttemptId(iAppAttemptId)

  def registerWithResourceManager(): ApplicationMasterHelper = {
    val req = Records.newRecord(classOf[RegisterApplicationMasterRequest])
    req.setApplicationAttemptId(appAttemptId)
    // TODO not sure why these are blank- This is how spark does it
    req.setHost("")
    req.setRpcPort(1)
    req.setTrackingUrl("")
    resourceManager.registerApplicationMaster(req)
    this
  }

My params are receiving the proper app/attempt/cluster timestamps:

app - 1
attempt - 1
timestamp - 1316556657998


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Resolved] (MAPREDUCE-3053) YARN Protobuf RPC Failures in RM

Posted by "Chris Riccomini (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-3053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chris Riccomini resolved MAPREDUCE-3053.
----------------------------------------

    Resolution: Fixed

Fixed with:

  val rmAddress = NetUtils.createSocketAddr(conf.get(YarnConfiguration.RM_SCHEDULER_ADDRESS, YarnConfiguration.DEFAULT_RM_SCHEDULER_ADDRESS))

See comment as to why.

> YARN Protobuf RPC Failures in RM
> --------------------------------
>
>                 Key: MAPREDUCE-3053
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3053
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 0.24.0
>         Environment: URL: http://svn.apache.org/repos/asf/hadoop/common/trunk
> Repository Root: http://svn.apache.org/repos/asf
> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> Revision: 1173401
> Node Kind: directory
> Schedule: normal
> Last Changed Author: vinodkv
> Last Changed Rev: 1173130
> Last Changed Date: 2011-09-20 06:11:12 -0700 (Tue, 20 Sep 2011)
>            Reporter: Chris Riccomini
>
> When I try to register my ApplicationMaster with YARN's RM, it fails.
> In my ApplicationMaster's logs:
> Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
> 	at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:108)
> 	at kafka.yarn.util.ApplicationMasterHelper.registerWithResourceManager(YarnHelper.scala:48)
> 	at kafka.yarn.ApplicationMaster$.main(ApplicationMaster.scala:32)
> 	at kafka.yarn.ApplicationMaster.main(ApplicationMaster.scala)
> Caused by: com.google.protobuf.ServiceException: java.lang.NullPointerException: java.lang.NullPointerException
> 	at org.apache.hadoop.yarn.proto.ClientRMProtocol$ClientRMProtocolService$2.getRequestPrototype(ClientRMProtocol.java:186)
> 	at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Server.call(ProtoOverHadoopRpcEngine.java:323)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1489)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1485)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1483)
> 	at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:130)
> 	at $Proxy6.registerApplicationMaster(Unknown Source)
> 	at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:101)
> 	... 3 more
> Caused by: java.lang.NullPointerException: java.lang.NullPointerException
> 	at org.apache.hadoop.yarn.proto.ClientRMProtocol$ClientRMProtocolService$2.getRequestPrototype(ClientRMProtocol.java:186)
> 	at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Server.call(ProtoOverHadoopRpcEngine.java:323)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1489)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1485)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1483)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1084)
> 	at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:127)
> 	... 5 more
> In the ResourceManager's logs:
> 2011-09-20 15:11:20,973 INFO  ipc.Server (Server.java:run(1497)) - IPC Server handler 2 on 8040, call: org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$ProtoSpecificRequestWritable@455dd32a from 127.0.0.1:33793, error: 
> java.lang.NullPointerException
> 	at org.apache.hadoop.yarn.proto.ClientRMProtocol$ClientRMProtocolService$2.getRequestPrototype(ClientRMProtocol.java:186)
> 	at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Server.call(ProtoOverHadoopRpcEngine.java:323)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1489)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1485)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1483)
> My registration code:
>     val appId = args(0).toInt
>     val attemptId = args(1).toInt
>     val timestamp = args(2).toLong
>     // these are our application master's parameters
>     val streamerClass = args(3)
>     val tasks = args(4).toInt
>     // TODO log params here
>     // start the application master helper
>     val conf = new Configuration
>     val applicationMasterHelper = new ApplicationMasterHelper(appId, attemptId, timestamp, conf)
>       .registerWithResourceManager
>   .....
>   val rpc = YarnRPC.create(conf)
>   val appId = Records.newRecord(classOf[ApplicationId])
>   val appAttemptId = Records.newRecord(classOf[ApplicationAttemptId])
>   val rmAddress = NetUtils.createSocketAddr(conf.get(YarnConfiguration.RM_ADDRESS, YarnConfiguration.DEFAULT_RM_ADDRESS))
>   val resourceManager = rpc.getProxy(classOf[AMRMProtocol], rmAddress, conf).asInstanceOf[AMRMProtocol]
>   var requestId = 0
>   appId.setClusterTimestamp(lTimestamp)
>   appId.setId(iAppId)
>   appAttemptId.setApplicationId(appId)
>   appAttemptId.setAttemptId(iAppAttemptId)
>   def registerWithResourceManager(): ApplicationMasterHelper = {
>     val req = Records.newRecord(classOf[RegisterApplicationMasterRequest])
>     req.setApplicationAttemptId(appAttemptId)
>     // TODO not sure why these are blank- This is how spark does it
>     req.setHost("")
>     req.setRpcPort(1)
>     req.setTrackingUrl("")
>     resourceManager.registerApplicationMaster(req)
>     this
>   }
> My params are receiving the proper app/attempt/cluster timestamps:
> app - 1
> attempt - 1
> timestamp - 1316556657998

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Reopened] (MAPREDUCE-3053) YARN Protobuf RPC Failures in RM

Posted by "Vinod Kumar Vavilapalli (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/MAPREDUCE-3053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Vinod Kumar Vavilapalli reopened MAPREDUCE-3053:
------------------------------------------------

      Assignee: Vinod Kumar Vavilapalli

I think it makes sense to fix the PB-RPC layer to check for wrong methods and throw a better error message.

> YARN Protobuf RPC Failures in RM
> --------------------------------
>
>                 Key: MAPREDUCE-3053
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3053
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: resourcemanager
>    Affects Versions: 0.24.0
>         Environment: URL: http://svn.apache.org/repos/asf/hadoop/common/trunk
> Repository Root: http://svn.apache.org/repos/asf
> Repository UUID: 13f79535-47bb-0310-9956-ffa450edef68
> Revision: 1173401
> Node Kind: directory
> Schedule: normal
> Last Changed Author: vinodkv
> Last Changed Rev: 1173130
> Last Changed Date: 2011-09-20 06:11:12 -0700 (Tue, 20 Sep 2011)
>            Reporter: Chris Riccomini
>            Assignee: Vinod Kumar Vavilapalli
>
> When I try to register my ApplicationMaster with YARN's RM, it fails.
> In my ApplicationMaster's logs:
> Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
> 	at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:108)
> 	at kafka.yarn.util.ApplicationMasterHelper.registerWithResourceManager(YarnHelper.scala:48)
> 	at kafka.yarn.ApplicationMaster$.main(ApplicationMaster.scala:32)
> 	at kafka.yarn.ApplicationMaster.main(ApplicationMaster.scala)
> Caused by: com.google.protobuf.ServiceException: java.lang.NullPointerException: java.lang.NullPointerException
> 	at org.apache.hadoop.yarn.proto.ClientRMProtocol$ClientRMProtocolService$2.getRequestPrototype(ClientRMProtocol.java:186)
> 	at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Server.call(ProtoOverHadoopRpcEngine.java:323)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1489)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1485)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1483)
> 	at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:130)
> 	at $Proxy6.registerApplicationMaster(Unknown Source)
> 	at org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:101)
> 	... 3 more
> Caused by: java.lang.NullPointerException: java.lang.NullPointerException
> 	at org.apache.hadoop.yarn.proto.ClientRMProtocol$ClientRMProtocolService$2.getRequestPrototype(ClientRMProtocol.java:186)
> 	at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Server.call(ProtoOverHadoopRpcEngine.java:323)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1489)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1485)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1483)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1084)
> 	at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Invoker.invoke(ProtoOverHadoopRpcEngine.java:127)
> 	... 5 more
> In the ResourceManager's logs:
> 2011-09-20 15:11:20,973 INFO  ipc.Server (Server.java:run(1497)) - IPC Server handler 2 on 8040, call: org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$ProtoSpecificRequestWritable@455dd32a from 127.0.0.1:33793, error: 
> java.lang.NullPointerException
> 	at org.apache.hadoop.yarn.proto.ClientRMProtocol$ClientRMProtocolService$2.getRequestPrototype(ClientRMProtocol.java:186)
> 	at org.apache.hadoop.yarn.ipc.ProtoOverHadoopRpcEngine$Server.call(ProtoOverHadoopRpcEngine.java:323)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1489)
> 	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:1485)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1135)
> 	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:1483)
> My registration code:
>     val appId = args(0).toInt
>     val attemptId = args(1).toInt
>     val timestamp = args(2).toLong
>     // these are our application master's parameters
>     val streamerClass = args(3)
>     val tasks = args(4).toInt
>     // TODO log params here
>     // start the application master helper
>     val conf = new Configuration
>     val applicationMasterHelper = new ApplicationMasterHelper(appId, attemptId, timestamp, conf)
>       .registerWithResourceManager
>   .....
>   val rpc = YarnRPC.create(conf)
>   val appId = Records.newRecord(classOf[ApplicationId])
>   val appAttemptId = Records.newRecord(classOf[ApplicationAttemptId])
>   val rmAddress = NetUtils.createSocketAddr(conf.get(YarnConfiguration.RM_ADDRESS, YarnConfiguration.DEFAULT_RM_ADDRESS))
>   val resourceManager = rpc.getProxy(classOf[AMRMProtocol], rmAddress, conf).asInstanceOf[AMRMProtocol]
>   var requestId = 0
>   appId.setClusterTimestamp(lTimestamp)
>   appId.setId(iAppId)
>   appAttemptId.setApplicationId(appId)
>   appAttemptId.setAttemptId(iAppAttemptId)
>   def registerWithResourceManager(): ApplicationMasterHelper = {
>     val req = Records.newRecord(classOf[RegisterApplicationMasterRequest])
>     req.setApplicationAttemptId(appAttemptId)
>     // TODO not sure why these are blank- This is how spark does it
>     req.setHost("")
>     req.setRpcPort(1)
>     req.setTrackingUrl("")
>     resourceManager.registerApplicationMaster(req)
>     this
>   }
> My params are receiving the proper app/attempt/cluster timestamps:
> app - 1
> attempt - 1
> timestamp - 1316556657998

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira