You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "panlijie (Jira)" <ji...@apache.org> on 2020/02/18 02:12:00 UTC

[jira] [Commented] (YARN-9693) When AMRMProxyService is enabled RMCommunicator will register with failure

    [ https://issues.apache.org/jira/browse/YARN-9693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17038735#comment-17038735 ] 

panlijie commented on YARN-9693:
--------------------------------

We config NM with configuration below:
{code:java}
yarn.nodemanager.amrmproxy.enabled	true
yarn.nodemanager.amrmproxy.interceptor-class.pipeline 	org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor{code}

but the error log as below:
 {code:java}
[hdfs@rbf jars]$ spark-submit --class org.apache.spark.examples.SparkPi --master yarn --driver-memory 1g --executor-cores 2 --queue default spark-examples_2.11-2.3.1.3.0.1.0-187.jar 10
20/01/07 17:01:04 INFO SparkContext: Running Spark version 2.3.1.3.0.1.0-187
20/01/07 17:01:04 INFO SparkContext: Submitted application: Spark Pi
20/01/07 17:01:04 INFO SecurityManager: Changing view acls to: hdfs
20/01/07 17:01:04 INFO SecurityManager: Changing modify acls to: hdfs
20/01/07 17:01:04 INFO SecurityManager: Changing view acls groups to: 
20/01/07 17:01:04 INFO SecurityManager: Changing modify acls groups to: 
20/01/07 17:01:04 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(hdfs); groups with view permissions: Set(); users  with modify permissions: Set(hdfs); groups with modify permissions: Set()
20/01/07 17:01:04 INFO Utils: Successfully started service 'sparkDriver' on port 45941.
20/01/07 17:01:04 INFO SparkEnv: Registering MapOutputTracker
20/01/07 17:01:04 INFO SparkEnv: Registering BlockManagerMaster
20/01/07 17:01:04 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
20/01/07 17:01:04 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
20/01/07 17:01:04 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-498de21a-a616-4826-b839-a9ca32a9272f
20/01/07 17:01:04 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
20/01/07 17:01:05 INFO SparkEnv: Registering OutputCommitCoordinator
20/01/07 17:01:05 INFO log: Logging initialized @1604ms
20/01/07 17:01:05 INFO Server: jetty-9.3.z-SNAPSHOT, build timestamp: 2018-06-06T01:11:56+08:00, git hash: 84205aa28f11a4f31f2a3b86d1bba2cc8ab69827
20/01/07 17:01:05 INFO Server: Started @1676ms
20/01/07 17:01:05 INFO AbstractConnector: Started ServerConnector@2e8ab815{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
20/01/07 17:01:05 INFO Utils: Successfully started service 'SparkUI' on port 4040.
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@7c18432b{/jobs,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@14bb2297{/jobs/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@69adf72c{/jobs/job,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@57f791c6{/jobs/job/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@51650883{/stages,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@6c4f9535{/stages/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@5bd1ceca{/stages/stage,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@596df867{/stages/stage/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@c1fca1e{/stages/pool,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@241a53ef{/stages/pool/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@344344fa{/storage,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@2db2cd5{/storage/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@70e659aa{/storage/rdd,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@615f972{/storage/rdd/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@285f09de{/environment,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@73393584{/environment/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@31500940{/executors,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1827a871{/executors/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@48e64352{/executors/threadDump,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@7249dadf{/executors/threadDump/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@4362d7df{/static,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@600b0b7{/,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@345e5a17{/api,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1734f68{/jobs/job/kill,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@77b7ffa4{/stages/stage/kill,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://rbf.master:4040
20/01/07 17:01:05 INFO SparkContext: Added JAR file:/usr/hdp/3.0.1.0-187/spark2/examples/jars/spark-examples_2.11-2.3.1.3.0.1.0-187.jar at spark://rbf.master:45941/jars/spark-examples_2.11-2.3.1.3.0.1.0-187.jar with timestamp 1578387665225
20/01/07 17:01:05 INFO RMProxy: Connecting to ResourceManager at /100.7.51.155:8050
20/01/07 17:01:06 INFO Client: Requesting a new application from cluster with 3 NodeManagers
20/01/07 17:01:06 INFO Configuration: found resource resource-types.xml at file:/etc/hadoop/3.0.1.0-187/0/resource-types.xml
20/01/07 17:01:06 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (11520 MB per container)
20/01/07 17:01:06 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
20/01/07 17:01:06 INFO Client: Setting up container launch context for our AM
20/01/07 17:01:06 INFO Client: Setting up the launch environment for our AM container
20/01/07 17:01:06 INFO Client: Preparing resources for our AM container
20/01/07 17:01:07 INFO Client: Use hdfs cache file as spark.yarn.archive for HDP, hdfsCacheFile:hdfs://ns-fed/hdp/apps/3.0.1.0-187/spark2/spark2-hdp-yarn-archive.tar.gz
20/01/07 17:01:07 INFO Client: Source and destination file systems are the same. Not copying hdfs://ns-fed/hdp/apps/3.0.1.0-187/spark2/spark2-hdp-yarn-archive.tar.gz
20/01/07 17:01:07 INFO Client: Distribute hdfs cache file as spark.sql.hive.metastore.jars for HDP, hdfsCacheFile:hdfs://ns-fed/hdp/apps/3.0.1.0-187/spark2/spark2-hdp-hive-archive.tar.gz
20/01/07 17:01:07 INFO Client: Source and destination file systems are the same. Not copying hdfs://ns-fed/hdp/apps/3.0.1.0-187/spark2/spark2-hdp-hive-archive.tar.gz
20/01/07 17:01:07 INFO Client: Uploading resource file:/tmp/spark-71f3d871-28fe-466c-97e1-b176bfe45347/__spark_conf__5541745671501623981.zip -> hdfs://ns-fed/user/hdfs/.sparkStaging/application_1578387107489_0003/__spark_conf__.zip
20/01/07 17:01:07 INFO SecurityManager: Changing view acls to: hdfs
20/01/07 17:01:07 INFO SecurityManager: Changing modify acls to: hdfs
20/01/07 17:01:07 INFO SecurityManager: Changing view acls groups to: 
20/01/07 17:01:07 INFO SecurityManager: Changing modify acls groups to: 
20/01/07 17:01:07 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users  with view permissions: Set(hdfs); groups with view permissions: Set(); users  with modify permissions: Set(hdfs); groups with modify permissions: Set()
20/01/07 17:01:07 INFO Client: Submitting application application_1578387107489_0003 to ResourceManager
20/01/07 17:01:07 INFO YarnClientImpl: Submitted application application_1578387107489_0003
20/01/07 17:01:07 INFO SchedulerExtensionServices: Starting Yarn extension services with app application_1578387107489_0003 and attemptId None
20/01/07 17:01:08 INFO Client: Application report for application_1578387107489_0003 (state: ACCEPTED)
20/01/07 17:01:08 INFO Client: 
	 client token: N/A
	 diagnostics: AM container is launched, waiting for AM container to Register with RM
	 ApplicationMaster host: N/A
	 ApplicationMaster RPC port: -1
	 queue: default
	 start time: 1578387667128
	 final status: UNDEFINED
	 tracking URL: http://work1.rbf1:8088/proxy/application_1578387107489_0003/
	 user: hdfs
20/01/07 17:01:09 INFO Client: Application report for application_1578387107489_0003 (state: ACCEPTED)
20/01/07 17:01:10 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> work1.rbf1,work2.rbf1, PROXY_URI_BASES -> http://work1.rbf1:8088/proxy/application_1578387107489_0003,http://work2.rbf1:8088/proxy/application_1578387107489_0003, RM_HA_URLS -> work1.rbf1:8088,work2.rbf1:8088), /proxy/application_1578387107489_0003
20/01/07 17:01:10 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, /jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, /stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, /storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, /executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, /api, /jobs/job/kill, /stages/stage/kill.
20/01/07 17:01:10 INFO Client: Application report for application_1578387107489_0003 (state: ACCEPTED)
20/01/07 17:01:11 INFO Client: Application report for application_1578387107489_0003 (state: ACCEPTED)
20/01/07 17:01:12 INFO Client: Application report for application_1578387107489_0003 (state: ACCEPTED)
20/01/07 17:01:13 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> work1.rbf1,work2.rbf1, PROXY_URI_BASES -> http://work1.rbf1:8088/proxy/application_1578387107489_0003,http://work2.rbf1:8088/proxy/application_1578387107489_0003, RM_HA_URLS -> work1.rbf1:8088,work2.rbf1:8088), /proxy/application_1578387107489_0003
20/01/07 17:01:13 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, /jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, /stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, /storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, /executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, /api, /jobs/job/kill, /stages/stage/kill.
20/01/07 17:01:13 INFO Client: Application report for application_1578387107489_0003 (state: ACCEPTED)
20/01/07 17:01:14 INFO Client: Application report for application_1578387107489_0003 (state: FAILED)
20/01/07 17:01:14 INFO Client: 
	 client token: N/A
	 diagnostics: Application application_1578387107489_0003 failed 2 times due to AM Container for appattempt_1578387107489_0003_000002 exited with  exitCode: 13
Failing this attempt.Diagnostics: [2020-01-07 17:01:13.267]Exception from container-launch.
Container id: container_e76_1578387107489_0003_02_000001
Exit code: 13

[2020-01-07 17:01:13.272]Container exited with a non-zero exit code 13. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
ntiateIOException(RPCUtil.java:80)
	at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119)
	at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:109)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
	at com.sun.proxy.$Proxy15.registerApplicationMaster(Unknown Source)
	at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:247)
	at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:234)
	at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:214)
	at org.apache.spark.deploy.yarn.YarnRMClient.register(YarnRMClient.scala:75)
	at org.apache.spark.deploy.yarn.ApplicationMaster.registerAM(ApplicationMaster.scala:462)
	at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:534)
	at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:347)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply$mcV$sp(ApplicationMaster.scala:260)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$5.run(ApplicationMaster.scala:815)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
	at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:814)
	at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:259)
	at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:839)
	at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:869)
	at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): Invalid AMRMToken from appattempt_1578387107489_0003_000002
	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1497)
	at org.apache.hadoop.ipc.Client.call(Client.java:1443)
	at org.apache.hadoop.ipc.Client.call(Client.java:1353)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
	at com.sun.proxy.$Proxy14.registerApplicationMaster(Unknown Source)
	at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:107)
	... 29 more
20/01/07 17:01:12 INFO ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: Uncaught exception: org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid AMRMToken from appattempt_1578387107489_0003_000002)
20/01/07 17:01:12 INFO ApplicationMaster: Deleting staging directory hdfs://ns-fed/user/hdfs/.sparkStaging/application_1578387107489_0003
20/01/07 17:01:12 INFO ShutdownHookManager: Shutdown hook called


[2020-01-07 17:01:13.273]Container exited with a non-zero exit code 13. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
ntiateIOException(RPCUtil.java:80)
	at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119)
	at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:109)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:498)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
	at com.sun.proxy.$Proxy15.registerApplicationMaster(Unknown Source)
	at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:247)
	at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:234)
	at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:214)
	at org.apache.spark.deploy.yarn.YarnRMClient.register(YarnRMClient.scala:75)
	at org.apache.spark.deploy.yarn.ApplicationMaster.registerAM(ApplicationMaster.scala:462)
	at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:534)
	at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:347)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply$mcV$sp(ApplicationMaster.scala:260)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
	at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$5.run(ApplicationMaster.scala:815)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
	at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:814)
	at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:259)
	at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:839)
	at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:869)
	at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): Invalid AMRMToken from appattempt_1578387107489_0003_000002
	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1497)
	at org.apache.hadoop.ipc.Client.call(Client.java:1443)
	at org.apache.hadoop.ipc.Client.call(Client.java:1353)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
	at com.sun.proxy.$Proxy14.registerApplicationMaster(Unknown Source)
	at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:107)
	... 29 more
20/01/07 17:01:12 INFO ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: Uncaught exception: org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid AMRMToken from appattempt_1578387107489_0003_000002)
20/01/07 17:01:12 INFO ApplicationMaster: Deleting staging directory hdfs://ns-fed/user/hdfs/.sparkStaging/application_1578387107489_0003
20/01/07 17:01:12 INFO ShutdownHookManager: Shutdown hook called
{code}

> When AMRMProxyService is enabled RMCommunicator will register with failure
> --------------------------------------------------------------------------
>
>                 Key: YARN-9693
>                 URL: https://issues.apache.org/jira/browse/YARN-9693
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: federation
>    Affects Versions: 3.1.2
>            Reporter: zhoukang
>            Assignee: zhoukang
>            Priority: Major
>         Attachments: YARN-9693.001.patch
>
>
> When we enable amrm proxy service, the  RMCommunicator will register with failure below:
> {code:java}
> 2019-07-23 17:12:44,794 INFO [TaskHeartbeatHandler PingChecker] org.apache.hadoop.mapreduce.v2.app.TaskHeartbeatHandler: TaskHeartbeatHandler thread interrupted
> 2019-07-23 17:12:44,794 ERROR [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid AMRMToken from appattempt_1563872237585_0001_000002
> 	at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:186)
> 	at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:123)
> 	at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280)
> 	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> 	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:986)
> 	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> 	at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
> 	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1300)
> 	at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> 	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(MRAppMaster.java:1768)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:422)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1716)
> 	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1764)
> 	at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1698)
> Caused by: org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid AMRMToken from appattempt_1563872237585_0001_000002
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> 	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> 	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> 	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> 	at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
> 	at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateIOException(RPCUtil.java:80)
> 	at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119)
> 	at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:109)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> 	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> 	at com.sun.proxy.$Proxy93.registerApplicationMaster(Unknown Source)
> 	at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:170)
> 	... 14 more
> Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): Invalid AMRMToken from appattempt_1563872237585_0001_000002
> 	at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1541)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1487)
> 	at org.apache.hadoop.ipc.Client.call(Client.java:1397)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
> 	at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
> 	at com.sun.proxy.$Proxy92.registerApplicationMaster(Unknown Source)
> 	at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:107)
> {code}
> We config NM with configuration below:
> {code:java}
> yarn.nodemanager.amrmproxy.enabled	true
> yarn.nodemanager.amrmproxy.interceptor-class.pipeline 	org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org