You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "panlijie (Jira)" <ji...@apache.org> on 2020/02/18 02:12:00 UTC
[jira] [Commented] (YARN-9693) When AMRMProxyService is enabled
RMCommunicator will register with failure
[ https://issues.apache.org/jira/browse/YARN-9693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17038735#comment-17038735 ]
panlijie commented on YARN-9693:
--------------------------------
We config NM with configuration below:
{code:java}
yarn.nodemanager.amrmproxy.enabled true
yarn.nodemanager.amrmproxy.interceptor-class.pipeline org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor{code}
but the error log as below:
{code:java}
[hdfs@rbf jars]$ spark-submit --class org.apache.spark.examples.SparkPi --master yarn --driver-memory 1g --executor-cores 2 --queue default spark-examples_2.11-2.3.1.3.0.1.0-187.jar 10
20/01/07 17:01:04 INFO SparkContext: Running Spark version 2.3.1.3.0.1.0-187
20/01/07 17:01:04 INFO SparkContext: Submitted application: Spark Pi
20/01/07 17:01:04 INFO SecurityManager: Changing view acls to: hdfs
20/01/07 17:01:04 INFO SecurityManager: Changing modify acls to: hdfs
20/01/07 17:01:04 INFO SecurityManager: Changing view acls groups to:
20/01/07 17:01:04 INFO SecurityManager: Changing modify acls groups to:
20/01/07 17:01:04 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hdfs); groups with view permissions: Set(); users with modify permissions: Set(hdfs); groups with modify permissions: Set()
20/01/07 17:01:04 INFO Utils: Successfully started service 'sparkDriver' on port 45941.
20/01/07 17:01:04 INFO SparkEnv: Registering MapOutputTracker
20/01/07 17:01:04 INFO SparkEnv: Registering BlockManagerMaster
20/01/07 17:01:04 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
20/01/07 17:01:04 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
20/01/07 17:01:04 INFO DiskBlockManager: Created local directory at /tmp/blockmgr-498de21a-a616-4826-b839-a9ca32a9272f
20/01/07 17:01:04 INFO MemoryStore: MemoryStore started with capacity 366.3 MB
20/01/07 17:01:05 INFO SparkEnv: Registering OutputCommitCoordinator
20/01/07 17:01:05 INFO log: Logging initialized @1604ms
20/01/07 17:01:05 INFO Server: jetty-9.3.z-SNAPSHOT, build timestamp: 2018-06-06T01:11:56+08:00, git hash: 84205aa28f11a4f31f2a3b86d1bba2cc8ab69827
20/01/07 17:01:05 INFO Server: Started @1676ms
20/01/07 17:01:05 INFO AbstractConnector: Started ServerConnector@2e8ab815{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
20/01/07 17:01:05 INFO Utils: Successfully started service 'SparkUI' on port 4040.
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@7c18432b{/jobs,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@14bb2297{/jobs/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@69adf72c{/jobs/job,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@57f791c6{/jobs/job/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@51650883{/stages,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@6c4f9535{/stages/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@5bd1ceca{/stages/stage,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@596df867{/stages/stage/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@c1fca1e{/stages/pool,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@241a53ef{/stages/pool/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@344344fa{/storage,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@2db2cd5{/storage/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@70e659aa{/storage/rdd,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@615f972{/storage/rdd/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@285f09de{/environment,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@73393584{/environment/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@31500940{/executors,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1827a871{/executors/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@48e64352{/executors/threadDump,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@7249dadf{/executors/threadDump/json,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@4362d7df{/static,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@600b0b7{/,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@345e5a17{/api,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@1734f68{/jobs/job/kill,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO ContextHandler: Started o.s.j.s.ServletContextHandler@77b7ffa4{/stages/stage/kill,null,AVAILABLE,@Spark}
20/01/07 17:01:05 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://rbf.master:4040
20/01/07 17:01:05 INFO SparkContext: Added JAR file:/usr/hdp/3.0.1.0-187/spark2/examples/jars/spark-examples_2.11-2.3.1.3.0.1.0-187.jar at spark://rbf.master:45941/jars/spark-examples_2.11-2.3.1.3.0.1.0-187.jar with timestamp 1578387665225
20/01/07 17:01:05 INFO RMProxy: Connecting to ResourceManager at /100.7.51.155:8050
20/01/07 17:01:06 INFO Client: Requesting a new application from cluster with 3 NodeManagers
20/01/07 17:01:06 INFO Configuration: found resource resource-types.xml at file:/etc/hadoop/3.0.1.0-187/0/resource-types.xml
20/01/07 17:01:06 INFO Client: Verifying our application has not requested more than the maximum memory capability of the cluster (11520 MB per container)
20/01/07 17:01:06 INFO Client: Will allocate AM container, with 896 MB memory including 384 MB overhead
20/01/07 17:01:06 INFO Client: Setting up container launch context for our AM
20/01/07 17:01:06 INFO Client: Setting up the launch environment for our AM container
20/01/07 17:01:06 INFO Client: Preparing resources for our AM container
20/01/07 17:01:07 INFO Client: Use hdfs cache file as spark.yarn.archive for HDP, hdfsCacheFile:hdfs://ns-fed/hdp/apps/3.0.1.0-187/spark2/spark2-hdp-yarn-archive.tar.gz
20/01/07 17:01:07 INFO Client: Source and destination file systems are the same. Not copying hdfs://ns-fed/hdp/apps/3.0.1.0-187/spark2/spark2-hdp-yarn-archive.tar.gz
20/01/07 17:01:07 INFO Client: Distribute hdfs cache file as spark.sql.hive.metastore.jars for HDP, hdfsCacheFile:hdfs://ns-fed/hdp/apps/3.0.1.0-187/spark2/spark2-hdp-hive-archive.tar.gz
20/01/07 17:01:07 INFO Client: Source and destination file systems are the same. Not copying hdfs://ns-fed/hdp/apps/3.0.1.0-187/spark2/spark2-hdp-hive-archive.tar.gz
20/01/07 17:01:07 INFO Client: Uploading resource file:/tmp/spark-71f3d871-28fe-466c-97e1-b176bfe45347/__spark_conf__5541745671501623981.zip -> hdfs://ns-fed/user/hdfs/.sparkStaging/application_1578387107489_0003/__spark_conf__.zip
20/01/07 17:01:07 INFO SecurityManager: Changing view acls to: hdfs
20/01/07 17:01:07 INFO SecurityManager: Changing modify acls to: hdfs
20/01/07 17:01:07 INFO SecurityManager: Changing view acls groups to:
20/01/07 17:01:07 INFO SecurityManager: Changing modify acls groups to:
20/01/07 17:01:07 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(hdfs); groups with view permissions: Set(); users with modify permissions: Set(hdfs); groups with modify permissions: Set()
20/01/07 17:01:07 INFO Client: Submitting application application_1578387107489_0003 to ResourceManager
20/01/07 17:01:07 INFO YarnClientImpl: Submitted application application_1578387107489_0003
20/01/07 17:01:07 INFO SchedulerExtensionServices: Starting Yarn extension services with app application_1578387107489_0003 and attemptId None
20/01/07 17:01:08 INFO Client: Application report for application_1578387107489_0003 (state: ACCEPTED)
20/01/07 17:01:08 INFO Client:
client token: N/A
diagnostics: AM container is launched, waiting for AM container to Register with RM
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1578387667128
final status: UNDEFINED
tracking URL: http://work1.rbf1:8088/proxy/application_1578387107489_0003/
user: hdfs
20/01/07 17:01:09 INFO Client: Application report for application_1578387107489_0003 (state: ACCEPTED)
20/01/07 17:01:10 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> work1.rbf1,work2.rbf1, PROXY_URI_BASES -> http://work1.rbf1:8088/proxy/application_1578387107489_0003,http://work2.rbf1:8088/proxy/application_1578387107489_0003, RM_HA_URLS -> work1.rbf1:8088,work2.rbf1:8088), /proxy/application_1578387107489_0003
20/01/07 17:01:10 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, /jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, /stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, /storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, /executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, /api, /jobs/job/kill, /stages/stage/kill.
20/01/07 17:01:10 INFO Client: Application report for application_1578387107489_0003 (state: ACCEPTED)
20/01/07 17:01:11 INFO Client: Application report for application_1578387107489_0003 (state: ACCEPTED)
20/01/07 17:01:12 INFO Client: Application report for application_1578387107489_0003 (state: ACCEPTED)
20/01/07 17:01:13 INFO YarnClientSchedulerBackend: Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> work1.rbf1,work2.rbf1, PROXY_URI_BASES -> http://work1.rbf1:8088/proxy/application_1578387107489_0003,http://work2.rbf1:8088/proxy/application_1578387107489_0003, RM_HA_URLS -> work1.rbf1:8088,work2.rbf1:8088), /proxy/application_1578387107489_0003
20/01/07 17:01:13 INFO JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /jobs, /jobs/json, /jobs/job, /jobs/job/json, /stages, /stages/json, /stages/stage, /stages/stage/json, /stages/pool, /stages/pool/json, /storage, /storage/json, /storage/rdd, /storage/rdd/json, /environment, /environment/json, /executors, /executors/json, /executors/threadDump, /executors/threadDump/json, /static, /, /api, /jobs/job/kill, /stages/stage/kill.
20/01/07 17:01:13 INFO Client: Application report for application_1578387107489_0003 (state: ACCEPTED)
20/01/07 17:01:14 INFO Client: Application report for application_1578387107489_0003 (state: FAILED)
20/01/07 17:01:14 INFO Client:
client token: N/A
diagnostics: Application application_1578387107489_0003 failed 2 times due to AM Container for appattempt_1578387107489_0003_000002 exited with exitCode: 13
Failing this attempt.Diagnostics: [2020-01-07 17:01:13.267]Exception from container-launch.
Container id: container_e76_1578387107489_0003_02_000001
Exit code: 13
[2020-01-07 17:01:13.272]Container exited with a non-zero exit code 13. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
ntiateIOException(RPCUtil.java:80)
at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy15.registerApplicationMaster(Unknown Source)
at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:247)
at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:234)
at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:214)
at org.apache.spark.deploy.yarn.YarnRMClient.register(YarnRMClient.scala:75)
at org.apache.spark.deploy.yarn.ApplicationMaster.registerAM(ApplicationMaster.scala:462)
at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:534)
at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:347)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply$mcV$sp(ApplicationMaster.scala:260)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$5.run(ApplicationMaster.scala:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:814)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:259)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:839)
at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:869)
at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): Invalid AMRMToken from appattempt_1578387107489_0003_000002
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1497)
at org.apache.hadoop.ipc.Client.call(Client.java:1443)
at org.apache.hadoop.ipc.Client.call(Client.java:1353)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy14.registerApplicationMaster(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:107)
... 29 more
20/01/07 17:01:12 INFO ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: Uncaught exception: org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid AMRMToken from appattempt_1578387107489_0003_000002)
20/01/07 17:01:12 INFO ApplicationMaster: Deleting staging directory hdfs://ns-fed/user/hdfs/.sparkStaging/application_1578387107489_0003
20/01/07 17:01:12 INFO ShutdownHookManager: Shutdown hook called
[2020-01-07 17:01:13.273]Container exited with a non-zero exit code 13. Error file: prelaunch.err.
Last 4096 bytes of prelaunch.err :
Last 4096 bytes of stderr :
ntiateIOException(RPCUtil.java:80)
at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:109)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
at com.sun.proxy.$Proxy15.registerApplicationMaster(Unknown Source)
at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:247)
at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:234)
at org.apache.hadoop.yarn.client.api.impl.AMRMClientImpl.registerApplicationMaster(AMRMClientImpl.java:214)
at org.apache.spark.deploy.yarn.YarnRMClient.register(YarnRMClient.scala:75)
at org.apache.spark.deploy.yarn.ApplicationMaster.registerAM(ApplicationMaster.scala:462)
at org.apache.spark.deploy.yarn.ApplicationMaster.runExecutorLauncher(ApplicationMaster.scala:534)
at org.apache.spark.deploy.yarn.ApplicationMaster.org$apache$spark$deploy$yarn$ApplicationMaster$$runImpl(ApplicationMaster.scala:347)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply$mcV$sp(ApplicationMaster.scala:260)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$2.apply(ApplicationMaster.scala:260)
at org.apache.spark.deploy.yarn.ApplicationMaster$$anon$5.run(ApplicationMaster.scala:815)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
at org.apache.spark.deploy.yarn.ApplicationMaster.doAsUser(ApplicationMaster.scala:814)
at org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:259)
at org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:839)
at org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:869)
at org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala)
Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): Invalid AMRMToken from appattempt_1578387107489_0003_000002
at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1497)
at org.apache.hadoop.ipc.Client.call(Client.java:1443)
at org.apache.hadoop.ipc.Client.call(Client.java:1353)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
at com.sun.proxy.$Proxy14.registerApplicationMaster(Unknown Source)
at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:107)
... 29 more
20/01/07 17:01:12 INFO ApplicationMaster: Final app status: FAILED, exitCode: 13, (reason: Uncaught exception: org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid AMRMToken from appattempt_1578387107489_0003_000002)
20/01/07 17:01:12 INFO ApplicationMaster: Deleting staging directory hdfs://ns-fed/user/hdfs/.sparkStaging/application_1578387107489_0003
20/01/07 17:01:12 INFO ShutdownHookManager: Shutdown hook called
{code}
> When AMRMProxyService is enabled RMCommunicator will register with failure
> --------------------------------------------------------------------------
>
> Key: YARN-9693
> URL: https://issues.apache.org/jira/browse/YARN-9693
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: federation
> Affects Versions: 3.1.2
> Reporter: zhoukang
> Assignee: zhoukang
> Priority: Major
> Attachments: YARN-9693.001.patch
>
>
> When we enable amrm proxy service, the RMCommunicator will register with failure below:
> {code:java}
> 2019-07-23 17:12:44,794 INFO [TaskHeartbeatHandler PingChecker] org.apache.hadoop.mapreduce.v2.app.TaskHeartbeatHandler: TaskHeartbeatHandler thread interrupted
> 2019-07-23 17:12:44,794 ERROR [main] org.apache.hadoop.mapreduce.v2.app.MRAppMaster: Error starting MRAppMaster
> org.apache.hadoop.yarn.exceptions.YarnRuntimeException: org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid AMRMToken from appattempt_1563872237585_0001_000002
> at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:186)
> at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:123)
> at org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:986)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:121)
> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1300)
> at org.apache.hadoop.service.AbstractService.start(AbstractService.java:194)
> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(MRAppMaster.java:1768)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1716)
> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1764)
> at org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1698)
> Caused by: org.apache.hadoop.security.token.SecretManager$InvalidToken: Invalid AMRMToken from appattempt_1563872237585_0001_000002
> at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
> at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
> at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
> at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateException(RPCUtil.java:53)
> at org.apache.hadoop.yarn.ipc.RPCUtil.instantiateIOException(RPCUtil.java:80)
> at org.apache.hadoop.yarn.ipc.RPCUtil.unwrapAndThrowException(RPCUtil.java:119)
> at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:109)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:422)
> at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeMethod(RetryInvocationHandler.java:165)
> at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invoke(RetryInvocationHandler.java:157)
> at org.apache.hadoop.io.retry.RetryInvocationHandler$Call.invokeOnce(RetryInvocationHandler.java:95)
> at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:359)
> at com.sun.proxy.$Proxy93.registerApplicationMaster(Unknown Source)
> at org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:170)
> ... 14 more
> Caused by: org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken): Invalid AMRMToken from appattempt_1563872237585_0001_000002
> at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1541)
> at org.apache.hadoop.ipc.Client.call(Client.java:1487)
> at org.apache.hadoop.ipc.Client.call(Client.java:1397)
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
> at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
> at com.sun.proxy.$Proxy92.registerApplicationMaster(Unknown Source)
> at org.apache.hadoop.yarn.api.impl.pb.client.ApplicationMasterProtocolPBClientImpl.registerApplicationMaster(ApplicationMasterProtocolPBClientImpl.java:107)
> {code}
> We config NM with configuration below:
> {code:java}
> yarn.nodemanager.amrmproxy.enabled true
> yarn.nodemanager.amrmproxy.interceptor-class.pipeline org.apache.hadoop.yarn.server.nodemanager.amrmproxy.FederationInterceptor
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org