You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by "caozhiqiang (JIRA)" <ji...@apache.org> on 2019/06/12 05:58:00 UTC

[jira] [Created] (YARN-9619) Transfer error AM host/ip when launching app using docker container with bridge network

caozhiqiang created YARN-9619:
---------------------------------

             Summary: Transfer error AM host/ip when launching app using docker container with bridge network
                 Key: YARN-9619
                 URL: https://issues.apache.org/jira/browse/YARN-9619
             Project: Hadoop YARN
          Issue Type: Bug
          Components: yarn
    Affects Versions: 3.3.0
            Reporter: caozhiqiang


When launching application using docker container with bridge network in overlay networks, client will polling the rate of application process from ApplicationMaster with error host/IP. client also polling from the nodemanager's hostname/IP, but not from the docker's IP which AM real running in. The error message is below(the server hadoop3-1/192.168.2.105 is NM's, not AM's docker IP, so it can't be accessed):

2019-05-11 08:28:46,361 INFO ipc.Client: Retrying connect to server: hadoop3-1/192.168.2.105:37963. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2019-05-11 08:28:47,363 INFO ipc.Client: Retrying connect to server: hadoop3-1/192.168.2.105:37963. Already tried 1 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2019-05-11 08:28:48,365 INFO ipc.Client: Retrying connect to server: hadoop3-1/192.168.2.105:37963. Already tried 2 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=3, sleepTime=1000 MILLISECONDS)
2019-05-10 08:34:40,235 INFO mapred.ClientServiceDelegate: Application state is completed. FinalApplicationStatus=FAILED. Redirecting to job history server
2019-05-10 08:35:00,408 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:12020. Already tried 8 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2019-05-10 08:35:00,408 INFO ipc.Client: Retrying connect to server: 0.0.0.0/0.0.0.0:12020. Already tried 9 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
java.io.IOException: java.net.ConnectException: Your endpoint configuration is wrong; For more details see: http://wiki.apache.org/hadoop/UnsetHostnameOrPort
 at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:345)
 at org.apache.hadoop.mapred.ClientServiceDelegate.getJobStatus(ClientServiceDelegate.java:430)
 at org.apache.hadoop.mapred.YARNRunner.getJobStatus(YARNRunner.java:871)
 at org.apache.hadoop.mapreduce.Job$1.run(Job.java:331)
 at org.apache.hadoop.mapreduce.Job$1.run(Job.java:328)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1730)
 at org.apache.hadoop.mapreduce.Job.updateStatus(Job.java:328)
 at org.apache.hadoop.mapreduce.Job.isComplete(Job.java:612)
 at org.apache.hadoop.mapreduce.Job.monitorAndPrintJob(Job.java:1629)
 at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1591)
 at org.apache.hadoop.examples.QuasiMonteCarlo.estimatePi(QuasiMonteCarlo.java:307)
 at org.apache.hadoop.examples.QuasiMonteCarlo.run(QuasiMonteCarlo.java:360)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
 at org.apache.hadoop.examples.QuasiMonteCarlo.main(QuasiMonteCarlo.java:368)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:71)
 at org.apache.hadoop.util.ProgramDriver.run(ProgramDriver.java:144)
 at org.apache.hadoop.examples.ExampleDriver.main(ExampleDriver.java:74)
 at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
 at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.apache.hadoop.util.RunJar.run(RunJar.java:323)
 at org.apache.hadoop.util.RunJar.main(RunJar.java:236)
Caused by: java.net.ConnectException: Your endpoint configuration is wrong; For more details see: http://wiki.apache.org/hadoop/UnsetHostnameOrPort
 at sun.reflect.GeneratedConstructorAccessor20.newInstance(Unknown Source)
 at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
 at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
 at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:831)
 at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:751)
 at org.apache.hadoop.ipc.Client.getRpcResponse(Client.java:1515)
 at org.apache.hadoop.ipc.Client.call(Client.java:1457)
 at org.apache.hadoop.ipc.Client.call(Client.java:1367)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:228)
 at org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:116)
 at com.sun.proxy.$Proxy14.getJobReport(Unknown Source)
 at org.apache.hadoop.mapreduce.v2.api.impl.pb.client.MRClientProtocolPBClientImpl.getJobReport(MRClientProtocolPBClientImpl.java:133)
 at sun.reflect.GeneratedMethodAccessor3.invoke(Unknown Source)
 at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 at java.lang.reflect.Method.invoke(Method.java:498)
 at org.apache.hadoop.mapred.ClientServiceDelegate.invoke(ClientServiceDelegate.java:326)
 ... 28 more
Caused by: java.net.ConnectException: 拒绝连接
 at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
 at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
 at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
 at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
 at org.apache.hadoop.ipc.Client$Connection.setupConnection(Client.java:690)
 at org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:794)
 at org.apache.hadoop.ipc.Client$Connection.access$3700(Client.java:411)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1572)
 at org.apache.hadoop.ipc.Client.call(Client.java:1403)
 ... 37 more

 

In AM register to RM's code, RMCommunicator::register(), I try to use "request.setHost(InetAddress.getLocalHost().getHostAddress());" to get the docker's IP, but it also doesn't work. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org