You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Dian Fu (Jira)" <ji...@apache.org> on 2020/08/23 01:39:00 UTC

[jira] [Commented] (FLINK-18117) "Kerberized YARN per-job on Docker test" fails with "Could not start hadoop cluster."

    [ https://issues.apache.org/jira/browse/FLINK-18117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17182556#comment-17182556 ] 

Dian Fu commented on FLINK-18117:
---------------------------------

Instance on master: [https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=5788&view=logs&j=91bf6583-3fb2-592f-e4d4-d79d79c3230a&t=3425d8ba-5f03-540a-c64b-51b8481bf7d6]

> "Kerberized YARN per-job on Docker test" fails with "Could not start hadoop cluster."
> -------------------------------------------------------------------------------------
>
>                 Key: FLINK-18117
>                 URL: https://issues.apache.org/jira/browse/FLINK-18117
>             Project: Flink
>          Issue Type: Bug
>          Components: Deployment / YARN
>    Affects Versions: 1.11.0
>            Reporter: Robert Metzger
>            Priority: Critical
>              Labels: test-stability
>
> https://dev.azure.com/apache-flink/apache-flink/_build/results?buildId=2683&view=logs&j=c88eea3b-64a0-564d-0031-9fdcd7b8abee&t=1e2bbe5b-4657-50be-1f07-d84bfce5b1f5
> {code}
> 2020-06-04T06:03:53.2844296Z Creating slave1 ... done
> 2020-06-04T06:03:53.4981251Z Waiting for hadoop cluster to come up. We have been trying for 0 seconds, retrying ...
> 2020-06-04T06:03:58.5980181Z Waiting for hadoop cluster to come up. We have been trying for 5 seconds, retrying ...
> 2020-06-04T06:04:03.6997087Z Waiting for hadoop cluster to come up. We have been trying for 10 seconds, retrying ...
> 2020-06-04T06:04:08.7910791Z Waiting for hadoop cluster to come up. We have been trying for 15 seconds, retrying ...
> 2020-06-04T06:04:13.8921621Z Waiting for hadoop cluster to come up. We have been trying for 20 seconds, retrying ...
> 2020-06-04T06:04:18.9648844Z Waiting for hadoop cluster to come up. We have been trying for 25 seconds, retrying ...
> 2020-06-04T06:04:24.0381851Z Waiting for hadoop cluster to come up. We have been trying for 31 seconds, retrying ...
> 2020-06-04T06:04:29.1220264Z Waiting for hadoop cluster to come up. We have been trying for 36 seconds, retrying ...
> 2020-06-04T06:04:34.1882187Z Waiting for hadoop cluster to come up. We have been trying for 41 seconds, retrying ...
> 2020-06-04T06:04:39.2784948Z Waiting for hadoop cluster to come up. We have been trying for 46 seconds, retrying ...
> 2020-06-04T06:04:44.3843337Z Waiting for hadoop cluster to come up. We have been trying for 51 seconds, retrying ...
> 2020-06-04T06:04:49.4703561Z Waiting for hadoop cluster to come up. We have been trying for 56 seconds, retrying ...
> 2020-06-04T06:04:54.5463207Z Waiting for hadoop cluster to come up. We have been trying for 61 seconds, retrying ...
> 2020-06-04T06:04:59.6650405Z Waiting for hadoop cluster to come up. We have been trying for 66 seconds, retrying ...
> 2020-06-04T06:05:04.7500168Z Waiting for hadoop cluster to come up. We have been trying for 71 seconds, retrying ...
> 2020-06-04T06:05:09.8177904Z Waiting for hadoop cluster to come up. We have been trying for 76 seconds, retrying ...
> 2020-06-04T06:05:14.9751297Z Waiting for hadoop cluster to come up. We have been trying for 81 seconds, retrying ...
> 2020-06-04T06:05:20.0336417Z Waiting for hadoop cluster to come up. We have been trying for 87 seconds, retrying ...
> 2020-06-04T06:05:25.1627704Z Waiting for hadoop cluster to come up. We have been trying for 92 seconds, retrying ...
> 2020-06-04T06:05:30.2583315Z Waiting for hadoop cluster to come up. We have been trying for 97 seconds, retrying ...
> 2020-06-04T06:05:35.3283678Z Waiting for hadoop cluster to come up. We have been trying for 102 seconds, retrying ...
> 2020-06-04T06:05:40.4184029Z Waiting for hadoop cluster to come up. We have been trying for 107 seconds, retrying ...
> 2020-06-04T06:05:45.5388372Z Waiting for hadoop cluster to come up. We have been trying for 112 seconds, retrying ...
> 2020-06-04T06:05:50.6155334Z Waiting for hadoop cluster to come up. We have been trying for 117 seconds, retrying ...
> 2020-06-04T06:05:55.7225186Z Command: start_hadoop_cluster failed. Retrying...
> 2020-06-04T06:05:55.7237999Z Starting Hadoop cluster
> 2020-06-04T06:05:56.5188293Z kdc is up-to-date
> 2020-06-04T06:05:56.5292716Z master is up-to-date
> 2020-06-04T06:05:56.5301735Z slave2 is up-to-date
> 2020-06-04T06:05:56.5306179Z slave1 is up-to-date
> 2020-06-04T06:05:56.6800566Z Waiting for hadoop cluster to come up. We have been trying for 0 seconds, retrying ...
> 2020-06-04T06:06:01.7668291Z Waiting for hadoop cluster to come up. We have been trying for 5 seconds, retrying ...
> 2020-06-04T06:06:06.8620265Z Waiting for hadoop cluster to come up. We have been trying for 10 seconds, retrying ...
> 2020-06-04T06:06:11.9753596Z Waiting for hadoop cluster to come up. We have been trying for 15 seconds, retrying ...
> 2020-06-04T06:06:17.0402846Z Waiting for hadoop cluster to come up. We have been trying for 21 seconds, retrying ...
> 2020-06-04T06:06:22.1650005Z Waiting for hadoop cluster to come up. We have been trying for 26 seconds, retrying ...
> 2020-06-04T06:06:27.2500179Z Waiting for hadoop cluster to come up. We have been trying for 31 seconds, retrying ...
> 2020-06-04T06:06:32.3133809Z Waiting for hadoop cluster to come up. We have been trying for 36 seconds, retrying ...
> 2020-06-04T06:06:37.4432923Z Waiting for hadoop cluster to come up. We have been trying for 41 seconds, retrying ...
> 2020-06-04T06:06:42.5658250Z Waiting for hadoop cluster to come up. We have been trying for 46 seconds, retrying ...
> 2020-06-04T06:06:47.6682536Z Waiting for hadoop cluster to come up. We have been trying for 51 seconds, retrying ...
> 2020-06-04T06:06:52.7810371Z Waiting for hadoop cluster to come up. We have been trying for 56 seconds, retrying ...
> 2020-06-04T06:06:57.8860269Z Waiting for hadoop cluster to come up. We have been trying for 61 seconds, retrying ...
> 2020-06-04T06:07:03.0337979Z Waiting for hadoop cluster to come up. We have been trying for 67 seconds, retrying ...
> 2020-06-04T06:07:08.1080310Z Waiting for hadoop cluster to come up. We have been trying for 72 seconds, retrying ...
> 2020-06-04T06:07:13.2297578Z Waiting for hadoop cluster to come up. We have been trying for 77 seconds, retrying ...
> 2020-06-04T06:07:18.3779034Z Waiting for hadoop cluster to come up. We have been trying for 82 seconds, retrying ...
> 2020-06-04T06:07:23.4789495Z Waiting for hadoop cluster to come up. We have been trying for 87 seconds, retrying ...
> 2020-06-04T06:07:28.6063062Z Waiting for hadoop cluster to come up. We have been trying for 92 seconds, retrying ...
> 2020-06-04T06:07:33.8220409Z Waiting for hadoop cluster to come up. We have been trying for 97 seconds, retrying ...
> 2020-06-04T06:07:38.9439231Z Waiting for hadoop cluster to come up. We have been trying for 102 seconds, retrying ...
> 2020-06-04T06:07:44.0193849Z Waiting for hadoop cluster to come up. We have been trying for 108 seconds, retrying ...
> 2020-06-04T06:07:49.1241642Z Waiting for hadoop cluster to come up. We have been trying for 113 seconds, retrying ...
> 2020-06-04T06:07:54.2425087Z Waiting for hadoop cluster to come up. We have been trying for 118 seconds, retrying ...
> 2020-06-04T06:07:59.3835321Z Command: start_hadoop_cluster failed. Retrying...
> 2020-06-04T06:07:59.3847275Z Starting Hadoop cluster
> 2020-06-04T06:08:00.1959109Z kdc is up-to-date
> 2020-06-04T06:08:00.1968717Z master is up-to-date
> 2020-06-04T06:08:00.1982811Z slave1 is up-to-date
> 2020-06-04T06:08:00.1988143Z slave2 is up-to-date
> 2020-06-04T06:08:00.4014781Z Waiting for hadoop cluster to come up. We have been trying for 0 seconds, retrying ...
> 2020-06-04T06:08:05.5168483Z Waiting for hadoop cluster to come up. We have been trying for 5 seconds, retrying ...
> 2020-06-04T06:08:10.6759355Z Waiting for hadoop cluster to come up. We have been trying for 10 seconds, retrying ...
> 2020-06-04T06:08:15.8307550Z Waiting for hadoop cluster to come up. We have been trying for 15 seconds, retrying ...
> 2020-06-04T06:08:21.0143341Z Waiting for hadoop cluster to come up. We have been trying for 21 seconds, retrying ...
> 2020-06-04T06:08:26.0932297Z Waiting for hadoop cluster to come up. We have been trying for 26 seconds, retrying ...
> 2020-06-04T06:08:31.2526775Z Waiting for hadoop cluster to come up. We have been trying for 31 seconds, retrying ...
> 2020-06-04T06:08:36.4356124Z Waiting for hadoop cluster to come up. We have been trying for 36 seconds, retrying ...
> 2020-06-04T06:08:41.5607530Z Waiting for hadoop cluster to come up. We have been trying for 41 seconds, retrying ...
> 2020-06-04T06:08:46.6407963Z Waiting for hadoop cluster to come up. We have been trying for 46 seconds, retrying ...
> 2020-06-04T06:08:51.8464789Z Waiting for hadoop cluster to come up. We have been trying for 51 seconds, retrying ...
> 2020-06-04T06:08:56.9735817Z Waiting for hadoop cluster to come up. We have been trying for 56 seconds, retrying ...
> 2020-06-04T06:09:02.1023842Z Waiting for hadoop cluster to come up. We have been trying for 62 seconds, retrying ...
> 2020-06-04T06:09:07.2390427Z Waiting for hadoop cluster to come up. We have been trying for 67 seconds, retrying ...
> 2020-06-04T06:09:12.4433329Z Waiting for hadoop cluster to come up. We have been trying for 72 seconds, retrying ...
> 2020-06-04T06:09:17.5390800Z Waiting for hadoop cluster to come up. We have been trying for 77 seconds, retrying ...
> 2020-06-04T06:09:22.7020537Z Waiting for hadoop cluster to come up. We have been trying for 82 seconds, retrying ...
> 2020-06-04T06:09:27.8754909Z Waiting for hadoop cluster to come up. We have been trying for 87 seconds, retrying ...
> 2020-06-04T06:09:33.0447274Z Waiting for hadoop cluster to come up. We have been trying for 93 seconds, retrying ...
> 2020-06-04T06:09:38.1804596Z Waiting for hadoop cluster to come up. We have been trying for 98 seconds, retrying ...
> 2020-06-04T06:09:43.3636590Z Waiting for hadoop cluster to come up. We have been trying for 103 seconds, retrying ...
> 2020-06-04T06:09:48.4975410Z Waiting for hadoop cluster to come up. We have been trying for 108 seconds, retrying ...
> 2020-06-04T06:09:53.6117328Z Waiting for hadoop cluster to come up. We have been trying for 113 seconds, retrying ...
> 2020-06-04T06:09:58.7785946Z Waiting for hadoop cluster to come up. We have been trying for 118 seconds, retrying ...
> 2020-06-04T06:10:03.9748663Z Command: start_hadoop_cluster failed. Retrying...
> 2020-06-04T06:10:03.9808244Z Command: start_hadoop_cluster failed 3 times.
> 2020-06-04T06:10:03.9823071Z ERROR: Could not start hadoop cluster. Aborting...
> {code}
> Frequent, suspicious logs
> {code}
> 2020-06-04T06:10:04.5032658Z 20/06/04 06:05:42 WARN ipc.Client: Failed to connect to server: master.docker-hadoop-cluster-network/172.19.0.3:9000: try once and fail.
> 2020-06-04T06:10:04.5033211Z java.net.ConnectException: Connection refused
> ...
> 2020-06-04T06:10:04.6867876Z 20/06/04 06:04:11 ERROR namenode.NameNode: Failed to start namenode.
> 2020-06-04T06:10:04.6868640Z java.net.BindException: Port in use: 0.0.0.0:50470
> 2020-06-04T06:10:04.6869062Z 	at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:998)
> 2020-06-04T06:10:04.6869702Z 	at org.apache.hadoop.http.HttpServer2.start(HttpServer2.java:935)
> 2020-06-04T06:10:04.6870199Z 	at org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:171)
> 2020-06-04T06:10:04.6870740Z 	at org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:842)
> 2020-06-04T06:10:04.6871235Z 	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:693)
> 2020-06-04T06:10:04.6871728Z 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:906)
> 2020-06-04T06:10:04.6872202Z 	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:885)
> 2020-06-04T06:10:04.6872699Z 	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1626)
> 2020-06-04T06:10:04.6873701Z 	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1694)
> 2020-06-04T06:10:04.6874100Z Caused by: java.net.BindException: Address already in use
> 2020-06-04T06:10:04.6901805Z 	at sun.nio.ch.Net.bind0(Native Method)
> 2020-06-04T06:10:04.6902168Z 	at sun.nio.ch.Net.bind(Net.java:433)
> 2020-06-04T06:10:04.6902478Z 	at sun.nio.ch.Net.bind(Net.java:425)
> 2020-06-04T06:10:04.6902847Z 	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
> 2020-06-04T06:10:04.6903296Z 	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
> 2020-06-04T06:10:04.6903744Z 	at org.mortbay.jetty.nio.SelectChannelConnector.open(SelectChannelConnector.java:216)
> 2020-06-04T06:10:04.6904395Z 	at org.apache.hadoop.http.HttpServer2.openListeners(HttpServer2.java:993)
> 2020-06-04T06:10:04.6904727Z 	... 8 more
> 2020-06-04T06:10:04.6905005Z 20/06/04 06:04:11 INFO util.ExitUtil: Exiting with status 1
> 2020-06-04T06:10:04.6905401Z 20/06/04 06:04:11 INFO namenode.NameNode: SHUTDOWN_MSG: 
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)