You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "ChuanHaiTan (JIRA)" <ji...@apache.org> on 2018/11/05 08:13:00 UTC
[jira] [Created] (FLINK-10775) Quarantined address
[akka.tcp://flink@flink-jobmanager:6123] is still unreachable or has not
been restarted. Keeping it quarantined.
ChuanHaiTan created FLINK-10775:
-----------------------------------
Summary: Quarantined address [akka.tcp://flink@flink-jobmanager:6123] is still unreachable or has not been restarted. Keeping it quarantined.
Key: FLINK-10775
URL: https://issues.apache.org/jira/browse/FLINK-10775
Project: Flink
Issue Type: Bug
Components: ResourceManager
Affects Versions: 1.4.2
Environment: k8s+docker
standalone (1jobmanager + 5taskmanager)
taskmanager.slotnum=4
Reporter: ChuanHaiTan
Attachments: logs-from-flink-jobmanager-in-flink-jobmanager-65c8d85f4f-5fm2d.txt, logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-7lw82.txt, logs-from-flink-taskmanager-in-flink-taskmanager-758575577d-qbj9g.txt, 微信图片_20181031171312.png, 微信图片_20181031171316.png
On the k8s+docker environment, the 1 jobmanager container and 5 taskmanager container are the standalone cluster modes.
{color:#FF0000}But for some reason, the jobmanager is rebooted, and two of the remaining three taskmanger are also rebooted, and two of the remaining three taskmanger don't connect to jobmanager, resulting in insufficient slot resources reporting errors.{color}
The attachments are the jobmanager log, two disconnected taskmanger logs, and all available and unavailable taskmanager screenshots of flink at the time.
It is strange that two rebooted taskmanger can connect with jobmanager, and one of the three unrebooted taskamanagers can connect.
Why?Can the cause of the restart be analyzed from the log?thank you
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)