You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by "TisonKun (Jira)" <ji...@apache.org> on 2019/09/18 04:57:00 UTC

[jira] [Comment Edited] (FLINK-14112) Removing zookeeper state should cause the task manager and job managers to restart

    [ https://issues.apache.org/jira/browse/FLINK-14112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16932058#comment-16932058 ] 

TisonKun edited comment on FLINK-14112 at 9/18/19 4:56 AM:
-----------------------------------------------------------

Hi [~aaronlevin] thanks for creating this JIRA. Generally I think Flink owns its znodes and the prerequisite here "delete all the znodes within {{/flink}}" should not happen.

However, I can see your concern and ask you for the "massive amount of logging" to see what we can improve in log scope.


was (Author: tison):
Hi [~aaronlevin] thanks for creating this JIRA. Generally I think Flink owns its znodes and the prerequisite here "delete all the znodes within {{/flink}}" should not happen.

However, I can see your concern and ask you for the "massive amount of logging" to see what we can improve in log scope. Besides, I agree that JM and TM are nice to crash if ZK is under an uncertain state.

> Removing zookeeper state should cause the task manager and job managers to restart
> ----------------------------------------------------------------------------------
>
>                 Key: FLINK-14112
>                 URL: https://issues.apache.org/jira/browse/FLINK-14112
>             Project: Flink
>          Issue Type: Wish
>          Components: Runtime / Coordination
>    Affects Versions: 1.8.1, 1.9.0
>            Reporter: Aaron Levin
>            Priority: Minor
>
> Suppose you have a flink application running on a cluster with the following configuration:
> {noformat}
> high-availability.zookeeper.path.root: /flink
> {noformat}
> Now suppose you delete all the znodes within {{/flink}}. I experienced the following:
>  * massive amount of logging
>  * application did not restart
>  * task manager did not crash or restart
>  * job manager did not crash or restart
> From this state I had to restart all the task managers and all the job managers in order for the flink application to recover.
> It would be desirable for the Task Managers and Job Managers to crash if the znode is not available (though perhaps you all have thought about this more deeply than I!)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)