You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Zhanghao Chen (Jira)" <ji...@apache.org> on 2024/04/17 14:29:00 UTC
[jira] [Created] (FLINK-35145) Add timeout for cluster termination
Zhanghao Chen created FLINK-35145:
-------------------------------------
Summary: Add timeout for cluster termination
Key: FLINK-35145
URL: https://issues.apache.org/jira/browse/FLINK-35145
Project: Flink
Issue Type: Improvement
Components: Runtime / Coordination
Affects Versions: 1.20.0
Reporter: Zhanghao Chen
Fix For: 1.20.0
Currently, cluster termination may be blocked forever as there's no timeout for that. For example, for an Application cluster with ZK HA enabled, when ZK cluster is down, the cluster will reach termination status, but the termination process will be blocked when trying to clean up HA data on ZK. Similar phenomenon can be observed when an HDFS/S3 outage occurs.
I propose adding a timeout for the cluster termination process in ClusterEntryPoint#
shutDownAsync method.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)