You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yangze Guo (Jira)" <ji...@apache.org> on 2023/09/14 02:27:00 UTC
[jira] [Assigned] (FLINK-33053) Watcher leak in Zookeeper HA mode
[ https://issues.apache.org/jira/browse/FLINK-33053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Yangze Guo reassigned FLINK-33053:
----------------------------------
Assignee: Yangze Guo
> Watcher leak in Zookeeper HA mode
> ---------------------------------
>
> Key: FLINK-33053
> URL: https://issues.apache.org/jira/browse/FLINK-33053
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Coordination
> Affects Versions: 1.17.0, 1.18.0, 1.17.1
> Reporter: Yangze Guo
> Assignee: Yangze Guo
> Priority: Blocker
> Attachments: 26.dump.zip, 26.log, taskmanager_flink-native-test-117-taskmanager-1-9_thread_dump (1).json
>
>
> We observe a watcher leak in our OLAP stress test when enabling Zookeeper HA mode. TM's watches on the leader of JobMaster has not been stopped after job finished.
> Here is how we re-produce this issue:
> - Start a session cluster and enable Zookeeper HA mode.
> - Continuously and concurrently submit short queries, e.g. WordCount to the cluster.
> - echo -n wchp | nc \{zk host} \{zk port} to get current watches.
> We can see a lot of watches on /flink/\{cluster_name}/leader/\{job_id}/connection_info.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)