You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Rui Fan (Jira)" <ji...@apache.org> on 2022/11/24 06:56:00 UTC

[jira] [Commented] (FLINK-30184) Save TM/JM thread stack periodically

    [ https://issues.apache.org/jira/browse/FLINK-30184?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17638128#comment-17638128 ] 

Rui Fan commented on FLINK-30184:
---------------------------------

Hi [~xtsong] , please help take a look in your free time. And if it makes sense, please assign it to me, thanks~

> Save TM/JM thread stack periodically
> ------------------------------------
>
>                 Key: FLINK-30184
>                 URL: https://issues.apache.org/jira/browse/FLINK-30184
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Web Frontend
>            Reporter: Rui Fan
>            Priority: Major
>             Fix For: 1.17.0
>
>
> After FLINK-14816 FLINK-25398 and FLINK-25372 , flink user can view the thread stack of TM/JM in Flink WebUI. 
> It can help flink users to find out why the Flink job is stuck, or why the processing is slow. It is very useful for trouble shooting.
> However, sometimes Flink tasks get stuck or process slowly, but when the user troubleshoots the problem, the job has resumed. It is difficult to find out what happened to the Flink job at the time and why is it slow?
>  
> So, could we periodically save the thread stack of TM or JM in the TM log directory?
> Define some configurations:
> cluster.thread-dump.interval=1min
> cluster.thread-dump.cleanup-time=48 hours



--
This message was sent by Atlassian Jira
(v8.20.10#820010)