You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "harold.miao (Jira)" <ji...@apache.org> on 2020/08/21 03:12:00 UTC

[jira] [Commented] (FLINK-19010) Add a system metric to show the checkpoint restore time

    [ https://issues.apache.org/jira/browse/FLINK-19010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17181557#comment-17181557 ] 

harold.miao commented on FLINK-19010:
-------------------------------------

yes, this metric get obvious significance,  we develop a platform and serve many people , they often talk to me “ when job failed, and when job restore from cp” .  

> Add a system metric to show the checkpoint restore time
> -------------------------------------------------------
>
>                 Key: FLINK-19010
>                 URL: https://issues.apache.org/jira/browse/FLINK-19010
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Metrics
>    Affects Versions: 1.11.1
>            Reporter: Zhinan Cheng
>            Priority: Trivial
>
> Now the system metric only shows the downtime when failure happens. It would be interesting to see the time to restore the checkpoint, so users can better understand the bottleneck of failure recovery.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)