You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Yanfei Lei (Jira)" <ji...@apache.org> on 2023/04/23 04:50:00 UTC

[jira] [Commented] (FLINK-19010) Add a system metric to show the checkpoint restore time

    [ https://issues.apache.org/jira/browse/FLINK-19010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715362#comment-17715362 ] 

Yanfei Lei commented on FLINK-19010:
------------------------------------

[~zncheng] Do you mean the total restore time of the whole job? I'm glad to add this, could you please assign it to me?

> Add a system metric to show the checkpoint restore time
> -------------------------------------------------------
>
>                 Key: FLINK-19010
>                 URL: https://issues.apache.org/jira/browse/FLINK-19010
>             Project: Flink
>          Issue Type: Improvement
>          Components: Runtime / Checkpointing, Runtime / Metrics
>            Reporter: Zhinan Cheng
>            Priority: Minor
>
> Now the system metric only shows the downtime when failure happens. It would be interesting to see the time to restore the checkpoint, so users can better understand the bottleneck of failure recovery.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)