You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Gyula Fora (Jira)" <ji...@apache.org> on 2022/05/11 08:29:00 UTC

[jira] [Commented] (FLINK-27572) Verify HA Metadata present before performint last-state restore

    [ https://issues.apache.org/jira/browse/FLINK-27572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17534762#comment-17534762 ] 

Gyula Fora commented on FLINK-27572:
------------------------------------

cc [~wangyang0918] [~matyas] 

> Verify HA Metadata present before performint last-state restore
> ---------------------------------------------------------------
>
>                 Key: FLINK-27572
>                 URL: https://issues.apache.org/jira/browse/FLINK-27572
>             Project: Flink
>          Issue Type: Bug
>          Components: Kubernetes Operator
>            Reporter: Gyula Fora
>            Priority: Blocker
>             Fix For: kubernetes-operator-1.0.0
>
>
> When we restore a job using the last-state logic we need to verify that the HA metadata has not been deleted. And if it's not there we need to simply throw an error because this requires manual user intervention.
> This only applies when the FlinkDeployment is not already in a suspended state with recorded last state information.
> The problem be reproduced easily in 1.14 by triggering a fatal job error. (turn of restart-strategy and kill TM for example). In these cases HA metadata will be removed, and the next last-state upgrade should throw an error instead of restoring from a completely empty state. 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)