You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Gyula Fora (Jira)" <ji...@apache.org> on 2022/05/27 13:05:00 UTC

[jira] [Closed] (FLINK-27804) Do not observe cluster/job mid upgrade

     [ https://issues.apache.org/jira/browse/FLINK-27804?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gyula Fora closed FLINK-27804.
------------------------------
    Resolution: Fixed

Merged

main: aa4f1d64d223d2dfa434edcd4c2ae8a9b54d0fdf
release-1.0: fbaad0f48cb5bf3a4ca2b685846dab9c072083e0

> Do not observe cluster/job mid upgrade
> --------------------------------------
>
>                 Key: FLINK-27804
>                 URL: https://issues.apache.org/jira/browse/FLINK-27804
>             Project: Flink
>          Issue Type: Improvement
>          Components: Kubernetes Operator
>    Affects Versions: kubernetes-operator-1.0.0
>            Reporter: Gyula Fora
>            Assignee: Gyula Fora
>            Priority: Critical
>              Labels: pull-request-available
>
> Seems like in some weird cornercases when we observe the FINISHED job (stopped with savepoint) during an upgrade the recorded last snapshot is incorrect (still need to investigate if this is due to a Flink problem or what) This can lead to upgrade errors.
> This can be avoided by simply skipping the observe step when the reconciliation status is UPGRADING because at that point we actually know that the job was already shut down and state recorded correctly in the savepoint info.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)