You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by "Gyula Fora (Jira)" <ji...@apache.org> on 2022/02/14 16:32:00 UTC

[jira] [Created] (FLINK-26140) Add basic handling mechanism to deal with job upgrade errors

Gyula Fora created FLINK-26140:
----------------------------------

             Summary: Add basic handling mechanism to deal with job upgrade errors
                 Key: FLINK-26140
                 URL: https://issues.apache.org/jira/browse/FLINK-26140
             Project: Flink
          Issue Type: Sub-task
          Components: Deployment / Kubernetes
            Reporter: Gyula Fora


There are various different ways how a stateful job upgrade can fail.
For example:
- Failure/timeout during savepoint
- Incompatible state
- Corrupted / not-found checkpoint
- Error after restart

We should allow some strategies for the user to declare how to handle the different error scenarios (such as roll back to earlier state) and what should be treated as a fatal error.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)