You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "zlzhang0122 (Jira)" <ji...@apache.org> on 2021/06/30 05:19:00 UTC

[jira] [Created] (FLINK-23189) Count and fail the task when the disk is error on JobManager

zlzhang0122 created FLINK-23189:
-----------------------------------

             Summary: Count and fail the task when the disk is error on JobManager
                 Key: FLINK-23189
                 URL: https://issues.apache.org/jira/browse/FLINK-23189
             Project: Flink
          Issue Type: Improvement
    Affects Versions: 1.13.1, 1.12.2
            Reporter: zlzhang0122


When the jobmanager disk is error and the triggerCheckpoint will throw a IOException and fail, this will cause a TRIGGER_CHECKPOINT_FAILURE, but this failure won't cause Job failed. Users can hardly find this error if he don't see the JobManager logs. To avoid this case, I propose that we can figure out these IOException case and increase the failureCounter which can fail the job finally.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)