You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/04/01 08:22:01 UTC

[GitHub] [flink] Thesharing commented on pull request #19275: [FLINK-24491][runtime] Make the job termination wait until the archiving of ExecutionGraphInfo finishes

Thesharing commented on pull request #19275:
URL: https://github.com/apache/flink/pull/19275#issuecomment-1085589841


   Thank you for the detailed analysis, @XComp. I agree that cleaning up and archiving are two different things. Integrating the archiving into the resource cleanup may make user confused. 
   
   I know that if the resource cleanup is not completed, the JobResultEntry will not be marked as clean, and a retry of the resource cleanup would be triggered. Would it try to archive the ExecutionGraph once again before it re-triggers another resource cleanup? If so, I think maybe waiting for the archiving before triggering the cleanup is better. However, I'm worried about the worst case: the archiving fails over and over again (due to a busy disk or a slow network), which makes the job retry so many times. If we run the archiving and the cleanup currently, the failure of archiving won't make the job retry over and over again. We just lost the archive, the same as we currently do.
   
   As for FLINK-26772, I'm wondering whether we could make sure `shutdownFuture` is not completed until the `jobTerminationFutures` are all completed or not.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org