You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/04/01 06:41:20 UTC

[GitHub] [flink] XComp commented on pull request #19275: [FLINK-24491][runtime] Make the job termination wait until the archiving of ExecutionGraphInfo finishes

XComp commented on pull request #19275:
URL: https://github.com/apache/flink/pull/19275#issuecomment-1085493232

That's a good point. Thanks for the detailed explanation. I was thinking about it: Essentially, the question is whether we consider the `HistoryServer` archiving being part of the job (which means that we want to finish it before the cleanup phase starts and the `JobManagerRunner` is removed) or we want it to run concurrently to the cleanup logic. For the latter case, I even thought of integrating it into the cleanup phase by making it implement the `GloballyCleanableResource` interface. But I'm not a fan of it because semantically, cleaning up and archiving are two different things (it would have the benefit of getting retries out-of-the-box in case of failure, though). This PR proposes a hybrid approach (i.e. triggering the archiving before the cleanup phase but letting it run concurrently to it) which contributes to the code becoming more complex. Hence, I'd propose going for one of the options I described depending on the usecase we want to cover.

I still tend to lean towards the first approach (waiting for the archiving before triggering the cleanup) because that's where the job is considered finished and the user should expect a result in the `HistoryServer`.

That said, keep in mind that if we go for the latter option (i.e. archiving concurrently to the cleanup), it doesn't guarantee that the cluster waits for it to finish. We have a ticket for that: FLINK-26772 The reason is that the `Dispatcher.shutdownCluster` methods isn't waiting for the `Dispatcher.jobTerminationFutures` to complete.

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org