You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "tao wang (Jira)" <ji...@apache.org> on 2020/08/03 03:11:00 UTC
[jira] [Comment Edited] (FLINK-18748) Savepoint would be queued
unexpected if pendingCheckpoints less than maxConcurrentCheckpoints
[ https://issues.apache.org/jira/browse/FLINK-18748?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17169597#comment-17169597 ]
tao wang edited comment on FLINK-18748 at 8/3/20, 3:10 AM:
-----------------------------------------------------------
Hi, [~klion26] [~roman_khachatryan]: I have create a pull_request, would you like to review this: [https://github.com/apache/flink/pull/13045]?
thanks.
was (Author: wayland):
Hi, [~klion26] [~roman_khachatryan]: I have create a pull_request, would you like to review?
thanks.
> Savepoint would be queued unexpected if pendingCheckpoints less than maxConcurrentCheckpoints
> ---------------------------------------------------------------------------------------------
>
> Key: FLINK-18748
> URL: https://issues.apache.org/jira/browse/FLINK-18748
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Checkpointing
> Affects Versions: 1.11.0, 1.11.1
> Reporter: Congxian Qiu(klion26)
> Assignee: tao wang
> Priority: Major
> Labels: pull-request-available
>
> Inspired by a [user-zh email|http://apache-flink.147419.n8.nabble.com/flink-1-11-rest-api-saveppoint-td5497.html]
> After FLINK-17342, when triggering a checkpoint/savepoint, we'll check whether the request can be triggered in {{CheckpointRequestDecider#chooseRequestToExecute}}, the logic is as follow:
> {code:java}
> Preconditions.checkState(Thread.holdsLock(lock));
> // 1.
> if (isTriggering || queuedRequests.isEmpty()) {
> return Optional.empty();
> }
> // 2 too many ongoing checkpoitn/savepoint
> if (pendingCheckpointsSizeSupplier.get() >= maxConcurrentCheckpointAttempts) {
> return Optional.of(queuedRequests.first())
> .filter(CheckpointTriggerRequest::isForce)
> .map(unused -> queuedRequests.pollFirst());
> }
> // 3 check the timestamp of last complete checkpoint
> long nextTriggerDelayMillis = nextTriggerDelayMillis(lastCompletionMs);
> if (nextTriggerDelayMillis > 0) {
> return onTooEarly(nextTriggerDelayMillis);
> }
> return Optional.of(queuedRequests.pollFirst());
> {code}
> But if currently {{pendingCheckpointsSizeSupplier.get()}} < {{maxConcurrentCheckpointAttempts}}, and the request is a savepoint, the savepoint will still wait some time in step 3.
> I think we should trigger the savepoint immediately if {{pendingCheckpointSizeSupplier.get()}} < {{maxConcurrentCheckpointAttempts}}.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)