You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by John O <so...@samsung.com> on 2018/08/29 18:37:21 UTC

checkpoint timeout

I have a flink job with a big enough state that makes checkpointing long ( ~ 70 seconds).
I have configured the checkpoint timeout to 180 seconds (setCheckpointTimeout(180000))
But as you can see from the following logs, timeout seems to be ~60 seconds.

Is there another timeout configuration I need to set?

2018-08-29 11:41:03,883 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Triggering checkpoint 61 @ 1535542863734 for job aae565f4f6efba50d5252fc1afd7c255.
2018-08-29 11:42:03,883 INFO  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Checkpoint 61 of job aae565f4f6efba50d5252fc1afd7c255 expired before completing.
2018-08-29 11:42:13,955 WARN  org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Received late message for now expired checkpoint attempt 61 from f022ce60f0f5da2a34290574305813d8 of job aae565f4f6efba50d5252fc1afd7c255.


Jo

Re: checkpoint timeout

Posted by Till Rohrmann <tr...@apache.org>.
Hi John,

which version of Flink are you using. I just tried it out with the current
snapshot version and I could configure the checkpoint timeout via

CheckpointConfig checkpointConfig = env.getCheckpointConfig();
checkpointConfig.setCheckpointTimeout(1337L);

Could you provide us the logs and the application code you are running?

Cheers,
Till

On Thu, Aug 30, 2018 at 4:23 AM vino yang <ya...@gmail.com> wrote:

> Hi John,
>
> Setting the checkpoint timeout is through this API. The default timeout
> for checkpoints is 10 minutes [1], not one minute. So, I think it must be
> something else.
> You can set the log level of JM and TM to Debug, and then see more
> checkpoint details. If there is no way to analyze it, you can share your
> log on this mailing list, if you can.
>
> [1]:
> https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/api/environment/CheckpointConfig.html
>
> Thanks, vino.
>
> John O <so...@samsung.com> 于2018年8月30日周四 上午2:37写道:
>
>> I have a flink job with a big enough state that makes checkpointing long
>> ( ~ 70 seconds).
>>
>> I have configured the checkpoint timeout to 180 seconds
>> (setCheckpointTimeout(180000))
>>
>> But as you can see from the following logs, timeout seems to be ~60
>> seconds.
>>
>> Is there another timeout configuration I need to set?
>>
>> 2018-08-29 11:41:03,883 INFO
>> org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Triggering
>> checkpoint 61 @ 1535542863734 for job aae565f4f6efba50d5252fc1afd7c255.
>>
>> 2018-08-29 11:42:03,883 INFO
>> org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Checkpoint
>> 61 of job aae565f4f6efba50d5252fc1afd7c255 expired before completing.
>>
>> 2018-08-29 11:42:13,955 WARN
>> org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Received
>> late message for now expired checkpoint attempt 61 from
>> f022ce60f0f5da2a34290574305813d8 of job aae565f4f6efba50d5252fc1afd7c255.
>>
>>
>>
>>
>>
>> Jo
>>
>

Re: checkpoint timeout

Posted by vino yang <ya...@gmail.com>.
Hi John,

Setting the checkpoint timeout is through this API. The default timeout for
checkpoints is 10 minutes [1], not one minute. So, I think it must be
something else.
You can set the log level of JM and TM to Debug, and then see more
checkpoint details. If there is no way to analyze it, you can share your
log on this mailing list, if you can.

[1]:
https://ci.apache.org/projects/flink/flink-docs-master/api/java/org/apache/flink/streaming/api/environment/CheckpointConfig.html

Thanks, vino.

John O <so...@samsung.com> 于2018年8月30日周四 上午2:37写道:

> I have a flink job with a big enough state that makes checkpointing long (
> ~ 70 seconds).
>
> I have configured the checkpoint timeout to 180 seconds
> (setCheckpointTimeout(180000))
>
> But as you can see from the following logs, timeout seems to be ~60
> seconds.
>
> Is there another timeout configuration I need to set?
>
> 2018-08-29 11:41:03,883 INFO
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Triggering
> checkpoint 61 @ 1535542863734 for job aae565f4f6efba50d5252fc1afd7c255.
>
> 2018-08-29 11:42:03,883 INFO
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Checkpoint
> 61 of job aae565f4f6efba50d5252fc1afd7c255 expired before completing.
>
> 2018-08-29 11:42:13,955 WARN
> org.apache.flink.runtime.checkpoint.CheckpointCoordinator     - Received
> late message for now expired checkpoint attempt 61 from
> f022ce60f0f5da2a34290574305813d8 of job aae565f4f6efba50d5252fc1afd7c255.
>
>
>
>
>
> Jo
>