You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by "Yan Zhou [FDS Science]" <yz...@coupang.com> on 2018/08/08 01:06:34 UTC
checkpoint recovery behavior when kafka source is set to start from
timestamp
Hi Experts,
In my application, the kafka source is set to start from a specified timestamp, by calling method FlinkKafkaConsumer010#setStartFromTimestamp(long startupOffsetsTimestamp).
If the application have run a while and then recover from a checkpoint because of failure, what's the offset will the kafka source to read from? I suppose it will read from the offset that has been committed before the failure. Is it right?
I am going to verify it, however some clarification is good in case my test result doesn't meet my assumption.
Best
Yan
Re: checkpoint recovery behavior when kafka source is set to start
from timestamp
Posted by "Yan Zhou [FDS Science]" <yz...@coupang.com>.
Thank you Vino. It is very helpful.
________________________________
From: vino yang <ya...@gmail.com>
Sent: Tuesday, August 7, 2018 7:22:50 PM
To: Yan Zhou [FDS Science]
Cc: user
Subject: Re: checkpoint recovery behavior when kafka source is set to start from timestamp
Hi Yan Zhou:
I think the java doc of the setStartFromTimestamp method has been explained very clearly, posted here:
/**
* Specify the consumer to start reading partitions from a specified timestamp.
* The specified timestamp must be before the current timestamp.
* This lets the consumer ignore any committed group offsets in Zookeeper / Kafka brokers.
*
* <p>The consumer will look up the earliest offset whose timestamp is greater than or equal
* to the specific timestamp from Kafka. If there's no such offset, the consumer will use the
* latest offset to read data from kafka.
*
* <p>This method does not affect where partitions are read from when the consumer is restored
* from a checkpoint or savepoint. When the consumer is restored from a checkpoint or
* savepoint, only the offsets in the restored state will be used.
*
* @param startupOffsetsTimestamp timestamp for the startup offsets, as milliseconds from epoch.
*
* @return The consumer object, to allow function chaining.
*/
Thanks, vino.
Yan Zhou [FDS Science] <yz...@coupang.com>> 于2018年8月8日周三 上午9:06写道:
Hi Experts,
In my application, the kafka source is set to start from a specified timestamp, by calling method FlinkKafkaConsumer010#setStartFromTimestamp(long startupOffsetsTimestamp).
If the application have run a while and then recover from a checkpoint because of failure, what's the offset will the kafka source to read from? I suppose it will read from the offset that has been committed before the failure. Is it right?
I am going to verify it, however some clarification is good in case my test result doesn't meet my assumption.
Best
Yan
Re: checkpoint recovery behavior when kafka source is set to start
from timestamp
Posted by vino yang <ya...@gmail.com>.
Hi Yan Zhou:
I think the java doc of the setStartFromTimestamp method has been explained
very clearly, posted here:
*/***
** Specify the consumer to start reading partitions from a specified
timestamp.*
** The specified timestamp must be before the current timestamp.*
** This lets the consumer ignore any committed group offsets in Zookeeper /
Kafka brokers.*
***
** <p>The consumer will look up the earliest offset whose timestamp is
greater than or equal*
** to the specific timestamp from Kafka. If there's no such offset, the
consumer will use the*
** latest offset to read data from kafka.*
***
** <p>This method does not affect where partitions are read from when the
consumer is restored*
** from a checkpoint or savepoint. When the consumer is restored from a
checkpoint or*
** savepoint, only the offsets in the restored state will be used.*
***
** @param startupOffsetsTimestamp timestamp for the startup offsets, as
milliseconds from epoch.*
***
** @return The consumer object, to allow function chaining.*
**/*
Thanks, vino.
Yan Zhou [FDS Science] <yz...@coupang.com> 于2018年8月8日周三 上午9:06写道:
> Hi Experts,
>
>
> In my application, the kafka source is set to start from a specified
> timestamp, by calling method FlinkKafkaConsumer010#setStartFromTimestamp(long
> startupOffsetsTimestamp).
>
>
> If the application have run a while and then recover from a checkpoint
> because of failure, what's the offset will the kafka source to read from? I
> suppose it will read from the offset that has been committed before the
> failure. Is it right?
>
>
> I am going to verify it, however some clarification is good in case my
> test result doesn't meet my assumption.
>
>
> Best
>
> Yan
>
>
>