You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@flink.apache.org by Janardhan Reddy <ja...@olacabs.com> on 2016/08/03 09:26:10 UTC

Reading from latest offset in kafka consumer on restart

Hi,

Is there a way to read from latest offset in kafka consumer on restart.
Or can we somehow start flink ignoring previous checkpointed data.

Thanks

Re: Reading from latest offset in kafka consumer on restart

Posted by Stephan Ewen <se...@apache.org>.
Ah, you probably use the same consumer group ID.

Flink participates in Kafka's consumer groups (writing offsets for that
group to ZooKeeper/Kafka). If you resume a job, it initially looks for the
current offsets of that consumer group.
If you want to restart without such an offset, you need to set a random "
group.id" in the properties of the FlinkKafkaConsumer.

We are thinking about changing the configuration a bit to make that more
easy. In the next versions, it should be explicit if the FlinkKafkaConsumer
would participate in the consumer group.

On Wed, Aug 3, 2016 at 11:48 AM, Janardhan Reddy <
janardhan.reddy@olacabs.com> wrote:

> thanks.
> We are using kafka flink consumer 0.8.2_11 ,I have set "auto.offset.reset"
> to "largest"
> On cancel and restart the consumer is reading from where it left off
> instead of current offset, i tried both largest and latest in
> auto.offset.reset
>
>
>
> On Wed, Aug 3, 2016 at 3:12 PM, Stephan Ewen <se...@apache.org> wrote:
>
>> Checkpointing starts the consumer where it left off in case the job fails
>> and recovers.
>> If you explicitly cancel a job and start a new job (same jar), the new
>> job will not start from a checkpoint, but from blank state.
>>
>>
>> On Wed, Aug 3, 2016 at 11:37 AM, Janardhan Reddy <
>> janardhan.reddy@olacabs.com> wrote:
>>
>>> I mean in case of chekpointing, won't the consumer start from where it
>>> previously left ?
>>>
>>> On Wed, Aug 3, 2016 at 3:06 PM, Janardhan Reddy <
>>> janardhan.reddy@olacabs.com> wrote:
>>>
>>>> How would checkpointing affect the offset.
>>>>
>>>> On Wed, Aug 3, 2016 at 3:03 PM, Stephan Ewen <se...@apache.org> wrote:
>>>>
>>>>> When you cancel and restart a Flink job (without a savepoint), it does
>>>>> not use the checkpoint data, and uses the behavior you defined in the Kafka
>>>>> consumer to decide where to start from (consumer group, latest, or
>>>>> earliest).
>>>>>
>>>>> On Wed, Aug 3, 2016 at 11:26 AM, Janardhan Reddy <
>>>>> janardhan.reddy@olacabs.com> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> Is there a way to read from latest offset in kafka consumer on
>>>>>> restart.
>>>>>> Or can we somehow start flink ignoring previous checkpointed data.
>>>>>>
>>>>>> Thanks
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Reading from latest offset in kafka consumer on restart

Posted by Janardhan Reddy <ja...@olacabs.com>.
thanks.
We are using kafka flink consumer 0.8.2_11 ,I have set "auto.offset.reset"
to "largest"
On cancel and restart the consumer is reading from where it left off
instead of current offset, i tried both largest and latest in
auto.offset.reset



On Wed, Aug 3, 2016 at 3:12 PM, Stephan Ewen <se...@apache.org> wrote:

> Checkpointing starts the consumer where it left off in case the job fails
> and recovers.
> If you explicitly cancel a job and start a new job (same jar), the new job
> will not start from a checkpoint, but from blank state.
>
>
> On Wed, Aug 3, 2016 at 11:37 AM, Janardhan Reddy <
> janardhan.reddy@olacabs.com> wrote:
>
>> I mean in case of chekpointing, won't the consumer start from where it
>> previously left ?
>>
>> On Wed, Aug 3, 2016 at 3:06 PM, Janardhan Reddy <
>> janardhan.reddy@olacabs.com> wrote:
>>
>>> How would checkpointing affect the offset.
>>>
>>> On Wed, Aug 3, 2016 at 3:03 PM, Stephan Ewen <se...@apache.org> wrote:
>>>
>>>> When you cancel and restart a Flink job (without a savepoint), it does
>>>> not use the checkpoint data, and uses the behavior you defined in the Kafka
>>>> consumer to decide where to start from (consumer group, latest, or
>>>> earliest).
>>>>
>>>> On Wed, Aug 3, 2016 at 11:26 AM, Janardhan Reddy <
>>>> janardhan.reddy@olacabs.com> wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> Is there a way to read from latest offset in kafka consumer on
>>>>> restart.
>>>>> Or can we somehow start flink ignoring previous checkpointed data.
>>>>>
>>>>> Thanks
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Reading from latest offset in kafka consumer on restart

Posted by Stephan Ewen <se...@apache.org>.
Checkpointing starts the consumer where it left off in case the job fails
and recovers.
If you explicitly cancel a job and start a new job (same jar), the new job
will not start from a checkpoint, but from blank state.


On Wed, Aug 3, 2016 at 11:37 AM, Janardhan Reddy <
janardhan.reddy@olacabs.com> wrote:

> I mean in case of chekpointing, won't the consumer start from where it
> previously left ?
>
> On Wed, Aug 3, 2016 at 3:06 PM, Janardhan Reddy <
> janardhan.reddy@olacabs.com> wrote:
>
>> How would checkpointing affect the offset.
>>
>> On Wed, Aug 3, 2016 at 3:03 PM, Stephan Ewen <se...@apache.org> wrote:
>>
>>> When you cancel and restart a Flink job (without a savepoint), it does
>>> not use the checkpoint data, and uses the behavior you defined in the Kafka
>>> consumer to decide where to start from (consumer group, latest, or
>>> earliest).
>>>
>>> On Wed, Aug 3, 2016 at 11:26 AM, Janardhan Reddy <
>>> janardhan.reddy@olacabs.com> wrote:
>>>
>>>> Hi,
>>>>
>>>> Is there a way to read from latest offset in kafka consumer on restart.
>>>> Or can we somehow start flink ignoring previous checkpointed data.
>>>>
>>>> Thanks
>>>>
>>>
>>>
>>
>

Re: Reading from latest offset in kafka consumer on restart

Posted by Janardhan Reddy <ja...@olacabs.com>.
I mean in case of chekpointing, won't the consumer start from where it
previously left ?

On Wed, Aug 3, 2016 at 3:06 PM, Janardhan Reddy <janardhan.reddy@olacabs.com
> wrote:

> How would checkpointing affect the offset.
>
> On Wed, Aug 3, 2016 at 3:03 PM, Stephan Ewen <se...@apache.org> wrote:
>
>> When you cancel and restart a Flink job (without a savepoint), it does
>> not use the checkpoint data, and uses the behavior you defined in the Kafka
>> consumer to decide where to start from (consumer group, latest, or
>> earliest).
>>
>> On Wed, Aug 3, 2016 at 11:26 AM, Janardhan Reddy <
>> janardhan.reddy@olacabs.com> wrote:
>>
>>> Hi,
>>>
>>> Is there a way to read from latest offset in kafka consumer on restart.
>>> Or can we somehow start flink ignoring previous checkpointed data.
>>>
>>> Thanks
>>>
>>
>>
>

Re: Reading from latest offset in kafka consumer on restart

Posted by Janardhan Reddy <ja...@olacabs.com>.
How would checkpointing affect the offset.

On Wed, Aug 3, 2016 at 3:03 PM, Stephan Ewen <se...@apache.org> wrote:

> When you cancel and restart a Flink job (without a savepoint), it does not
> use the checkpoint data, and uses the behavior you defined in the Kafka
> consumer to decide where to start from (consumer group, latest, or
> earliest).
>
> On Wed, Aug 3, 2016 at 11:26 AM, Janardhan Reddy <
> janardhan.reddy@olacabs.com> wrote:
>
>> Hi,
>>
>> Is there a way to read from latest offset in kafka consumer on restart.
>> Or can we somehow start flink ignoring previous checkpointed data.
>>
>> Thanks
>>
>
>

Re: Reading from latest offset in kafka consumer on restart

Posted by Stephan Ewen <se...@apache.org>.
When you cancel and restart a Flink job (without a savepoint), it does not
use the checkpoint data, and uses the behavior you defined in the Kafka
consumer to decide where to start from (consumer group, latest, or
earliest).

On Wed, Aug 3, 2016 at 11:26 AM, Janardhan Reddy <
janardhan.reddy@olacabs.com> wrote:

> Hi,
>
> Is there a way to read from latest offset in kafka consumer on restart.
> Or can we somehow start flink ignoring previous checkpointed data.
>
> Thanks
>