You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Tomas Niño Kehoe <to...@gmail.com> on 2015/07/21 23:24:17 UTC

Retrieving "lost" messages produced while the consumer was down.

Hi,

We've been using Kafka for a couple of months, and now we're trying to to
write a Simple application using the ConsumerGroup to fully understand
Kafka.

Having the producer continually writing data, our consumer occasionally
needs to be restarted. However, once the program is brought back up,
messages which we're produced during that period of time are not being
read. Instead, the consumer (this is a single consumer inside a Consume
group) will read the messages produced after it was brought back up.  Its
configuration doesn't change at all.

For example using the simple consumer/producer apps:

Produced 1, 2, 3, 4, 5
Consumed 1, 2, 3, 4, 5

[Stop the consumer]
Produce 20, 21, 22, 23

When the consumer is brought back up, I'd like to get 20, 21, 22, 23, but I
will only get either new messages, or all the messages using
(--from-beginning).

Is there a way of achieving this programatically, without for example
writing an offset into the zookeeper node? Is the OffsetCommitRequest the
way to go?

Thanks in advance


Tomás

Re: Retrieving "lost" messages produced while the consumer was down.

Posted by Ewen Cheslack-Postava <ew...@confluent.io>.
Since you mentioned consumer groups, I'm assuming you're using the high
level consumer? Do you have auto.commit.enable set to true?

It sounds like when you start up you are always getting the
auto.offset.reset behavior, which indicates you don't have any offsets
committed. By default, that behavior is to consume from the latest offset
(which would only get messages produced after the consumer starts).

To get the behavior you're looking for, you should make sure to commit
offsets when you're shutting down your consumer so it will resume where you
left off the next time you start it. Unless you are using the
SimpleConsumer, you shouldn't need to explicitly make any requests yourself.


On Tue, Jul 21, 2015 at 2:24 PM, Tomas Niño Kehoe <to...@gmail.com>
wrote:

> Hi,
>
> We've been using Kafka for a couple of months, and now we're trying to to
> write a Simple application using the ConsumerGroup to fully understand
> Kafka.
>
> Having the producer continually writing data, our consumer occasionally
> needs to be restarted. However, once the program is brought back up,
> messages which we're produced during that period of time are not being
> read. Instead, the consumer (this is a single consumer inside a Consume
> group) will read the messages produced after it was brought back up.  Its
> configuration doesn't change at all.
>
> For example using the simple consumer/producer apps:
>
> Produced 1, 2, 3, 4, 5
> Consumed 1, 2, 3, 4, 5
>
> [Stop the consumer]
> Produce 20, 21, 22, 23
>
> When the consumer is brought back up, I'd like to get 20, 21, 22, 23, but I
> will only get either new messages, or all the messages using
> (--from-beginning).
>
> Is there a way of achieving this programatically, without for example
> writing an offset into the zookeeper node? Is the OffsetCommitRequest the
> way to go?
>
> Thanks in advance
>
>
> Tomás
>



-- 
Thanks,
Ewen