You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Corbin Hoenes <co...@tynt.com> on 2012/06/12 18:14:21 UTC

consumer may lose data

Hi there this is my first experience w/kafka.  We've deployed it in production (soft release) and using it to create a realtime stream of data--we love it so far.

Running in production we are seeing these types of messages every once in a while:  

[2012-06-09 09:02:36,051] ERROR [pool-1-thread-4] (ConsumerIterator.scala 74) - consumed offset: 22013667532 doesn't match fetch offset: 21008498593 for firehose:1-23: fetched offset = 22013667532: consumed offset = 22013667532;
 Consumer may lose data
[2012-06-09 09:22:48,520] ERROR [pool-1-thread-4] (ConsumerIterator.scala 74) - consumed offset: 21013192930 doesn't match fetch offset: 21475567914 for firehose:1-3: fetched offset = 22021419503: consumed offset = 21013192930;
 Consumer may lose data
[2012-06-09 09:42:34,342] ERROR [pool-1-thread-1] (ConsumerIterator.scala 74) - consumed offset: 21017992042 doesn't match fetch offset: 21477363255 for firehose:1-5: fetched offset = 22029075985: consumed offset = 21017992042;
 Consumer may lose data
[2012-06-09 09:46:50,599] ERROR [pool-1-thread-1] (ConsumerIterator.scala 74) - consumed offset: 21017498912 doesn't match fetch offset: 21476883323 for firehose:1-7: fetched offset = 22022716494: consumed offset = 21017498912;
 Consumer may lose data
[2012-06-09 09:50:54,912] ERROR [pool-1-thread-1] (ConsumerIterator.scala 74) - consumed offset: 21016428723 doesn't match fetch offset: 21475750245 for firehose:1-4: fetched offset = 22027573299: consumed offset = 21016428723;
 Consumer may lose data
[2012-06-09 09:58:29,709] ERROR [pool-1-thread-1] (ConsumerIterator.scala 74) - consumed offset: 21017643906 doesn't match fetch offset: 21477006308 for firehose:1-6: fetched offset = 22025778964: consumed offset = 21017643906;
 Consumer may lose data
[2012-06-09 09:59:04,622] ERROR [pool-1-thread-4] (ConsumerIterator.scala 74) - consumed offset: 21017419393 doesn't match fetch offset: 21476749439 for firehose:1-23: fetched offset = 22025584690: consumed offset = 21017419393;

I am a bit unsure what kafka does when the consumed offset doesn't match the fetch offset.  We are using a pool of threads to consume each stream created by ConsumerConnector.createMessageStreams().  Is this kosher?


Re: consumer may lose data

Posted by Jun Rao <ju...@gmail.com>.
Corbin,

This indicates a bug in Kafka. Which version of Kafka are you using? Is
there an easy way to reproduce this problem? Your usage of Kafka consumer
seems to be normal.

Thanks,

Jun

On Tue, Jun 12, 2012 at 9:14 AM, Corbin Hoenes <co...@tynt.com> wrote:

> Hi there this is my first experience w/kafka.  We've deployed it in
> production (soft release) and using it to create a realtime stream of
> data--we love it so far.
>
> Running in production we are seeing these types of messages every once in
> a while:
>
> [2012-06-09 09:02:36,051] ERROR [pool-1-thread-4] (ConsumerIterator.scala
> 74) - consumed offset: 22013667532 doesn't match fetch offset: 21008498593
> for firehose:1-23: fetched offset = 22013667532: consumed offset =
> 22013667532;
>  Consumer may lose data
> [2012-06-09 09:22:48,520] ERROR [pool-1-thread-4] (ConsumerIterator.scala
> 74) - consumed offset: 21013192930 doesn't match fetch offset: 21475567914
> for firehose:1-3: fetched offset = 22021419503: consumed offset =
> 21013192930;
>  Consumer may lose data
> [2012-06-09 09:42:34,342] ERROR [pool-1-thread-1] (ConsumerIterator.scala
> 74) - consumed offset: 21017992042 doesn't match fetch offset: 21477363255
> for firehose:1-5: fetched offset = 22029075985: consumed offset =
> 21017992042;
>  Consumer may lose data
> [2012-06-09 09:46:50,599] ERROR [pool-1-thread-1] (ConsumerIterator.scala
> 74) - consumed offset: 21017498912 doesn't match fetch offset: 21476883323
> for firehose:1-7: fetched offset = 22022716494: consumed offset =
> 21017498912;
>  Consumer may lose data
> [2012-06-09 09:50:54,912] ERROR [pool-1-thread-1] (ConsumerIterator.scala
> 74) - consumed offset: 21016428723 doesn't match fetch offset: 21475750245
> for firehose:1-4: fetched offset = 22027573299: consumed offset =
> 21016428723;
>  Consumer may lose data
> [2012-06-09 09:58:29,709] ERROR [pool-1-thread-1] (ConsumerIterator.scala
> 74) - consumed offset: 21017643906 doesn't match fetch offset: 21477006308
> for firehose:1-6: fetched offset = 22025778964: consumed offset =
> 21017643906;
>  Consumer may lose data
> [2012-06-09 09:59:04,622] ERROR [pool-1-thread-4] (ConsumerIterator.scala
> 74) - consumed offset: 21017419393 doesn't match fetch offset: 21476749439
> for firehose:1-23: fetched offset = 22025584690: consumed offset =
> 21017419393;
>
> I am a bit unsure what kafka does when the consumed offset doesn't match
> the fetch offset.  We are using a pool of threads to consume each stream
> created by ConsumerConnector.createMessageStreams().  Is this kosher?
>
>