You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by "Sybrandy, Casey" <Ca...@Six3Systems.com> on 2013/07/18 17:55:09 UTC

Duplicate Messages on the Consumer

Hello,

We recently started seeing duplicate messages appearing at our consumers.  Thankfully, the database is set up so that we don't store the dupes, but it is annoying.  It's not every message, only about 1% of them.  We are running 0.7.0 for the broker with Zookeeper 3.3.4 from Cloudera and 0.7.0 for the producer and consumer.  We tried upgrading the consumer to 0.7.2 to see if that worked, but we're still seeing the dupes.  Do we have to upgrade the broker as well to resolve this?  Is there something we can check to see what's going on because we're not seeing anything unusual in the logs.  I suspected that there may be significant rebalancing, but that does not appear to be the case at all.

Casey Sybrandy


Re: Duplicate Messages on the Consumer

Posted by Jun Rao <ju...@gmail.com>.
In 0.7.x, if the messages are compressed, there could be duplicated
messages during consumer rebalance. This is because we can only checkpoint
consumer offset at the compressed unit boundary. You may want to see if you
have unnecessary rebalances (see
https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-Whyaretheremanyrebalancesinmyconsumerlog%3F).
In 0.8, there won't be duplicated messages even when compression is enabled.

Thanks,

Jun


On Fri, Jul 19, 2013 at 1:16 PM, Sybrandy, Casey <
Casey.Sybrandy@six3systems.com> wrote:

> Hello,
>
> No, we couldn't check the broker logs because the data is obfuscated, so
> we can't just look at the files and tell.  It looks like our dev system may
> be experiencing the same issue, so I did turn of the obfuscation and we'll
> monitor it.  However, on our production system where we were seeing the
> errors more often, appears to have had zookeeper misconfigured, so we're
> thinking that may be the issue.
>
> Casey
>
> -----Original Message-----
> From: Philip O'Toole [mailto:philip@loggly.com]
> Sent: Thursday, July 18, 2013 3:29 PM
> To: users@kafka.apache.org
> Cc: kafka-users@incubator.apache.org
> Subject: Re: Duplicate Messages on the Consumer
>
> Have you actually examined the Kafka files on disk, to make sure those
> dupes are really there? Or is this a case of reading the same message more
> than once?
>
> Philip
>
> On Thu, Jul 18, 2013 at 8:55 AM, Sybrandy, Casey <
> Casey.Sybrandy@six3systems.com> wrote:
> > Hello,
> >
> > We recently started seeing duplicate messages appearing at our
> consumers.  Thankfully, the database is set up so that we don't store the
> dupes, but it is annoying.  It's not every message, only about 1% of them.
>  We are running 0.7.0 for the broker with Zookeeper 3.3.4 from Cloudera and
> 0.7.0 for the producer and consumer.  We tried upgrading the consumer to
> 0.7.2 to see if that worked, but we're still seeing the dupes.  Do we have
> to upgrade the broker as well to resolve this?  Is there something we can
> check to see what's going on because we're not seeing anything unusual in
> the logs.  I suspected that there may be significant rebalancing, but that
> does not appear to be the case at all.
> >
> > Casey Sybrandy
> >
>

RE: Duplicate Messages on the Consumer

Posted by "Sybrandy, Casey" <Ca...@Six3Systems.com>.
Hello,

No, we couldn't check the broker logs because the data is obfuscated, so we can't just look at the files and tell.  It looks like our dev system may be experiencing the same issue, so I did turn of the obfuscation and we'll monitor it.  However, on our production system where we were seeing the errors more often, appears to have had zookeeper misconfigured, so we're thinking that may be the issue.

Casey

-----Original Message-----
From: Philip O'Toole [mailto:philip@loggly.com] 
Sent: Thursday, July 18, 2013 3:29 PM
To: users@kafka.apache.org
Cc: kafka-users@incubator.apache.org
Subject: Re: Duplicate Messages on the Consumer

Have you actually examined the Kafka files on disk, to make sure those dupes are really there? Or is this a case of reading the same message more than once?

Philip

On Thu, Jul 18, 2013 at 8:55 AM, Sybrandy, Casey <Ca...@six3systems.com> wrote:
> Hello,
>
> We recently started seeing duplicate messages appearing at our consumers.  Thankfully, the database is set up so that we don't store the dupes, but it is annoying.  It's not every message, only about 1% of them.  We are running 0.7.0 for the broker with Zookeeper 3.3.4 from Cloudera and 0.7.0 for the producer and consumer.  We tried upgrading the consumer to 0.7.2 to see if that worked, but we're still seeing the dupes.  Do we have to upgrade the broker as well to resolve this?  Is there something we can check to see what's going on because we're not seeing anything unusual in the logs.  I suspected that there may be significant rebalancing, but that does not appear to be the case at all.
>
> Casey Sybrandy
>

Re: Duplicate Messages on the Consumer

Posted by Philip O'Toole <ph...@loggly.com>.
Have you actually examined the Kafka files on disk, to make sure those
dupes are really there? Or is this a case of reading the same message
more than once?

Philip

On Thu, Jul 18, 2013 at 8:55 AM, Sybrandy, Casey
<Ca...@six3systems.com> wrote:
> Hello,
>
> We recently started seeing duplicate messages appearing at our consumers.  Thankfully, the database is set up so that we don't store the dupes, but it is annoying.  It's not every message, only about 1% of them.  We are running 0.7.0 for the broker with Zookeeper 3.3.4 from Cloudera and 0.7.0 for the producer and consumer.  We tried upgrading the consumer to 0.7.2 to see if that worked, but we're still seeing the dupes.  Do we have to upgrade the broker as well to resolve this?  Is there something we can check to see what's going on because we're not seeing anything unusual in the logs.  I suspected that there may be significant rebalancing, but that does not appear to be the case at all.
>
> Casey Sybrandy
>