You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by "Ghosh, Achintya (Contractor)" <Ac...@comcast.com> on 2016/08/05 15:15:48 UTC

Kafka consumer getting duplicate message

Hi there,

We are using Kafka 1.0.0.M2 with Spring and we see a lot of duplicate message is getting received by the Listener onMessage() method .
We configured :

enable.auto.commit=false
session.timeout.ms=15000
factory.getContainerProperties().setSyncCommits(true);
factory.setConcurrency(5);

So what could be the reason to get the duplicate messages?

Thanks
Achintya

RE: Kafka consumer getting duplicate message

Posted by "Ghosh, Achintya (Contractor)" <Ac...@comcast.com>.
Thank you , Ewen for your response.
Actually we are using 1.0.0.M2 Spring Kafka release that uses Kafka 0.9 release.
Yes, we see a lot of duplicates and here is our producer and consumer settings in application. We don't see any duplicacy at Producer end I mean if we send 1000 messages to a particular Topic we receive exactly (sometimes less) 1000 messages.

But when we consume the message at Consumer level we see a lot of messages with same offset value and same partition , so please let us know what tweaking is needed to avaoid the duplicacy.

We have three types of Topics and each topic has 3 replication factors and 10 partitions.

Producer Configuration:

bootstrap.producer.servers=provisioningservices-aq-dev.g.comcast.net:80
acks=1
retries=3
batch.size=16384
linger.ms=5
buffer.memory=33554432
request.timeout.ms=60000
timeout.ms=60000
key.serializer=org.apache.kafka.common.serialization.StringSerializer
value.serializer=com.comcast.provisioning.provisioning_.kafka.CustomMessageSer

Consumer Configuration:

bootstrap.consumer.servers=provisioningservices-aqr-dev.g.comcast.net:80
group.id=ps-consumer-group
enable.auto.commit=false
auto.commit.interval.ms=100
session.timeout.ms=15000
key.deserializer=org.apache.kafka.common.serialization.StringDeserializer
value.deserializer=com.comcast.provisioning.provisioning_.kafka.CustomMessageDeSer

factory.getContainerProperties().setSyncCommits(true);
factory.setConcurrency(5);

Thanks
Achintya


-----Original Message-----
From: Ewen Cheslack-Postava [mailto:ewen@confluent.io] 
Sent: Saturday, August 06, 2016 1:45 AM
To: users@kafka.apache.org
Cc: dev@kafka.apache.org
Subject: Re: Kafka consumer getting duplicate message

Achintya,

1.0.0.M2 is not an official release, so this version number is not particularly meaningful to people on this list. What platform/distribution are you using and how does this map to actual Apache Kafka releases?

In general, it is not possible for any system to guarantee exactly once semantics because those semantics rely on the source and destination systems coordinating -- the source provides some sort of retry semantics, and the destination system needs to do some sort of deduplication or similar to only "deliver" the data one time.

That said, duplicates should usually only be generated in the face of failures. If you're seeing a lot of duplicates, that probably means shutdown/failover is not being handled correctly. If you can provide more info about your setup, we might be able to suggest tweaks that will avoid these situations.

-Ewen

On Fri, Aug 5, 2016 at 8:15 AM, Ghosh, Achintya (Contractor) < Achintya_Ghosh@comcast.com> wrote:

> Hi there,
>
> We are using Kafka 1.0.0.M2 with Spring and we see a lot of duplicate 
> message is getting received by the Listener onMessage() method .
> We configured :
>
> enable.auto.commit=false
> session.timeout.ms=15000
> factory.getContainerProperties().setSyncCommits(true);
> factory.setConcurrency(5);
>
> So what could be the reason to get the duplicate messages?
>
> Thanks
> Achintya
>



--
Thanks,
Ewen

RE: Kafka consumer getting duplicate message

Posted by "Ghosh, Achintya (Contractor)" <Ac...@comcast.com>.
Can anyone please check this one?

Thanks
Achintya

-----Original Message-----
From: Ghosh, Achintya (Contractor) 
Sent: Monday, August 08, 2016 9:44 AM
To: users@kafka.apache.org
Cc: dev@kafka.apache.org
Subject: RE: Kafka consumer getting duplicate message

Thank you , Ewen for your response.
Actually we are using 1.0.0.M2 Spring Kafka release that uses Kafka 0.9 release.
Yes, we see a lot of duplicates and here is our producer and consumer settings in application. We don't see any duplicacy at Producer end I mean if we send 1000 messages to a particular Topic we receive exactly (sometimes less) 1000 messages.

But when we consume the message at Consumer level we see a lot of messages with same offset value and same partition , so please let us know what tweaking is needed to avaoid the duplicacy.

We have three types of Topics and each topic has 3 replication factors and 10 partitions.

Producer Configuration:

bootstrap.producer.servers=provisioningservices-aq-dev.g.comcast.net:80
acks=1
retries=3
batch.size=16384
linger.ms=5
buffer.memory=33554432
request.timeout.ms=60000
timeout.ms=60000
key.serializer=org.apache.kafka.common.serialization.StringSerializer
value.serializer=com.comcast.provisioning.provisioning_.kafka.CustomMessageSer

Consumer Configuration:

bootstrap.consumer.servers=provisioningservices-aqr-dev.g.comcast.net:80
group.id=ps-consumer-group
enable.auto.commit=false
auto.commit.interval.ms=100
session.timeout.ms=15000
key.deserializer=org.apache.kafka.common.serialization.StringDeserializer
value.deserializer=com.comcast.provisioning.provisioning_.kafka.CustomMessageDeSer

factory.getContainerProperties().setSyncCommits(true);
factory.setConcurrency(5);

Thanks
Achintya


-----Original Message-----
From: Ewen Cheslack-Postava [mailto:ewen@confluent.io]
Sent: Saturday, August 06, 2016 1:45 AM
To: users@kafka.apache.org
Cc: dev@kafka.apache.org
Subject: Re: Kafka consumer getting duplicate message

Achintya,

1.0.0.M2 is not an official release, so this version number is not particularly meaningful to people on this list. What platform/distribution are you using and how does this map to actual Apache Kafka releases?

In general, it is not possible for any system to guarantee exactly once semantics because those semantics rely on the source and destination systems coordinating -- the source provides some sort of retry semantics, and the destination system needs to do some sort of deduplication or similar to only "deliver" the data one time.

That said, duplicates should usually only be generated in the face of failures. If you're seeing a lot of duplicates, that probably means shutdown/failover is not being handled correctly. If you can provide more info about your setup, we might be able to suggest tweaks that will avoid these situations.

-Ewen

On Fri, Aug 5, 2016 at 8:15 AM, Ghosh, Achintya (Contractor) < Achintya_Ghosh@comcast.com> wrote:

> Hi there,
>
> We are using Kafka 1.0.0.M2 with Spring and we see a lot of duplicate 
> message is getting received by the Listener onMessage() method .
> We configured :
>
> enable.auto.commit=false
> session.timeout.ms=15000
> factory.getContainerProperties().setSyncCommits(true);
> factory.setConcurrency(5);
>
> So what could be the reason to get the duplicate messages?
>
> Thanks
> Achintya
>



--
Thanks,
Ewen

RE: Kafka consumer getting duplicate message

Posted by "Ghosh, Achintya (Contractor)" <Ac...@comcast.com>.
Thank you , Ewen for your response.
Actually we are using 1.0.0.M2 Spring Kafka release that uses Kafka 0.9 release.
Yes, we see a lot of duplicates and here is our producer and consumer settings in application. We don't see any duplicacy at Producer end I mean if we send 1000 messages to a particular Topic we receive exactly (sometimes less) 1000 messages.

But when we consume the message at Consumer level we see a lot of messages with same offset value and same partition , so please let us know what tweaking is needed to avaoid the duplicacy.

We have three types of Topics and each topic has 3 replication factors and 10 partitions.

Producer Configuration:

bootstrap.producer.servers=provisioningservices-aq-dev.g.comcast.net:80
acks=1
retries=3
batch.size=16384
linger.ms=5
buffer.memory=33554432
request.timeout.ms=60000
timeout.ms=60000
key.serializer=org.apache.kafka.common.serialization.StringSerializer
value.serializer=com.comcast.provisioning.provisioning_.kafka.CustomMessageSer

Consumer Configuration:

bootstrap.consumer.servers=provisioningservices-aqr-dev.g.comcast.net:80
group.id=ps-consumer-group
enable.auto.commit=false
auto.commit.interval.ms=100
session.timeout.ms=15000
key.deserializer=org.apache.kafka.common.serialization.StringDeserializer
value.deserializer=com.comcast.provisioning.provisioning_.kafka.CustomMessageDeSer

factory.getContainerProperties().setSyncCommits(true);
factory.setConcurrency(5);

Thanks
Achintya


-----Original Message-----
From: Ewen Cheslack-Postava [mailto:ewen@confluent.io] 
Sent: Saturday, August 06, 2016 1:45 AM
To: users@kafka.apache.org
Cc: dev@kafka.apache.org
Subject: Re: Kafka consumer getting duplicate message

Achintya,

1.0.0.M2 is not an official release, so this version number is not particularly meaningful to people on this list. What platform/distribution are you using and how does this map to actual Apache Kafka releases?

In general, it is not possible for any system to guarantee exactly once semantics because those semantics rely on the source and destination systems coordinating -- the source provides some sort of retry semantics, and the destination system needs to do some sort of deduplication or similar to only "deliver" the data one time.

That said, duplicates should usually only be generated in the face of failures. If you're seeing a lot of duplicates, that probably means shutdown/failover is not being handled correctly. If you can provide more info about your setup, we might be able to suggest tweaks that will avoid these situations.

-Ewen

On Fri, Aug 5, 2016 at 8:15 AM, Ghosh, Achintya (Contractor) < Achintya_Ghosh@comcast.com> wrote:

> Hi there,
>
> We are using Kafka 1.0.0.M2 with Spring and we see a lot of duplicate 
> message is getting received by the Listener onMessage() method .
> We configured :
>
> enable.auto.commit=false
> session.timeout.ms=15000
> factory.getContainerProperties().setSyncCommits(true);
> factory.setConcurrency(5);
>
> So what could be the reason to get the duplicate messages?
>
> Thanks
> Achintya
>



--
Thanks,
Ewen

RE: Kafka consumer getting duplicate message

Posted by "Ghosh, Achintya (Contractor)" <Ac...@comcast.com>.
Can anyone please check this one?

Thanks
Achintya

-----Original Message-----
From: Ghosh, Achintya (Contractor) 
Sent: Monday, August 08, 2016 9:44 AM
To: users@kafka.apache.org
Cc: dev@kafka.apache.org
Subject: RE: Kafka consumer getting duplicate message

Thank you , Ewen for your response.
Actually we are using 1.0.0.M2 Spring Kafka release that uses Kafka 0.9 release.
Yes, we see a lot of duplicates and here is our producer and consumer settings in application. We don't see any duplicacy at Producer end I mean if we send 1000 messages to a particular Topic we receive exactly (sometimes less) 1000 messages.

But when we consume the message at Consumer level we see a lot of messages with same offset value and same partition , so please let us know what tweaking is needed to avaoid the duplicacy.

We have three types of Topics and each topic has 3 replication factors and 10 partitions.

Producer Configuration:

bootstrap.producer.servers=provisioningservices-aq-dev.g.comcast.net:80
acks=1
retries=3
batch.size=16384
linger.ms=5
buffer.memory=33554432
request.timeout.ms=60000
timeout.ms=60000
key.serializer=org.apache.kafka.common.serialization.StringSerializer
value.serializer=com.comcast.provisioning.provisioning_.kafka.CustomMessageSer

Consumer Configuration:

bootstrap.consumer.servers=provisioningservices-aqr-dev.g.comcast.net:80
group.id=ps-consumer-group
enable.auto.commit=false
auto.commit.interval.ms=100
session.timeout.ms=15000
key.deserializer=org.apache.kafka.common.serialization.StringDeserializer
value.deserializer=com.comcast.provisioning.provisioning_.kafka.CustomMessageDeSer

factory.getContainerProperties().setSyncCommits(true);
factory.setConcurrency(5);

Thanks
Achintya


-----Original Message-----
From: Ewen Cheslack-Postava [mailto:ewen@confluent.io]
Sent: Saturday, August 06, 2016 1:45 AM
To: users@kafka.apache.org
Cc: dev@kafka.apache.org
Subject: Re: Kafka consumer getting duplicate message

Achintya,

1.0.0.M2 is not an official release, so this version number is not particularly meaningful to people on this list. What platform/distribution are you using and how does this map to actual Apache Kafka releases?

In general, it is not possible for any system to guarantee exactly once semantics because those semantics rely on the source and destination systems coordinating -- the source provides some sort of retry semantics, and the destination system needs to do some sort of deduplication or similar to only "deliver" the data one time.

That said, duplicates should usually only be generated in the face of failures. If you're seeing a lot of duplicates, that probably means shutdown/failover is not being handled correctly. If you can provide more info about your setup, we might be able to suggest tweaks that will avoid these situations.

-Ewen

On Fri, Aug 5, 2016 at 8:15 AM, Ghosh, Achintya (Contractor) < Achintya_Ghosh@comcast.com> wrote:

> Hi there,
>
> We are using Kafka 1.0.0.M2 with Spring and we see a lot of duplicate 
> message is getting received by the Listener onMessage() method .
> We configured :
>
> enable.auto.commit=false
> session.timeout.ms=15000
> factory.getContainerProperties().setSyncCommits(true);
> factory.setConcurrency(5);
>
> So what could be the reason to get the duplicate messages?
>
> Thanks
> Achintya
>



--
Thanks,
Ewen

Re: Kafka consumer getting duplicate message

Posted by Ewen Cheslack-Postava <ew...@confluent.io>.
Achintya,

1.0.0.M2 is not an official release, so this version number is not
particularly meaningful to people on this list. What platform/distribution
are you using and how does this map to actual Apache Kafka releases?

In general, it is not possible for any system to guarantee exactly once
semantics because those semantics rely on the source and destination
systems coordinating -- the source provides some sort of retry semantics,
and the destination system needs to do some sort of deduplication or
similar to only "deliver" the data one time.

That said, duplicates should usually only be generated in the face of
failures. If you're seeing a lot of duplicates, that probably means
shutdown/failover is not being handled correctly. If you can provide more
info about your setup, we might be able to suggest tweaks that will avoid
these situations.

-Ewen

On Fri, Aug 5, 2016 at 8:15 AM, Ghosh, Achintya (Contractor) <
Achintya_Ghosh@comcast.com> wrote:

> Hi there,
>
> We are using Kafka 1.0.0.M2 with Spring and we see a lot of duplicate
> message is getting received by the Listener onMessage() method .
> We configured :
>
> enable.auto.commit=false
> session.timeout.ms=15000
> factory.getContainerProperties().setSyncCommits(true);
> factory.setConcurrency(5);
>
> So what could be the reason to get the duplicate messages?
>
> Thanks
> Achintya
>



-- 
Thanks,
Ewen

Re: Kafka consumer getting duplicate message

Posted by Ewen Cheslack-Postava <ew...@confluent.io>.
Achintya,

1.0.0.M2 is not an official release, so this version number is not
particularly meaningful to people on this list. What platform/distribution
are you using and how does this map to actual Apache Kafka releases?

In general, it is not possible for any system to guarantee exactly once
semantics because those semantics rely on the source and destination
systems coordinating -- the source provides some sort of retry semantics,
and the destination system needs to do some sort of deduplication or
similar to only "deliver" the data one time.

That said, duplicates should usually only be generated in the face of
failures. If you're seeing a lot of duplicates, that probably means
shutdown/failover is not being handled correctly. If you can provide more
info about your setup, we might be able to suggest tweaks that will avoid
these situations.

-Ewen

On Fri, Aug 5, 2016 at 8:15 AM, Ghosh, Achintya (Contractor) <
Achintya_Ghosh@comcast.com> wrote:

> Hi there,
>
> We are using Kafka 1.0.0.M2 with Spring and we see a lot of duplicate
> message is getting received by the Listener onMessage() method .
> We configured :
>
> enable.auto.commit=false
> session.timeout.ms=15000
> factory.getContainerProperties().setSyncCommits(true);
> factory.setConcurrency(5);
>
> So what could be the reason to get the duplicate messages?
>
> Thanks
> Achintya
>



-- 
Thanks,
Ewen