You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Raffaele Esposito <ra...@gmail.com> on 2020/05/15 18:47:45 UTC

idempotency issue and transactions on producer

In relation to producer idempotency Kafka documentation says:


... Since each new instance of a producer is assigned a new, unique, PID,
we can only guarantee idempotent production within a single producer
session.


Does it mean that when we build a producer sourcing the data from whatever
source (MySql for example) we need to take care explicitly of this problem

and put in place effective counter measures, like checking what was the
last message actually produced successfully on Kafka ?


What if we use transactions, even in the case we don't need really need
transactional logic on writing messages to different partitions in an  all
or nothing fashion ?

Would using transactions in our producer get rid of the idempotency issue ?



Thanks!

Re: idempotency issue and transactions on producer

Posted by Raffaele Esposito <ra...@gmail.com>.
to be more clear, does using transactions and providing the
transactional.id, makes it possible for
a producer to be idempotent even across sessions ?

Thanks :)

On Fri, May 15, 2020 at 9:29 PM Raffaele Esposito <ra...@gmail.com>
wrote:

> Hi Boynag,
> Yeah what you wrote is clear to me, that's why I asked this question.
>
> What I don't understand if in the case of a producer we build a producer
> sourcing the data from whatever source (MySql for example) we need to take
> care explicitly of
> the mentioned idempotency per session problem.
>
> Do we need to take care of this, or is it already taken care of when using
> transactions ?
>
>
> On Fri, May 15, 2020 at 9:23 PM Boyang Chen <re...@gmail.com>
> wrote:
>
>> Hey Raffaele,
>>
>> the producer id is getting assigned upon receiving the
>> producer.initTransaction call at the broker side. It guarantees the
>> uniqueness of a producer for current lifecycle, which you don't have to
>> configure manually.
>>
>> Transactional API on the other hand, includes idempotent produce for sure.
>> You need to provide a unique transactional.id though. Feel free to learn
>> more high level stuff through this blog:
>> https://www.confluent.io/blog/transactions-apache-kafka/.
>>
>> Boyang
>>
>> Boyang
>>
>> On Fri, May 15, 2020 at 11:48 AM Raffaele Esposito <
>> rafaelralf90@gmail.com>
>> wrote:
>>
>> > In relation to producer idempotency Kafka documentation says:
>> >
>> >
>> > ... Since each new instance of a producer is assigned a new, unique,
>> PID,
>> > we can only guarantee idempotent production within a single producer
>> > session.
>> >
>> >
>> > Does it mean that when we build a producer sourcing the data from
>> whatever
>> > source (MySql for example) we need to take care explicitly of this
>> problem
>> >
>> > and put in place effective counter measures, like checking what was the
>> > last message actually produced successfully on Kafka ?
>> >
>> >
>> > What if we use transactions, even in the case we don't need really need
>> > transactional logic on writing messages to different partitions in an
>> all
>> > or nothing fashion ?
>> >
>> > Would using transactions in our producer get rid of the idempotency
>> issue ?
>> >
>> >
>> >
>> > Thanks!
>> >
>>
>

Re: idempotency issue and transactions on producer

Posted by Raffaele Esposito <ra...@gmail.com>.
Hi Boynag,
Yeah what you wrote is clear to me, that's why I asked this question.

What I don't understand if in the case of a producer we build a producer
sourcing the data from whatever source (MySql for example) we need to take
care explicitly of
the mentioned idempotency per session problem.

Do we need to take care of this, or is it already taken care of when using
transactions ?


On Fri, May 15, 2020 at 9:23 PM Boyang Chen <re...@gmail.com>
wrote:

> Hey Raffaele,
>
> the producer id is getting assigned upon receiving the
> producer.initTransaction call at the broker side. It guarantees the
> uniqueness of a producer for current lifecycle, which you don't have to
> configure manually.
>
> Transactional API on the other hand, includes idempotent produce for sure.
> You need to provide a unique transactional.id though. Feel free to learn
> more high level stuff through this blog:
> https://www.confluent.io/blog/transactions-apache-kafka/.
>
> Boyang
>
> Boyang
>
> On Fri, May 15, 2020 at 11:48 AM Raffaele Esposito <rafaelralf90@gmail.com
> >
> wrote:
>
> > In relation to producer idempotency Kafka documentation says:
> >
> >
> > ... Since each new instance of a producer is assigned a new, unique, PID,
> > we can only guarantee idempotent production within a single producer
> > session.
> >
> >
> > Does it mean that when we build a producer sourcing the data from
> whatever
> > source (MySql for example) we need to take care explicitly of this
> problem
> >
> > and put in place effective counter measures, like checking what was the
> > last message actually produced successfully on Kafka ?
> >
> >
> > What if we use transactions, even in the case we don't need really need
> > transactional logic on writing messages to different partitions in an
> all
> > or nothing fashion ?
> >
> > Would using transactions in our producer get rid of the idempotency
> issue ?
> >
> >
> >
> > Thanks!
> >
>

Re: idempotency issue and transactions on producer

Posted by Boyang Chen <re...@gmail.com>.
Hey Raffaele,

the producer id is getting assigned upon receiving the
producer.initTransaction call at the broker side. It guarantees the
uniqueness of a producer for current lifecycle, which you don't have to
configure manually.

Transactional API on the other hand, includes idempotent produce for sure.
You need to provide a unique transactional.id though. Feel free to learn
more high level stuff through this blog:
https://www.confluent.io/blog/transactions-apache-kafka/.

Boyang

Boyang

On Fri, May 15, 2020 at 11:48 AM Raffaele Esposito <ra...@gmail.com>
wrote:

> In relation to producer idempotency Kafka documentation says:
>
>
> ... Since each new instance of a producer is assigned a new, unique, PID,
> we can only guarantee idempotent production within a single producer
> session.
>
>
> Does it mean that when we build a producer sourcing the data from whatever
> source (MySql for example) we need to take care explicitly of this problem
>
> and put in place effective counter measures, like checking what was the
> last message actually produced successfully on Kafka ?
>
>
> What if we use transactions, even in the case we don't need really need
> transactional logic on writing messages to different partitions in an  all
> or nothing fashion ?
>
> Would using transactions in our producer get rid of the idempotency issue ?
>
>
>
> Thanks!
>