You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by nitin agarwal <ni...@gmail.com> on 2021/01/27 19:35:26 UTC

Impact of setting max.inflight.requests.per.connection > 1 in Kafka connect

Hi All,

I see that max.inflight.requests.per.connection is set to 1 explicitly in
Kafka Connect but there is a way to override it. I want to understand the
impact of setting its value > 1.
As per my understanding, it will lead to data loss in some cases. Is it
correct ?


Thank you,
Nitin

Re: Impact of setting max.inflight.requests.per.connection > 1 in Kafka connect

Posted by "Matthias J. Sax" <mj...@apache.org>.
I guess you wpuld get duplicates of you crash after data was written
into the topics but before offsets were committed.

So there is no data-loss nor re-ordering for this case, but duplication.


-Matthias

On 1/28/21 11:20 AM, nitin agarwal wrote:
> Hi,
> 
> By committing the offsets, I meant tracking the progress of how much data
> is read from the upstream system. In Kafka Connect this is being referred
> as committing the offsets.
> This is the method I was talking about
> https://github.com/a0x8o/kafka/blob/master/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L462-L567
> 
> My doubt is that what if the connector gets restarted or the node on which
> connector is running goes down just before flushing the offsets
> <https://github.com/a0x8o/kafka/blob/master/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L521>
> .
> 
> Thank you,
> Nitin
> 
> 
> 
> On Thu, Jan 28, 2021 at 9:54 PM Matthias J. Sax <mj...@apache.org> wrote:
> 
>> I don't know all details of Connect...
>>
>> However, not sure what you mean by "committing offsets"?
>>
>> A source connector takes data from an external data source and writes it
>> into a Kafka topic. Thus, there should not be any offsets to be
>> committed. (Committing offsets only applies if you read from a topic.)
>>
>> Instead, the "progress" how much data from the upstream system is read
>> needs to be tracked. If done right (what I assume Connect does -- not
>> sure if there might be a concrete connector dependency?) there should
>> not be out-of-order data.
>>
>> But I hope that some Connect expert can chime in...
>>
>>
>> -Matthias
>>
>> On 1/28/21 12:24 AM, nitin agarwal wrote:
>>> Assuming the configurations are as follows:
>>> max.inflight.requests.per.connection=1
>>> enable.idempotence=false
>>>
>>> Thanks,
>>> Nitin
>>>
>>>
>>> On Thu, Jan 28, 2021 at 1:53 PM nitin agarwal <ni...@gmail.com>
>>> wrote:
>>>
>>>> Thanks for quick reply, I have understood this behaviour now.
>>>> I have another follow up question.
>>>>
>>>> Can the Source connector write out of order messages in a case where
>> there
>>>> is a failure in committing the offset and the connector is restarted at
>> the
>>>> same time?
>>>>
>>>> Thanks,
>>>> Nitin
>>>>
>>>> On Thu, Jan 28, 2021 at 8:06 AM Matthias J. Sax <mj...@apache.org>
>> wrote:
>>>>
>>>>> There should not be any data loss.
>>>>>
>>>>> However, if a request fails and is retried, it may lead to reordering
>> of
>>>>> sends. Thus, records would not be ordered based on the `send()` calls
>>>>> any longer.
>>>>>
>>>>> If you would enable idempotent writes, ordering is guaranteed even with
>>>>> multiple in-flight requests per connection though.
>>>>>
>>>>>
>>>>>
>>>>> -Matthias
>>>>>
>>>>> On 1/27/21 11:35 AM, nitin agarwal wrote:
>>>>>> Hi All,
>>>>>>
>>>>>> I see that max.inflight.requests.per.connection is set to 1 explicitly
>>>>> in
>>>>>> Kafka Connect but there is a way to override it. I want to understand
>>>>> the
>>>>>> impact of setting its value > 1.
>>>>>> As per my understanding, it will lead to data loss in some cases. Is
>> it
>>>>>> correct ?
>>>>>>
>>>>>>
>>>>>> Thank you,
>>>>>> Nitin
>>>>>>
>>>>>
>>>>
>>>
>>
> 

Re: Impact of setting max.inflight.requests.per.connection > 1 in Kafka connect

Posted by nitin agarwal <ni...@gmail.com>.
Hi,

By committing the offsets, I meant tracking the progress of how much data
is read from the upstream system. In Kafka Connect this is being referred
as committing the offsets.
This is the method I was talking about
https://github.com/a0x8o/kafka/blob/master/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L462-L567

My doubt is that what if the connector gets restarted or the node on which
connector is running goes down just before flushing the offsets
<https://github.com/a0x8o/kafka/blob/master/connect/runtime/src/main/java/org/apache/kafka/connect/runtime/WorkerSourceTask.java#L521>
.

Thank you,
Nitin



On Thu, Jan 28, 2021 at 9:54 PM Matthias J. Sax <mj...@apache.org> wrote:

> I don't know all details of Connect...
>
> However, not sure what you mean by "committing offsets"?
>
> A source connector takes data from an external data source and writes it
> into a Kafka topic. Thus, there should not be any offsets to be
> committed. (Committing offsets only applies if you read from a topic.)
>
> Instead, the "progress" how much data from the upstream system is read
> needs to be tracked. If done right (what I assume Connect does -- not
> sure if there might be a concrete connector dependency?) there should
> not be out-of-order data.
>
> But I hope that some Connect expert can chime in...
>
>
> -Matthias
>
> On 1/28/21 12:24 AM, nitin agarwal wrote:
> > Assuming the configurations are as follows:
> > max.inflight.requests.per.connection=1
> > enable.idempotence=false
> >
> > Thanks,
> > Nitin
> >
> >
> > On Thu, Jan 28, 2021 at 1:53 PM nitin agarwal <ni...@gmail.com>
> > wrote:
> >
> >> Thanks for quick reply, I have understood this behaviour now.
> >> I have another follow up question.
> >>
> >> Can the Source connector write out of order messages in a case where
> there
> >> is a failure in committing the offset and the connector is restarted at
> the
> >> same time?
> >>
> >> Thanks,
> >> Nitin
> >>
> >> On Thu, Jan 28, 2021 at 8:06 AM Matthias J. Sax <mj...@apache.org>
> wrote:
> >>
> >>> There should not be any data loss.
> >>>
> >>> However, if a request fails and is retried, it may lead to reordering
> of
> >>> sends. Thus, records would not be ordered based on the `send()` calls
> >>> any longer.
> >>>
> >>> If you would enable idempotent writes, ordering is guaranteed even with
> >>> multiple in-flight requests per connection though.
> >>>
> >>>
> >>>
> >>> -Matthias
> >>>
> >>> On 1/27/21 11:35 AM, nitin agarwal wrote:
> >>>> Hi All,
> >>>>
> >>>> I see that max.inflight.requests.per.connection is set to 1 explicitly
> >>> in
> >>>> Kafka Connect but there is a way to override it. I want to understand
> >>> the
> >>>> impact of setting its value > 1.
> >>>> As per my understanding, it will lead to data loss in some cases. Is
> it
> >>>> correct ?
> >>>>
> >>>>
> >>>> Thank you,
> >>>> Nitin
> >>>>
> >>>
> >>
> >
>

Re: Impact of setting max.inflight.requests.per.connection > 1 in Kafka connect

Posted by "Matthias J. Sax" <mj...@apache.org>.
I don't know all details of Connect...

However, not sure what you mean by "committing offsets"?

A source connector takes data from an external data source and writes it
into a Kafka topic. Thus, there should not be any offsets to be
committed. (Committing offsets only applies if you read from a topic.)

Instead, the "progress" how much data from the upstream system is read
needs to be tracked. If done right (what I assume Connect does -- not
sure if there might be a concrete connector dependency?) there should
not be out-of-order data.

But I hope that some Connect expert can chime in...


-Matthias

On 1/28/21 12:24 AM, nitin agarwal wrote:
> Assuming the configurations are as follows:
> max.inflight.requests.per.connection=1
> enable.idempotence=false
> 
> Thanks,
> Nitin
> 
> 
> On Thu, Jan 28, 2021 at 1:53 PM nitin agarwal <ni...@gmail.com>
> wrote:
> 
>> Thanks for quick reply, I have understood this behaviour now.
>> I have another follow up question.
>>
>> Can the Source connector write out of order messages in a case where there
>> is a failure in committing the offset and the connector is restarted at the
>> same time?
>>
>> Thanks,
>> Nitin
>>
>> On Thu, Jan 28, 2021 at 8:06 AM Matthias J. Sax <mj...@apache.org> wrote:
>>
>>> There should not be any data loss.
>>>
>>> However, if a request fails and is retried, it may lead to reordering of
>>> sends. Thus, records would not be ordered based on the `send()` calls
>>> any longer.
>>>
>>> If you would enable idempotent writes, ordering is guaranteed even with
>>> multiple in-flight requests per connection though.
>>>
>>>
>>>
>>> -Matthias
>>>
>>> On 1/27/21 11:35 AM, nitin agarwal wrote:
>>>> Hi All,
>>>>
>>>> I see that max.inflight.requests.per.connection is set to 1 explicitly
>>> in
>>>> Kafka Connect but there is a way to override it. I want to understand
>>> the
>>>> impact of setting its value > 1.
>>>> As per my understanding, it will lead to data loss in some cases. Is it
>>>> correct ?
>>>>
>>>>
>>>> Thank you,
>>>> Nitin
>>>>
>>>
>>
> 

Re: Impact of setting max.inflight.requests.per.connection > 1 in Kafka connect

Posted by nitin agarwal <ni...@gmail.com>.
Assuming the configurations are as follows:
max.inflight.requests.per.connection=1
enable.idempotence=false

Thanks,
Nitin


On Thu, Jan 28, 2021 at 1:53 PM nitin agarwal <ni...@gmail.com>
wrote:

> Thanks for quick reply, I have understood this behaviour now.
> I have another follow up question.
>
> Can the Source connector write out of order messages in a case where there
> is a failure in committing the offset and the connector is restarted at the
> same time?
>
> Thanks,
> Nitin
>
> On Thu, Jan 28, 2021 at 8:06 AM Matthias J. Sax <mj...@apache.org> wrote:
>
>> There should not be any data loss.
>>
>> However, if a request fails and is retried, it may lead to reordering of
>> sends. Thus, records would not be ordered based on the `send()` calls
>> any longer.
>>
>> If you would enable idempotent writes, ordering is guaranteed even with
>> multiple in-flight requests per connection though.
>>
>>
>>
>> -Matthias
>>
>> On 1/27/21 11:35 AM, nitin agarwal wrote:
>> > Hi All,
>> >
>> > I see that max.inflight.requests.per.connection is set to 1 explicitly
>> in
>> > Kafka Connect but there is a way to override it. I want to understand
>> the
>> > impact of setting its value > 1.
>> > As per my understanding, it will lead to data loss in some cases. Is it
>> > correct ?
>> >
>> >
>> > Thank you,
>> > Nitin
>> >
>>
>

Re: Impact of setting max.inflight.requests.per.connection > 1 in Kafka connect

Posted by nitin agarwal <ni...@gmail.com>.
Thanks for quick reply, I have understood this behaviour now.
I have another follow up question.

Can the Source connector write out of order messages in a case where there
is a failure in committing the offset and the connector is restarted at the
same time?

Thanks,
Nitin

On Thu, Jan 28, 2021 at 8:06 AM Matthias J. Sax <mj...@apache.org> wrote:

> There should not be any data loss.
>
> However, if a request fails and is retried, it may lead to reordering of
> sends. Thus, records would not be ordered based on the `send()` calls
> any longer.
>
> If you would enable idempotent writes, ordering is guaranteed even with
> multiple in-flight requests per connection though.
>
>
>
> -Matthias
>
> On 1/27/21 11:35 AM, nitin agarwal wrote:
> > Hi All,
> >
> > I see that max.inflight.requests.per.connection is set to 1 explicitly in
> > Kafka Connect but there is a way to override it. I want to understand the
> > impact of setting its value > 1.
> > As per my understanding, it will lead to data loss in some cases. Is it
> > correct ?
> >
> >
> > Thank you,
> > Nitin
> >
>

Re: Impact of setting max.inflight.requests.per.connection > 1 in Kafka connect

Posted by "Matthias J. Sax" <mj...@apache.org>.
There should not be any data loss.

However, if a request fails and is retried, it may lead to reordering of
sends. Thus, records would not be ordered based on the `send()` calls
any longer.

If you would enable idempotent writes, ordering is guaranteed even with
multiple in-flight requests per connection though.



-Matthias

On 1/27/21 11:35 AM, nitin agarwal wrote:
> Hi All,
> 
> I see that max.inflight.requests.per.connection is set to 1 explicitly in
> Kafka Connect but there is a way to override it. I want to understand the
> impact of setting its value > 1.
> As per my understanding, it will lead to data loss in some cases. Is it
> correct ?
> 
> 
> Thank you,
> Nitin
>