You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Gabriel Giussi (Jira)" <ji...@apache.org> on 2022/08/26 13:49:00 UTC

[jira] [Updated] (KAFKA-14185) Broker allows transactions with generation.id -1 and could lead to duplicates

     [ https://issues.apache.org/jira/browse/KAFKA-14185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gabriel Giussi updated KAFKA-14185:
-----------------------------------
    Description: 
We were incorrectly getting a reference to ConsumerGroupMetadata before the first call to poll and holding that reference for the whole lifecycle of the consumer/producer pair we were creating. Since that reference was obtained before calling poll the consumer wasn't joined yet hence it had a -1 as generation.id.

The producer allows to send the offsets using this generation.id which can lead to duplicates since the fencing won't happen.

The scenario I reproduced locally is the following:
1. Client A starts the consumer and the producer and holds a reference to the current groupMetadata wich has [generation.id|http://generation.id/] -1 since the consumer didn't join the group yet 
2. Client A joins the group and gets assigned partition 0 and 1
3. Client A polls a message with offset X from partition 1, produces to output topic and enters a long gc pause (before calling sendOffsetsToTransation)
4. Client B starts the consumer and the producer, also getting a reference to groupMetadata with [generation.id|http://generation.id/] -1 
5. Client B joins the group and gets assigned partition 1
6. Client B polls a message with offset X from partition 1, produces to output topic, sends offset with [generation.id|http://generation.id/] -1, and commits successfully.
7. Client A comes back and send offsets with [generation.id|http://generation.id/] -1 and commits successfully

Original thread in the mailing list: [https://lists.apache.org/thread/hgmrxvx3f4kjxxcll2jhdb6zpzcvznx3]

I think it would be nice to prevent this scenario by rejecting requests with a generation.id -1, ideally in the broker.

  was:
We were incorrectly getting a reference to ConsumerGroupMetadata before the first call to poll and holding that reference for the whole lifecycle of the consumer/producer pair we were creating. Since that reference was obtained before calling poll the consumer wasn't joined yet hence it had a -1 as groupId.

The producer allows to send the offsets using this groupId which can lead to duplicates since the fencing won't happen.

The scenario I reproduced locally is the following:
1. Client A starts the consumer and the producer and holds a reference to the current groupMetadata wich has [generation.id|http://generation.id/] -1 since the consumer didn't join the group yet 
2. Client A joins the group and gets assigned partition 0 and 1
3. Client A polls a message with offset X from partition 1, produces to output topic and enters a long gc pause (before calling sendOffsetsToTransation)
4. Client B starts the consumer and the producer, also getting a reference to groupMetadata with [generation.id|http://generation.id/] -1 
5. Client B joins the group and gets assigned partition 1
6. Client B polls a message with offset X from partition 1, produces to output topic, sends offset with [generation.id|http://generation.id/] -1, and commits successfully.
7. Client A comes back and send offsets with [generation.id|http://generation.id/] -1 and commits successfully


Original thread in the mailing list: https://lists.apache.org/thread/hgmrxvx3f4kjxxcll2jhdb6zpzcvznx3

I think it would be nice to prevent this scenario by rejecting requests with a groupId -1, ideally in the broker.


> Broker allows transactions with generation.id -1 and could lead to duplicates
> -----------------------------------------------------------------------------
>
>                 Key: KAFKA-14185
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14185
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 2.8.0
>            Reporter: Gabriel Giussi
>            Priority: Major
>
> We were incorrectly getting a reference to ConsumerGroupMetadata before the first call to poll and holding that reference for the whole lifecycle of the consumer/producer pair we were creating. Since that reference was obtained before calling poll the consumer wasn't joined yet hence it had a -1 as generation.id.
> The producer allows to send the offsets using this generation.id which can lead to duplicates since the fencing won't happen.
> The scenario I reproduced locally is the following:
> 1. Client A starts the consumer and the producer and holds a reference to the current groupMetadata wich has [generation.id|http://generation.id/] -1 since the consumer didn't join the group yet 
> 2. Client A joins the group and gets assigned partition 0 and 1
> 3. Client A polls a message with offset X from partition 1, produces to output topic and enters a long gc pause (before calling sendOffsetsToTransation)
> 4. Client B starts the consumer and the producer, also getting a reference to groupMetadata with [generation.id|http://generation.id/] -1 
> 5. Client B joins the group and gets assigned partition 1
> 6. Client B polls a message with offset X from partition 1, produces to output topic, sends offset with [generation.id|http://generation.id/] -1, and commits successfully.
> 7. Client A comes back and send offsets with [generation.id|http://generation.id/] -1 and commits successfully
> Original thread in the mailing list: [https://lists.apache.org/thread/hgmrxvx3f4kjxxcll2jhdb6zpzcvznx3]
> I think it would be nice to prevent this scenario by rejecting requests with a generation.id -1, ideally in the broker.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)