You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Chetan (Jira)" <ji...@apache.org> on 2022/11/08 07:35:00 UTC

[jira] [Updated] (KAFKA-14366) Kafka consumer rebalance issue, offsets points back to very old committed offset

     [ https://issues.apache.org/jira/browse/KAFKA-14366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chetan updated KAFKA-14366:
---------------------------
    Description: 
Hi All,

We are facing an issue while the client consumer restart (again not all restarts are ending up with this issue) and during the re-balancing scenario, sometimes one of the partition offsets goes back a long way from the committed offset.

Scenario :

Assume we have 4 instances of consumer and restarts of consumer one after the other.
 # At the time of starting restarts assume the offset on partition 10 of a topic being consumed is pointing to 50000. (last offset of the topic and 0 lag)
 # When restarts start (rebalancing) suddenly the offsets start pointing to 20000.
 # While all the restarts are going on the consumer who is attached starts reading from 20000 and goes on.
 # Once all rebalance is completed, and all messages from 20000 to 50000 offset has been read (where it had stopped initially)
We end up having around 30K duplicates.

(The numbers here are just an example, in production, we are facing huge duplicates and every two rebalance during restarts of consumer out of 10 restart exercise activity ends up in such duplicates and not all partitions and only one or two partitions behave this way and randomly)

This seems to be a bug. I am attaching all screenshots for reference as well.

Can someone kindly help out here?

  was:
Hi All,

We are facing an issue while the client consumer restart (again not all restarts are ending up with this issue) and during the re-balancing scenario, sometimes one of the partition offsets goes back a long way from the committed offset.

Scenario :

Assume we have 4 instances of consumer and restarts of consumer one after the other.
 # At the time of starting restarts assume the offset on partition 10 of a topic being consumed is pointing to 50000. (last offset of the topic)
 # When restarts start suddenly the offsets start pointing to 20000.
 # While all the restarts are going on the consumer who is attached starts reading from 20000 and goes on.
 # Once all rebalance is completed, and all messages from 20000 to 50000 offset has been read (where it had stopped initially)
We end up having around 30K duplicates.

(The numbers here are just an example, in production, we are facing huge duplicates and every two rebalance during restarts of consumer out of 10 restart exercise activity ends up in such duplicates and not all partitions and only one or two partitions behave this way and randomly)

This seems to be a bug. I am attaching all screenshots for reference as well.

Can someone kindly help out here?


> Kafka consumer rebalance issue, offsets points back to very old committed offset
> --------------------------------------------------------------------------------
>
>                 Key: KAFKA-14366
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14366
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer, offset manager
>    Affects Versions: 2.8.1
>         Environment: Production
>            Reporter: Chetan
>            Priority: Major
>         Attachments: rebalance issue.docx
>
>
> Hi All,
> We are facing an issue while the client consumer restart (again not all restarts are ending up with this issue) and during the re-balancing scenario, sometimes one of the partition offsets goes back a long way from the committed offset.
> Scenario :
> Assume we have 4 instances of consumer and restarts of consumer one after the other.
>  # At the time of starting restarts assume the offset on partition 10 of a topic being consumed is pointing to 50000. (last offset of the topic and 0 lag)
>  # When restarts start (rebalancing) suddenly the offsets start pointing to 20000.
>  # While all the restarts are going on the consumer who is attached starts reading from 20000 and goes on.
>  # Once all rebalance is completed, and all messages from 20000 to 50000 offset has been read (where it had stopped initially)
> We end up having around 30K duplicates.
> (The numbers here are just an example, in production, we are facing huge duplicates and every two rebalance during restarts of consumer out of 10 restart exercise activity ends up in such duplicates and not all partitions and only one or two partitions behave this way and randomly)
> This seems to be a bug. I am attaching all screenshots for reference as well.
> Can someone kindly help out here?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)