You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Sergejs Andrejevs (Jira)" <ji...@apache.org> on 2021/07/23 19:43:00 UTC

[jira] [Updated] (KAFKA-13131) Consumer offsets lost during partition reassignment

     [ https://issues.apache.org/jira/browse/KAFKA-13131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sergejs Andrejevs updated KAFKA-13131:
--------------------------------------
    Description: 
While doing replicas reassignment of a *___consumer_offsets_* partition from one set of brokers to another, the consumer group offset got lost (seems to be reset to earliest).
 
offsets.retention.minutes: 10080
Consumers are constantly reading and regularly commit offsets.
Initial setup:
 __consumer_offsets-18 
 Replicas: 9,7,6

Desired setup:
 __consumer_offsets-18
 Replicas: 11,10,5

File_with_desired_state:
{code:java}
{
  "version": 1,
  "partitions": [
    {
      "topic": "__consumer_offsets",
      "partition": 18,
      "replicas": [
        11,
        10,
        5
      ],
      "log_dirs": [
        "/path_replica_1",
        "/path_replica_2",
        "/path_replica_3"
      ]
    }
  ]
}
{code}
Reassignment command:
{code:java}
/opt/kafka/bin/kafka-reassign-partitions.sh --bootstrap-server localhost:9092 --execute --reassignment-json-file File_with_desired_state --throttle 104857600 --replica-alter-log-dirs-throttle 104857600
{code}
The error in logs at the broker:
{code:java}
[2021-07-22 05:28:11,777] ERROR [GroupMetadataManager brokerId=11] Error loading offsets from __consumer_offsets-18 (kafka.coordinator.group.GroupMetadataManager)
java.lang.NullPointerException
        at kafka.log.OffsetIndex.$anonfun$lookup$1(OffsetIndex.scala:90)
        at kafka.log.AbstractIndex.maybeLock(AbstractIndex.scala:338)
        at kafka.log.OffsetIndex.lookup(OffsetIndex.scala:89)
        at kafka.log.LogSegment.translateOffset(LogSegment.scala:274)
        at kafka.log.LogSegment.read(LogSegment.scala:298)
        at kafka.log.Log.$anonfun$read$2(Log.scala:1522)
        at kafka.log.Log.read(Log.scala:2340)
        at kafka.coordinator.group.GroupMetadataManager.loadGroupsAndOffsets(GroupMetadataManager.scala:589)
        at kafka.coordinator.group.GroupMetadataManager.$anonfun$scheduleLoadGroupAndOffsets$2(GroupMetadataManager.scala:537)
        at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:114)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
{code}
It was tried to reproduce at test environments, but so far unsuccessfully.

Let me know if any other configuration/parameters/details shall be added.

  was:
While doing replicas reassignment of a *___consumer_offsets_* partition from one set of brokers to another, the consumer group offset got lost (seems to be reset to earliest).

Initial setup:
 __consumer_offsets-18 
 Replicas: 9,7,6

Desired setup:
 __consumer_offsets-18
 Replicas: 11,10,5

File_with_desired_state:
{code:java}
{
  "version": 1,
  "partitions": [
    {
      "topic": "__consumer_offsets",
      "partition": 18,
      "replicas": [
        11,
        10,
        5
      ],
      "log_dirs": [
        "/path_replica_1",
        "/path_replica_2",
        "/path_replica_3"
      ]
    }
  ]
}
{code}
Reassignment command:
{code:java}
/opt/kafka/bin/kafka-reassign-partitions.sh --bootstrap-server localhost:9092 --execute --reassignment-json-file File_with_desired_state --throttle 104857600 --replica-alter-log-dirs-throttle 104857600
{code}
The error in logs at the broker:
{code:java}
[2021-07-22 05:28:11,777] ERROR [GroupMetadataManager brokerId=11] Error loading offsets from __consumer_offsets-18 (kafka.coordinator.group.GroupMetadataManager)
java.lang.NullPointerException
        at kafka.log.OffsetIndex.$anonfun$lookup$1(OffsetIndex.scala:90)
        at kafka.log.AbstractIndex.maybeLock(AbstractIndex.scala:338)
        at kafka.log.OffsetIndex.lookup(OffsetIndex.scala:89)
        at kafka.log.LogSegment.translateOffset(LogSegment.scala:274)
        at kafka.log.LogSegment.read(LogSegment.scala:298)
        at kafka.log.Log.$anonfun$read$2(Log.scala:1522)
        at kafka.log.Log.read(Log.scala:2340)
        at kafka.coordinator.group.GroupMetadataManager.loadGroupsAndOffsets(GroupMetadataManager.scala:589)
        at kafka.coordinator.group.GroupMetadataManager.$anonfun$scheduleLoadGroupAndOffsets$2(GroupMetadataManager.scala:537)
        at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:114)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
{code}
It was tried to reproduce at test environments, but so far unsuccessfully.


 Let me know if any other configuration/parameters/details shall be added.


> Consumer offsets lost during partition reassignment
> ---------------------------------------------------
>
>                 Key: KAFKA-13131
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13131
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 2.6.0
>            Reporter: Sergejs Andrejevs
>            Priority: Major
>
> While doing replicas reassignment of a *___consumer_offsets_* partition from one set of brokers to another, the consumer group offset got lost (seems to be reset to earliest).
>  
> offsets.retention.minutes: 10080
> Consumers are constantly reading and regularly commit offsets.
> Initial setup:
>  __consumer_offsets-18 
>  Replicas: 9,7,6
> Desired setup:
>  __consumer_offsets-18
>  Replicas: 11,10,5
> File_with_desired_state:
> {code:java}
> {
>   "version": 1,
>   "partitions": [
>     {
>       "topic": "__consumer_offsets",
>       "partition": 18,
>       "replicas": [
>         11,
>         10,
>         5
>       ],
>       "log_dirs": [
>         "/path_replica_1",
>         "/path_replica_2",
>         "/path_replica_3"
>       ]
>     }
>   ]
> }
> {code}
> Reassignment command:
> {code:java}
> /opt/kafka/bin/kafka-reassign-partitions.sh --bootstrap-server localhost:9092 --execute --reassignment-json-file File_with_desired_state --throttle 104857600 --replica-alter-log-dirs-throttle 104857600
> {code}
> The error in logs at the broker:
> {code:java}
> [2021-07-22 05:28:11,777] ERROR [GroupMetadataManager brokerId=11] Error loading offsets from __consumer_offsets-18 (kafka.coordinator.group.GroupMetadataManager)
> java.lang.NullPointerException
>         at kafka.log.OffsetIndex.$anonfun$lookup$1(OffsetIndex.scala:90)
>         at kafka.log.AbstractIndex.maybeLock(AbstractIndex.scala:338)
>         at kafka.log.OffsetIndex.lookup(OffsetIndex.scala:89)
>         at kafka.log.LogSegment.translateOffset(LogSegment.scala:274)
>         at kafka.log.LogSegment.read(LogSegment.scala:298)
>         at kafka.log.Log.$anonfun$read$2(Log.scala:1522)
>         at kafka.log.Log.read(Log.scala:2340)
>         at kafka.coordinator.group.GroupMetadataManager.loadGroupsAndOffsets(GroupMetadataManager.scala:589)
>         at kafka.coordinator.group.GroupMetadataManager.$anonfun$scheduleLoadGroupAndOffsets$2(GroupMetadataManager.scala:537)
>         at kafka.utils.KafkaScheduler.$anonfun$schedule$2(KafkaScheduler.scala:114)
>         at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>         at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748)
> {code}
> It was tried to reproduce at test environments, but so far unsuccessfully.
> Let me know if any other configuration/parameters/details shall be added.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)