You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Anna Povzner (JIRA)" <ji...@apache.org> on 2018/05/31 18:36:00 UTC
[jira] [Created] (KAFKA-6975) AdminClient.deleteRecords() may cause
replicas unable to fetch from beginning
Anna Povzner created KAFKA-6975:
-----------------------------------
Summary: AdminClient.deleteRecords() may cause replicas unable to fetch from beginning
Key: KAFKA-6975
URL: https://issues.apache.org/jira/browse/KAFKA-6975
Project: Kafka
Issue Type: Bug
Affects Versions: 1.1.0
Reporter: Anna Povzner
Assignee: Anna Povzner
AdminClient.deleteRecords(beforeOffset(offset)) will set log start offset to the requested offset. If the requested offset is in the middle of the batch, the replica will not be able to fetch from that offset (because it is in the middle of the batch).
One use-case where this could cause problems is replica re-assignment. Suppose we have a topic partition with 3 initial replicas, and at some point the user issues AdminClient.deleteRecords() for the offset that falls in the middle of the batch. It now becomes log start offset for this topic partition. Suppose at some later time, the user starts partition re-assignment to 3 new replicas. The new replicas (followers) will start with HW = 0, will try to fetch from 0, then get "out of order offset" because 0 < log start offset (LSO); the follower will be able to reset offset to LSO of the leader and fetch LSO; the leader will send a batch in response with base offset <LSO, this will cause "out of order offset" on the follower which will stop the fetcher thread. The end result is that the new replicas will not be able to start fetching unless LSO moves to an offset that is not in the middle of the batch, and the re-assignment will be stuck for a possibly a very log time.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)