You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Calvin Liu (Jira)" <ji...@apache.org> on 2023/07/19 23:09:00 UTC
[jira] [Created] (KAFKA-15221) Potential race condition between requests from rebooted followers
Calvin Liu created KAFKA-15221:
----------------------------------
Summary: Potential race condition between requests from rebooted followers
Key: KAFKA-15221
URL: https://issues.apache.org/jira/browse/KAFKA-15221
Project: Kafka
Issue Type: Bug
Affects Versions: 3.5.0
Reporter: Calvin Liu
Assignee: Calvin Liu
Fix For: 3.6.0, 3.5.1
When the leader processes the fetch request, it does not acquire locks when updating the replica fetch state. Then there can be a race between the fetch requests from a rebooted follower.
T0, broker 1 sends a fetch to broker 0(leader). At the moment, broker 1 is not in ISR.
T1, broker 1 crashes.
T2 broker 1 is back online and receives a new broker epoch. Also, it sends a new Fetch request.
T3 broker 0 receives the old fetch requests and decides to expand the ISR.
T4 Right before broker 0 starts to fill the AlterPartitoin request, the new fetch request comes in and overwrites the fetch state. Then broker 0 uses the new broker epoch on the AlterPartition request.
In this way, the AlterPartition request can get around KIP-903 and wrongly update the ISR.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)