You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "A. Sophie Blee-Goldman (Jira)" <ji...@apache.org> on 2020/10/30 01:49:00 UTC
[jira] [Created] (KAFKA-10664) Streams fails to overwrite corrupted
offsets leading to infinite OffsetOutOfRangeException loop
A. Sophie Blee-Goldman created KAFKA-10664:
----------------------------------------------
Summary: Streams fails to overwrite corrupted offsets leading to infinite OffsetOutOfRangeException loop
Key: KAFKA-10664
URL: https://issues.apache.org/jira/browse/KAFKA-10664
Project: Kafka
Issue Type: Bug
Components: streams
Affects Versions: 2.7.0
Reporter: A. Sophie Blee-Goldman
Assignee: A. Sophie Blee-Goldman
Fix For: 2.7.0
In KAFKA-10391 we fixed an issue where Streams could get stuck in an infinite loop of OffsetOutOfRangeException/TaskCorruptedException due to re-initializing the corrupted offsets from the checkpoint after each revival. The fix we applied was to remove the corrupted offsets from the state manager and then force it to write a new checkpoint file without those offsets during revival.
Unfortunately we missed that there's an optimization in OffsetCheckpoint#write to just return without writing anything when there's no offsets. So if a task doesn't have any offsets that _aren't_ corrupted, it will skip overwriting the corrupted checkpoint.
Probably we should just fix the optimization in OffsetCheckpoint so that it deletes the current checkpoint in the case there are no offsets to write
--
This message was sent by Atlassian Jira
(v8.3.4#803005)