You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2021/02/26 13:59:48 UTC

[GitHub] [kafka] ijuma commented on a change in pull request #8812: KAFKA-10101: Fix edge cases in Log.recoverLog and LogManager.loadLogs

ijuma commented on a change in pull request #8812:
URL: https://github.com/apache/kafka/pull/8812#discussion_r583656545



##########
File path: core/src/main/scala/kafka/log/Log.scala
##########
@@ -826,8 +832,16 @@ class Log(@volatile private var _dir: File,
         preallocate = config.preallocate))
     }
 
-    recoveryPoint = activeSegment.readNextOffset

Review comment:
       @purplefox To clarify, the case you are talking about is:
   
   1. A catastrophic scenario where a partition is offline (i.e. all replicas are down)
   2. The OS was not shutdown
   3. The OS did not flush the data to disk (this typically happens irrespective of our flushes due to OS configurations)
   4. The replica that was the last member of the ISR comes back up, registers, to ZK and the Controller makes it the
   leader (since it was the last member of the ISR, if it was a different replica, it won't be given leadership without
   unclean leader election)
   5. The hw is beyond the flushed segments
   5. Consumer fetches the data beyond the flushed segments
   6. OS hard dies
   
   This is an interesting edge case, it seems incredibly unlikely, but possible if the hw can be beyond the flushed segments. @junrao & @hachikuji are we missing any detail that protects us against this?
   
   The following has a similar impact:
   1. Leader accepts a write
   2. Write is replicated, hw is incremented, but data is not flushed
   3. All replicas die, but the ISR is not shrunk yet
   4. Leader receives a write, accepts it, replication doesn't happen since replicas are gone
   5. ISR is shrunk, hw is incremented
   6. Producer won't receive a successful ack given min.isr=2, but the consumer reads data that is only in the leader
   7. Leader crashes and the unflushed data is gone (or hard disk dies and all the data in the leader is gone)
   
   Flushing the segments during recovery helps on some scenarios, but not the one I just mentioned (assuming I am not missing anything). @hachikuji had a "strict min isr" proposal where the ISR is never allowed to shrink below `min.isr`. I haven't thought about all the details, but perhaps that covers both issues. Thoughts @hachikuji?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org