You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by GitBox <gi...@apache.org> on 2021/11/20 04:03:23 UTC

[GitHub] [kafka] functioner opened a new pull request #11520: KAFKA-13468: Consumers may hang because IOException in Log# does not trigger KafkaStorageException

functioner opened a new pull request #11520:
URL: https://github.com/apache/kafka/pull/11520


   A patch for [KAFKA-13468](https://issues.apache.org/jira/browse/KAFKA-13468)
   
   P.S. We found and described this issue in Kafka 2.8.0. We confirmed that this issue holds from Kafka version 2.8.0 to 3.0.0, but currently in the trunk branch, `core/src/main/scala/kafka/log/Log.scala` is renamed to `core/src/main/scala/kafka/log/UnifiedLog.scala` and there are some small code changes. However, this issue still holds, so we are still submitting the pull request for the fix for the trunk branch (the fix is slightly different from the fix in 2.8.0). And we also propose that the fix should be also applied to version 2.8.0 and 3.0.0, etc, with another pull request.
   
   Another issue is that currently we just catch the IOException and throw a new KafkaStorageException. But we are thinking whether we should also use `logDirFailureChannel.maybeAddOfflineLogDir` to handle the IOException, such as https://github.com/apache/kafka/blob/ebb1d6e21cc9213071ee1c6a15ec3411fc215b81/core/src/main/scala/kafka/server/checkpoints/CheckpointFile.scala#L92-L120 and https://github.com/apache/kafka/blob/ebb1d6e21cc9213071ee1c6a15ec3411fc215b81/core/src/main/scala/kafka/server/checkpoints/CheckpointFile.scala#L126-L139
   If so, `logDirFailureChannel.maybeAddOfflineLogDir` would crash the node according to the protocol in https://github.com/apache/kafka/blob/ebb1d6e21cc9213071ee1c6a15ec3411fc215b81/core/src/main/scala/kafka/server/ReplicaManager.scala#L268-L277 and https://github.com/apache/kafka/blob/ebb1d6e21cc9213071ee1c6a15ec3411fc215b81/core/src/main/scala/kafka/server/ReplicaManager.scala#L327-L332


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] functioner commented on pull request #11520: KAFKA-13468: Consumers may hang because IOException in Log# does not trigger KafkaStorageException

Posted by GitBox <gi...@apache.org>.
functioner commented on pull request #11520:
URL: https://github.com/apache/kafka/pull/11520#issuecomment-978812208


   @dajac do you have time to review this patch?
   I think it is an exception handling bug kind of similar to https://github.com/apache/kafka/pull/11504 merged yesterday.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [kafka] hachikuji commented on pull request #11520: KAFKA-13468: Consumers may hang because IOException in Log# does not trigger KafkaStorageException

Posted by GitBox <gi...@apache.org>.
hachikuji commented on pull request #11520:
URL: https://github.com/apache/kafka/pull/11520#issuecomment-1034437206


   @functioner Good find. It does indeed look like `IoException` raised from `LogManager.getOrCreateLog` is not caught anywhere. In addition to catching, we probably need to add the dir to `LogDirFailureChannel` as you suggested. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscribe@kafka.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org