You are viewing a plain text version of this content. The canonical link for it is here.

Posted to jira@kafka.apache.org by "Jason Gustafson (JIRA)" <ji...@apache.org> on 2018/02/26 23:11:00 UTC

[jira] [Commented] (KAFKA-6593) Coordinator disconnect in heartbeat thread can cause commitSync to block indefinitely

    [ https://issues.apache.org/jira/browse/KAFKA-6593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16377764#comment-16377764 ] 

Jason Gustafson commented on KAFKA-6593:
----------------------------------------

I was wrong about the explanation. The {{ConsumerNetworkClient}} already has some protection from this kind of scenario (e.g. it limits the maximum poll time to 5 seconds). So something else is going on in the case that I'm looking at.

> Coordinator disconnect in heartbeat thread can cause commitSync to block indefinitely
> -------------------------------------------------------------------------------------
>
>                 Key: KAFKA-6593
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6593
>             Project: Kafka
>          Issue Type: Bug
>          Components: consumer
>    Affects Versions: 1.0.0, 0.11.0.2
>            Reporter: Jason Gustafson
>            Assignee: Jason Gustafson
>            Priority: Major
>             Fix For: 1.1.0
>
>
> If a coordinator disconnect is observed in the heartbeat thread, it can cause a pending offset commit to be cancelled just before the foreground thread begins waiting on its response in poll(). Since the poll timeout is Long.MAX_VALUE, this will cause the consumer to effectively hang until some other network event causes the poll() to return. We try to protect this case with a poll condition on the future, but this isn't bulletproof since the future can be completed outside of the lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)