You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Colin P. McCabe (JIRA)" <ji...@apache.org> on 2017/05/03 17:23:04 UTC

[jira] [Commented] (KAFKA-5004) poll() timeout not enforced when connecting to 0.10.0 broker

    [ https://issues.apache.org/jira/browse/KAFKA-5004?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15995257#comment-15995257 ] 

Colin P. McCabe commented on KAFKA-5004:
----------------------------------------

Thanks for filing this, [~mjsax].  I think the severity is mitigated somewhat by the fact that there has to be a client-side bug (polling thread dies) to trigger the bad behavior.

bq. IMHO, a "clean" solution would be, to disable the heartbeat thread if the client connects to 0.10.0 broker and sends heartbeats on poll() as 0.10.0 consumer does. Not sure, how complex this would be to do though.

I think this would be a bit risky since we'd be adding code that only ever gets used in a very obscure error path when talking to 0.10.0 brokers.  It's not likely to be well-tested.

bq. [~cmccabe] had the idea to set a "flag" on the heartbeat thread each time poll() is called, and let the heartbeat thread stop if max.poll.interval.ms passed and flag got not "renewed".

Yeah, this might be a good option.

> poll() timeout not enforced when connecting to 0.10.0 broker
> ------------------------------------------------------------
>
>                 Key: KAFKA-5004
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5004
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients, consumer
>    Affects Versions: 0.10.2.0
>            Reporter: Matthias J. Sax
>
> In 0.10.1, heartbeat thread and new poll timeout {{max.poll.interval.ms}} got introduced via KIP-62. In 0.10.2, we added client-broker backward compatibility.
> Now, if a 0.10.2 client connects to a 0.10.0 broker, the broker only understand the heartbeat timeout but not the poll timeout, while the client is still using the heartbeat background threat. Thus, the new client config {{max.poll.interval.ms}} is ignored.
> In the worst case, the polling threat might die while the heartbeat thread is still up. Thus, the broker would not timeout the client and no rebalance would be triggered while at the same time the client is effectively dead not making any progress in its assigned partitions.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)