You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@kafka.apache.org by "Guozhang Wang (JIRA)" <ji...@apache.org> on 2015/12/02 18:57:11 UTC

[jira] [Comment Edited] (KAFKA-1894) Avoid long or infinite blocking in the consumer

    [ https://issues.apache.org/jira/browse/KAFKA-1894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036256#comment-15036256 ] 

Guozhang Wang edited comment on KAFKA-1894 at 12/2/15 5:56 PM:
---------------------------------------------------------------

[~BigAndy] You can indeed interrupt the poll() call by calling wakeup() from another thread (search for the paragraph of "multi-threaded processing"):

http://kafka.apache.org/090/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html

This ticket is for resolving the issue that poll(timeout) can actually block longer than the specified timeout value if the broker is not available and no one else wakes it up.


was (Author: guozhang):
[~BigAndy] You can indeed interrupt the poll() call by calling wakeup() from another thread (search for the paragraph of "multi-threaded processing"):

http://kafka.apache.org/090/javadoc/index.html?org/apache/kafka/clients/consumer/KafkaConsumer.html

> Avoid long or infinite blocking in the consumer
> -----------------------------------------------
>
>                 Key: KAFKA-1894
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1894
>             Project: Kafka
>          Issue Type: Sub-task
>          Components: consumer
>            Reporter: Jay Kreps
>            Assignee: Jason Gustafson
>             Fix For: 0.10.0.0
>
>
> The new consumer has a lot of loops that look something like
> {code}
>   while(!isThingComplete())
>     client.poll();
> {code}
> This occurs both in KafkaConsumer but also in NetworkClient.completeAll. These retry loops are actually mostly the behavior we want but there are several cases where they may cause problems:
>  - In the case of a hard failure we may hang for a long time or indefinitely before realizing the connection is lost.
>  - In the case where the cluster is malfunctioning or down we may retry forever.
> It would probably be better to give a timeout to these. The proposed approach would be to add something like retry.time.ms=60000 and only continue retrying for that period of time.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)