You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/04/13 12:56:41 UTC

[jira] [Commented] (KAFKA-5065) AbstractCoordinator.ensureCoordinatorReady() stuck in loop if absent any bootstrap servers

    [ https://issues.apache.org/jira/browse/KAFKA-5065?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15967548#comment-15967548 ] 

ASF GitHub Bot commented on KAFKA-5065:
---------------------------------------

GitHub user porshkevich opened a pull request:

    https://github.com/apache/kafka/pull/2850

    KAFKA-5065; AbstractCoordinator.ensureCoordinatorReady() stuck in loop if absent any bootstrap servers

    add a consumer config: "max.block.ms"
    default to 60000 ms;
    when specified, the ensureCoordinatorReady check default call will be limited by "max.block.ms"

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/porshkevich/kafka KAFKA-5065

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/kafka/pull/2850.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #2850
    
----
commit 99004de30a5400b2d8554b4a4469039498e033d4
Author: Vladimir Porshkevich <ne...@inbox.ru>
Date:   2017-04-13T12:41:31Z

    Add max.block.ms to allow timing out ensureCoordinatorReady check.

----


> AbstractCoordinator.ensureCoordinatorReady() stuck in loop if absent any bootstrap servers 
> -------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-5065
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5065
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 0.10.0.0, 0.10.0.1, 0.10.1.0, 0.10.1.1, 0.10.2.0
>            Reporter: Vladimir Porshkevich
>              Labels: newbie
>   Original Estimate: 4m
>  Remaining Estimate: 4m
>
> If Consumer started with wrong bootstrap servers or absent any valid servers, and Thread call Consumer.poll(timeout) with any timeout Thread stuck in loop with debug logs like
> {noformat}
> org.apache.kafka.common.network.Selector - Connection with /172.31.1.100 disconnected
> java.net.ConnectException: Connection timed out: no further information
> 	at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> 	at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> 	at org.apache.kafka.common.network.PlaintextTransportLayer.finishConnect(PlaintextTransportLayer.java:51)
> 	at org.apache.kafka.common.network.KafkaChannel.finishConnect(KafkaChannel.java:81)
> 	at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:335)
> 	at org.apache.kafka.common.network.Selector.poll(Selector.java:303)
> 	at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:349)
> 	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:226)
> 	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:203)
> 	at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitMetadataUpdate(ConsumerNetworkClient.java:138)
> 	at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:216)
> 	at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureCoordinatorReady(AbstractCoordinator.java:193)
> 	at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.poll(ConsumerCoordinator.java:275)
> 	at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1030)
> 	at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:995)
> 	at com.example.SccSpringCloudDemoApplication.main(SccSpringCloudDemoApplication.java:46)
> {noformat}
> Problem with AbstractCoordinator.ensureCoordinatorReady() method
> It uses Long.MAX_VALUE as timeout.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)