You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Cheng Tan (Jira)" <ji...@apache.org> on 2020/05/18 08:05:00 UTC

[jira] [Updated] (KAFKA-9893) Configurable TCP connection timeout and improve the initial metadata fetch

     [ https://issues.apache.org/jira/browse/KAFKA-9893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cheng Tan updated KAFKA-9893:
-----------------------------
    Description: 
This issue has two parts:
 # Support transportation layer connection timeout described in KIP-601
 # Optimize the logic for NetworkClient.leastLoadedNode()

Changes:
 # Added a new common client configuration parameter socket.connection.setup.timeout.ms to the NetworkClient. Handle potential transportation layer timeout using the same approach as it handling potential request timeout.
 # When no connected channel exists, leastLoadedNode() will now provide a disconnected node that has the least number of failed attempts. 

  was:
This issue has two parts:
 # Support TCP connection timeout described in KIP-601
 # Currently, the LeastLoadedNodeProvider might provide an offline/invalid node when no nodes provided in --boostrap-server option is not connected. The Cluster class shuffled the nodes to balance the initial pressure (I guess) and the LeastLoadedNodeProvider will always provide the same node, which is the last node after shuffling. Consequently, though we may provide several bootstrap servers, we might hit timeout if any of the servers shutdown.

The implementation strategy for 1 is described in KIP-601

The solution for 2 is to implement a round-robin candidate node selection when every node is unconnected. We can either
 # shuffle the nodes every time we hit the "no node connected" status
 # keep the status of the nodes' try times and clean the try times after any of the nodes gets connected.

 


> Configurable TCP connection timeout and improve the initial metadata fetch
> --------------------------------------------------------------------------
>
>                 Key: KAFKA-9893
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9893
>             Project: Kafka
>          Issue Type: New Feature
>            Reporter: Cheng Tan
>            Assignee: Cheng Tan
>            Priority: Major
>
> This issue has two parts:
>  # Support transportation layer connection timeout described in KIP-601
>  # Optimize the logic for NetworkClient.leastLoadedNode()
> Changes:
>  # Added a new common client configuration parameter socket.connection.setup.timeout.ms to the NetworkClient. Handle potential transportation layer timeout using the same approach as it handling potential request timeout.
>  # When no connected channel exists, leastLoadedNode() will now provide a disconnected node that has the least number of failed attempts. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)