You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Json Tu <ka...@126.com> on 2017/10/22 16:01:26 UTC

Get Broker metrics timeout

Hi all,
	we have a cluster with 10 brokers, and our kafka version is 0.9.0.1,we repeatedly get our metric data such as offlinePartition metric from each broker with 2 minutes gap to achieve the goal of cluster’s monitor.
but accidental timeout occurs when we get data from some of brokers. which will leads to false alarm information.
	such as we may get exception as below.
	error: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: 10.11.12.13; nested exception is: 
	java.net.ConnectException: Connection timed out]
	
	we find our TcpExt.TCPBacklogDrop index is fluctuate repeatedly, may be this is some root cause. if it’s the problem. how can I optimize it.

	Any suggestion is appreciated. Thanks before.