You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "RivenSun (Jira)" <ji...@apache.org> on 2022/01/05 08:41:00 UTC

[jira] [Commented] (KAFKA-13576) Processor.ConnectionQueueSize provides configuration & metrics, SelectorMetrics adds connection-register related metrics

    [ https://issues.apache.org/jira/browse/KAFKA-13576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17469124#comment-17469124 ] 

RivenSun commented on KAFKA-13576:
----------------------------------

Hi [~showuon]  [~guozhang] 
Please help check this Jira,Thanks.

> Processor.ConnectionQueueSize provides configuration & metrics, SelectorMetrics adds connection-register related metrics
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-13576
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13576
>             Project: Kafka
>          Issue Type: Improvement
>          Components: metrics, network
>    Affects Versions: 3.0.0
>            Reporter: RivenSun
>            Assignee: Luke Chen
>            Priority: Major
>
> h1. Problem:
> After all client machines are switched to the company's private BYOIP, producers who send messages frequently have a significant increase in time consumption. Producers who send messages infrequently often throw out exceptions that send messages to obtain metadata timeout. Everything was normal before switching
> h1. RC:
> 1. The client's BYOIP lacks PTR configuration
> 2. When the port uses SASL_SSL protocol, the underlying method SaslChannelBuilder#buildTransportLayer of Processor#configureNewConnections will call socketChannel.socket().getInetAddress().getHostName() to trigger DNS reverse lookup. If clientIp lacks PTR configuration, this will cause getHostName() will be time consuming.
> 3. Several steps in the processor's run method are executed serially. If configureNewConnections takes time, it will inevitably cause the completed response to not be sent to the client in time, resulting in an increase in the ack time for the producer to send messages
> 4. ConfigureNewConnections is time-consuming, which will cause the elements in Processor.newConnections to not be removed in time, which will increase the time-consuming of the Acceptor#assignNewConnection method. AssignNewConnection will even block in newConnections.put(socketChannel). At this time, the Acceptor thread may reject any new creation TCP connection request.
> h1. Solution:
> 1. Add PTR configuration to the BYOIP of the client
> 2. Kafka high version has fixed this problem,
> https://issues.apache.org/jira/browse/KAFKA-8562
> https://github.com/apache/kafka/pull/10059
> 3. Selector Metrics of each processor’s selector, add *connection-register* related metrics.
> Selector#register(String id, SocketChannel socketChannel) In this method, update the connection-register related indicators, the metrics indicator type is expected to use newHistogram, which is similar to the attribute field of *responseQueueTimeMs*
> 4.
> 1) The queue size of Processor.newConnections is recommended to be configurable
> Source code:
> {code:java}
> private[kafka] object Processor {
>   val IdlePercentMetricName = "IdlePercent"
>   val NetworkProcessorMetricTag = "networkProcessor"
>   val ListenerMetricTag = "listener"
>   val ConnectionQueueSize = 20
> }{code}
> The current value is 20, and the code is hard-coded here, perhaps for design considerations, but it is still recommended to provide configuration, *queued.max.connections* acts on processors of all ports,
> Or the processor of each listener port provides independent configuration
> *listener.name.\{listenerName}.queued.max.connections*
> 2) Provide metrics statistics for each processor’s newConnections queue size: {*}ConnectionQueueSize{*}, ConnectionQueueSize metrics can refer to the *ResponseQueueSize* maintained in RequestChannel



--
This message was sent by Atlassian Jira
(v8.20.1#820001)