You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "huxihx (JIRA)" <ji...@apache.org> on 2017/12/14 00:25:00 UTC

[jira] [Commented] (KAFKA-6345) NetworkClient.inFlightRequestCount() is not thread safe, causing ConcurrentModificationExceptions when sensors are read

    [ https://issues.apache.org/jira/browse/KAFKA-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16290140#comment-16290140 ] 

huxihx commented on KAFKA-6345:
-------------------------------

A easy-thinking solution is to create a safe count method only for JMX metrics. The safe version creates a live snapshot for the map by deep copying each map entries.

> NetworkClient.inFlightRequestCount() is not thread safe, causing ConcurrentModificationExceptions when sensors are read
> -----------------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-6345
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6345
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients
>    Affects Versions: 1.0.0
>            Reporter: radai rosenblatt
>
> example stack trace (code is ~0.10.2.*)
> {code}
> java.util.ConcurrentModificationException: java.util.ConcurrentModificationException
> 	at java.util.HashMap$HashIterator.nextNode(HashMap.java:1429)
> 	at java.util.HashMap$ValueIterator.next(HashMap.java:1458)
> 	at org.apache.kafka.clients.InFlightRequests.inFlightRequestCount(InFlightRequests.java:109)
> 	at org.apache.kafka.clients.NetworkClient.inFlightRequestCount(NetworkClient.java:382)
> 	at org.apache.kafka.clients.producer.internals.Sender$SenderMetrics$1.measure(Sender.java:480)
> 	at org.apache.kafka.common.metrics.KafkaMetric.value(KafkaMetric.java:61)
> 	at org.apache.kafka.common.metrics.KafkaMetric.value(KafkaMetric.java:52)
> 	at org.apache.kafka.common.metrics.JmxReporter$KafkaMbean.getAttribute(JmxReporter.java:183)
> 	at org.apache.kafka.common.metrics.JmxReporter$KafkaMbean.getAttributes(JmxReporter.java:193)
> 	at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.getAttributes(DefaultMBeanServerInterceptor.java:709)
> 	at com.sun.jmx.mbeanserver.JmxMBeanServer.getAttributes(JmxMBeanServer.java:705)
> {code}
> looking at latest trunk, the code is still vulnerable:
> # NetworkClient.inFlightRequestCount() eventually iterates over InFlightRequests.requests.values(), which is backed by a (non-thread-safe) HashMap
> # this will be called from the "requests-in-flight" sensor's measure() method (Sender.java line  ~765 in SenderMetrics ctr), which would be driven by some thread reading JMX values
> # HashMap in question would also be updated by some client io thread calling NetworkClient.doSend() - which calls into InFlightRequests.add())
> i guess the only upside is that this exception will always happen on the thread reading the JMX values and never on the actual client io thread ...



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)