You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Rajini Sivaram (JIRA)" <ji...@apache.org> on 2018/07/05 20:31:00 UTC

[jira] [Created] (KAFKA-7136) PushHttpMetricsReporter may deadlock when processing metrics changes

Rajini Sivaram created KAFKA-7136:
-------------------------------------

             Summary: PushHttpMetricsReporter may deadlock when processing metrics changes
                 Key: KAFKA-7136
                 URL: https://issues.apache.org/jira/browse/KAFKA-7136
             Project: Kafka
          Issue Type: Bug
          Components: metrics
    Affects Versions: 1.1.0, 2.0.0
            Reporter: Rajini Sivaram
            Assignee: Rajini Sivaram
             Fix For: 2.0.0


We noticed a deadlock in {{PushHttpMetricsReporter}}. Locking for metrics was changed under KAFKA-6765 to avoid {{NullPointerException}} in metrics reporters due to concurrent read and updates. {{PushHttpMetricsReporter}} requires a lock to process metrics registration that is invoked while holding the sensor lock. It also reads metrics attempting to acquire sensor lock while holding its lock (inverse order). This resulted in the deadlock below. 

{quote}
Found one Java-level deadlock:
Java stack information for the threads listed above:
===================================================
"StreamThread-7":
        at org.apache.kafka.tools.PushHttpMetricsReporter.metricChange(PushHttpMetricsReporter.java:144)
        - waiting to lock <0x0000000655a54310> (a java.lang.Object)
        at org.apache.kafka.common.metrics.Metrics.registerMetric(Metrics.java:563)
        - locked <0x0000000655a44a28> (a org.apache.kafka.common.metrics.Metrics)
        at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:236)
        - locked <0x000000065629c170> (a org.apache.kafka.common.metrics.Sensor)
        at org.apache.kafka.common.metrics.Sensor.add(Sensor.java:217)
        at org.apache.kafka.common.network.Selector$SelectorMetrics.maybeRegisterConnectionMetrics(Selector.java:1016)
        at org.apache.kafka.common.network.Selector.pollSelectionKeys(Selector.java:462)
        at org.apache.kafka.common.network.Selector.poll(Selector.java:425)
        at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:510)
        at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:271)
        at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:242)
        at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:218)
        at org.apache.kafka.clients.consumer.internals.Fetcher.getTopicMetadata(Fetcher.java:274)
        at org.apache.kafka.clients.consumer.internals.Fetcher.getAllTopicMetadata(Fetcher.java:254)
        at org.apache.kafka.clients.consumer.KafkaConsumer.listTopics(KafkaConsumer.java:1820)
        at org.apache.kafka.clients.consumer.KafkaConsumer.listTopics(KafkaConsumer.java:1798)
        at org.apache.kafka.streams.processor.internals.StoreChangelogReader.refreshChangelogInfo(StoreChangelogReader.java:224)
        at org.apache.kafka.streams.processor.internals.StoreChangelogReader.initialize(StoreChangelogReader.java:121)
        at org.apache.kafka.streams.processor.internals.StoreChangelogReader.restore(StoreChangelogReader.java:74)
        at org.apache.kafka.streams.processor.internals.TaskManager.updateNewAndRestoringTasks(TaskManager.java:317)
        at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:824)
        at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:767)
        at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:736)
"pool-17-thread-1":
        at org.apache.kafka.common.metrics.KafkaMetric.measurableValue(KafkaMetric.java:82)
        - waiting to lock <0x000000065629c170> (a org.apache.kafka.common.metrics.Sensor)
        at org.apache.kafka.common.metrics.KafkaMetric.value(KafkaMetric.java:58)
        at org.apache.kafka.tools.PushHttpMetricsReporter$HttpReporter.run(PushHttpMetricsReporter.java:177)
        - locked <0x0000000655a54310> (a java.lang.Object)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)

Found 1 deadlock.
{quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)