You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Manoj Ramakrishnan (JIRA)" <ji...@apache.org> on 2016/05/19 17:26:13 UTC

[jira] [Commented] (KAFKA-493) High CPU usage on inactive server

    [ https://issues.apache.org/jira/browse/KAFKA-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15291584#comment-15291584 ] 

Manoj Ramakrishnan commented on KAFKA-493:
------------------------------------------

[~cosmin.marginean] We are running into something similar. Whenever we have consumer connected, we see the CPU bumps to about 100% and stays there. When consumers are stopped, the CPU goes down. We see similar the following stack as some others have noted. To make sure that is not a jdk issue, what version of java jdk are you running on ? 

> High CPU usage on inactive server
> ---------------------------------
>
>                 Key: KAFKA-493
>                 URL: https://issues.apache.org/jira/browse/KAFKA-493
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.8.0
>            Reporter: Jay Kreps
>         Attachments: 0.9.0.1-upgrade.png, Kafka-2014-11-10.snapshot.zip, Kafka-sampling1.zip, Kafka-sampling2.zip, Kafka-sampling3.zip, Kafka-trace1.zip, Kafka-trace2.zip, Kafka-trace3.zip, backtraces.txt, stacktrace.txt
>
>
> > I've been playing with the 0.8 branch of Kafka and noticed that idle CPU usage is fairly high (13% of a 
> > core). Is that to be expected? I did look at the stack, but didn't see anything obvious. A background 
> > task?
> > I wanted to mention how I am getting into this state. I've set up two machines with the latest 0.8 
> > code base and am using a replication factor of 2. On starting the brokers there is no idle CPU activity. 
> > Then I run a test that essential does 10k publish operations followed by immediate consume operations 
> > (I was measuring latency). Once this has run the kafka nodes seem to consistently be consuming CPU 
> > essentially forever.
> hprof results:
> THREAD START (obj=500003ae, id = 200004, name="RMI TCP Accept-0", group="system")
> THREAD START (obj=500003ae, id = 200005, name="RMI TCP Accept-9999", group="system")
> THREAD START (obj=500003ae, id = 200006, name="RMI TCP Accept-0", group="system")
> THREAD START (obj=500003ae, id = 200001, name="main", group="main")
> THREAD START (obj=500003ae, id = 200007, name="Thread-2", group="main")
> THREAD START (obj=500003ae, id = 200008, name="Thread-3", group="main")
> THREAD START (obj=500003ae, id = 200009, name="kafka-processor-9092-0", group="main")
> THREAD START (obj=500003ae, id = 200010, name="kafka-processor-9092-1", group="main")
> THREAD START (obj=500003ae, id = 200011, name="kafka-acceptor", group="main")
> THREAD START (obj=5000074b, id = 200012, name="ZkClient-EventThread-20-localhost:2181", group="main")
> THREAD START (obj=5000076e, id = 200014, name="main-SendThread()", group="main")
> THREAD START (obj=5000076d, id = 200013, name="main-EventThread", group="main")
> THREAD START (obj=500003ae, id = 200015, name="metrics-meter-tick-thread-1", group="main")
> THREAD START (obj=500003ae, id = 200016, name="metrics-meter-tick-thread-2", group="main")
> THREAD START (obj=500003ae, id = 200017, name="request-expiration-task", group="main")
> THREAD START (obj=500003ae, id = 200018, name="request-expiration-task", group="main")
> THREAD START (obj=500003ae, id = 200019, name="kafka-request-handler-0", group="main")
> THREAD START (obj=500003ae, id = 200020, name="kafka-request-handler-1", group="main")
> THREAD START (obj=500003ae, id = 200021, name="Thread-6", group="main")
> THREAD START (obj=500003ae, id = 200022, name="Thread-7", group="main")
> THREAD START (obj=50000899, id = 200023, name="ReplicaFetcherThread-0-2 on broker 1, ", group="main")
> THREAD START (obj=50000899, id = 200024, name="ReplicaFetcherThread-0-3 on broker 1, ", group="main")
> THREAD START (obj=50000899, id = 200025, name="ReplicaFetcherThread-0-0 on broker 1, ", group="main")
> THREAD START (obj=50000899, id = 200026, name="ReplicaFetcherThread-0-1 on broker 1, ", group="main")
> THREAD START (obj=500003ae, id = 200028, name="SIGINT handler", group="system")
> THREAD START (obj=500003ae, id = 200029, name="Thread-5", group="main")
> THREAD START (obj=5000074b, id = 200030, name="Thread-1", group="main")
> THREAD START (obj=5000074b, id = 200031, name="Thread-0", group="main")
> THREAD END (id = 200031)
> THREAD END (id = 200029)
> THREAD END (id = 200020)
> THREAD END (id = 200019)
> THREAD END (id = 200008)
> THREAD END (id = 200021)
> THREAD END (id = 200007)
> THREAD END (id = 200022)
> THREAD END (id = 200018)
> THREAD END (id = 200017)
> THREAD END (id = 200012)
> THREAD END (id = 200013)
> THREAD END (id = 200014)
> THREAD END (id = 200025)
> THREAD END (id = 200023)
> THREAD END (id = 200026)
> THREAD END (id = 200024)
> THREAD END (id = 200011)
> THREAD END (id = 200009)
> THREAD END (id = 200010)
> THREAD END (id = 200030)
> THREAD END (id = 200028)
> TRACE 301281:
>         sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
>         sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
>         sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
>         sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>         sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>         sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:218)
>         sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
>         java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
>         kafka.utils.Utils$.read(Utils.scala:453)
>         kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
>         kafka.network.Receive$class.readCompletely(Transmission.scala:56)
>         kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
>         kafka.network.BlockingChannel.receive(BlockingChannel.scala:92)
>         kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:67)
>         kafka.consumer.SimpleConsumer.sendRequest(SimpleConsumer.scala:65)
>         kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:95)
>         kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:71)
>         kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
> TRACE 300427:
>         java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown line)
>         java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
>         java.net.ServerSocket.implAccept(ServerSocket.java:522)
>         java.net.ServerSocket.accept(ServerSocket.java:490)
>         sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:387)
>         sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:359)
>         java.lang.Thread.run(Thread.java:722)
> TRACE 300963:
>         sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
>         sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
>         sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
>         sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>         sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>         kafka.network.Processor.run(SocketServer.scala:202)
>         java.lang.Thread.run(Thread.java:722)
> TRACE 300558:
>         java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown line)
>         java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
>         java.net.ServerSocket.implAccept(ServerSocket.java:522)
>         java.net.ServerSocket.accept(ServerSocket.java:490)
>         sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:52)
>         sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:387)
>         sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:359)
>         java.lang.Thread.run(Thread.java:722)
> TRACE 300978:
>         sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
>         sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
>         sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
>         sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>         sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>         kafka.network.Acceptor.run(SocketServer.scala:142)
>         java.lang.Thread.run(Thread.java:722)
> TRACE 301024:
>         sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
>         sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
>         sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
>         sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
>         sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
>         org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1134)
> TRACE 301830:
>         sun.misc.Unsafe.park(Unsafe.java:Unknown line)
>         java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
>         java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
>         java.util.concurrent.DelayQueue.take(DelayQueue.java:220)
>         kafka.server.RequestPurgatory$ExpiredRequestReaper.pollExpired(RequestPurgatory.scala:282)
>         kafka.server.RequestPurgatory$ExpiredRequestReaper.run(RequestPurgatory.scala:235)
>         java.lang.Thread.run(Thread.java:722)
> CPU SAMPLES BEGIN (total = 1878003) Thu Aug 30 22:26:05 2012
> rank   self  accum   count trace method
>    1 36.22% 36.22%  680139 301281 sun.nio.ch.EPollArrayWrapper.epollWait
>    2 18.15% 54.36%  340770 300427 java.net.PlainSocketImpl.socketAccept
>    3 18.13% 72.49%  340458 300963 sun.nio.ch.EPollArrayWrapper.epollWait
>    4  9.07% 81.56%  170382 300558 java.net.PlainSocketImpl.socketAccept
>    5  9.07% 90.63%  170340 300978 sun.nio.ch.EPollArrayWrapper.epollWait
>    6  9.06% 99.69%  170139 301024 sun.nio.ch.EPollArrayWrapper.epollWait
>    7  0.18% 99.88%    3429 301830 sun.misc.Unsafe.park
> CPU SAMPLES END



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)