You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Jay Kreps (JIRA)" <ji...@apache.org> on 2012/08/31 18:42:08 UTC
[jira] [Created] (KAFKA-493) High CPU usage on inactive server
Jay Kreps created KAFKA-493:
-------------------------------
Summary: High CPU usage on inactive server
Key: KAFKA-493
URL: https://issues.apache.org/jira/browse/KAFKA-493
Project: Kafka
Issue Type: Bug
Components: core
Affects Versions: 0.8
Reporter: Jay Kreps
Fix For: 0.8
> I've been playing with the 0.8 branch of Kafka and noticed that idle CPU usage is fairly high (13% of a
> core). Is that to be expected? I did look at the stack, but didn't see anything obvious. A background
> task?
> I wanted to mention how I am getting into this state. I've set up two machines with the latest 0.8
> code base and am using a replication factor of 2. On starting the brokers there is no idle CPU activity.
> Then I run a test that essential does 10k publish operations followed by immediate consume operations
> (I was measuring latency). Once this has run the kafka nodes seem to consistently be consuming CPU
> essentially forever.
hprof results:
THREAD START (obj=500003ae, id = 200004, name="RMI TCP Accept-0", group="system")
THREAD START (obj=500003ae, id = 200005, name="RMI TCP Accept-9999", group="system")
THREAD START (obj=500003ae, id = 200006, name="RMI TCP Accept-0", group="system")
THREAD START (obj=500003ae, id = 200001, name="main", group="main")
THREAD START (obj=500003ae, id = 200007, name="Thread-2", group="main")
THREAD START (obj=500003ae, id = 200008, name="Thread-3", group="main")
THREAD START (obj=500003ae, id = 200009, name="kafka-processor-9092-0", group="main")
THREAD START (obj=500003ae, id = 200010, name="kafka-processor-9092-1", group="main")
THREAD START (obj=500003ae, id = 200011, name="kafka-acceptor", group="main")
THREAD START (obj=5000074b, id = 200012, name="ZkClient-EventThread-20-localhost:2181", group="main")
THREAD START (obj=5000076e, id = 200014, name="main-SendThread()", group="main")
THREAD START (obj=5000076d, id = 200013, name="main-EventThread", group="main")
THREAD START (obj=500003ae, id = 200015, name="metrics-meter-tick-thread-1", group="main")
THREAD START (obj=500003ae, id = 200016, name="metrics-meter-tick-thread-2", group="main")
THREAD START (obj=500003ae, id = 200017, name="request-expiration-task", group="main")
THREAD START (obj=500003ae, id = 200018, name="request-expiration-task", group="main")
THREAD START (obj=500003ae, id = 200019, name="kafka-request-handler-0", group="main")
THREAD START (obj=500003ae, id = 200020, name="kafka-request-handler-1", group="main")
THREAD START (obj=500003ae, id = 200021, name="Thread-6", group="main")
THREAD START (obj=500003ae, id = 200022, name="Thread-7", group="main")
THREAD START (obj=50000899, id = 200023, name="ReplicaFetcherThread-0-2 on broker 1, ", group="main")
THREAD START (obj=50000899, id = 200024, name="ReplicaFetcherThread-0-3 on broker 1, ", group="main")
THREAD START (obj=50000899, id = 200025, name="ReplicaFetcherThread-0-0 on broker 1, ", group="main")
THREAD START (obj=50000899, id = 200026, name="ReplicaFetcherThread-0-1 on broker 1, ", group="main")
THREAD START (obj=500003ae, id = 200028, name="SIGINT handler", group="system")
THREAD START (obj=500003ae, id = 200029, name="Thread-5", group="main")
THREAD START (obj=5000074b, id = 200030, name="Thread-1", group="main")
THREAD START (obj=5000074b, id = 200031, name="Thread-0", group="main")
THREAD END (id = 200031)
THREAD END (id = 200029)
THREAD END (id = 200020)
THREAD END (id = 200019)
THREAD END (id = 200008)
THREAD END (id = 200021)
THREAD END (id = 200007)
THREAD END (id = 200022)
THREAD END (id = 200018)
THREAD END (id = 200017)
THREAD END (id = 200012)
THREAD END (id = 200013)
THREAD END (id = 200014)
THREAD END (id = 200025)
THREAD END (id = 200023)
THREAD END (id = 200026)
THREAD END (id = 200024)
THREAD END (id = 200011)
THREAD END (id = 200009)
THREAD END (id = 200010)
THREAD END (id = 200030)
THREAD END (id = 200028)
TRACE 301281:
sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:218)
sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
kafka.utils.Utils$.read(Utils.scala:453)
kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
kafka.network.Receive$class.readCompletely(Transmission.scala:56)
kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
kafka.network.BlockingChannel.receive(BlockingChannel.scala:92)
kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:67)
kafka.consumer.SimpleConsumer.sendRequest(SimpleConsumer.scala:65)
kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:95)
kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:71)
kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
TRACE 300427:
java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown line)
java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
java.net.ServerSocket.implAccept(ServerSocket.java:522)
java.net.ServerSocket.accept(ServerSocket.java:490)
sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:387)
sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:359)
java.lang.Thread.run(Thread.java:722)
TRACE 300963:
sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
kafka.network.Processor.run(SocketServer.scala:202)
java.lang.Thread.run(Thread.java:722)
TRACE 300558:
java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown line)
java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
java.net.ServerSocket.implAccept(ServerSocket.java:522)
java.net.ServerSocket.accept(ServerSocket.java:490)
sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:52)
sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:387)
sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:359)
java.lang.Thread.run(Thread.java:722)
TRACE 300978:
sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
kafka.network.Acceptor.run(SocketServer.scala:142)
java.lang.Thread.run(Thread.java:722)
TRACE 301024:
sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1134)
TRACE 301830:
sun.misc.Unsafe.park(Unsafe.java:Unknown line)
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
java.util.concurrent.DelayQueue.take(DelayQueue.java:220)
kafka.server.RequestPurgatory$ExpiredRequestReaper.pollExpired(RequestPurgatory.scala:282)
kafka.server.RequestPurgatory$ExpiredRequestReaper.run(RequestPurgatory.scala:235)
java.lang.Thread.run(Thread.java:722)
CPU SAMPLES BEGIN (total = 1878003) Thu Aug 30 22:26:05 2012
rank self accum count trace method
1 36.22% 36.22% 680139 301281 sun.nio.ch.EPollArrayWrapper.epollWait
2 18.15% 54.36% 340770 300427 java.net.PlainSocketImpl.socketAccept
3 18.13% 72.49% 340458 300963 sun.nio.ch.EPollArrayWrapper.epollWait
4 9.07% 81.56% 170382 300558 java.net.PlainSocketImpl.socketAccept
5 9.07% 90.63% 170340 300978 sun.nio.ch.EPollArrayWrapper.epollWait
6 9.06% 99.69% 170139 301024 sun.nio.ch.EPollArrayWrapper.epollWait
7 0.18% 99.88% 3429 301830 sun.misc.Unsafe.park
CPU SAMPLES END
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-493) High CPU usage on inactive server
Posted by "Sam Meder (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13447955#comment-13447955 ]
Sam Meder commented on KAFKA-493:
---------------------------------
Attached a stacktrace, looks like there definitely are som runnable threads...
> High CPU usage on inactive server
> ---------------------------------
>
> Key: KAFKA-493
> URL: https://issues.apache.org/jira/browse/KAFKA-493
> Project: Kafka
> Issue Type: Bug
> Components: core
> Affects Versions: 0.8
> Reporter: Jay Kreps
> Fix For: 0.8
>
> Attachments: stacktrace.txt
>
>
> > I've been playing with the 0.8 branch of Kafka and noticed that idle CPU usage is fairly high (13% of a
> > core). Is that to be expected? I did look at the stack, but didn't see anything obvious. A background
> > task?
> > I wanted to mention how I am getting into this state. I've set up two machines with the latest 0.8
> > code base and am using a replication factor of 2. On starting the brokers there is no idle CPU activity.
> > Then I run a test that essential does 10k publish operations followed by immediate consume operations
> > (I was measuring latency). Once this has run the kafka nodes seem to consistently be consuming CPU
> > essentially forever.
> hprof results:
> THREAD START (obj=500003ae, id = 200004, name="RMI TCP Accept-0", group="system")
> THREAD START (obj=500003ae, id = 200005, name="RMI TCP Accept-9999", group="system")
> THREAD START (obj=500003ae, id = 200006, name="RMI TCP Accept-0", group="system")
> THREAD START (obj=500003ae, id = 200001, name="main", group="main")
> THREAD START (obj=500003ae, id = 200007, name="Thread-2", group="main")
> THREAD START (obj=500003ae, id = 200008, name="Thread-3", group="main")
> THREAD START (obj=500003ae, id = 200009, name="kafka-processor-9092-0", group="main")
> THREAD START (obj=500003ae, id = 200010, name="kafka-processor-9092-1", group="main")
> THREAD START (obj=500003ae, id = 200011, name="kafka-acceptor", group="main")
> THREAD START (obj=5000074b, id = 200012, name="ZkClient-EventThread-20-localhost:2181", group="main")
> THREAD START (obj=5000076e, id = 200014, name="main-SendThread()", group="main")
> THREAD START (obj=5000076d, id = 200013, name="main-EventThread", group="main")
> THREAD START (obj=500003ae, id = 200015, name="metrics-meter-tick-thread-1", group="main")
> THREAD START (obj=500003ae, id = 200016, name="metrics-meter-tick-thread-2", group="main")
> THREAD START (obj=500003ae, id = 200017, name="request-expiration-task", group="main")
> THREAD START (obj=500003ae, id = 200018, name="request-expiration-task", group="main")
> THREAD START (obj=500003ae, id = 200019, name="kafka-request-handler-0", group="main")
> THREAD START (obj=500003ae, id = 200020, name="kafka-request-handler-1", group="main")
> THREAD START (obj=500003ae, id = 200021, name="Thread-6", group="main")
> THREAD START (obj=500003ae, id = 200022, name="Thread-7", group="main")
> THREAD START (obj=50000899, id = 200023, name="ReplicaFetcherThread-0-2 on broker 1, ", group="main")
> THREAD START (obj=50000899, id = 200024, name="ReplicaFetcherThread-0-3 on broker 1, ", group="main")
> THREAD START (obj=50000899, id = 200025, name="ReplicaFetcherThread-0-0 on broker 1, ", group="main")
> THREAD START (obj=50000899, id = 200026, name="ReplicaFetcherThread-0-1 on broker 1, ", group="main")
> THREAD START (obj=500003ae, id = 200028, name="SIGINT handler", group="system")
> THREAD START (obj=500003ae, id = 200029, name="Thread-5", group="main")
> THREAD START (obj=5000074b, id = 200030, name="Thread-1", group="main")
> THREAD START (obj=5000074b, id = 200031, name="Thread-0", group="main")
> THREAD END (id = 200031)
> THREAD END (id = 200029)
> THREAD END (id = 200020)
> THREAD END (id = 200019)
> THREAD END (id = 200008)
> THREAD END (id = 200021)
> THREAD END (id = 200007)
> THREAD END (id = 200022)
> THREAD END (id = 200018)
> THREAD END (id = 200017)
> THREAD END (id = 200012)
> THREAD END (id = 200013)
> THREAD END (id = 200014)
> THREAD END (id = 200025)
> THREAD END (id = 200023)
> THREAD END (id = 200026)
> THREAD END (id = 200024)
> THREAD END (id = 200011)
> THREAD END (id = 200009)
> THREAD END (id = 200010)
> THREAD END (id = 200030)
> THREAD END (id = 200028)
> TRACE 301281:
> sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:218)
> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> kafka.utils.Utils$.read(Utils.scala:453)
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
> kafka.network.Receive$class.readCompletely(Transmission.scala:56)
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
> kafka.network.BlockingChannel.receive(BlockingChannel.scala:92)
> kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:67)
> kafka.consumer.SimpleConsumer.sendRequest(SimpleConsumer.scala:65)
> kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:95)
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:71)
> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
> TRACE 300427:
> java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown line)
> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
> java.net.ServerSocket.implAccept(ServerSocket.java:522)
> java.net.ServerSocket.accept(ServerSocket.java:490)
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:387)
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:359)
> java.lang.Thread.run(Thread.java:722)
> TRACE 300963:
> sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> kafka.network.Processor.run(SocketServer.scala:202)
> java.lang.Thread.run(Thread.java:722)
> TRACE 300558:
> java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown line)
> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
> java.net.ServerSocket.implAccept(ServerSocket.java:522)
> java.net.ServerSocket.accept(ServerSocket.java:490)
> sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:52)
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:387)
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:359)
> java.lang.Thread.run(Thread.java:722)
> TRACE 300978:
> sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> kafka.network.Acceptor.run(SocketServer.scala:142)
> java.lang.Thread.run(Thread.java:722)
> TRACE 301024:
> sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1134)
> TRACE 301830:
> sun.misc.Unsafe.park(Unsafe.java:Unknown line)
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
> java.util.concurrent.DelayQueue.take(DelayQueue.java:220)
> kafka.server.RequestPurgatory$ExpiredRequestReaper.pollExpired(RequestPurgatory.scala:282)
> kafka.server.RequestPurgatory$ExpiredRequestReaper.run(RequestPurgatory.scala:235)
> java.lang.Thread.run(Thread.java:722)
> CPU SAMPLES BEGIN (total = 1878003) Thu Aug 30 22:26:05 2012
> rank self accum count trace method
> 1 36.22% 36.22% 680139 301281 sun.nio.ch.EPollArrayWrapper.epollWait
> 2 18.15% 54.36% 340770 300427 java.net.PlainSocketImpl.socketAccept
> 3 18.13% 72.49% 340458 300963 sun.nio.ch.EPollArrayWrapper.epollWait
> 4 9.07% 81.56% 170382 300558 java.net.PlainSocketImpl.socketAccept
> 5 9.07% 90.63% 170340 300978 sun.nio.ch.EPollArrayWrapper.epollWait
> 6 9.06% 99.69% 170139 301024 sun.nio.ch.EPollArrayWrapper.epollWait
> 7 0.18% 99.88% 3429 301830 sun.misc.Unsafe.park
> CPU SAMPLES END
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (KAFKA-493) High CPU usage on inactive server
Posted by "Sam Meder (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Sam Meder updated KAFKA-493:
----------------------------
Attachment: stacktrace.txt
> High CPU usage on inactive server
> ---------------------------------
>
> Key: KAFKA-493
> URL: https://issues.apache.org/jira/browse/KAFKA-493
> Project: Kafka
> Issue Type: Bug
> Components: core
> Affects Versions: 0.8
> Reporter: Jay Kreps
> Fix For: 0.8
>
> Attachments: stacktrace.txt
>
>
> > I've been playing with the 0.8 branch of Kafka and noticed that idle CPU usage is fairly high (13% of a
> > core). Is that to be expected? I did look at the stack, but didn't see anything obvious. A background
> > task?
> > I wanted to mention how I am getting into this state. I've set up two machines with the latest 0.8
> > code base and am using a replication factor of 2. On starting the brokers there is no idle CPU activity.
> > Then I run a test that essential does 10k publish operations followed by immediate consume operations
> > (I was measuring latency). Once this has run the kafka nodes seem to consistently be consuming CPU
> > essentially forever.
> hprof results:
> THREAD START (obj=500003ae, id = 200004, name="RMI TCP Accept-0", group="system")
> THREAD START (obj=500003ae, id = 200005, name="RMI TCP Accept-9999", group="system")
> THREAD START (obj=500003ae, id = 200006, name="RMI TCP Accept-0", group="system")
> THREAD START (obj=500003ae, id = 200001, name="main", group="main")
> THREAD START (obj=500003ae, id = 200007, name="Thread-2", group="main")
> THREAD START (obj=500003ae, id = 200008, name="Thread-3", group="main")
> THREAD START (obj=500003ae, id = 200009, name="kafka-processor-9092-0", group="main")
> THREAD START (obj=500003ae, id = 200010, name="kafka-processor-9092-1", group="main")
> THREAD START (obj=500003ae, id = 200011, name="kafka-acceptor", group="main")
> THREAD START (obj=5000074b, id = 200012, name="ZkClient-EventThread-20-localhost:2181", group="main")
> THREAD START (obj=5000076e, id = 200014, name="main-SendThread()", group="main")
> THREAD START (obj=5000076d, id = 200013, name="main-EventThread", group="main")
> THREAD START (obj=500003ae, id = 200015, name="metrics-meter-tick-thread-1", group="main")
> THREAD START (obj=500003ae, id = 200016, name="metrics-meter-tick-thread-2", group="main")
> THREAD START (obj=500003ae, id = 200017, name="request-expiration-task", group="main")
> THREAD START (obj=500003ae, id = 200018, name="request-expiration-task", group="main")
> THREAD START (obj=500003ae, id = 200019, name="kafka-request-handler-0", group="main")
> THREAD START (obj=500003ae, id = 200020, name="kafka-request-handler-1", group="main")
> THREAD START (obj=500003ae, id = 200021, name="Thread-6", group="main")
> THREAD START (obj=500003ae, id = 200022, name="Thread-7", group="main")
> THREAD START (obj=50000899, id = 200023, name="ReplicaFetcherThread-0-2 on broker 1, ", group="main")
> THREAD START (obj=50000899, id = 200024, name="ReplicaFetcherThread-0-3 on broker 1, ", group="main")
> THREAD START (obj=50000899, id = 200025, name="ReplicaFetcherThread-0-0 on broker 1, ", group="main")
> THREAD START (obj=50000899, id = 200026, name="ReplicaFetcherThread-0-1 on broker 1, ", group="main")
> THREAD START (obj=500003ae, id = 200028, name="SIGINT handler", group="system")
> THREAD START (obj=500003ae, id = 200029, name="Thread-5", group="main")
> THREAD START (obj=5000074b, id = 200030, name="Thread-1", group="main")
> THREAD START (obj=5000074b, id = 200031, name="Thread-0", group="main")
> THREAD END (id = 200031)
> THREAD END (id = 200029)
> THREAD END (id = 200020)
> THREAD END (id = 200019)
> THREAD END (id = 200008)
> THREAD END (id = 200021)
> THREAD END (id = 200007)
> THREAD END (id = 200022)
> THREAD END (id = 200018)
> THREAD END (id = 200017)
> THREAD END (id = 200012)
> THREAD END (id = 200013)
> THREAD END (id = 200014)
> THREAD END (id = 200025)
> THREAD END (id = 200023)
> THREAD END (id = 200026)
> THREAD END (id = 200024)
> THREAD END (id = 200011)
> THREAD END (id = 200009)
> THREAD END (id = 200010)
> THREAD END (id = 200030)
> THREAD END (id = 200028)
> TRACE 301281:
> sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:218)
> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> kafka.utils.Utils$.read(Utils.scala:453)
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
> kafka.network.Receive$class.readCompletely(Transmission.scala:56)
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
> kafka.network.BlockingChannel.receive(BlockingChannel.scala:92)
> kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:67)
> kafka.consumer.SimpleConsumer.sendRequest(SimpleConsumer.scala:65)
> kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:95)
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:71)
> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
> TRACE 300427:
> java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown line)
> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
> java.net.ServerSocket.implAccept(ServerSocket.java:522)
> java.net.ServerSocket.accept(ServerSocket.java:490)
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:387)
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:359)
> java.lang.Thread.run(Thread.java:722)
> TRACE 300963:
> sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> kafka.network.Processor.run(SocketServer.scala:202)
> java.lang.Thread.run(Thread.java:722)
> TRACE 300558:
> java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown line)
> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
> java.net.ServerSocket.implAccept(ServerSocket.java:522)
> java.net.ServerSocket.accept(ServerSocket.java:490)
> sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:52)
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:387)
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:359)
> java.lang.Thread.run(Thread.java:722)
> TRACE 300978:
> sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> kafka.network.Acceptor.run(SocketServer.scala:142)
> java.lang.Thread.run(Thread.java:722)
> TRACE 301024:
> sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1134)
> TRACE 301830:
> sun.misc.Unsafe.park(Unsafe.java:Unknown line)
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
> java.util.concurrent.DelayQueue.take(DelayQueue.java:220)
> kafka.server.RequestPurgatory$ExpiredRequestReaper.pollExpired(RequestPurgatory.scala:282)
> kafka.server.RequestPurgatory$ExpiredRequestReaper.run(RequestPurgatory.scala:235)
> java.lang.Thread.run(Thread.java:722)
> CPU SAMPLES BEGIN (total = 1878003) Thu Aug 30 22:26:05 2012
> rank self accum count trace method
> 1 36.22% 36.22% 680139 301281 sun.nio.ch.EPollArrayWrapper.epollWait
> 2 18.15% 54.36% 340770 300427 java.net.PlainSocketImpl.socketAccept
> 3 18.13% 72.49% 340458 300963 sun.nio.ch.EPollArrayWrapper.epollWait
> 4 9.07% 81.56% 170382 300558 java.net.PlainSocketImpl.socketAccept
> 5 9.07% 90.63% 170340 300978 sun.nio.ch.EPollArrayWrapper.epollWait
> 6 9.06% 99.69% 170139 301024 sun.nio.ch.EPollArrayWrapper.epollWait
> 7 0.18% 99.88% 3429 301830 sun.misc.Unsafe.park
> CPU SAMPLES END
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (KAFKA-493) High CPU usage on inactive server
Posted by "Jay Kreps (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/KAFKA-493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13447991#comment-13447991 ]
Jay Kreps commented on KAFKA-493:
---------------------------------
I think the root cause is this jdk bug:
http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6403933
Basically the select loop in NIO can end up with a socket registered in some kind of error state so that the select loop always returns immediately, which results in a kind of busy wait for something that should be blocking.
I suspect that all our NIO usage has the possibility of this occurring, but something about the fetcher thread seems to make it happen much more easily.
I think this is the same root cause issue as KAFKA-479.
> High CPU usage on inactive server
> ---------------------------------
>
> Key: KAFKA-493
> URL: https://issues.apache.org/jira/browse/KAFKA-493
> Project: Kafka
> Issue Type: Bug
> Components: core
> Affects Versions: 0.8
> Reporter: Jay Kreps
> Fix For: 0.8
>
> Attachments: stacktrace.txt
>
>
> > I've been playing with the 0.8 branch of Kafka and noticed that idle CPU usage is fairly high (13% of a
> > core). Is that to be expected? I did look at the stack, but didn't see anything obvious. A background
> > task?
> > I wanted to mention how I am getting into this state. I've set up two machines with the latest 0.8
> > code base and am using a replication factor of 2. On starting the brokers there is no idle CPU activity.
> > Then I run a test that essential does 10k publish operations followed by immediate consume operations
> > (I was measuring latency). Once this has run the kafka nodes seem to consistently be consuming CPU
> > essentially forever.
> hprof results:
> THREAD START (obj=500003ae, id = 200004, name="RMI TCP Accept-0", group="system")
> THREAD START (obj=500003ae, id = 200005, name="RMI TCP Accept-9999", group="system")
> THREAD START (obj=500003ae, id = 200006, name="RMI TCP Accept-0", group="system")
> THREAD START (obj=500003ae, id = 200001, name="main", group="main")
> THREAD START (obj=500003ae, id = 200007, name="Thread-2", group="main")
> THREAD START (obj=500003ae, id = 200008, name="Thread-3", group="main")
> THREAD START (obj=500003ae, id = 200009, name="kafka-processor-9092-0", group="main")
> THREAD START (obj=500003ae, id = 200010, name="kafka-processor-9092-1", group="main")
> THREAD START (obj=500003ae, id = 200011, name="kafka-acceptor", group="main")
> THREAD START (obj=5000074b, id = 200012, name="ZkClient-EventThread-20-localhost:2181", group="main")
> THREAD START (obj=5000076e, id = 200014, name="main-SendThread()", group="main")
> THREAD START (obj=5000076d, id = 200013, name="main-EventThread", group="main")
> THREAD START (obj=500003ae, id = 200015, name="metrics-meter-tick-thread-1", group="main")
> THREAD START (obj=500003ae, id = 200016, name="metrics-meter-tick-thread-2", group="main")
> THREAD START (obj=500003ae, id = 200017, name="request-expiration-task", group="main")
> THREAD START (obj=500003ae, id = 200018, name="request-expiration-task", group="main")
> THREAD START (obj=500003ae, id = 200019, name="kafka-request-handler-0", group="main")
> THREAD START (obj=500003ae, id = 200020, name="kafka-request-handler-1", group="main")
> THREAD START (obj=500003ae, id = 200021, name="Thread-6", group="main")
> THREAD START (obj=500003ae, id = 200022, name="Thread-7", group="main")
> THREAD START (obj=50000899, id = 200023, name="ReplicaFetcherThread-0-2 on broker 1, ", group="main")
> THREAD START (obj=50000899, id = 200024, name="ReplicaFetcherThread-0-3 on broker 1, ", group="main")
> THREAD START (obj=50000899, id = 200025, name="ReplicaFetcherThread-0-0 on broker 1, ", group="main")
> THREAD START (obj=50000899, id = 200026, name="ReplicaFetcherThread-0-1 on broker 1, ", group="main")
> THREAD START (obj=500003ae, id = 200028, name="SIGINT handler", group="system")
> THREAD START (obj=500003ae, id = 200029, name="Thread-5", group="main")
> THREAD START (obj=5000074b, id = 200030, name="Thread-1", group="main")
> THREAD START (obj=5000074b, id = 200031, name="Thread-0", group="main")
> THREAD END (id = 200031)
> THREAD END (id = 200029)
> THREAD END (id = 200020)
> THREAD END (id = 200019)
> THREAD END (id = 200008)
> THREAD END (id = 200021)
> THREAD END (id = 200007)
> THREAD END (id = 200022)
> THREAD END (id = 200018)
> THREAD END (id = 200017)
> THREAD END (id = 200012)
> THREAD END (id = 200013)
> THREAD END (id = 200014)
> THREAD END (id = 200025)
> THREAD END (id = 200023)
> THREAD END (id = 200026)
> THREAD END (id = 200024)
> THREAD END (id = 200011)
> THREAD END (id = 200009)
> THREAD END (id = 200010)
> THREAD END (id = 200030)
> THREAD END (id = 200028)
> TRACE 301281:
> sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:218)
> sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> kafka.utils.Utils$.read(Utils.scala:453)
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
> kafka.network.Receive$class.readCompletely(Transmission.scala:56)
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
> kafka.network.BlockingChannel.receive(BlockingChannel.scala:92)
> kafka.consumer.SimpleConsumer.liftedTree1$1(SimpleConsumer.scala:67)
> kafka.consumer.SimpleConsumer.sendRequest(SimpleConsumer.scala:65)
> kafka.consumer.SimpleConsumer.fetch(SimpleConsumer.scala:95)
> kafka.server.AbstractFetcherThread.doWork(AbstractFetcherThread.scala:71)
> kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:50)
> TRACE 300427:
> java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown line)
> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
> java.net.ServerSocket.implAccept(ServerSocket.java:522)
> java.net.ServerSocket.accept(ServerSocket.java:490)
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:387)
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:359)
> java.lang.Thread.run(Thread.java:722)
> TRACE 300963:
> sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> kafka.network.Processor.run(SocketServer.scala:202)
> java.lang.Thread.run(Thread.java:722)
> TRACE 300558:
> java.net.PlainSocketImpl.socketAccept(PlainSocketImpl.java:Unknown line)
> java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:398)
> java.net.ServerSocket.implAccept(ServerSocket.java:522)
> java.net.ServerSocket.accept(ServerSocket.java:490)
> sun.management.jmxremote.LocalRMIServerSocketFactory$1.accept(LocalRMIServerSocketFactory.java:52)
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.executeAcceptLoop(TCPTransport.java:387)
> sun.rmi.transport.tcp.TCPTransport$AcceptLoop.run(TCPTransport.java:359)
> java.lang.Thread.run(Thread.java:722)
> TRACE 300978:
> sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> kafka.network.Acceptor.run(SocketServer.scala:142)
> java.lang.Thread.run(Thread.java:722)
> TRACE 301024:
> sun.nio.ch.EPollArrayWrapper.epollWait(EPollArrayWrapper.java:Unknown line)
> sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:228)
> sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:81)
> sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:87)
> sun.nio.ch.SelectorImpl.select(SelectorImpl.java:98)
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1134)
> TRACE 301830:
> sun.misc.Unsafe.park(Unsafe.java:Unknown line)
> java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:226)
> java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2082)
> java.util.concurrent.DelayQueue.take(DelayQueue.java:220)
> kafka.server.RequestPurgatory$ExpiredRequestReaper.pollExpired(RequestPurgatory.scala:282)
> kafka.server.RequestPurgatory$ExpiredRequestReaper.run(RequestPurgatory.scala:235)
> java.lang.Thread.run(Thread.java:722)
> CPU SAMPLES BEGIN (total = 1878003) Thu Aug 30 22:26:05 2012
> rank self accum count trace method
> 1 36.22% 36.22% 680139 301281 sun.nio.ch.EPollArrayWrapper.epollWait
> 2 18.15% 54.36% 340770 300427 java.net.PlainSocketImpl.socketAccept
> 3 18.13% 72.49% 340458 300963 sun.nio.ch.EPollArrayWrapper.epollWait
> 4 9.07% 81.56% 170382 300558 java.net.PlainSocketImpl.socketAccept
> 5 9.07% 90.63% 170340 300978 sun.nio.ch.EPollArrayWrapper.epollWait
> 6 9.06% 99.69% 170139 301024 sun.nio.ch.EPollArrayWrapper.epollWait
> 7 0.18% 99.88% 3429 301830 sun.misc.Unsafe.park
> CPU SAMPLES END
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira