You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Joseph Mermelstein <jo...@liveperson.com> on 2010/09/16 11:03:46 UTC

busy thread on IncomingStreamReader

Hi - has anyone made any progress with this issue? We are having the same
problem with our Cassandra nodes in production. At some point a node (and
sometimes all 3) will jump to 100% CPU usage and stay there for hours until
restarted. Stack traces reveal several threads in a seemingly endless loop
doing this:

"Thread-21770" - Thread t@25278
   java.lang.Thread.State: RUNNABLE
at sun.nio.ch.FileChannelImpl.size0(Native Method)
 at sun.nio.ch.FileChannelImpl.size(Unknown Source)
- locked java.lang.Object@7a2c843d
 at sun.nio.ch.FileChannelImpl.transferFrom(Unknown Source)
at
org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:62)
 at
org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)


My understanding from reading the code is that this trace shows a thread
belonging to the StreamingService which is writing an incoming stream to
disk. There seems to be some kind of bizzare problem which is causing the
FileChannel.size() function to spin with high CPU.

Also, this problem is not easy to replicate - so I would appreciate any
information on how the StreamingService works and what triggers it to
transfer these file streams.

Thanks,

Joseph Mermelstein
LivePerson http://solutions.liveperson.com



>
>
> i all,
>
>  We setup two nodes and simply set replication factor=2 for test run.
>
> After both nodes, say, node A and node B, serve several hours, we found that
> "node A" always keep 300% cpu usage.
>
>
> (the other node is under 100% cpu, which is normal)
>
> thread dump on "node A" shows that there are 3 busy threads related to
> IncomingStreamReader:
>
> ==========================
>
> "Thread-66" prio=10 tid=0x00002aade4018800 nid=0x69e7 runnable
>
>
> [0x000000004030a000]
>    java.lang.Thread.State: RUNNABLE
>         at sun.misc.Unsafe.setMemory(Native Method)
>         at sun.nio.ch.Util.erase(Util.java:202)
>         at
> sun.nio.ch.FileChannelImpl.transferFromArbitraryChannel(FileChannelImpl.java:560)
>
>
>         at sun.nio.ch.FileChannelImpl.transferFrom(FileChannelImpl.java:603)
>         at
> org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:62)
>         at
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)
>
>
> "Thread-65" prio=10 tid=0x00002aade4017000 nid=0x69e6 runnable
> [0x000000004d44b000]
>    java.lang.Thread.State: RUNNABLE
>         at sun.misc.Unsafe.setMemory(Native Method)
>         at sun.nio.ch.Util.erase(Util.java:202)
>
>
>         at
> sun.nio.ch.FileChannelImpl.transferFromArbitraryChannel(FileChannelImpl.java:560)
>         at sun.nio.ch.FileChannelImpl.transferFrom(FileChannelImpl.java:603)
>         at
> org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:62)
>
>
>         at
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)
>
> "Thread-62" prio=10 tid=0x00002aade4014800 nid=0x4150 runnable
> [0x000000004d34a000]
>    java.lang.Thread.State: RUNNABLE
>
>
>         at sun.nio.ch.FileChannelImpl.size0(Native Method)
>         at sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:309)
>         - locked <0x00002aaac450dcd0> (a java.lang.Object)
>         at sun.nio.ch.FileChannelImpl.transferFrom(FileChannelImpl.java:597)
>
>
>         at
> org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:62)
>         at
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)
>
> ===========================
>
>
> Is there anyone experience similar issue ?
>
> environments:
>
> OS   --- CentOS 5.4, Linux 2.6.18-164.15.1.el5 SMP x86_64 GNU/Linux
> Java --- build 1.6.0_16-b01, Java HotSpot(TM) 64-Bit Server VM (build
> 14.2-b01, mixed mode)
>
>
> Cassandra --- 0.6.0
> Node configuration --- node A and node B. both nodes use node A as Seed
> client --- Java thrift clients pick one node randomly to do read and write.
>
>
> --
> Ingram Chen
> online share order: http://dinbendon.net
>
>
> blog: http://www.javaworld.com.tw/roller/page/ingramchen
>
>
>

Re: busy thread on IncomingStreamReader

Posted by Jonathan Ellis <jb...@gmail.com>.
Are you on the most recent version of the JVM?  There have been bugs
fixed in FileChannel over the 1.6 lifespan.

On Thu, Sep 16, 2010 at 4:03 AM, Joseph Mermelstein
<jo...@liveperson.com> wrote:
> Hi - has anyone made any progress with this issue? We are having the same
> problem with our Cassandra nodes in production. At some point a node (and
> sometimes all 3) will jump to 100% CPU usage and stay there for hours until
> restarted. Stack traces reveal several threads in a seemingly endless loop
> doing this:
>
> "Thread-21770" - Thread t@25278
>    java.lang.Thread.State: RUNNABLE
> at sun.nio.ch.FileChannelImpl.size0(Native Method)
> at sun.nio.ch.FileChannelImpl.size(Unknown Source)
> - locked java.lang.Object@7a2c843d
> at sun.nio.ch.FileChannelImpl.transferFrom(Unknown Source)
> at
> org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:62)
> at
> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)
>
> My understanding from reading the code is that this trace shows a thread
> belonging to the StreamingService which is writing an incoming stream to
> disk. There seems to be some kind of bizzare problem which is causing the
> FileChannel.size() function to spin with high CPU.
>
> Also, this problem is not easy to replicate - so I would appreciate any
> information on how the StreamingService works and what triggers it to
> transfer these file streams.
>
> Thanks,
>
> Joseph Mermelstein
> LivePerson http://solutions.liveperson.com
>
>
>>
>>
>>
>> i all,
>>
>>  We setup two nodes and simply set replication factor=2 for test run.
>>
>> After both nodes, say, node A and node B, serve several hours, we found
>> that
>> "node A" always keep 300% cpu usage.
>>
>>
>>
>> (the other node is under 100% cpu, which is normal)
>>
>> thread dump on "node A" shows that there are 3 busy threads related to
>> IncomingStreamReader:
>>
>> ==========================
>>
>> "Thread-66" prio=10 tid=0x00002aade4018800 nid=0x69e7 runnable
>>
>>
>>
>> [0x000000004030a000]
>>    java.lang.Thread.State: RUNNABLE
>>         at sun.misc.Unsafe.setMemory(Native Method)
>>         at sun.nio.ch.Util.erase(Util.java:202)
>>         at
>>
>> sun.nio.ch.FileChannelImpl.transferFromArbitraryChannel(FileChannelImpl.java:560)
>>
>>
>>
>>         at
>> sun.nio.ch.FileChannelImpl.transferFrom(FileChannelImpl.java:603)
>>         at
>>
>> org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:62)
>>         at
>>
>> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)
>>
>>
>>
>>
>> "Thread-65" prio=10 tid=0x00002aade4017000 nid=0x69e6 runnable
>> [0x000000004d44b000]
>>    java.lang.Thread.State: RUNNABLE
>>         at sun.misc.Unsafe.setMemory(Native Method)
>>         at sun.nio.ch.Util.erase(Util.java:202)
>>
>>
>>
>>         at
>>
>> sun.nio.ch.FileChannelImpl.transferFromArbitraryChannel(FileChannelImpl.java:560)
>>         at
>> sun.nio.ch.FileChannelImpl.transferFrom(FileChannelImpl.java:603)
>>         at
>>
>> org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:62)
>>
>>
>>
>>         at
>>
>> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)
>>
>> "Thread-62" prio=10 tid=0x00002aade4014800 nid=0x4150 runnable
>> [0x000000004d34a000]
>>    java.lang.Thread.State: RUNNABLE
>>
>>
>>
>>         at sun.nio.ch.FileChannelImpl.size0(Native Method)
>>         at sun.nio.ch.FileChannelImpl.size(FileChannelImpl.java:309)
>>         - locked <0x00002aaac450dcd0> (a java.lang.Object)
>>         at
>> sun.nio.ch.FileChannelImpl.transferFrom(FileChannelImpl.java:597)
>>
>>
>>
>>         at
>>
>> org.apache.cassandra.streaming.IncomingStreamReader.read(IncomingStreamReader.java:62)
>>         at
>>
>> org.apache.cassandra.net.IncomingTcpConnection.run(IncomingTcpConnection.java:66)
>>
>> ===========================
>>
>>
>>
>>
>> Is there anyone experience similar issue ?
>>
>> environments:
>>
>> OS   --- CentOS 5.4, Linux 2.6.18-164.15.1.el5 SMP x86_64 GNU/Linux
>> Java --- build 1.6.0_16-b01, Java HotSpot(TM) 64-Bit Server VM (build
>> 14.2-b01, mixed mode)
>>
>>
>>
>> Cassandra --- 0.6.0
>> Node configuration --- node A and node B. both nodes use node A as Seed
>> client --- Java thrift clients pick one node randomly to do read and
>> write.
>>
>>
>> --
>> Ingram Chen
>> online share order: http://dinbendon.net
>>
>>
>>
>> blog: http://www.javaworld.com.tw/roller/page/ingramchen
>>
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com