You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Keith Wright <kw...@nanigans.com> on 2013/08/21 02:32:50 UTC

Nodes get stuck

Hi all,

    We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior recently where 3 of our nodes get locked up in high load in what appears to be a GC spiral while the rest of the cluster (7 total nodes) appears fine.  When I run a tpstats, I see the following (assuming tpstats returns at all) and top shows cassandra pegged at 2000%.  Obviously we have a large number of blocked reads.  In the past I could explain this due to unexpectedly wide rows however we have handled that.  When the cluster starts to meltdown like this its hard to get visibility into what's going on and what triggered the issue as everything starts to pile on.  Opscenter becomes unusable and because the effected nodes are in GC pressure, getting any data via nodetool or JMX is also difficult.  What do people do to handle these situations?  We are going to start graphing reads/writes/sec/CF to Ganglia in the hopes that it helps.

Thanks

Pool Name                    Active   Pending      Completed   Blocked  All time blocked
ReadStage                       256       381     1245117434         0                 0
RequestResponseStage              0         0     1161495947         0                 0
MutationStage                     8         8      481721887         0                 0
ReadRepairStage                   0         0       85770600         0                 0
ReplicateOnWriteStage             0         0       21896804         0                 0
GossipStage                       0         0        1546196         0                 0
AntiEntropyStage                  0         0           5009         0                 0
MigrationStage                    0         0           1082         0                 0
MemtablePostFlusher               0         0          10178         0                 0
FlushWriter                       0         0           6081         0              2075
MiscStage                         0         0             57         0                 0
commitlog_archiver                0         0              0         0                 0
AntiEntropySessions               0         0              0         0                 0
InternalResponseStage             0         0              6         0                 0
HintedHandoff                     1         1            246         0                 0

Message type           Dropped
RANGE_SLICE                482
READ_REPAIR                  0
BINARY                       0
READ                    515762
MUTATION                    39
_TRACE                       0
REQUEST_RESPONSE            29


Re: Nodes get stuck

Posted by Robert Coli <rc...@eventbrite.com>.
On Wed, Aug 21, 2013 at 10:47 AM, Robert Coli <rc...@eventbrite.com> wrote:

> On Tue, Aug 20, 2013 at 11:35 PM, Keith Wright <kw...@nanigans.com>wrote:
>
>> Still looking for help!  We have stopped almost ALL traffic to the
>> cluster and still some nodes are showing almost 1000% CPU for cassandra
>> with no iostat activity.   We were running cleanup on one of the nodes that
>> was not showing load spikes however now when I attempt to stop cleanup
>> there via nodetool stop cleanup the java task for stopping cleanup itself
>> is at 1500% and has not returned after 2 minutes.  This is VERY odd
>> behavior.  Any ideas?  Hardware failure?  Network?  We are not seeing
>> anything there but wanted to get ideas.
>>
>
> The most obvious answer is that somehow the problem nodes hit a magical
> threshold which makes them "thrash" with GC.
>
> If you restart the affected nodes, does the error  condition return? If
> so, how quickly?
>

Lol, missed the rest of the responses on thread. NVMD. :D

=Rob

Re: Nodes get stuck

Posted by Robert Coli <rc...@eventbrite.com>.
On Tue, Aug 20, 2013 at 11:35 PM, Keith Wright <kw...@nanigans.com> wrote:

> Still looking for help!  We have stopped almost ALL traffic to the cluster
> and still some nodes are showing almost 1000% CPU for cassandra with no
> iostat activity.   We were running cleanup on one of the nodes that was not
> showing load spikes however now when I attempt to stop cleanup there via
> nodetool stop cleanup the java task for stopping cleanup itself is at 1500%
> and has not returned after 2 minutes.  This is VERY odd behavior.  Any
> ideas?  Hardware failure?  Network?  We are not seeing anything there but
> wanted to get ideas.
>

The most obvious answer is that somehow the problem nodes hit a magical
threshold which makes them "thrash" with GC.

If you restart the affected nodes, does the error  condition return? If so,
how quickly?

=Rob

Re: Nodes get stuck

Posted by Keith Wright <kw...@nanigans.com>.
Cfstats from an unaffected node:

Column Family: global_user
SSTable count: 8611
SSTables in each level: [1, 10, 107/100, 479, 8014, 0, 0]
Space used (live): 92459971645
Space used (total): 92525877061
Number of Keys (estimate): 605913600
Memtable Columns Count: 140236
Memtable Data Size: 71389685
Memtable Switch Count: 232
Read Count: 67335340
Read Latency: 1.523 ms.
Write Count: 50975221
Write Latency: 0.026 ms.
Pending Tasks: 0
Bloom Filter False Positives: 2981447
Bloom Filter False Ratio: 0.41085
Bloom Filter Space Used: 416240904
Compacted row minimum size: 73
Compacted row maximum size: 4866323
Compacted row mean size: 325

From: Nate McCall <na...@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 9:00 AM
To: Cassandra Users <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck


That is a huge number of sstables. Does cfstats show a similar count on the other nodes?

On Aug 21, 2013 7:32 AM, "Keith Wright" <kw...@nanigans.com>> wrote:
Thank you for responding.  I did a quick look and my mutation stage threads are currently in TIMED_WAITING (as expected since tpstats shows no active or pending) however most of my read stage threads are Runnable with the stack traces below.  I haven't dug into them yet but thought I would put them out there to see if anyone had any ideas since we are currently in a production down state.

Thanks all!

Most have the first stack:

java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:69)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.countRunAndMakeAscending(TimSort.java:329)
java.util.TimSort.sort(TimSort.java:203)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

Name: ReadStage:1719
State: RUNNABLE
Total blocked: 1,005  Total waited: 913

Stack trace:
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:252)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

Name: ReadStage:1722
State: RUNNABLE
Total blocked: 1,001  Total waited: 897

Stack trace:
org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:58)
org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:26)
org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:229)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:81)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.binarySort(TimSort.java:265)
java.util.TimSort.sort(TimSort.java:208)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)


From: Sylvain Lebresne <sy...@datastax.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 6:21 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

A thread dump on one of the machine that has a suspiciously high CPU might help figuring out what it is that is taking all that CPU.


On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright <kw...@nanigans.com>> wrote:
Some last minute info on this to hopefully enlighten.  We are doing ~200 reads and writes across our 7 node SSD cluster right now (usually can do closer to 20K reads at least) and seeing CPU load as follows for the nodes (with some par new to give an idea of GC):

001 – 1200%   (Par New at 120 ms / sec)
002 – 6% (Par New at 0)
003 – 600% (Par New at 45 ms / sec)
004 – 900%
005 – 500%
006 – 10%
007 – 130%

There are no compactions running on 001 however I did see a broken pipe error in the logs there (see below).  Netstats for 001 shows nothing pending.  It appears that all of the load/latency is related to one column family.  You can see cfstats & cfhistograms output below and note that we are using LCS.  I have brought the odd cfhistograms behavior to the thread before and am not sure what's going on there.  We are in a production down situation right now so any help would be much appreciated!!!

Column Family: global_user
SSTable count: 7546
SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0]
Space used (live): 83848742562
Space used (total): 83848742562
Number of Keys (estimate): 549792896
Memtable Columns Count: 526746
Memtable Data Size: 117408252
Memtable Switch Count: 0
Read Count: 11673
Read Latency: 1950.062 ms.
Write Count: 118588
Write Latency: 0.080 ms.
Pending Tasks: 0
Bloom Filter False Positives: 4322
Bloom Filter False Ratio: 0.84066
Bloom Filter Space Used: 383507440
Compacted row minimum size: 73
Compacted row maximum size: 2816159
Compacted row mean size: 324

[kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user
users/global_user histograms
Offset      SSTables     Write Latency      Read Latency          Row Size      Column Count
1               8866                 0                 0                 0              3420
2               1001                 0                 0                 0          99218975
3               1249                 0                 0                 0         319713048
4               1074                 0                 0                 0          25073893
5                132                 0                 0                 0          15359199
6                  0                 0                 0                 0          27794925
7                  0                12                 0                 0           7954974
8                  0                23                 0                 0           7733934
10                 0               184                 0                 0          13276275
12                 0               567                 0                 0           9077508
14                 0              1098                 0                 0           5879292
17                 0              2722                 0                 0           5693471
20                 0              4379                 0                 0           3204131
24                 0              8928                 0                 0           2614995
29                 0             13525                 0                 0           1824584
35                 0             16759                 0                 0           1265911
42                 0             17048                 0                 0            868075
50                 0             14162                 5                 0            596417
60                 0             11806                15                 0            467747
72                 0              8569               108                 0            354276
86                 0              7042               276               227            269987
103                0              5936               372              2972            218931
124                0              4538               577               157            181360
149                0              2981              1076           7388090            144298
179                0              1929              1529          90535838            116628
215                0              1081              1450         182701876             93378
258                0               499              1125         141393480             74052
310                0               124               756          18883224             58617
372                0                31               460          24599272             45453
446                0                25               247          23516772             34310
535                0                10               146          13987584             26168
642                0                20               194          12091458             19965
770                0                 8               196           9269197             14649
924                0                 9               340           8082898             11015
1109               0                 9               225           4762865              8058
1331               0                 9               154           3330110              5866
1597               0                 8               144           2367615              4275
1916               0                 1               188           1633608              3087
2299               0                 4               216           1139820              2196
2759               0                 5               201            819019              1456
3311               0                 4               194            600522              1135
3973               0                 6               181            454566               786
4768               0                13               136            353886               587
5722               0                 6               152            280630               400
6866               0                 5                80            225545               254
8239               0                 6               112            183285               138
9887               0                 0                68            149820               109
11864              0                 5                99            121722                66
14237              0                57                86             98352                50
17084              0                18                99             79085                35
20501              0                 1                93             62423                11
24601              0                 0                61             49471                 9
29521              0                 0                69             37395                 5
35425              0                 4                56             28611                 6
42510              0                 0                57             21876                 1
51012              0                 9                60             16105                 0
61214              0                 0                52             11996                 0
73457              0                 0                50              8791                 0
88148              0                 0                38              6430                 0
105778             0                 0                25              4660                 0
126934             0                 0                15              3308                 0
152321             0                 0                 2              2364                 0
182785             0                 0                 0              1631                 0
219342             0                 0                 0              1156                 0
263210             0                 0                 0               887                 0
315852             0                 0                 0               618                 0
379022             0                 0                 0               427                 0
454826             0                 0                 0               272                 0
545791             0                 0                 0               168                 0
654949             0                 0                 0               115                 0
785939             0                 0                 0                61                 0
943127             0                 0                 0                58                 0
1131752            0                 0                 0                34                 0
1358102            0                 0                 0                19                 0
1629722            0                 0                 0                 9                 0
1955666            0                 0                 0                 4                 0
2346799            0                 0                 0                 5                 0
2816159            0                 0                 0                 2                 0
3379391            0                 0                 0                 0                 0
4055269            0                 0                 0                 0                 0
4866323            0                 0                 0                 0                 0
5839588            0                 0                 0                 0                 0
7007506            0                 0                 0                 0                 0
8409007            0                 0                 0                 0                 0
10090808           0                 0                 0                 0                 0
12108970           0                 0                 0                 0                 0
14530764           0                 0                 0                 0                 0
17436917           0                 0                 0                 0                 0
20924300           0                 0                 0                 0                 0
25109160           0                 0                 0                 0                 0

ERROR [WRITE-/10.8.44.98<http://10.8.44.98>] 2013-08-21 06:50:25,450 OutboundTcpConnection.java (line 197) error writing to /10.8.44.98<http://10.8.44.98>
java.lang.RuntimeException: java.io.IOException: Broken pipe
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:59)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
at org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:78)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:69)
at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131)
at org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:221)
at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:186)
at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
at java.nio.channels.Channels.writeFully(Channels.java:98)
at java.nio.channels.Channels.access$000(Channels.java:61)
at java.nio.channels.Channels$1.write(Channels.java:174)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
at org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
at org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
... 9 more


From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 2:35 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Still looking for help!  We have stopped almost ALL traffic to the cluster and still some nodes are showing almost 1000% CPU for cassandra with no iostat activity.   We were running cleanup on one of the nodes that was not showing load spikes however now when I attempt to stop cleanup there via nodetool stop cleanup the java task for stopping cleanup itself is at 1500% and has not returned after 2 minutes.  This is VERY odd behavior.  Any ideas?  Hardware failure?  Network?  We are not seeing anything there but wanted to get ideas.

Thanks

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Tuesday, August 20, 2013 8:32 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Nodes get stuck

Hi all,

    We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior recently where 3 of our nodes get locked up in high load in what appears to be a GC spiral while the rest of the cluster (7 total nodes) appears fine.  When I run a tpstats, I see the following (assuming tpstats returns at all) and top shows cassandra pegged at 2000%.  Obviously we have a large number of blocked reads.  In the past I could explain this due to unexpectedly wide rows however we have handled that.  When the cluster starts to meltdown like this its hard to get visibility into what's going on and what triggered the issue as everything starts to pile on.  Opscenter becomes unusable and because the effected nodes are in GC pressure, getting any data via nodetool or JMX is also difficult.  What do people do to handle these situations?  We are going to start graphing reads/writes/sec/CF to Ganglia in the hopes that it helps.

Thanks

Pool Name                    Active   Pending      Completed   Blocked  All time blocked
ReadStage                       256       381     1245117434         0                 0
RequestResponseStage              0         0     1161495947         0                 0
MutationStage                     8         8      481721887         0                 0
ReadRepairStage                   0         0       85770600         0                 0
ReplicateOnWriteStage             0         0       21896804         0                 0
GossipStage                       0         0        1546196         0                 0
AntiEntropyStage                  0         0           5009         0                 0
MigrationStage                    0         0           1082         0                 0
MemtablePostFlusher               0         0          10178         0                 0
FlushWriter                       0         0           6081         0              2075
MiscStage                         0         0             57         0                 0
commitlog_archiver                0         0              0         0                 0
AntiEntropySessions               0         0              0         0                 0
InternalResponseStage             0         0              6         0                 0
HintedHandoff                     1         1            246         0                 0

Message type           Dropped
RANGE_SLICE                482
READ_REPAIR                  0
BINARY                       0
READ                    515762
MUTATION                    39
_TRACE                       0
REQUEST_RESPONSE            29



Re: Nodes get stuck

Posted by Nate McCall <na...@thelastpickle.com>.
That is a huge number of sstables. Does cfstats show a similar count on the
other nodes?
On Aug 21, 2013 7:32 AM, "Keith Wright" <kw...@nanigans.com> wrote:

> Thank you for responding.  I did a quick look and my mutation stage
> threads are currently in TIMED_WAITING (as expected since tpstats shows no
> active or pending) however most of my read stage threads are Runnable with
> the stack traces below.  I haven't dug into them yet but thought I would
> put them out there to see if anyone had any ideas since we are currently in
> a production down state.
>
> Thanks all!
>
> Most have the first stack:
>
> java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
>
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:69)
>
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
> java.util.TimSort.countRunAndMakeAscending(TimSort.java:329)
> java.util.TimSort.sort(TimSort.java:203)
> java.util.TimSort.sort(TimSort.java:173)
> java.util.Arrays.sort(Arrays.java:659)
> java.util.Collections.sort(Collections.java:217)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
> org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
> org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
>
> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
>
> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
> org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
> org.apache.cassandra.db.Table.getRow(Table.java:347)
>
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
>
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> java.lang.Thread.run(Thread.java:722)
>
> Name: ReadStage:1719
> State: RUNNABLE
> Total blocked: 1,005  Total waited: 913
>
> Stack trace:
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:252)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
> org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
> org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
>
> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
>
> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
> org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
> org.apache.cassandra.db.Table.getRow(Table.java:347)
>
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
>
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> java.lang.Thread.run(Thread.java:722)
>
> Name: ReadStage:1722
> State: RUNNABLE
> Total blocked: 1,001  Total waited: 897
>
> Stack trace:
> org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:58)
> org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:26)
>
> org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:229)
>
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:81)
>
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
> java.util.TimSort.binarySort(TimSort.java:265)
> java.util.TimSort.sort(TimSort.java:208)
> java.util.TimSort.sort(TimSort.java:173)
> java.util.Arrays.sort(Arrays.java:659)
> java.util.Collections.sort(Collections.java:217)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
> org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
> org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
>
> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
>
> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
> org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
> org.apache.cassandra.db.Table.getRow(Table.java:347)
>
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
>
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> java.lang.Thread.run(Thread.java:722)
>
>
> From: Sylvain Lebresne <sy...@datastax.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Wednesday, August 21, 2013 6:21 AM
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Re: Nodes get stuck
>
> A thread dump on one of the machine that has a suspiciously high CPU might
> help figuring out what it is that is taking all that CPU.
>
>
> On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright <kw...@nanigans.com>wrote:
>
>> Some last minute info on this to hopefully enlighten.  We are doing ~200
>> reads and writes across our 7 node SSD cluster right now (usually can do
>> closer to 20K reads at least) and seeing CPU load as follows for the nodes
>> (with some par new to give an idea of GC):
>>
>> 001 – 1200%   (Par New at 120 ms / sec)
>> 002 – 6% (Par New at 0)
>> 003 – 600% (Par New at 45 ms / sec)
>> 004 – 900%
>> 005 – 500%
>> 006 – 10%
>> 007 – 130%
>>
>> There are no compactions running on 001 however I did see a broken pipe
>> error in the logs there (see below).  Netstats for 001 shows nothing
>> pending.  It appears that all of the load/latency is related to one column
>> family.  You can see cfstats & cfhistograms output below and note that we
>> are using LCS.  I have brought the odd cfhistograms behavior to the thread
>> before and am not sure what's going on there.  We are in a production down
>> situation right now so any help would be much appreciated!!!
>>
>> Column Family: global_user
>> SSTable count: 7546
>> SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0]
>> Space used (live): 83848742562
>> Space used (total): 83848742562
>> Number of Keys (estimate): 549792896
>> Memtable Columns Count: 526746
>> Memtable Data Size: 117408252
>> Memtable Switch Count: 0
>> Read Count: 11673
>> Read Latency: 1950.062 ms.
>> Write Count: 118588
>> Write Latency: 0.080 ms.
>> Pending Tasks: 0
>> Bloom Filter False Positives: 4322
>> Bloom Filter False Ratio: 0.84066
>> Bloom Filter Space Used: 383507440
>> Compacted row minimum size: 73
>> Compacted row maximum size: 2816159
>> Compacted row mean size: 324
>>
>> [kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user
>> users/global_user histograms
>> Offset      SSTables     Write Latency      Read Latency          Row
>> Size      Column Count
>> 1               8866                 0                 0
>> 0              3420
>> 2               1001                 0                 0
>> 0          99218975
>> 3               1249                 0                 0
>> 0         319713048
>> 4               1074                 0                 0
>> 0          25073893
>> 5                132                 0                 0
>> 0          15359199
>> 6                  0                 0                 0
>> 0          27794925
>> 7                  0                12                 0
>> 0           7954974
>> 8                  0                23                 0
>> 0           7733934
>> 10                 0               184                 0
>> 0          13276275
>> 12                 0               567                 0
>> 0           9077508
>> 14                 0              1098                 0
>> 0           5879292
>> 17                 0              2722                 0
>> 0           5693471
>> 20                 0              4379                 0
>> 0           3204131
>> 24                 0              8928                 0
>> 0           2614995
>> 29                 0             13525                 0
>> 0           1824584
>> 35                 0             16759                 0
>> 0           1265911
>> 42                 0             17048                 0
>> 0            868075
>> 50                 0             14162                 5
>> 0            596417
>> 60                 0             11806                15
>> 0            467747
>> 72                 0              8569               108
>> 0            354276
>> 86                 0              7042               276
>> 227            269987
>> 103                0              5936               372
>>  2972            218931
>> 124                0              4538               577
>> 157            181360
>> 149                0              2981              1076
>> 7388090            144298
>> 179                0              1929              1529
>>  90535838            116628
>> 215                0              1081              1450
>> 182701876             93378
>> 258                0               499              1125
>> 141393480             74052
>> 310                0               124               756
>>  18883224             58617
>> 372                0                31               460
>>  24599272             45453
>> 446                0                25               247
>>  23516772             34310
>> 535                0                10               146
>>  13987584             26168
>> 642                0                20               194
>>  12091458             19965
>> 770                0                 8               196
>> 9269197             14649
>> 924                0                 9               340
>> 8082898             11015
>> 1109               0                 9               225
>> 4762865              8058
>> 1331               0                 9               154
>> 3330110              5866
>> 1597               0                 8               144
>> 2367615              4275
>> 1916               0                 1               188
>> 1633608              3087
>> 2299               0                 4               216
>> 1139820              2196
>> 2759               0                 5               201
>>  819019              1456
>> 3311               0                 4               194
>>  600522              1135
>> 3973               0                 6               181
>>  454566               786
>> 4768               0                13               136
>>  353886               587
>> 5722               0                 6               152
>>  280630               400
>> 6866               0                 5                80
>>  225545               254
>> 8239               0                 6               112
>>  183285               138
>> 9887               0                 0                68
>>  149820               109
>> 11864              0                 5                99
>>  121722                66
>> 14237              0                57                86
>> 98352                50
>> 17084              0                18                99
>> 79085                35
>> 20501              0                 1                93
>> 62423                11
>> 24601              0                 0                61
>> 49471                 9
>> 29521              0                 0                69
>> 37395                 5
>> 35425              0                 4                56
>> 28611                 6
>> 42510              0                 0                57
>> 21876                 1
>> 51012              0                 9                60
>> 16105                 0
>> 61214              0                 0                52
>> 11996                 0
>> 73457              0                 0                50
>>  8791                 0
>> 88148              0                 0                38
>>  6430                 0
>> 105778             0                 0                25
>>  4660                 0
>> 126934             0                 0                15
>>  3308                 0
>> 152321             0                 0                 2
>>  2364                 0
>> 182785             0                 0                 0
>>  1631                 0
>> 219342             0                 0                 0
>>  1156                 0
>> 263210             0                 0                 0
>> 887                 0
>> 315852             0                 0                 0
>> 618                 0
>> 379022             0                 0                 0
>> 427                 0
>> 454826             0                 0                 0
>> 272                 0
>> 545791             0                 0                 0
>> 168                 0
>> 654949             0                 0                 0
>> 115                 0
>> 785939             0                 0                 0
>>  61                 0
>> 943127             0                 0                 0
>>  58                 0
>> 1131752            0                 0                 0
>>  34                 0
>> 1358102            0                 0                 0
>>  19                 0
>> 1629722            0                 0                 0
>> 9                 0
>> 1955666            0                 0                 0
>> 4                 0
>> 2346799            0                 0                 0
>> 5                 0
>> 2816159            0                 0                 0
>> 2                 0
>> 3379391            0                 0                 0
>> 0                 0
>> 4055269            0                 0                 0
>> 0                 0
>> 4866323            0                 0                 0
>> 0                 0
>> 5839588            0                 0                 0
>> 0                 0
>> 7007506            0                 0                 0
>> 0                 0
>> 8409007            0                 0                 0
>> 0                 0
>> 10090808           0                 0                 0
>> 0                 0
>> 12108970           0                 0                 0
>> 0                 0
>> 14530764           0                 0                 0
>> 0                 0
>> 17436917           0                 0                 0
>> 0                 0
>> 20924300           0                 0                 0
>> 0                 0
>> 25109160           0                 0                 0
>> 0                 0
>>
>> ERROR [WRITE-/10.8.44.98] 2013-08-21 06:50:25,450
>> OutboundTcpConnection.java (line 197) error writing to /10.8.44.98
>> java.lang.RuntimeException: java.io.IOException: Broken pipe
>> at
>> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:59)
>> at
>> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
>> at
>> org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
>> at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62)
>> at
>> org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:78)
>> at
>> org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:69)
>> at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131)
>> at
>> org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:221)
>> at
>> org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:186)
>> at
>> org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
>> Caused by: java.io.IOException: Broken pipe
>> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
>> at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
>> at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
>> at java.nio.channels.Channels.writeFully(Channels.java:98)
>> at java.nio.channels.Channels.access$000(Channels.java:61)
>> at java.nio.channels.Channels$1.write(Channels.java:174)
>> at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>> at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
>> at org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
>> at
>> org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
>> at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
>> at java.io.DataOutputStream.write(DataOutputStream.java:107)
>> at
>> org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
>> at
>> org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
>> at
>> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
>> ... 9 more
>>
>>
>> From: Keith Wright <kw...@nanigans.com>
>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Date: Wednesday, August 21, 2013 2:35 AM
>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Subject: Re: Nodes get stuck
>>
>> Still looking for help!  We have stopped almost ALL traffic to the
>> cluster and still some nodes are showing almost 1000% CPU for cassandra
>> with no iostat activity.   We were running cleanup on one of the nodes that
>> was not showing load spikes however now when I attempt to stop cleanup
>> there via nodetool stop cleanup the java task for stopping cleanup itself
>> is at 1500% and has not returned after 2 minutes.  This is VERY odd
>> behavior.  Any ideas?  Hardware failure?  Network?  We are not seeing
>> anything there but wanted to get ideas.
>>
>> Thanks
>>
>> From: Keith Wright <kw...@nanigans.com>
>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Date: Tuesday, August 20, 2013 8:32 PM
>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Subject: Nodes get stuck
>>
>> Hi all,
>>
>>     We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior
>> recently where 3 of our nodes get locked up in high load in what appears to
>> be a GC spiral while the rest of the cluster (7 total nodes) appears fine.
>>  When I run a tpstats, I see the following (assuming tpstats returns at
>> all) and top shows cassandra pegged at 2000%.  Obviously we have a large
>> number of blocked reads.  In the past I could explain this due to
>> unexpectedly wide rows however we have handled that.  When the cluster
>> starts to meltdown like this its hard to get visibility into what's going
>> on and what triggered the issue as everything starts to pile on.  Opscenter
>> becomes unusable and because the effected nodes are in GC pressure, getting
>> any data via nodetool or JMX is also difficult.  What do people do to
>> handle these situations?  We are going to start graphing
>> reads/writes/sec/CF to Ganglia in the hopes that it helps.
>>
>> Thanks
>>
>> Pool Name                    Active   Pending      Completed   Blocked
>>  All time blocked
>> ReadStage                       256       381     1245117434         0
>>               0
>> RequestResponseStage              0         0     1161495947         0
>>               0
>> MutationStage                     8         8      481721887         0
>>               0
>> ReadRepairStage                   0         0       85770600         0
>>               0
>> ReplicateOnWriteStage             0         0       21896804         0
>>               0
>> GossipStage                       0         0        1546196         0
>>               0
>> AntiEntropyStage                  0         0           5009         0
>>               0
>> MigrationStage                    0         0           1082         0
>>               0
>> MemtablePostFlusher               0         0          10178         0
>>               0
>> FlushWriter                       0         0           6081         0
>>            2075
>> MiscStage                         0         0             57         0
>>               0
>> commitlog_archiver                0         0              0         0
>>               0
>> AntiEntropySessions               0         0              0         0
>>               0
>> InternalResponseStage             0         0              6         0
>>               0
>> HintedHandoff                     1         1            246         0
>>               0
>>
>> Message type           Dropped
>> RANGE_SLICE                482
>> READ_REPAIR                  0
>> BINARY                       0
>> READ                    515762
>> MUTATION                    39
>> _TRACE                       0
>> REQUEST_RESPONSE            29
>>
>>
>

Re: Nodes get stuck

Posted by Keith Wright <kw...@nanigans.com>.
Hi all.  FYI we upgraded our cluster to 1.2.8 and things are now stable.  Thank you all for your assistance!!!

From: Sylvain Lebresne <sy...@datastax.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 11:21 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck


    n other words, is it expected that the same interval tree issue would occur during compactions?

Yep.


Thanks

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 9:48 AM

To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Are many people running at 1.2.8?  Any issues?  Just nervous about running on the latest.  Would prefer to be a couple of versions behind as new bugs due to tend to popup.

Thanks all

From: Nate McCall <na...@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 9:33 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

We use 128m and gc_grace of 300 on a CF with highly transient data (holds values for client locking via wait-chain algorithm implementation in hector).

Same minor version upgrades should be painless and do-able with no downtime.


On Wed, Aug 21, 2013 at 8:28 AM, Keith Wright <kw...@nanigans.com>> wrote:
We have our LCS sstable size at 64 MB and gc grace at 86400.  May I ask what values you use?  I saw that in 2.0 they are setting LCS default sstable size to 160 MB.

Does anyone see any risk in upgrading from 1.2.4 to 1.2.8?  Upgrade steps do not appear to mention any actions required and that a rolling upgrade should be safe

From: Nate McCall <na...@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 9:07 AM
To: Cassandra Users <us...@cassandra.apache.org>>

Subject: Re: Nodes get stuck


Hit send before i saw your update. Yeah turn gc grace way down. You can turn your level size up a lot as well.

On Aug 21, 2013 7:55 AM, "Keith Wright" <kw...@nanigans.com>> wrote:
So the stack appears to be related to walking tombstones for a fetch.  Can you please give me your take on if this is a plausible explanation:

 *   Given our data model, we can experience wide rows.  We protect against these by randomly reading a portion on write and if the size is beyond a certain threshold, we delete data
 *   This worked VERY well for some time now however perhaps we hit a row that we deleted and has many tombstones.  The row is being requests frequently so Cassandra is working very hard to process through all of its tombstones (currently the RF # of nodes are at high load which again suggests this).

Question is what to do about it?  This is an LCS table with gc grace seconds at 86400.  I assume my only options are to force a major compaction via nodetool compaction or upgrades stables?  How can I validate this is the cause?  How can I prevent it going forward?  Set the gc grace seconds to a much lower value for that table?

Thanks all!

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 8:31 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Thank you for responding.  I did a quick look and my mutation stage threads are currently in TIMED_WAITING (as expected since tpstats shows no active or pending) however most of my read stage threads are Runnable with the stack traces below.  I haven't dug into them yet but thought I would put them out there to see if anyone had any ideas since we are currently in a production down state.

Thanks all!

Most have the first stack:

java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:69)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.countRunAndMakeAscending(TimSort.java:329)
java.util.TimSort.sort(TimSort.java:203)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

Name: ReadStage:1719
State: RUNNABLE
Total blocked: 1,005  Total waited: 913

Stack trace:
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:252)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

Name: ReadStage:1722
State: RUNNABLE
Total blocked: 1,001  Total waited: 897

Stack trace:
org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:58)
org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:26)
org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:229)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:81)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.binarySort(TimSort.java:265)
java.util.TimSort.sort(TimSort.java:208)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)


From: Sylvain Lebresne <sy...@datastax.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 6:21 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

A thread dump on one of the machine that has a suspiciously high CPU might help figuring out what it is that is taking all that CPU.


On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright <kw...@nanigans.com>> wrote:
Some last minute info on this to hopefully enlighten.  We are doing ~200 reads and writes across our 7 node SSD cluster right now (usually can do closer to 20K reads at least) and seeing CPU load as follows for the nodes (with some par new to give an idea of GC):

001 – 1200%   (Par New at 120 ms / sec)
002 – 6% (Par New at 0)
003 – 600% (Par New at 45 ms / sec)
004 – 900%
005 – 500%
006 – 10%
007 – 130%

There are no compactions running on 001 however I did see a broken pipe error in the logs there (see below).  Netstats for 001 shows nothing pending.  It appears that all of the load/latency is related to one column family.  You can see cfstats & cfhistograms output below and note that we are using LCS.  I have brought the odd cfhistograms behavior to the thread before and am not sure what's going on there.  We are in a production down situation right now so any help would be much appreciated!!!

Column Family: global_user
SSTable count: 7546
SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0]
Space used (live): 83848742562
Space used (total): 83848742562
Number of Keys (estimate): 549792896
Memtable Columns Count: 526746
Memtable Data Size: 117408252
Memtable Switch Count: 0
Read Count: 11673
Read Latency: 1950.062 ms.
Write Count: 118588
Write Latency: 0.080 ms.
Pending Tasks: 0
Bloom Filter False Positives: 4322
Bloom Filter False Ratio: 0.84066
Bloom Filter Space Used: 383507440
Compacted row minimum size: 73
Compacted row maximum size: 2816159
Compacted row mean size: 324

[kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user
users/global_user histograms
Offset      SSTables     Write Latency      Read Latency          Row Size      Column Count
1               8866                 0                 0                 0              3420
2               1001                 0                 0                 0          99218975
3               1249                 0                 0                 0         319713048
4               1074                 0                 0                 0          25073893
5                132                 0                 0                 0          15359199
6                  0                 0                 0                 0          27794925
7                  0                12                 0                 0           7954974
8                  0                23                 0                 0           7733934
10                 0               184                 0                 0          13276275
12                 0               567                 0                 0           9077508
14                 0              1098                 0                 0           5879292
17                 0              2722                 0                 0           5693471
20                 0              4379                 0                 0           3204131
24                 0              8928                 0                 0           2614995
29                 0             13525                 0                 0           1824584
35                 0             16759                 0                 0           1265911
42                 0             17048                 0                 0            868075
50                 0             14162                 5                 0            596417
60                 0             11806                15                 0            467747
72                 0              8569               108                 0            354276
86                 0              7042               276               227            269987
103                0              5936               372              2972            218931
124                0              4538               577               157            181360
149                0              2981              1076           7388090            144298
179                0              1929              1529          90535838            116628
215                0              1081              1450         182701876             93378
258                0               499              1125         141393480             74052
310                0               124               756          18883224             58617
372                0                31               460          24599272             45453
446                0                25               247          23516772             34310
535                0                10               146          13987584             26168
642                0                20               194          12091458             19965
770                0                 8               196           9269197             14649
924                0                 9               340           8082898             11015
1109               0                 9               225           4762865              8058
1331               0                 9               154           3330110              5866
1597               0                 8               144           2367615              4275
1916               0                 1               188           1633608              3087
2299               0                 4               216           1139820              2196
2759               0                 5               201            819019              1456
3311               0                 4               194            600522              1135
3973               0                 6               181            454566               786
4768               0                13               136            353886               587
5722               0                 6               152            280630               400
6866               0                 5                80            225545               254
8239               0                 6               112            183285               138
9887               0                 0                68            149820               109
11864              0                 5                99            121722                66
14237              0                57                86             98352                50
17084              0                18                99             79085                35
20501              0                 1                93             62423                11
24601              0                 0                61             49471                 9
29521              0                 0                69             37395                 5
35425              0                 4                56             28611                 6
42510              0                 0                57             21876                 1
51012              0                 9                60             16105                 0
61214              0                 0                52             11996                 0
73457              0                 0                50              8791                 0
88148              0                 0                38              6430                 0
105778             0                 0                25              4660                 0
126934             0                 0                15              3308                 0
152321             0                 0                 2              2364                 0
182785             0                 0                 0              1631                 0
219342             0                 0                 0              1156                 0
263210             0                 0                 0               887                 0
315852             0                 0                 0               618                 0
379022             0                 0                 0               427                 0
454826             0                 0                 0               272                 0
545791             0                 0                 0               168                 0
654949             0                 0                 0               115                 0
785939             0                 0                 0                61                 0
943127             0                 0                 0                58                 0
1131752            0                 0                 0                34                 0
1358102            0                 0                 0                19                 0
1629722            0                 0                 0                 9                 0
1955666            0                 0                 0                 4                 0
2346799            0                 0                 0                 5                 0
2816159            0                 0                 0                 2                 0
3379391            0                 0                 0                 0                 0
4055269            0                 0                 0                 0                 0
4866323            0                 0                 0                 0                 0
5839588            0                 0                 0                 0                 0
7007506            0                 0                 0                 0                 0
8409007            0                 0                 0                 0                 0
10090808           0                 0                 0                 0                 0
12108970           0                 0                 0                 0                 0
14530764           0                 0                 0                 0                 0
17436917           0                 0                 0                 0                 0
20924300           0                 0                 0                 0                 0
25109160           0                 0                 0                 0                 0

ERROR [WRITE-/10.8.44.98<http://10.8.44.98>] 2013-08-21 06:50:25,450 OutboundTcpConnection.java (line 197) error writing to /10.8.44.98<http://10.8.44.98>
java.lang.RuntimeException: java.io.IOException: Broken pipe
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:59)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
at org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:78)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:69)
at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131)
at org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:221)
at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:186)
at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
at java.nio.channels.Channels.writeFully(Channels.java:98)
at java.nio.channels.Channels.access$000(Channels.java:61)
at java.nio.channels.Channels$1.write(Channels.java:174)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
at org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
at org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
... 9 more


From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 2:35 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Still looking for help!  We have stopped almost ALL traffic to the cluster and still some nodes are showing almost 1000% CPU for cassandra with no iostat activity.   We were running cleanup on one of the nodes that was not showing load spikes however now when I attempt to stop cleanup there via nodetool stop cleanup the java task for stopping cleanup itself is at 1500% and has not returned after 2 minutes.  This is VERY odd behavior.  Any ideas?  Hardware failure?  Network?  We are not seeing anything there but wanted to get ideas.

Thanks

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Tuesday, August 20, 2013 8:32 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Nodes get stuck

Hi all,

    We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior recently where 3 of our nodes get locked up in high load in what appears to be a GC spiral while the rest of the cluster (7 total nodes) appears fine.  When I run a tpstats, I see the following (assuming tpstats returns at all) and top shows cassandra pegged at 2000%.  Obviously we have a large number of blocked reads.  In the past I could explain this due to unexpectedly wide rows however we have handled that.  When the cluster starts to meltdown like this its hard to get visibility into what's going on and what triggered the issue as everything starts to pile on.  Opscenter becomes unusable and because the effected nodes are in GC pressure, getting any data via nodetool or JMX is also difficult.  What do people do to handle these situations?  We are going to start graphing reads/writes/sec/CF to Ganglia in the hopes that it helps.

Thanks

Pool Name                    Active   Pending      Completed   Blocked  All time blocked
ReadStage                       256       381     1245117434         0                 0
RequestResponseStage              0         0     1161495947         0                 0
MutationStage                     8         8      481721887         0                 0
ReadRepairStage                   0         0       85770600         0                 0
ReplicateOnWriteStage             0         0       21896804         0                 0
GossipStage                       0         0        1546196         0                 0
AntiEntropyStage                  0         0           5009         0                 0
MigrationStage                    0         0           1082         0                 0
MemtablePostFlusher               0         0          10178         0                 0
FlushWriter                       0         0           6081         0              2075
MiscStage                         0         0             57         0                 0
commitlog_archiver                0         0              0         0                 0
AntiEntropySessions               0         0              0         0                 0
InternalResponseStage             0         0              6         0                 0
HintedHandoff                     1         1            246         0                 0

Message type           Dropped
RANGE_SLICE                482
READ_REPAIR                  0
BINARY                       0
READ                    515762
MUTATION                    39
_TRACE                       0
REQUEST_RESPONSE            29





Re: Nodes get stuck

Posted by Sylvain Lebresne <sy...@datastax.com>.
>     n other words, is it expected that the same interval tree issue would
> occur during compactions?
>

Yep.


>
> Thanks
>
> From: Keith Wright <kw...@nanigans.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Wednesday, August 21, 2013 9:48 AM
>
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Re: Nodes get stuck
>
> Are many people running at 1.2.8?  Any issues?  Just nervous about running
> on the latest.  Would prefer to be a couple of versions behind as new bugs
> due to tend to popup.
>
> Thanks all
>
> From: Nate McCall <na...@thelastpickle.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Wednesday, August 21, 2013 9:33 AM
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Re: Nodes get stuck
>
> We use 128m and gc_grace of 300 on a CF with highly transient data (holds
> values for client locking via wait-chain algorithm implementation in
> hector).
>
> Same minor version upgrades should be painless and do-able with no
> downtime.
>
>
> On Wed, Aug 21, 2013 at 8:28 AM, Keith Wright <kw...@nanigans.com>wrote:
>
>> We have our LCS sstable size at 64 MB and gc grace at 86400.  May I ask
>> what values you use?  I saw that in 2.0 they are setting LCS default
>> sstable size to 160 MB.
>>
>> Does anyone see any risk in upgrading from 1.2.4 to 1.2.8?  Upgrade steps
>> do not appear to mention any actions required and that a rolling upgrade
>> should be safe
>>
>> From: Nate McCall <na...@thelastpickle.com>
>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Date: Wednesday, August 21, 2013 9:07 AM
>> To: Cassandra Users <us...@cassandra.apache.org>
>>
>> Subject: Re: Nodes get stuck
>>
>> Hit send before i saw your update. Yeah turn gc grace way down. You can
>> turn your level size up a lot as well.
>> On Aug 21, 2013 7:55 AM, "Keith Wright" <kw...@nanigans.com> wrote:
>>
>>> So the stack appears to be related to walking tombstones for a fetch.
>>>  Can you please give me your take on if this is a plausible explanation:
>>>
>>>    - Given our data model, we can experience wide rows.  We protect
>>>    against these by randomly reading a portion on write and if the size is
>>>    beyond a certain threshold, we delete data
>>>    - This worked VERY well for some time now however perhaps we hit a
>>>    row that we deleted and has many tombstones.  The row is being requests
>>>    frequently so Cassandra is working very hard to process through all of its
>>>    tombstones (currently the RF # of nodes are at high load which again
>>>    suggests this).
>>>
>>> Question is what to do about it?  This is an LCS table with gc grace
>>> seconds at 86400.  I assume my only options are to force a major compaction
>>> via nodetool compaction or upgrades stables?  How can I validate this is
>>> the cause?  How can I prevent it going forward?  Set the gc grace seconds
>>> to a much lower value for that table?
>>>
>>> Thanks all!
>>>
>>> From: Keith Wright <kw...@nanigans.com>
>>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>>> Date: Wednesday, August 21, 2013 8:31 AM
>>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>>> Subject: Re: Nodes get stuck
>>>
>>> Thank you for responding.  I did a quick look and my mutation stage
>>> threads are currently in TIMED_WAITING (as expected since tpstats shows no
>>> active or pending) however most of my read stage threads are Runnable with
>>> the stack traces below.  I haven't dug into them yet but thought I would
>>> put them out there to see if anyone had any ideas since we are currently in
>>> a production down state.
>>>
>>> Thanks all!
>>>
>>> Most have the first stack:
>>>
>>> java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
>>>
>>> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:69)
>>>
>>> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
>>> java.util.TimSort.countRunAndMakeAscending(TimSort.java:329)
>>> java.util.TimSort.sort(TimSort.java:203)
>>> java.util.TimSort.sort(TimSort.java:173)
>>> java.util.Arrays.sort(Arrays.java:659)
>>> java.util.Collections.sort(Collections.java:217)
>>>
>>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
>>>
>>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>>>
>>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>>>
>>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
>>> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
>>> org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
>>> org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
>>>
>>> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
>>>
>>> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
>>> org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
>>>
>>> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
>>>
>>> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>>>
>>> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>>>
>>> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>>>
>>> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>>>
>>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>>>
>>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>>>
>>> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>>>
>>> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>>>
>>> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>>>
>>> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>>>
>>> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>>>
>>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>>>
>>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>>>
>>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
>>> org.apache.cassandra.db.Table.getRow(Table.java:347)
>>>
>>> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
>>> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
>>>
>>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
>>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> java.lang.Thread.run(Thread.java:722)
>>>
>>> Name: ReadStage:1719
>>> State: RUNNABLE
>>> Total blocked: 1,005  Total waited: 913
>>>
>>> Stack trace:
>>>
>>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:252)
>>>
>>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>>>
>>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>>> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
>>> org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
>>> org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
>>>
>>> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
>>>
>>> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
>>> org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
>>>
>>> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
>>>
>>> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>>>
>>> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>>>
>>> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>>>
>>> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>>>
>>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>>>
>>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>>>
>>> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>>>
>>> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>>>
>>> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>>>
>>> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>>>
>>> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>>>
>>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>>>
>>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>>>
>>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
>>> org.apache.cassandra.db.Table.getRow(Table.java:347)
>>>
>>> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
>>> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
>>>
>>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
>>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> java.lang.Thread.run(Thread.java:722)
>>>
>>> Name: ReadStage:1722
>>> State: RUNNABLE
>>> Total blocked: 1,001  Total waited: 897
>>>
>>> Stack trace:
>>> org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:58)
>>> org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:26)
>>>
>>> org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:229)
>>>
>>> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:81)
>>>
>>> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
>>> java.util.TimSort.binarySort(TimSort.java:265)
>>> java.util.TimSort.sort(TimSort.java:208)
>>> java.util.TimSort.sort(TimSort.java:173)
>>> java.util.Arrays.sort(Arrays.java:659)
>>> java.util.Collections.sort(Collections.java:217)
>>>
>>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
>>>
>>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>>>
>>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
>>>
>>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>>>
>>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
>>>
>>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>>> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
>>> org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
>>> org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
>>>
>>> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
>>>
>>> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
>>> org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
>>>
>>> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
>>>
>>> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>>>
>>> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>>>
>>> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>>>
>>> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>>>
>>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>>>
>>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>>>
>>> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>>>
>>> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>>>
>>> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>>>
>>> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>>>
>>> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>>>
>>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>>>
>>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>>>
>>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
>>> org.apache.cassandra.db.Table.getRow(Table.java:347)
>>>
>>> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
>>> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
>>>
>>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
>>>
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>>
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>>> java.lang.Thread.run(Thread.java:722)
>>>
>>>
>>> From: Sylvain Lebresne <sy...@datastax.com>
>>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>>> Date: Wednesday, August 21, 2013 6:21 AM
>>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>>> Subject: Re: Nodes get stuck
>>>
>>> A thread dump on one of the machine that has a suspiciously high CPU
>>> might help figuring out what it is that is taking all that CPU.
>>>
>>>
>>> On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright <kw...@nanigans.com>wrote:
>>>
>>>> Some last minute info on this to hopefully enlighten.  We are doing
>>>> ~200 reads and writes across our 7 node SSD cluster right now (usually can
>>>> do closer to 20K reads at least) and seeing CPU load as follows for the
>>>> nodes (with some par new to give an idea of GC):
>>>>
>>>> 001 – 1200%   (Par New at 120 ms / sec)
>>>> 002 – 6% (Par New at 0)
>>>> 003 – 600% (Par New at 45 ms / sec)
>>>> 004 – 900%
>>>> 005 – 500%
>>>> 006 – 10%
>>>> 007 – 130%
>>>>
>>>> There are no compactions running on 001 however I did see a broken pipe
>>>> error in the logs there (see below).  Netstats for 001 shows nothing
>>>> pending.  It appears that all of the load/latency is related to one column
>>>> family.  You can see cfstats & cfhistograms output below and note that we
>>>> are using LCS.  I have brought the odd cfhistograms behavior to the thread
>>>> before and am not sure what's going on there.  We are in a production down
>>>> situation right now so any help would be much appreciated!!!
>>>>
>>>> Column Family: global_user
>>>> SSTable count: 7546
>>>> SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0]
>>>> Space used (live): 83848742562
>>>> Space used (total): 83848742562
>>>> Number of Keys (estimate): 549792896
>>>> Memtable Columns Count: 526746
>>>> Memtable Data Size: 117408252
>>>> Memtable Switch Count: 0
>>>> Read Count: 11673
>>>> Read Latency: 1950.062 ms.
>>>> Write Count: 118588
>>>> Write Latency: 0.080 ms.
>>>> Pending Tasks: 0
>>>> Bloom Filter False Positives: 4322
>>>> Bloom Filter False Ratio: 0.84066
>>>> Bloom Filter Space Used: 383507440
>>>> Compacted row minimum size: 73
>>>> Compacted row maximum size: 2816159
>>>> Compacted row mean size: 324
>>>>
>>>> [kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user
>>>> users/global_user histograms
>>>> Offset      SSTables     Write Latency      Read Latency          Row
>>>> Size      Column Count
>>>> 1               8866                 0                 0
>>>>   0              3420
>>>> 2               1001                 0                 0
>>>>   0          99218975
>>>> 3               1249                 0                 0
>>>>   0         319713048
>>>> 4               1074                 0                 0
>>>>   0          25073893
>>>> 5                132                 0                 0
>>>>   0          15359199
>>>> 6                  0                 0                 0
>>>>   0          27794925
>>>> 7                  0                12                 0
>>>>   0           7954974
>>>> 8                  0                23                 0
>>>>   0           7733934
>>>> 10                 0               184                 0
>>>>   0          13276275
>>>> 12                 0               567                 0
>>>>   0           9077508
>>>> 14                 0              1098                 0
>>>>   0           5879292
>>>> 17                 0              2722                 0
>>>>   0           5693471
>>>> 20                 0              4379                 0
>>>>   0           3204131
>>>> 24                 0              8928                 0
>>>>   0           2614995
>>>> 29                 0             13525                 0
>>>>   0           1824584
>>>> 35                 0             16759                 0
>>>>   0           1265911
>>>> 42                 0             17048                 0
>>>>   0            868075
>>>> 50                 0             14162                 5
>>>>   0            596417
>>>> 60                 0             11806                15
>>>>   0            467747
>>>> 72                 0              8569               108
>>>>   0            354276
>>>> 86                 0              7042               276
>>>> 227            269987
>>>> 103                0              5936               372
>>>>  2972            218931
>>>> 124                0              4538               577
>>>> 157            181360
>>>> 149                0              2981              1076
>>>> 7388090            144298
>>>> 179                0              1929              1529
>>>>  90535838            116628
>>>> 215                0              1081              1450
>>>> 182701876             93378
>>>> 258                0               499              1125
>>>> 141393480             74052
>>>> 310                0               124               756
>>>>  18883224             58617
>>>> 372                0                31               460
>>>>  24599272             45453
>>>> 446                0                25               247
>>>>  23516772             34310
>>>> 535                0                10               146
>>>>  13987584             26168
>>>> 642                0                20               194
>>>>  12091458             19965
>>>> 770                0                 8               196
>>>> 9269197             14649
>>>> 924                0                 9               340
>>>> 8082898             11015
>>>> 1109               0                 9               225
>>>> 4762865              8058
>>>> 1331               0                 9               154
>>>> 3330110              5866
>>>> 1597               0                 8               144
>>>> 2367615              4275
>>>> 1916               0                 1               188
>>>> 1633608              3087
>>>> 2299               0                 4               216
>>>> 1139820              2196
>>>> 2759               0                 5               201
>>>>  819019              1456
>>>> 3311               0                 4               194
>>>>  600522              1135
>>>> 3973               0                 6               181
>>>>  454566               786
>>>> 4768               0                13               136
>>>>  353886               587
>>>> 5722               0                 6               152
>>>>  280630               400
>>>> 6866               0                 5                80
>>>>  225545               254
>>>> 8239               0                 6               112
>>>>  183285               138
>>>> 9887               0                 0                68
>>>>  149820               109
>>>> 11864              0                 5                99
>>>>  121722                66
>>>> 14237              0                57                86
>>>> 98352                50
>>>> 17084              0                18                99
>>>> 79085                35
>>>> 20501              0                 1                93
>>>> 62423                11
>>>> 24601              0                 0                61
>>>> 49471                 9
>>>> 29521              0                 0                69
>>>> 37395                 5
>>>> 35425              0                 4                56
>>>> 28611                 6
>>>> 42510              0                 0                57
>>>> 21876                 1
>>>> 51012              0                 9                60
>>>> 16105                 0
>>>> 61214              0                 0                52
>>>> 11996                 0
>>>> 73457              0                 0                50
>>>>  8791                 0
>>>> 88148              0                 0                38
>>>>  6430                 0
>>>> 105778             0                 0                25
>>>>  4660                 0
>>>> 126934             0                 0                15
>>>>  3308                 0
>>>> 152321             0                 0                 2
>>>>  2364                 0
>>>> 182785             0                 0                 0
>>>>  1631                 0
>>>> 219342             0                 0                 0
>>>>  1156                 0
>>>> 263210             0                 0                 0
>>>> 887                 0
>>>> 315852             0                 0                 0
>>>> 618                 0
>>>> 379022             0                 0                 0
>>>> 427                 0
>>>> 454826             0                 0                 0
>>>> 272                 0
>>>> 545791             0                 0                 0
>>>> 168                 0
>>>> 654949             0                 0                 0
>>>> 115                 0
>>>> 785939             0                 0                 0
>>>>  61                 0
>>>> 943127             0                 0                 0
>>>>  58                 0
>>>> 1131752            0                 0                 0
>>>>  34                 0
>>>> 1358102            0                 0                 0
>>>>  19                 0
>>>> 1629722            0                 0                 0
>>>>   9                 0
>>>> 1955666            0                 0                 0
>>>>   4                 0
>>>> 2346799            0                 0                 0
>>>>   5                 0
>>>> 2816159            0                 0                 0
>>>>   2                 0
>>>> 3379391            0                 0                 0
>>>>   0                 0
>>>> 4055269            0                 0                 0
>>>>   0                 0
>>>> 4866323            0                 0                 0
>>>>   0                 0
>>>> 5839588            0                 0                 0
>>>>   0                 0
>>>> 7007506            0                 0                 0
>>>>   0                 0
>>>> 8409007            0                 0                 0
>>>>   0                 0
>>>> 10090808           0                 0                 0
>>>>   0                 0
>>>> 12108970           0                 0                 0
>>>>   0                 0
>>>> 14530764           0                 0                 0
>>>>   0                 0
>>>> 17436917           0                 0                 0
>>>>   0                 0
>>>> 20924300           0                 0                 0
>>>>   0                 0
>>>> 25109160           0                 0                 0
>>>>   0                 0
>>>>
>>>> ERROR [WRITE-/10.8.44.98] 2013-08-21 06:50:25,450
>>>> OutboundTcpConnection.java (line 197) error writing to /10.8.44.98
>>>> java.lang.RuntimeException: java.io.IOException: Broken pipe
>>>> at
>>>> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:59)
>>>> at
>>>> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
>>>> at
>>>> org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
>>>> at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62)
>>>> at
>>>> org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:78)
>>>> at
>>>> org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:69)
>>>> at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131)
>>>> at
>>>> org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:221)
>>>> at
>>>> org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:186)
>>>> at
>>>> org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
>>>> Caused by: java.io.IOException: Broken pipe
>>>> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>>> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>>>> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
>>>> at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>>>> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
>>>> at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
>>>> at java.nio.channels.Channels.writeFully(Channels.java:98)
>>>> at java.nio.channels.Channels.access$000(Channels.java:61)
>>>> at java.nio.channels.Channels$1.write(Channels.java:174)
>>>> at
>>>> java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>>>> at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
>>>> at
>>>> org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
>>>> at
>>>> org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
>>>> at
>>>> org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
>>>> at java.io.DataOutputStream.write(DataOutputStream.java:107)
>>>> at
>>>> org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
>>>> at
>>>> org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
>>>> at
>>>> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
>>>> ... 9 more
>>>>
>>>>
>>>> From: Keith Wright <kw...@nanigans.com>
>>>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>>>> Date: Wednesday, August 21, 2013 2:35 AM
>>>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>>>> Subject: Re: Nodes get stuck
>>>>
>>>> Still looking for help!  We have stopped almost ALL traffic to the
>>>> cluster and still some nodes are showing almost 1000% CPU for cassandra
>>>> with no iostat activity.   We were running cleanup on one of the nodes that
>>>> was not showing load spikes however now when I attempt to stop cleanup
>>>> there via nodetool stop cleanup the java task for stopping cleanup itself
>>>> is at 1500% and has not returned after 2 minutes.  This is VERY odd
>>>> behavior.  Any ideas?  Hardware failure?  Network?  We are not seeing
>>>> anything there but wanted to get ideas.
>>>>
>>>> Thanks
>>>>
>>>> From: Keith Wright <kw...@nanigans.com>
>>>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>>>> Date: Tuesday, August 20, 2013 8:32 PM
>>>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>>>> Subject: Nodes get stuck
>>>>
>>>> Hi all,
>>>>
>>>>     We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior
>>>> recently where 3 of our nodes get locked up in high load in what appears to
>>>> be a GC spiral while the rest of the cluster (7 total nodes) appears fine.
>>>>  When I run a tpstats, I see the following (assuming tpstats returns at
>>>> all) and top shows cassandra pegged at 2000%.  Obviously we have a large
>>>> number of blocked reads.  In the past I could explain this due to
>>>> unexpectedly wide rows however we have handled that.  When the cluster
>>>> starts to meltdown like this its hard to get visibility into what's going
>>>> on and what triggered the issue as everything starts to pile on.  Opscenter
>>>> becomes unusable and because the effected nodes are in GC pressure, getting
>>>> any data via nodetool or JMX is also difficult.  What do people do to
>>>> handle these situations?  We are going to start graphing
>>>> reads/writes/sec/CF to Ganglia in the hopes that it helps.
>>>>
>>>> Thanks
>>>>
>>>> Pool Name                    Active   Pending      Completed   Blocked
>>>>  All time blocked
>>>> ReadStage                       256       381     1245117434         0
>>>>                 0
>>>> RequestResponseStage              0         0     1161495947         0
>>>>                 0
>>>> MutationStage                     8         8      481721887         0
>>>>                 0
>>>> ReadRepairStage                   0         0       85770600         0
>>>>                 0
>>>> ReplicateOnWriteStage             0         0       21896804         0
>>>>                 0
>>>> GossipStage                       0         0        1546196         0
>>>>                 0
>>>> AntiEntropyStage                  0         0           5009         0
>>>>                 0
>>>> MigrationStage                    0         0           1082         0
>>>>                 0
>>>> MemtablePostFlusher               0         0          10178         0
>>>>                 0
>>>> FlushWriter                       0         0           6081         0
>>>>              2075
>>>> MiscStage                         0         0             57         0
>>>>                 0
>>>> commitlog_archiver                0         0              0         0
>>>>                 0
>>>> AntiEntropySessions               0         0              0         0
>>>>                 0
>>>> InternalResponseStage             0         0              6         0
>>>>                 0
>>>> HintedHandoff                     1         1            246         0
>>>>                 0
>>>>
>>>> Message type           Dropped
>>>> RANGE_SLICE                482
>>>> READ_REPAIR                  0
>>>> BINARY                       0
>>>> READ                    515762
>>>> MUTATION                    39
>>>> _TRACE                       0
>>>> REQUEST_RESPONSE            29
>>>>
>>>>
>>>
>

Re: Nodes get stuck

Posted by Keith Wright <kw...@nanigans.com>.
Hi all,

    One more question for you related to this issue.   We are no longer reading from the global_user table which showed the very high latency times via cfstats but we are still writing and therefore compacting.  I still see high load on the effected nodes (and tpstats showing pending read stage actions) which I assume is due to compactions running against the global user table?  In other words, is it expected that the same interval tree issue would occur during compactions?  I assume yes but wanted to make sure.

Thanks

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 9:48 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Are many people running at 1.2.8?  Any issues?  Just nervous about running on the latest.  Would prefer to be a couple of versions behind as new bugs due to tend to popup.

Thanks all

From: Nate McCall <na...@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 9:33 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

We use 128m and gc_grace of 300 on a CF with highly transient data (holds values for client locking via wait-chain algorithm implementation in hector).

Same minor version upgrades should be painless and do-able with no downtime.


On Wed, Aug 21, 2013 at 8:28 AM, Keith Wright <kw...@nanigans.com>> wrote:
We have our LCS sstable size at 64 MB and gc grace at 86400.  May I ask what values you use?  I saw that in 2.0 they are setting LCS default sstable size to 160 MB.

Does anyone see any risk in upgrading from 1.2.4 to 1.2.8?  Upgrade steps do not appear to mention any actions required and that a rolling upgrade should be safe

From: Nate McCall <na...@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 9:07 AM
To: Cassandra Users <us...@cassandra.apache.org>>

Subject: Re: Nodes get stuck


Hit send before i saw your update. Yeah turn gc grace way down. You can turn your level size up a lot as well.

On Aug 21, 2013 7:55 AM, "Keith Wright" <kw...@nanigans.com>> wrote:
So the stack appears to be related to walking tombstones for a fetch.  Can you please give me your take on if this is a plausible explanation:

 *   Given our data model, we can experience wide rows.  We protect against these by randomly reading a portion on write and if the size is beyond a certain threshold, we delete data
 *   This worked VERY well for some time now however perhaps we hit a row that we deleted and has many tombstones.  The row is being requests frequently so Cassandra is working very hard to process through all of its tombstones (currently the RF # of nodes are at high load which again suggests this).

Question is what to do about it?  This is an LCS table with gc grace seconds at 86400.  I assume my only options are to force a major compaction via nodetool compaction or upgrades stables?  How can I validate this is the cause?  How can I prevent it going forward?  Set the gc grace seconds to a much lower value for that table?

Thanks all!

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 8:31 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Thank you for responding.  I did a quick look and my mutation stage threads are currently in TIMED_WAITING (as expected since tpstats shows no active or pending) however most of my read stage threads are Runnable with the stack traces below.  I haven't dug into them yet but thought I would put them out there to see if anyone had any ideas since we are currently in a production down state.

Thanks all!

Most have the first stack:

java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:69)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.countRunAndMakeAscending(TimSort.java:329)
java.util.TimSort.sort(TimSort.java:203)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

Name: ReadStage:1719
State: RUNNABLE
Total blocked: 1,005  Total waited: 913

Stack trace:
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:252)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

Name: ReadStage:1722
State: RUNNABLE
Total blocked: 1,001  Total waited: 897

Stack trace:
org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:58)
org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:26)
org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:229)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:81)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.binarySort(TimSort.java:265)
java.util.TimSort.sort(TimSort.java:208)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)


From: Sylvain Lebresne <sy...@datastax.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 6:21 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

A thread dump on one of the machine that has a suspiciously high CPU might help figuring out what it is that is taking all that CPU.


On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright <kw...@nanigans.com>> wrote:
Some last minute info on this to hopefully enlighten.  We are doing ~200 reads and writes across our 7 node SSD cluster right now (usually can do closer to 20K reads at least) and seeing CPU load as follows for the nodes (with some par new to give an idea of GC):

001 – 1200%   (Par New at 120 ms / sec)
002 – 6% (Par New at 0)
003 – 600% (Par New at 45 ms / sec)
004 – 900%
005 – 500%
006 – 10%
007 – 130%

There are no compactions running on 001 however I did see a broken pipe error in the logs there (see below).  Netstats for 001 shows nothing pending.  It appears that all of the load/latency is related to one column family.  You can see cfstats & cfhistograms output below and note that we are using LCS.  I have brought the odd cfhistograms behavior to the thread before and am not sure what's going on there.  We are in a production down situation right now so any help would be much appreciated!!!

Column Family: global_user
SSTable count: 7546
SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0]
Space used (live): 83848742562
Space used (total): 83848742562
Number of Keys (estimate): 549792896
Memtable Columns Count: 526746
Memtable Data Size: 117408252
Memtable Switch Count: 0
Read Count: 11673
Read Latency: 1950.062 ms.
Write Count: 118588
Write Latency: 0.080 ms.
Pending Tasks: 0
Bloom Filter False Positives: 4322
Bloom Filter False Ratio: 0.84066
Bloom Filter Space Used: 383507440
Compacted row minimum size: 73
Compacted row maximum size: 2816159
Compacted row mean size: 324

[kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user
users/global_user histograms
Offset      SSTables     Write Latency      Read Latency          Row Size      Column Count
1               8866                 0                 0                 0              3420
2               1001                 0                 0                 0          99218975
3               1249                 0                 0                 0         319713048
4               1074                 0                 0                 0          25073893
5                132                 0                 0                 0          15359199
6                  0                 0                 0                 0          27794925
7                  0                12                 0                 0           7954974
8                  0                23                 0                 0           7733934
10                 0               184                 0                 0          13276275
12                 0               567                 0                 0           9077508
14                 0              1098                 0                 0           5879292
17                 0              2722                 0                 0           5693471
20                 0              4379                 0                 0           3204131
24                 0              8928                 0                 0           2614995
29                 0             13525                 0                 0           1824584
35                 0             16759                 0                 0           1265911
42                 0             17048                 0                 0            868075
50                 0             14162                 5                 0            596417
60                 0             11806                15                 0            467747
72                 0              8569               108                 0            354276
86                 0              7042               276               227            269987
103                0              5936               372              2972            218931
124                0              4538               577               157            181360
149                0              2981              1076           7388090            144298
179                0              1929              1529          90535838            116628
215                0              1081              1450         182701876             93378
258                0               499              1125         141393480             74052
310                0               124               756          18883224             58617
372                0                31               460          24599272             45453
446                0                25               247          23516772             34310
535                0                10               146          13987584             26168
642                0                20               194          12091458             19965
770                0                 8               196           9269197             14649
924                0                 9               340           8082898             11015
1109               0                 9               225           4762865              8058
1331               0                 9               154           3330110              5866
1597               0                 8               144           2367615              4275
1916               0                 1               188           1633608              3087
2299               0                 4               216           1139820              2196
2759               0                 5               201            819019              1456
3311               0                 4               194            600522              1135
3973               0                 6               181            454566               786
4768               0                13               136            353886               587
5722               0                 6               152            280630               400
6866               0                 5                80            225545               254
8239               0                 6               112            183285               138
9887               0                 0                68            149820               109
11864              0                 5                99            121722                66
14237              0                57                86             98352                50
17084              0                18                99             79085                35
20501              0                 1                93             62423                11
24601              0                 0                61             49471                 9
29521              0                 0                69             37395                 5
35425              0                 4                56             28611                 6
42510              0                 0                57             21876                 1
51012              0                 9                60             16105                 0
61214              0                 0                52             11996                 0
73457              0                 0                50              8791                 0
88148              0                 0                38              6430                 0
105778             0                 0                25              4660                 0
126934             0                 0                15              3308                 0
152321             0                 0                 2              2364                 0
182785             0                 0                 0              1631                 0
219342             0                 0                 0              1156                 0
263210             0                 0                 0               887                 0
315852             0                 0                 0               618                 0
379022             0                 0                 0               427                 0
454826             0                 0                 0               272                 0
545791             0                 0                 0               168                 0
654949             0                 0                 0               115                 0
785939             0                 0                 0                61                 0
943127             0                 0                 0                58                 0
1131752            0                 0                 0                34                 0
1358102            0                 0                 0                19                 0
1629722            0                 0                 0                 9                 0
1955666            0                 0                 0                 4                 0
2346799            0                 0                 0                 5                 0
2816159            0                 0                 0                 2                 0
3379391            0                 0                 0                 0                 0
4055269            0                 0                 0                 0                 0
4866323            0                 0                 0                 0                 0
5839588            0                 0                 0                 0                 0
7007506            0                 0                 0                 0                 0
8409007            0                 0                 0                 0                 0
10090808           0                 0                 0                 0                 0
12108970           0                 0                 0                 0                 0
14530764           0                 0                 0                 0                 0
17436917           0                 0                 0                 0                 0
20924300           0                 0                 0                 0                 0
25109160           0                 0                 0                 0                 0

ERROR [WRITE-/10.8.44.98<http://10.8.44.98>] 2013-08-21 06:50:25,450 OutboundTcpConnection.java (line 197) error writing to /10.8.44.98<http://10.8.44.98>
java.lang.RuntimeException: java.io.IOException: Broken pipe
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:59)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
at org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:78)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:69)
at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131)
at org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:221)
at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:186)
at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
at java.nio.channels.Channels.writeFully(Channels.java:98)
at java.nio.channels.Channels.access$000(Channels.java:61)
at java.nio.channels.Channels$1.write(Channels.java:174)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
at org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
at org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
... 9 more


From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 2:35 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Still looking for help!  We have stopped almost ALL traffic to the cluster and still some nodes are showing almost 1000% CPU for cassandra with no iostat activity.   We were running cleanup on one of the nodes that was not showing load spikes however now when I attempt to stop cleanup there via nodetool stop cleanup the java task for stopping cleanup itself is at 1500% and has not returned after 2 minutes.  This is VERY odd behavior.  Any ideas?  Hardware failure?  Network?  We are not seeing anything there but wanted to get ideas.

Thanks

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Tuesday, August 20, 2013 8:32 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Nodes get stuck

Hi all,

    We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior recently where 3 of our nodes get locked up in high load in what appears to be a GC spiral while the rest of the cluster (7 total nodes) appears fine.  When I run a tpstats, I see the following (assuming tpstats returns at all) and top shows cassandra pegged at 2000%.  Obviously we have a large number of blocked reads.  In the past I could explain this due to unexpectedly wide rows however we have handled that.  When the cluster starts to meltdown like this its hard to get visibility into what's going on and what triggered the issue as everything starts to pile on.  Opscenter becomes unusable and because the effected nodes are in GC pressure, getting any data via nodetool or JMX is also difficult.  What do people do to handle these situations?  We are going to start graphing reads/writes/sec/CF to Ganglia in the hopes that it helps.

Thanks

Pool Name                    Active   Pending      Completed   Blocked  All time blocked
ReadStage                       256       381     1245117434         0                 0
RequestResponseStage              0         0     1161495947         0                 0
MutationStage                     8         8      481721887         0                 0
ReadRepairStage                   0         0       85770600         0                 0
ReplicateOnWriteStage             0         0       21896804         0                 0
GossipStage                       0         0        1546196         0                 0
AntiEntropyStage                  0         0           5009         0                 0
MigrationStage                    0         0           1082         0                 0
MemtablePostFlusher               0         0          10178         0                 0
FlushWriter                       0         0           6081         0              2075
MiscStage                         0         0             57         0                 0
commitlog_archiver                0         0              0         0                 0
AntiEntropySessions               0         0              0         0                 0
InternalResponseStage             0         0              6         0                 0
HintedHandoff                     1         1            246         0                 0

Message type           Dropped
RANGE_SLICE                482
READ_REPAIR                  0
BINARY                       0
READ                    515762
MUTATION                    39
_TRACE                       0
REQUEST_RESPONSE            29




Re: Nodes get stuck

Posted by Keith Wright <kw...@nanigans.com>.
Are many people running at 1.2.8?  Any issues?  Just nervous about running on the latest.  Would prefer to be a couple of versions behind as new bugs due to tend to popup.

Thanks all

From: Nate McCall <na...@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 9:33 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

We use 128m and gc_grace of 300 on a CF with highly transient data (holds values for client locking via wait-chain algorithm implementation in hector).

Same minor version upgrades should be painless and do-able with no downtime.


On Wed, Aug 21, 2013 at 8:28 AM, Keith Wright <kw...@nanigans.com>> wrote:
We have our LCS sstable size at 64 MB and gc grace at 86400.  May I ask what values you use?  I saw that in 2.0 they are setting LCS default sstable size to 160 MB.

Does anyone see any risk in upgrading from 1.2.4 to 1.2.8?  Upgrade steps do not appear to mention any actions required and that a rolling upgrade should be safe

From: Nate McCall <na...@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 9:07 AM
To: Cassandra Users <us...@cassandra.apache.org>>

Subject: Re: Nodes get stuck


Hit send before i saw your update. Yeah turn gc grace way down. You can turn your level size up a lot as well.

On Aug 21, 2013 7:55 AM, "Keith Wright" <kw...@nanigans.com>> wrote:
So the stack appears to be related to walking tombstones for a fetch.  Can you please give me your take on if this is a plausible explanation:

 *   Given our data model, we can experience wide rows.  We protect against these by randomly reading a portion on write and if the size is beyond a certain threshold, we delete data
 *   This worked VERY well for some time now however perhaps we hit a row that we deleted and has many tombstones.  The row is being requests frequently so Cassandra is working very hard to process through all of its tombstones (currently the RF # of nodes are at high load which again suggests this).

Question is what to do about it?  This is an LCS table with gc grace seconds at 86400.  I assume my only options are to force a major compaction via nodetool compaction or upgrades stables?  How can I validate this is the cause?  How can I prevent it going forward?  Set the gc grace seconds to a much lower value for that table?

Thanks all!

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 8:31 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Thank you for responding.  I did a quick look and my mutation stage threads are currently in TIMED_WAITING (as expected since tpstats shows no active or pending) however most of my read stage threads are Runnable with the stack traces below.  I haven't dug into them yet but thought I would put them out there to see if anyone had any ideas since we are currently in a production down state.

Thanks all!

Most have the first stack:

java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:69)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.countRunAndMakeAscending(TimSort.java:329)
java.util.TimSort.sort(TimSort.java:203)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

Name: ReadStage:1719
State: RUNNABLE
Total blocked: 1,005  Total waited: 913

Stack trace:
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:252)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

Name: ReadStage:1722
State: RUNNABLE
Total blocked: 1,001  Total waited: 897

Stack trace:
org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:58)
org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:26)
org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:229)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:81)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.binarySort(TimSort.java:265)
java.util.TimSort.sort(TimSort.java:208)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)


From: Sylvain Lebresne <sy...@datastax.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 6:21 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

A thread dump on one of the machine that has a suspiciously high CPU might help figuring out what it is that is taking all that CPU.


On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright <kw...@nanigans.com>> wrote:
Some last minute info on this to hopefully enlighten.  We are doing ~200 reads and writes across our 7 node SSD cluster right now (usually can do closer to 20K reads at least) and seeing CPU load as follows for the nodes (with some par new to give an idea of GC):

001 – 1200%   (Par New at 120 ms / sec)
002 – 6% (Par New at 0)
003 – 600% (Par New at 45 ms / sec)
004 – 900%
005 – 500%
006 – 10%
007 – 130%

There are no compactions running on 001 however I did see a broken pipe error in the logs there (see below).  Netstats for 001 shows nothing pending.  It appears that all of the load/latency is related to one column family.  You can see cfstats & cfhistograms output below and note that we are using LCS.  I have brought the odd cfhistograms behavior to the thread before and am not sure what's going on there.  We are in a production down situation right now so any help would be much appreciated!!!

Column Family: global_user
SSTable count: 7546
SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0]
Space used (live): 83848742562
Space used (total): 83848742562
Number of Keys (estimate): 549792896
Memtable Columns Count: 526746
Memtable Data Size: 117408252
Memtable Switch Count: 0
Read Count: 11673
Read Latency: 1950.062 ms.
Write Count: 118588
Write Latency: 0.080 ms.
Pending Tasks: 0
Bloom Filter False Positives: 4322
Bloom Filter False Ratio: 0.84066
Bloom Filter Space Used: 383507440
Compacted row minimum size: 73
Compacted row maximum size: 2816159
Compacted row mean size: 324

[kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user
users/global_user histograms
Offset      SSTables     Write Latency      Read Latency          Row Size      Column Count
1               8866                 0                 0                 0              3420
2               1001                 0                 0                 0          99218975
3               1249                 0                 0                 0         319713048
4               1074                 0                 0                 0          25073893
5                132                 0                 0                 0          15359199
6                  0                 0                 0                 0          27794925
7                  0                12                 0                 0           7954974
8                  0                23                 0                 0           7733934
10                 0               184                 0                 0          13276275
12                 0               567                 0                 0           9077508
14                 0              1098                 0                 0           5879292
17                 0              2722                 0                 0           5693471
20                 0              4379                 0                 0           3204131
24                 0              8928                 0                 0           2614995
29                 0             13525                 0                 0           1824584
35                 0             16759                 0                 0           1265911
42                 0             17048                 0                 0            868075
50                 0             14162                 5                 0            596417
60                 0             11806                15                 0            467747
72                 0              8569               108                 0            354276
86                 0              7042               276               227            269987
103                0              5936               372              2972            218931
124                0              4538               577               157            181360
149                0              2981              1076           7388090            144298
179                0              1929              1529          90535838            116628
215                0              1081              1450         182701876             93378
258                0               499              1125         141393480             74052
310                0               124               756          18883224             58617
372                0                31               460          24599272             45453
446                0                25               247          23516772             34310
535                0                10               146          13987584             26168
642                0                20               194          12091458             19965
770                0                 8               196           9269197             14649
924                0                 9               340           8082898             11015
1109               0                 9               225           4762865              8058
1331               0                 9               154           3330110              5866
1597               0                 8               144           2367615              4275
1916               0                 1               188           1633608              3087
2299               0                 4               216           1139820              2196
2759               0                 5               201            819019              1456
3311               0                 4               194            600522              1135
3973               0                 6               181            454566               786
4768               0                13               136            353886               587
5722               0                 6               152            280630               400
6866               0                 5                80            225545               254
8239               0                 6               112            183285               138
9887               0                 0                68            149820               109
11864              0                 5                99            121722                66
14237              0                57                86             98352                50
17084              0                18                99             79085                35
20501              0                 1                93             62423                11
24601              0                 0                61             49471                 9
29521              0                 0                69             37395                 5
35425              0                 4                56             28611                 6
42510              0                 0                57             21876                 1
51012              0                 9                60             16105                 0
61214              0                 0                52             11996                 0
73457              0                 0                50              8791                 0
88148              0                 0                38              6430                 0
105778             0                 0                25              4660                 0
126934             0                 0                15              3308                 0
152321             0                 0                 2              2364                 0
182785             0                 0                 0              1631                 0
219342             0                 0                 0              1156                 0
263210             0                 0                 0               887                 0
315852             0                 0                 0               618                 0
379022             0                 0                 0               427                 0
454826             0                 0                 0               272                 0
545791             0                 0                 0               168                 0
654949             0                 0                 0               115                 0
785939             0                 0                 0                61                 0
943127             0                 0                 0                58                 0
1131752            0                 0                 0                34                 0
1358102            0                 0                 0                19                 0
1629722            0                 0                 0                 9                 0
1955666            0                 0                 0                 4                 0
2346799            0                 0                 0                 5                 0
2816159            0                 0                 0                 2                 0
3379391            0                 0                 0                 0                 0
4055269            0                 0                 0                 0                 0
4866323            0                 0                 0                 0                 0
5839588            0                 0                 0                 0                 0
7007506            0                 0                 0                 0                 0
8409007            0                 0                 0                 0                 0
10090808           0                 0                 0                 0                 0
12108970           0                 0                 0                 0                 0
14530764           0                 0                 0                 0                 0
17436917           0                 0                 0                 0                 0
20924300           0                 0                 0                 0                 0
25109160           0                 0                 0                 0                 0

ERROR [WRITE-/10.8.44.98<http://10.8.44.98>] 2013-08-21 06:50:25,450 OutboundTcpConnection.java (line 197) error writing to /10.8.44.98<http://10.8.44.98>
java.lang.RuntimeException: java.io.IOException: Broken pipe
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:59)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
at org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:78)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:69)
at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131)
at org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:221)
at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:186)
at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
at java.nio.channels.Channels.writeFully(Channels.java:98)
at java.nio.channels.Channels.access$000(Channels.java:61)
at java.nio.channels.Channels$1.write(Channels.java:174)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
at org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
at org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
... 9 more


From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 2:35 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Still looking for help!  We have stopped almost ALL traffic to the cluster and still some nodes are showing almost 1000% CPU for cassandra with no iostat activity.   We were running cleanup on one of the nodes that was not showing load spikes however now when I attempt to stop cleanup there via nodetool stop cleanup the java task for stopping cleanup itself is at 1500% and has not returned after 2 minutes.  This is VERY odd behavior.  Any ideas?  Hardware failure?  Network?  We are not seeing anything there but wanted to get ideas.

Thanks

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Tuesday, August 20, 2013 8:32 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Nodes get stuck

Hi all,

    We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior recently where 3 of our nodes get locked up in high load in what appears to be a GC spiral while the rest of the cluster (7 total nodes) appears fine.  When I run a tpstats, I see the following (assuming tpstats returns at all) and top shows cassandra pegged at 2000%.  Obviously we have a large number of blocked reads.  In the past I could explain this due to unexpectedly wide rows however we have handled that.  When the cluster starts to meltdown like this its hard to get visibility into what's going on and what triggered the issue as everything starts to pile on.  Opscenter becomes unusable and because the effected nodes are in GC pressure, getting any data via nodetool or JMX is also difficult.  What do people do to handle these situations?  We are going to start graphing reads/writes/sec/CF to Ganglia in the hopes that it helps.

Thanks

Pool Name                    Active   Pending      Completed   Blocked  All time blocked
ReadStage                       256       381     1245117434         0                 0
RequestResponseStage              0         0     1161495947         0                 0
MutationStage                     8         8      481721887         0                 0
ReadRepairStage                   0         0       85770600         0                 0
ReplicateOnWriteStage             0         0       21896804         0                 0
GossipStage                       0         0        1546196         0                 0
AntiEntropyStage                  0         0           5009         0                 0
MigrationStage                    0         0           1082         0                 0
MemtablePostFlusher               0         0          10178         0                 0
FlushWriter                       0         0           6081         0              2075
MiscStage                         0         0             57         0                 0
commitlog_archiver                0         0              0         0                 0
AntiEntropySessions               0         0              0         0                 0
InternalResponseStage             0         0              6         0                 0
HintedHandoff                     1         1            246         0                 0

Message type           Dropped
RANGE_SLICE                482
READ_REPAIR                  0
BINARY                       0
READ                    515762
MUTATION                    39
_TRACE                       0
REQUEST_RESPONSE            29




Re: Nodes get stuck

Posted by Nate McCall <na...@thelastpickle.com>.
We use 128m and gc_grace of 300 on a CF with highly transient data (holds
values for client locking via wait-chain algorithm implementation in
hector).

Same minor version upgrades should be painless and do-able with no
downtime.


On Wed, Aug 21, 2013 at 8:28 AM, Keith Wright <kw...@nanigans.com> wrote:

> We have our LCS sstable size at 64 MB and gc grace at 86400.  May I ask
> what values you use?  I saw that in 2.0 they are setting LCS default
> sstable size to 160 MB.
>
> Does anyone see any risk in upgrading from 1.2.4 to 1.2.8?  Upgrade steps
> do not appear to mention any actions required and that a rolling upgrade
> should be safe
>
> From: Nate McCall <na...@thelastpickle.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Wednesday, August 21, 2013 9:07 AM
> To: Cassandra Users <us...@cassandra.apache.org>
>
> Subject: Re: Nodes get stuck
>
> Hit send before i saw your update. Yeah turn gc grace way down. You can
> turn your level size up a lot as well.
> On Aug 21, 2013 7:55 AM, "Keith Wright" <kw...@nanigans.com> wrote:
>
>> So the stack appears to be related to walking tombstones for a fetch.
>>  Can you please give me your take on if this is a plausible explanation:
>>
>>    - Given our data model, we can experience wide rows.  We protect
>>    against these by randomly reading a portion on write and if the size is
>>    beyond a certain threshold, we delete data
>>    - This worked VERY well for some time now however perhaps we hit a
>>    row that we deleted and has many tombstones.  The row is being requests
>>    frequently so Cassandra is working very hard to process through all of its
>>    tombstones (currently the RF # of nodes are at high load which again
>>    suggests this).
>>
>> Question is what to do about it?  This is an LCS table with gc grace
>> seconds at 86400.  I assume my only options are to force a major compaction
>> via nodetool compaction or upgrades stables?  How can I validate this is
>> the cause?  How can I prevent it going forward?  Set the gc grace seconds
>> to a much lower value for that table?
>>
>> Thanks all!
>>
>> From: Keith Wright <kw...@nanigans.com>
>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Date: Wednesday, August 21, 2013 8:31 AM
>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Subject: Re: Nodes get stuck
>>
>> Thank you for responding.  I did a quick look and my mutation stage
>> threads are currently in TIMED_WAITING (as expected since tpstats shows no
>> active or pending) however most of my read stage threads are Runnable with
>> the stack traces below.  I haven't dug into them yet but thought I would
>> put them out there to see if anyone had any ideas since we are currently in
>> a production down state.
>>
>> Thanks all!
>>
>> Most have the first stack:
>>
>> java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
>>
>> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:69)
>>
>> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
>> java.util.TimSort.countRunAndMakeAscending(TimSort.java:329)
>> java.util.TimSort.sort(TimSort.java:203)
>> java.util.TimSort.sort(TimSort.java:173)
>> java.util.Arrays.sort(Arrays.java:659)
>> java.util.Collections.sort(Collections.java:217)
>>
>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
>>
>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>>
>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>>
>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
>> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
>> org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
>> org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
>>
>> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
>>
>> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
>> org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
>> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
>> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>>
>> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>>
>> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>>
>> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>>
>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>>
>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>>
>> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>>
>> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>>
>> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>>
>> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>>
>> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>>
>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>>
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>>
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
>> org.apache.cassandra.db.Table.getRow(Table.java:347)
>>
>> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
>> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
>>
>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> java.lang.Thread.run(Thread.java:722)
>>
>> Name: ReadStage:1719
>> State: RUNNABLE
>> Total blocked: 1,005  Total waited: 913
>>
>> Stack trace:
>>
>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:252)
>>
>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>>
>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
>> org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
>> org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
>>
>> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
>>
>> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
>> org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
>> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
>> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>>
>> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>>
>> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>>
>> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>>
>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>>
>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>>
>> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>>
>> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>>
>> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>>
>> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>>
>> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>>
>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>>
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>>
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
>> org.apache.cassandra.db.Table.getRow(Table.java:347)
>>
>> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
>> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
>>
>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> java.lang.Thread.run(Thread.java:722)
>>
>> Name: ReadStage:1722
>> State: RUNNABLE
>> Total blocked: 1,001  Total waited: 897
>>
>> Stack trace:
>> org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:58)
>> org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:26)
>>
>> org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:229)
>>
>> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:81)
>>
>> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
>> java.util.TimSort.binarySort(TimSort.java:265)
>> java.util.TimSort.sort(TimSort.java:208)
>> java.util.TimSort.sort(TimSort.java:173)
>> java.util.Arrays.sort(Arrays.java:659)
>> java.util.Collections.sort(Collections.java:217)
>>
>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
>>
>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>>
>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
>>
>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>>
>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
>>
>> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
>> org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
>> org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
>>
>> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
>>
>> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
>> org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
>> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
>> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>>
>> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>>
>> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>>
>> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>>
>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>>
>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>>
>> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>>
>> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>>
>> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>>
>> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>>
>> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>>
>> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>>
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>>
>> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
>> org.apache.cassandra.db.Table.getRow(Table.java:347)
>>
>> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
>> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
>>
>> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
>>
>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>>
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>> java.lang.Thread.run(Thread.java:722)
>>
>>
>> From: Sylvain Lebresne <sy...@datastax.com>
>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Date: Wednesday, August 21, 2013 6:21 AM
>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Subject: Re: Nodes get stuck
>>
>> A thread dump on one of the machine that has a suspiciously high CPU
>> might help figuring out what it is that is taking all that CPU.
>>
>>
>> On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright <kw...@nanigans.com>wrote:
>>
>>> Some last minute info on this to hopefully enlighten.  We are doing ~200
>>> reads and writes across our 7 node SSD cluster right now (usually can do
>>> closer to 20K reads at least) and seeing CPU load as follows for the nodes
>>> (with some par new to give an idea of GC):
>>>
>>> 001 – 1200%   (Par New at 120 ms / sec)
>>> 002 – 6% (Par New at 0)
>>> 003 – 600% (Par New at 45 ms / sec)
>>> 004 – 900%
>>> 005 – 500%
>>> 006 – 10%
>>> 007 – 130%
>>>
>>> There are no compactions running on 001 however I did see a broken pipe
>>> error in the logs there (see below).  Netstats for 001 shows nothing
>>> pending.  It appears that all of the load/latency is related to one column
>>> family.  You can see cfstats & cfhistograms output below and note that we
>>> are using LCS.  I have brought the odd cfhistograms behavior to the thread
>>> before and am not sure what's going on there.  We are in a production down
>>> situation right now so any help would be much appreciated!!!
>>>
>>> Column Family: global_user
>>> SSTable count: 7546
>>> SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0]
>>> Space used (live): 83848742562
>>> Space used (total): 83848742562
>>> Number of Keys (estimate): 549792896
>>> Memtable Columns Count: 526746
>>> Memtable Data Size: 117408252
>>> Memtable Switch Count: 0
>>> Read Count: 11673
>>> Read Latency: 1950.062 ms.
>>> Write Count: 118588
>>> Write Latency: 0.080 ms.
>>> Pending Tasks: 0
>>> Bloom Filter False Positives: 4322
>>> Bloom Filter False Ratio: 0.84066
>>> Bloom Filter Space Used: 383507440
>>> Compacted row minimum size: 73
>>> Compacted row maximum size: 2816159
>>> Compacted row mean size: 324
>>>
>>> [kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user
>>> users/global_user histograms
>>> Offset      SSTables     Write Latency      Read Latency          Row
>>> Size      Column Count
>>> 1               8866                 0                 0
>>> 0              3420
>>> 2               1001                 0                 0
>>> 0          99218975
>>> 3               1249                 0                 0
>>> 0         319713048
>>> 4               1074                 0                 0
>>> 0          25073893
>>> 5                132                 0                 0
>>> 0          15359199
>>> 6                  0                 0                 0
>>> 0          27794925
>>> 7                  0                12                 0
>>> 0           7954974
>>> 8                  0                23                 0
>>> 0           7733934
>>> 10                 0               184                 0
>>> 0          13276275
>>> 12                 0               567                 0
>>> 0           9077508
>>> 14                 0              1098                 0
>>> 0           5879292
>>> 17                 0              2722                 0
>>> 0           5693471
>>> 20                 0              4379                 0
>>> 0           3204131
>>> 24                 0              8928                 0
>>> 0           2614995
>>> 29                 0             13525                 0
>>> 0           1824584
>>> 35                 0             16759                 0
>>> 0           1265911
>>> 42                 0             17048                 0
>>> 0            868075
>>> 50                 0             14162                 5
>>> 0            596417
>>> 60                 0             11806                15
>>> 0            467747
>>> 72                 0              8569               108
>>> 0            354276
>>> 86                 0              7042               276
>>> 227            269987
>>> 103                0              5936               372
>>>  2972            218931
>>> 124                0              4538               577
>>> 157            181360
>>> 149                0              2981              1076
>>> 7388090            144298
>>> 179                0              1929              1529
>>>  90535838            116628
>>> 215                0              1081              1450
>>> 182701876             93378
>>> 258                0               499              1125
>>> 141393480             74052
>>> 310                0               124               756
>>>  18883224             58617
>>> 372                0                31               460
>>>  24599272             45453
>>> 446                0                25               247
>>>  23516772             34310
>>> 535                0                10               146
>>>  13987584             26168
>>> 642                0                20               194
>>>  12091458             19965
>>> 770                0                 8               196
>>> 9269197             14649
>>> 924                0                 9               340
>>> 8082898             11015
>>> 1109               0                 9               225
>>> 4762865              8058
>>> 1331               0                 9               154
>>> 3330110              5866
>>> 1597               0                 8               144
>>> 2367615              4275
>>> 1916               0                 1               188
>>> 1633608              3087
>>> 2299               0                 4               216
>>> 1139820              2196
>>> 2759               0                 5               201
>>>  819019              1456
>>> 3311               0                 4               194
>>>  600522              1135
>>> 3973               0                 6               181
>>>  454566               786
>>> 4768               0                13               136
>>>  353886               587
>>> 5722               0                 6               152
>>>  280630               400
>>> 6866               0                 5                80
>>>  225545               254
>>> 8239               0                 6               112
>>>  183285               138
>>> 9887               0                 0                68
>>>  149820               109
>>> 11864              0                 5                99
>>>  121722                66
>>> 14237              0                57                86
>>> 98352                50
>>> 17084              0                18                99
>>> 79085                35
>>> 20501              0                 1                93
>>> 62423                11
>>> 24601              0                 0                61
>>> 49471                 9
>>> 29521              0                 0                69
>>> 37395                 5
>>> 35425              0                 4                56
>>> 28611                 6
>>> 42510              0                 0                57
>>> 21876                 1
>>> 51012              0                 9                60
>>> 16105                 0
>>> 61214              0                 0                52
>>> 11996                 0
>>> 73457              0                 0                50
>>>  8791                 0
>>> 88148              0                 0                38
>>>  6430                 0
>>> 105778             0                 0                25
>>>  4660                 0
>>> 126934             0                 0                15
>>>  3308                 0
>>> 152321             0                 0                 2
>>>  2364                 0
>>> 182785             0                 0                 0
>>>  1631                 0
>>> 219342             0                 0                 0
>>>  1156                 0
>>> 263210             0                 0                 0
>>> 887                 0
>>> 315852             0                 0                 0
>>> 618                 0
>>> 379022             0                 0                 0
>>> 427                 0
>>> 454826             0                 0                 0
>>> 272                 0
>>> 545791             0                 0                 0
>>> 168                 0
>>> 654949             0                 0                 0
>>> 115                 0
>>> 785939             0                 0                 0
>>>  61                 0
>>> 943127             0                 0                 0
>>>  58                 0
>>> 1131752            0                 0                 0
>>>  34                 0
>>> 1358102            0                 0                 0
>>>  19                 0
>>> 1629722            0                 0                 0
>>> 9                 0
>>> 1955666            0                 0                 0
>>> 4                 0
>>> 2346799            0                 0                 0
>>> 5                 0
>>> 2816159            0                 0                 0
>>> 2                 0
>>> 3379391            0                 0                 0
>>> 0                 0
>>> 4055269            0                 0                 0
>>> 0                 0
>>> 4866323            0                 0                 0
>>> 0                 0
>>> 5839588            0                 0                 0
>>> 0                 0
>>> 7007506            0                 0                 0
>>> 0                 0
>>> 8409007            0                 0                 0
>>> 0                 0
>>> 10090808           0                 0                 0
>>> 0                 0
>>> 12108970           0                 0                 0
>>> 0                 0
>>> 14530764           0                 0                 0
>>> 0                 0
>>> 17436917           0                 0                 0
>>> 0                 0
>>> 20924300           0                 0                 0
>>> 0                 0
>>> 25109160           0                 0                 0
>>> 0                 0
>>>
>>> ERROR [WRITE-/10.8.44.98] 2013-08-21 06:50:25,450
>>> OutboundTcpConnection.java (line 197) error writing to /10.8.44.98
>>> java.lang.RuntimeException: java.io.IOException: Broken pipe
>>> at
>>> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:59)
>>> at
>>> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
>>> at
>>> org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
>>> at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62)
>>> at
>>> org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:78)
>>> at
>>> org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:69)
>>> at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131)
>>> at
>>> org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:221)
>>> at
>>> org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:186)
>>> at
>>> org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
>>> Caused by: java.io.IOException: Broken pipe
>>> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>>> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>>> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
>>> at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>>> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
>>> at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
>>> at java.nio.channels.Channels.writeFully(Channels.java:98)
>>> at java.nio.channels.Channels.access$000(Channels.java:61)
>>> at java.nio.channels.Channels$1.write(Channels.java:174)
>>> at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>>> at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
>>> at org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
>>> at
>>> org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
>>> at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
>>> at java.io.DataOutputStream.write(DataOutputStream.java:107)
>>> at
>>> org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
>>> at
>>> org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
>>> at
>>> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
>>> ... 9 more
>>>
>>>
>>> From: Keith Wright <kw...@nanigans.com>
>>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>>> Date: Wednesday, August 21, 2013 2:35 AM
>>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>>> Subject: Re: Nodes get stuck
>>>
>>> Still looking for help!  We have stopped almost ALL traffic to the
>>> cluster and still some nodes are showing almost 1000% CPU for cassandra
>>> with no iostat activity.   We were running cleanup on one of the nodes that
>>> was not showing load spikes however now when I attempt to stop cleanup
>>> there via nodetool stop cleanup the java task for stopping cleanup itself
>>> is at 1500% and has not returned after 2 minutes.  This is VERY odd
>>> behavior.  Any ideas?  Hardware failure?  Network?  We are not seeing
>>> anything there but wanted to get ideas.
>>>
>>> Thanks
>>>
>>> From: Keith Wright <kw...@nanigans.com>
>>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>>> Date: Tuesday, August 20, 2013 8:32 PM
>>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>>> Subject: Nodes get stuck
>>>
>>> Hi all,
>>>
>>>     We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior
>>> recently where 3 of our nodes get locked up in high load in what appears to
>>> be a GC spiral while the rest of the cluster (7 total nodes) appears fine.
>>>  When I run a tpstats, I see the following (assuming tpstats returns at
>>> all) and top shows cassandra pegged at 2000%.  Obviously we have a large
>>> number of blocked reads.  In the past I could explain this due to
>>> unexpectedly wide rows however we have handled that.  When the cluster
>>> starts to meltdown like this its hard to get visibility into what's going
>>> on and what triggered the issue as everything starts to pile on.  Opscenter
>>> becomes unusable and because the effected nodes are in GC pressure, getting
>>> any data via nodetool or JMX is also difficult.  What do people do to
>>> handle these situations?  We are going to start graphing
>>> reads/writes/sec/CF to Ganglia in the hopes that it helps.
>>>
>>> Thanks
>>>
>>> Pool Name                    Active   Pending      Completed   Blocked
>>>  All time blocked
>>> ReadStage                       256       381     1245117434         0
>>>               0
>>> RequestResponseStage              0         0     1161495947         0
>>>               0
>>> MutationStage                     8         8      481721887         0
>>>               0
>>> ReadRepairStage                   0         0       85770600         0
>>>               0
>>> ReplicateOnWriteStage             0         0       21896804         0
>>>               0
>>> GossipStage                       0         0        1546196         0
>>>               0
>>> AntiEntropyStage                  0         0           5009         0
>>>               0
>>> MigrationStage                    0         0           1082         0
>>>               0
>>> MemtablePostFlusher               0         0          10178         0
>>>               0
>>> FlushWriter                       0         0           6081         0
>>>            2075
>>> MiscStage                         0         0             57         0
>>>               0
>>> commitlog_archiver                0         0              0         0
>>>               0
>>> AntiEntropySessions               0         0              0         0
>>>               0
>>> InternalResponseStage             0         0              6         0
>>>               0
>>> HintedHandoff                     1         1            246         0
>>>               0
>>>
>>> Message type           Dropped
>>> RANGE_SLICE                482
>>> READ_REPAIR                  0
>>> BINARY                       0
>>> READ                    515762
>>> MUTATION                    39
>>> _TRACE                       0
>>> REQUEST_RESPONSE            29
>>>
>>>
>>

Re: Nodes get stuck

Posted by Keith Wright <kw...@nanigans.com>.
We have our LCS sstable size at 64 MB and gc grace at 86400.  May I ask what values you use?  I saw that in 2.0 they are setting LCS default sstable size to 160 MB.

Does anyone see any risk in upgrading from 1.2.4 to 1.2.8?  Upgrade steps do not appear to mention any actions required and that a rolling upgrade should be safe

From: Nate McCall <na...@thelastpickle.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 9:07 AM
To: Cassandra Users <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck


Hit send before i saw your update. Yeah turn gc grace way down. You can turn your level size up a lot as well.

On Aug 21, 2013 7:55 AM, "Keith Wright" <kw...@nanigans.com>> wrote:
So the stack appears to be related to walking tombstones for a fetch.  Can you please give me your take on if this is a plausible explanation:

 *   Given our data model, we can experience wide rows.  We protect against these by randomly reading a portion on write and if the size is beyond a certain threshold, we delete data
 *   This worked VERY well for some time now however perhaps we hit a row that we deleted and has many tombstones.  The row is being requests frequently so Cassandra is working very hard to process through all of its tombstones (currently the RF # of nodes are at high load which again suggests this).

Question is what to do about it?  This is an LCS table with gc grace seconds at 86400.  I assume my only options are to force a major compaction via nodetool compaction or upgrades stables?  How can I validate this is the cause?  How can I prevent it going forward?  Set the gc grace seconds to a much lower value for that table?

Thanks all!

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 8:31 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Thank you for responding.  I did a quick look and my mutation stage threads are currently in TIMED_WAITING (as expected since tpstats shows no active or pending) however most of my read stage threads are Runnable with the stack traces below.  I haven't dug into them yet but thought I would put them out there to see if anyone had any ideas since we are currently in a production down state.

Thanks all!

Most have the first stack:

java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:69)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.countRunAndMakeAscending(TimSort.java:329)
java.util.TimSort.sort(TimSort.java:203)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

Name: ReadStage:1719
State: RUNNABLE
Total blocked: 1,005  Total waited: 913

Stack trace:
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:252)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

Name: ReadStage:1722
State: RUNNABLE
Total blocked: 1,001  Total waited: 897

Stack trace:
org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:58)
org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:26)
org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:229)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:81)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.binarySort(TimSort.java:265)
java.util.TimSort.sort(TimSort.java:208)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)


From: Sylvain Lebresne <sy...@datastax.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 6:21 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

A thread dump on one of the machine that has a suspiciously high CPU might help figuring out what it is that is taking all that CPU.


On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright <kw...@nanigans.com>> wrote:
Some last minute info on this to hopefully enlighten.  We are doing ~200 reads and writes across our 7 node SSD cluster right now (usually can do closer to 20K reads at least) and seeing CPU load as follows for the nodes (with some par new to give an idea of GC):

001 – 1200%   (Par New at 120 ms / sec)
002 – 6% (Par New at 0)
003 – 600% (Par New at 45 ms / sec)
004 – 900%
005 – 500%
006 – 10%
007 – 130%

There are no compactions running on 001 however I did see a broken pipe error in the logs there (see below).  Netstats for 001 shows nothing pending.  It appears that all of the load/latency is related to one column family.  You can see cfstats & cfhistograms output below and note that we are using LCS.  I have brought the odd cfhistograms behavior to the thread before and am not sure what's going on there.  We are in a production down situation right now so any help would be much appreciated!!!

Column Family: global_user
SSTable count: 7546
SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0]
Space used (live): 83848742562
Space used (total): 83848742562
Number of Keys (estimate): 549792896
Memtable Columns Count: 526746
Memtable Data Size: 117408252
Memtable Switch Count: 0
Read Count: 11673
Read Latency: 1950.062 ms.
Write Count: 118588
Write Latency: 0.080 ms.
Pending Tasks: 0
Bloom Filter False Positives: 4322
Bloom Filter False Ratio: 0.84066
Bloom Filter Space Used: 383507440
Compacted row minimum size: 73
Compacted row maximum size: 2816159
Compacted row mean size: 324

[kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user
users/global_user histograms
Offset      SSTables     Write Latency      Read Latency          Row Size      Column Count
1               8866                 0                 0                 0              3420
2               1001                 0                 0                 0          99218975
3               1249                 0                 0                 0         319713048
4               1074                 0                 0                 0          25073893
5                132                 0                 0                 0          15359199
6                  0                 0                 0                 0          27794925
7                  0                12                 0                 0           7954974
8                  0                23                 0                 0           7733934
10                 0               184                 0                 0          13276275
12                 0               567                 0                 0           9077508
14                 0              1098                 0                 0           5879292
17                 0              2722                 0                 0           5693471
20                 0              4379                 0                 0           3204131
24                 0              8928                 0                 0           2614995
29                 0             13525                 0                 0           1824584
35                 0             16759                 0                 0           1265911
42                 0             17048                 0                 0            868075
50                 0             14162                 5                 0            596417
60                 0             11806                15                 0            467747
72                 0              8569               108                 0            354276
86                 0              7042               276               227            269987
103                0              5936               372              2972            218931
124                0              4538               577               157            181360
149                0              2981              1076           7388090            144298
179                0              1929              1529          90535838            116628
215                0              1081              1450         182701876             93378
258                0               499              1125         141393480             74052
310                0               124               756          18883224             58617
372                0                31               460          24599272             45453
446                0                25               247          23516772             34310
535                0                10               146          13987584             26168
642                0                20               194          12091458             19965
770                0                 8               196           9269197             14649
924                0                 9               340           8082898             11015
1109               0                 9               225           4762865              8058
1331               0                 9               154           3330110              5866
1597               0                 8               144           2367615              4275
1916               0                 1               188           1633608              3087
2299               0                 4               216           1139820              2196
2759               0                 5               201            819019              1456
3311               0                 4               194            600522              1135
3973               0                 6               181            454566               786
4768               0                13               136            353886               587
5722               0                 6               152            280630               400
6866               0                 5                80            225545               254
8239               0                 6               112            183285               138
9887               0                 0                68            149820               109
11864              0                 5                99            121722                66
14237              0                57                86             98352                50
17084              0                18                99             79085                35
20501              0                 1                93             62423                11
24601              0                 0                61             49471                 9
29521              0                 0                69             37395                 5
35425              0                 4                56             28611                 6
42510              0                 0                57             21876                 1
51012              0                 9                60             16105                 0
61214              0                 0                52             11996                 0
73457              0                 0                50              8791                 0
88148              0                 0                38              6430                 0
105778             0                 0                25              4660                 0
126934             0                 0                15              3308                 0
152321             0                 0                 2              2364                 0
182785             0                 0                 0              1631                 0
219342             0                 0                 0              1156                 0
263210             0                 0                 0               887                 0
315852             0                 0                 0               618                 0
379022             0                 0                 0               427                 0
454826             0                 0                 0               272                 0
545791             0                 0                 0               168                 0
654949             0                 0                 0               115                 0
785939             0                 0                 0                61                 0
943127             0                 0                 0                58                 0
1131752            0                 0                 0                34                 0
1358102            0                 0                 0                19                 0
1629722            0                 0                 0                 9                 0
1955666            0                 0                 0                 4                 0
2346799            0                 0                 0                 5                 0
2816159            0                 0                 0                 2                 0
3379391            0                 0                 0                 0                 0
4055269            0                 0                 0                 0                 0
4866323            0                 0                 0                 0                 0
5839588            0                 0                 0                 0                 0
7007506            0                 0                 0                 0                 0
8409007            0                 0                 0                 0                 0
10090808           0                 0                 0                 0                 0
12108970           0                 0                 0                 0                 0
14530764           0                 0                 0                 0                 0
17436917           0                 0                 0                 0                 0
20924300           0                 0                 0                 0                 0
25109160           0                 0                 0                 0                 0

ERROR [WRITE-/10.8.44.98<http://10.8.44.98>] 2013-08-21 06:50:25,450 OutboundTcpConnection.java (line 197) error writing to /10.8.44.98<http://10.8.44.98>
java.lang.RuntimeException: java.io.IOException: Broken pipe
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:59)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
at org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:78)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:69)
at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131)
at org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:221)
at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:186)
at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
at java.nio.channels.Channels.writeFully(Channels.java:98)
at java.nio.channels.Channels.access$000(Channels.java:61)
at java.nio.channels.Channels$1.write(Channels.java:174)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
at org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
at org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
... 9 more


From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 2:35 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Still looking for help!  We have stopped almost ALL traffic to the cluster and still some nodes are showing almost 1000% CPU for cassandra with no iostat activity.   We were running cleanup on one of the nodes that was not showing load spikes however now when I attempt to stop cleanup there via nodetool stop cleanup the java task for stopping cleanup itself is at 1500% and has not returned after 2 minutes.  This is VERY odd behavior.  Any ideas?  Hardware failure?  Network?  We are not seeing anything there but wanted to get ideas.

Thanks

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Tuesday, August 20, 2013 8:32 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Nodes get stuck

Hi all,

    We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior recently where 3 of our nodes get locked up in high load in what appears to be a GC spiral while the rest of the cluster (7 total nodes) appears fine.  When I run a tpstats, I see the following (assuming tpstats returns at all) and top shows cassandra pegged at 2000%.  Obviously we have a large number of blocked reads.  In the past I could explain this due to unexpectedly wide rows however we have handled that.  When the cluster starts to meltdown like this its hard to get visibility into what's going on and what triggered the issue as everything starts to pile on.  Opscenter becomes unusable and because the effected nodes are in GC pressure, getting any data via nodetool or JMX is also difficult.  What do people do to handle these situations?  We are going to start graphing reads/writes/sec/CF to Ganglia in the hopes that it helps.

Thanks

Pool Name                    Active   Pending      Completed   Blocked  All time blocked
ReadStage                       256       381     1245117434         0                 0
RequestResponseStage              0         0     1161495947         0                 0
MutationStage                     8         8      481721887         0                 0
ReadRepairStage                   0         0       85770600         0                 0
ReplicateOnWriteStage             0         0       21896804         0                 0
GossipStage                       0         0        1546196         0                 0
AntiEntropyStage                  0         0           5009         0                 0
MigrationStage                    0         0           1082         0                 0
MemtablePostFlusher               0         0          10178         0                 0
FlushWriter                       0         0           6081         0              2075
MiscStage                         0         0             57         0                 0
commitlog_archiver                0         0              0         0                 0
AntiEntropySessions               0         0              0         0                 0
InternalResponseStage             0         0              6         0                 0
HintedHandoff                     1         1            246         0                 0

Message type           Dropped
RANGE_SLICE                482
READ_REPAIR                  0
BINARY                       0
READ                    515762
MUTATION                    39
_TRACE                       0
REQUEST_RESPONSE            29



Re: Nodes get stuck

Posted by Nate McCall <na...@thelastpickle.com>.
Hit send before i saw your update. Yeah turn gc grace way down. You can
turn your level size up a lot as well.
 On Aug 21, 2013 7:55 AM, "Keith Wright" <kw...@nanigans.com> wrote:

> So the stack appears to be related to walking tombstones for a fetch.  Can
> you please give me your take on if this is a plausible explanation:
>
>    - Given our data model, we can experience wide rows.  We protect
>    against these by randomly reading a portion on write and if the size is
>    beyond a certain threshold, we delete data
>    - This worked VERY well for some time now however perhaps we hit a row
>    that we deleted and has many tombstones.  The row is being requests
>    frequently so Cassandra is working very hard to process through all of its
>    tombstones (currently the RF # of nodes are at high load which again
>    suggests this).
>
> Question is what to do about it?  This is an LCS table with gc grace
> seconds at 86400.  I assume my only options are to force a major compaction
> via nodetool compaction or upgrades stables?  How can I validate this is
> the cause?  How can I prevent it going forward?  Set the gc grace seconds
> to a much lower value for that table?
>
> Thanks all!
>
> From: Keith Wright <kw...@nanigans.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Wednesday, August 21, 2013 8:31 AM
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Re: Nodes get stuck
>
> Thank you for responding.  I did a quick look and my mutation stage
> threads are currently in TIMED_WAITING (as expected since tpstats shows no
> active or pending) however most of my read stage threads are Runnable with
> the stack traces below.  I haven't dug into them yet but thought I would
> put them out there to see if anyone had any ideas since we are currently in
> a production down state.
>
> Thanks all!
>
> Most have the first stack:
>
> java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
>
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:69)
>
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
> java.util.TimSort.countRunAndMakeAscending(TimSort.java:329)
> java.util.TimSort.sort(TimSort.java:203)
> java.util.TimSort.sort(TimSort.java:173)
> java.util.Arrays.sort(Arrays.java:659)
> java.util.Collections.sort(Collections.java:217)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
> org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
> org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
>
> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
>
> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
> org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
> org.apache.cassandra.db.Table.getRow(Table.java:347)
>
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
>
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> java.lang.Thread.run(Thread.java:722)
>
> Name: ReadStage:1719
> State: RUNNABLE
> Total blocked: 1,005  Total waited: 913
>
> Stack trace:
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:252)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
> org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
> org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
>
> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
>
> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
> org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
> org.apache.cassandra.db.Table.getRow(Table.java:347)
>
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
>
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> java.lang.Thread.run(Thread.java:722)
>
> Name: ReadStage:1722
> State: RUNNABLE
> Total blocked: 1,001  Total waited: 897
>
> Stack trace:
> org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:58)
> org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:26)
>
> org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:229)
>
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:81)
>
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
> java.util.TimSort.binarySort(TimSort.java:265)
> java.util.TimSort.sort(TimSort.java:208)
> java.util.TimSort.sort(TimSort.java:173)
> java.util.Arrays.sort(Arrays.java:659)
> java.util.Collections.sort(Collections.java:217)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
> org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
> org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
>
> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
>
> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
> org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
> org.apache.cassandra.db.Table.getRow(Table.java:347)
>
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
>
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> java.lang.Thread.run(Thread.java:722)
>
>
> From: Sylvain Lebresne <sy...@datastax.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Wednesday, August 21, 2013 6:21 AM
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Re: Nodes get stuck
>
> A thread dump on one of the machine that has a suspiciously high CPU might
> help figuring out what it is that is taking all that CPU.
>
>
> On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright <kw...@nanigans.com>wrote:
>
>> Some last minute info on this to hopefully enlighten.  We are doing ~200
>> reads and writes across our 7 node SSD cluster right now (usually can do
>> closer to 20K reads at least) and seeing CPU load as follows for the nodes
>> (with some par new to give an idea of GC):
>>
>> 001 – 1200%   (Par New at 120 ms / sec)
>> 002 – 6% (Par New at 0)
>> 003 – 600% (Par New at 45 ms / sec)
>> 004 – 900%
>> 005 – 500%
>> 006 – 10%
>> 007 – 130%
>>
>> There are no compactions running on 001 however I did see a broken pipe
>> error in the logs there (see below).  Netstats for 001 shows nothing
>> pending.  It appears that all of the load/latency is related to one column
>> family.  You can see cfstats & cfhistograms output below and note that we
>> are using LCS.  I have brought the odd cfhistograms behavior to the thread
>> before and am not sure what's going on there.  We are in a production down
>> situation right now so any help would be much appreciated!!!
>>
>> Column Family: global_user
>> SSTable count: 7546
>> SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0]
>> Space used (live): 83848742562
>> Space used (total): 83848742562
>> Number of Keys (estimate): 549792896
>> Memtable Columns Count: 526746
>> Memtable Data Size: 117408252
>> Memtable Switch Count: 0
>> Read Count: 11673
>> Read Latency: 1950.062 ms.
>> Write Count: 118588
>> Write Latency: 0.080 ms.
>> Pending Tasks: 0
>> Bloom Filter False Positives: 4322
>> Bloom Filter False Ratio: 0.84066
>> Bloom Filter Space Used: 383507440
>> Compacted row minimum size: 73
>> Compacted row maximum size: 2816159
>> Compacted row mean size: 324
>>
>> [kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user
>> users/global_user histograms
>> Offset      SSTables     Write Latency      Read Latency          Row
>> Size      Column Count
>> 1               8866                 0                 0
>> 0              3420
>> 2               1001                 0                 0
>> 0          99218975
>> 3               1249                 0                 0
>> 0         319713048
>> 4               1074                 0                 0
>> 0          25073893
>> 5                132                 0                 0
>> 0          15359199
>> 6                  0                 0                 0
>> 0          27794925
>> 7                  0                12                 0
>> 0           7954974
>> 8                  0                23                 0
>> 0           7733934
>> 10                 0               184                 0
>> 0          13276275
>> 12                 0               567                 0
>> 0           9077508
>> 14                 0              1098                 0
>> 0           5879292
>> 17                 0              2722                 0
>> 0           5693471
>> 20                 0              4379                 0
>> 0           3204131
>> 24                 0              8928                 0
>> 0           2614995
>> 29                 0             13525                 0
>> 0           1824584
>> 35                 0             16759                 0
>> 0           1265911
>> 42                 0             17048                 0
>> 0            868075
>> 50                 0             14162                 5
>> 0            596417
>> 60                 0             11806                15
>> 0            467747
>> 72                 0              8569               108
>> 0            354276
>> 86                 0              7042               276
>> 227            269987
>> 103                0              5936               372
>>  2972            218931
>> 124                0              4538               577
>> 157            181360
>> 149                0              2981              1076
>> 7388090            144298
>> 179                0              1929              1529
>>  90535838            116628
>> 215                0              1081              1450
>> 182701876             93378
>> 258                0               499              1125
>> 141393480             74052
>> 310                0               124               756
>>  18883224             58617
>> 372                0                31               460
>>  24599272             45453
>> 446                0                25               247
>>  23516772             34310
>> 535                0                10               146
>>  13987584             26168
>> 642                0                20               194
>>  12091458             19965
>> 770                0                 8               196
>> 9269197             14649
>> 924                0                 9               340
>> 8082898             11015
>> 1109               0                 9               225
>> 4762865              8058
>> 1331               0                 9               154
>> 3330110              5866
>> 1597               0                 8               144
>> 2367615              4275
>> 1916               0                 1               188
>> 1633608              3087
>> 2299               0                 4               216
>> 1139820              2196
>> 2759               0                 5               201
>>  819019              1456
>> 3311               0                 4               194
>>  600522              1135
>> 3973               0                 6               181
>>  454566               786
>> 4768               0                13               136
>>  353886               587
>> 5722               0                 6               152
>>  280630               400
>> 6866               0                 5                80
>>  225545               254
>> 8239               0                 6               112
>>  183285               138
>> 9887               0                 0                68
>>  149820               109
>> 11864              0                 5                99
>>  121722                66
>> 14237              0                57                86
>> 98352                50
>> 17084              0                18                99
>> 79085                35
>> 20501              0                 1                93
>> 62423                11
>> 24601              0                 0                61
>> 49471                 9
>> 29521              0                 0                69
>> 37395                 5
>> 35425              0                 4                56
>> 28611                 6
>> 42510              0                 0                57
>> 21876                 1
>> 51012              0                 9                60
>> 16105                 0
>> 61214              0                 0                52
>> 11996                 0
>> 73457              0                 0                50
>>  8791                 0
>> 88148              0                 0                38
>>  6430                 0
>> 105778             0                 0                25
>>  4660                 0
>> 126934             0                 0                15
>>  3308                 0
>> 152321             0                 0                 2
>>  2364                 0
>> 182785             0                 0                 0
>>  1631                 0
>> 219342             0                 0                 0
>>  1156                 0
>> 263210             0                 0                 0
>> 887                 0
>> 315852             0                 0                 0
>> 618                 0
>> 379022             0                 0                 0
>> 427                 0
>> 454826             0                 0                 0
>> 272                 0
>> 545791             0                 0                 0
>> 168                 0
>> 654949             0                 0                 0
>> 115                 0
>> 785939             0                 0                 0
>>  61                 0
>> 943127             0                 0                 0
>>  58                 0
>> 1131752            0                 0                 0
>>  34                 0
>> 1358102            0                 0                 0
>>  19                 0
>> 1629722            0                 0                 0
>> 9                 0
>> 1955666            0                 0                 0
>> 4                 0
>> 2346799            0                 0                 0
>> 5                 0
>> 2816159            0                 0                 0
>> 2                 0
>> 3379391            0                 0                 0
>> 0                 0
>> 4055269            0                 0                 0
>> 0                 0
>> 4866323            0                 0                 0
>> 0                 0
>> 5839588            0                 0                 0
>> 0                 0
>> 7007506            0                 0                 0
>> 0                 0
>> 8409007            0                 0                 0
>> 0                 0
>> 10090808           0                 0                 0
>> 0                 0
>> 12108970           0                 0                 0
>> 0                 0
>> 14530764           0                 0                 0
>> 0                 0
>> 17436917           0                 0                 0
>> 0                 0
>> 20924300           0                 0                 0
>> 0                 0
>> 25109160           0                 0                 0
>> 0                 0
>>
>> ERROR [WRITE-/10.8.44.98] 2013-08-21 06:50:25,450
>> OutboundTcpConnection.java (line 197) error writing to /10.8.44.98
>> java.lang.RuntimeException: java.io.IOException: Broken pipe
>> at
>> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:59)
>> at
>> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
>> at
>> org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
>> at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62)
>> at
>> org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:78)
>> at
>> org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:69)
>> at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131)
>> at
>> org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:221)
>> at
>> org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:186)
>> at
>> org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
>> Caused by: java.io.IOException: Broken pipe
>> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
>> at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
>> at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
>> at java.nio.channels.Channels.writeFully(Channels.java:98)
>> at java.nio.channels.Channels.access$000(Channels.java:61)
>> at java.nio.channels.Channels$1.write(Channels.java:174)
>> at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>> at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
>> at org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
>> at
>> org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
>> at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
>> at java.io.DataOutputStream.write(DataOutputStream.java:107)
>> at
>> org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
>> at
>> org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
>> at
>> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
>> ... 9 more
>>
>>
>> From: Keith Wright <kw...@nanigans.com>
>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Date: Wednesday, August 21, 2013 2:35 AM
>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Subject: Re: Nodes get stuck
>>
>> Still looking for help!  We have stopped almost ALL traffic to the
>> cluster and still some nodes are showing almost 1000% CPU for cassandra
>> with no iostat activity.   We were running cleanup on one of the nodes that
>> was not showing load spikes however now when I attempt to stop cleanup
>> there via nodetool stop cleanup the java task for stopping cleanup itself
>> is at 1500% and has not returned after 2 minutes.  This is VERY odd
>> behavior.  Any ideas?  Hardware failure?  Network?  We are not seeing
>> anything there but wanted to get ideas.
>>
>> Thanks
>>
>> From: Keith Wright <kw...@nanigans.com>
>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Date: Tuesday, August 20, 2013 8:32 PM
>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Subject: Nodes get stuck
>>
>> Hi all,
>>
>>     We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior
>> recently where 3 of our nodes get locked up in high load in what appears to
>> be a GC spiral while the rest of the cluster (7 total nodes) appears fine.
>>  When I run a tpstats, I see the following (assuming tpstats returns at
>> all) and top shows cassandra pegged at 2000%.  Obviously we have a large
>> number of blocked reads.  In the past I could explain this due to
>> unexpectedly wide rows however we have handled that.  When the cluster
>> starts to meltdown like this its hard to get visibility into what's going
>> on and what triggered the issue as everything starts to pile on.  Opscenter
>> becomes unusable and because the effected nodes are in GC pressure, getting
>> any data via nodetool or JMX is also difficult.  What do people do to
>> handle these situations?  We are going to start graphing
>> reads/writes/sec/CF to Ganglia in the hopes that it helps.
>>
>> Thanks
>>
>> Pool Name                    Active   Pending      Completed   Blocked
>>  All time blocked
>> ReadStage                       256       381     1245117434         0
>>               0
>> RequestResponseStage              0         0     1161495947         0
>>               0
>> MutationStage                     8         8      481721887         0
>>               0
>> ReadRepairStage                   0         0       85770600         0
>>               0
>> ReplicateOnWriteStage             0         0       21896804         0
>>               0
>> GossipStage                       0         0        1546196         0
>>               0
>> AntiEntropyStage                  0         0           5009         0
>>               0
>> MigrationStage                    0         0           1082         0
>>               0
>> MemtablePostFlusher               0         0          10178         0
>>               0
>> FlushWriter                       0         0           6081         0
>>            2075
>> MiscStage                         0         0             57         0
>>               0
>> commitlog_archiver                0         0              0         0
>>               0
>> AntiEntropySessions               0         0              0         0
>>               0
>> InternalResponseStage             0         0              6         0
>>               0
>> HintedHandoff                     1         1            246         0
>>               0
>>
>> Message type           Dropped
>> RANGE_SLICE                482
>> READ_REPAIR                  0
>> BINARY                       0
>> READ                    515762
>> MUTATION                    39
>> _TRACE                       0
>> REQUEST_RESPONSE            29
>>
>>
>

Re: Nodes get stuck

Posted by Keith Wright <kw...@nanigans.com>.
That looks very encouraging!  Thank you

From: Sylvain Lebresne <sy...@datastax.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 9:01 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

It seems you are running into https://issues.apache.org/jira/browse/CASSANDRA-5677. The proper fix would be to upgrade to 1.2.8.

--
Sylvain


On Wed, Aug 21, 2013 at 2:54 PM, Keith Wright <kw...@nanigans.com>> wrote:
So the stack appears to be related to walking tombstones for a fetch.  Can you please give me your take on if this is a plausible explanation:

 *   Given our data model, we can experience wide rows.  We protect against these by randomly reading a portion on write and if the size is beyond a certain threshold, we delete data
 *   This worked VERY well for some time now however perhaps we hit a row that we deleted and has many tombstones.  The row is being requests frequently so Cassandra is working very hard to process through all of its tombstones (currently the RF # of nodes are at high load which again suggests this).

Question is what to do about it?  This is an LCS table with gc grace seconds at 86400.  I assume my only options are to force a major compaction via nodetool compaction or upgrades stables?  How can I validate this is the cause?  How can I prevent it going forward?  Set the gc grace seconds to a much lower value for that table?

Thanks all!

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 8:31 AM

To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Thank you for responding.  I did a quick look and my mutation stage threads are currently in TIMED_WAITING (as expected since tpstats shows no active or pending) however most of my read stage threads are Runnable with the stack traces below.  I haven't dug into them yet but thought I would put them out there to see if anyone had any ideas since we are currently in a production down state.

Thanks all!

Most have the first stack:

java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:69)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.countRunAndMakeAscending(TimSort.java:329)
java.util.TimSort.sort(TimSort.java:203)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

Name: ReadStage:1719
State: RUNNABLE
Total blocked: 1,005  Total waited: 913

Stack trace:
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:252)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

Name: ReadStage:1722
State: RUNNABLE
Total blocked: 1,001  Total waited: 897

Stack trace:
org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:58)
org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:26)
org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:229)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:81)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.binarySort(TimSort.java:265)
java.util.TimSort.sort(TimSort.java:208)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)


From: Sylvain Lebresne <sy...@datastax.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 6:21 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

A thread dump on one of the machine that has a suspiciously high CPU might help figuring out what it is that is taking all that CPU.


On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright <kw...@nanigans.com>> wrote:
Some last minute info on this to hopefully enlighten.  We are doing ~200 reads and writes across our 7 node SSD cluster right now (usually can do closer to 20K reads at least) and seeing CPU load as follows for the nodes (with some par new to give an idea of GC):

001 – 1200%   (Par New at 120 ms / sec)
002 – 6% (Par New at 0)
003 – 600% (Par New at 45 ms / sec)
004 – 900%
005 – 500%
006 – 10%
007 – 130%

There are no compactions running on 001 however I did see a broken pipe error in the logs there (see below).  Netstats for 001 shows nothing pending.  It appears that all of the load/latency is related to one column family.  You can see cfstats & cfhistograms output below and note that we are using LCS.  I have brought the odd cfhistograms behavior to the thread before and am not sure what's going on there.  We are in a production down situation right now so any help would be much appreciated!!!

Column Family: global_user
SSTable count: 7546
SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0]
Space used (live): 83848742562
Space used (total): 83848742562
Number of Keys (estimate): 549792896
Memtable Columns Count: 526746
Memtable Data Size: 117408252
Memtable Switch Count: 0
Read Count: 11673
Read Latency: 1950.062 ms.
Write Count: 118588
Write Latency: 0.080 ms.
Pending Tasks: 0
Bloom Filter False Positives: 4322
Bloom Filter False Ratio: 0.84066
Bloom Filter Space Used: 383507440
Compacted row minimum size: 73
Compacted row maximum size: 2816159
Compacted row mean size: 324

[kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user
users/global_user histograms
Offset      SSTables     Write Latency      Read Latency          Row Size      Column Count
1               8866                 0                 0                 0              3420
2               1001                 0                 0                 0          99218975
3               1249                 0                 0                 0         319713048
4               1074                 0                 0                 0          25073893
5                132                 0                 0                 0          15359199
6                  0                 0                 0                 0          27794925
7                  0                12                 0                 0           7954974
8                  0                23                 0                 0           7733934
10                 0               184                 0                 0          13276275
12                 0               567                 0                 0           9077508
14                 0              1098                 0                 0           5879292
17                 0              2722                 0                 0           5693471
20                 0              4379                 0                 0           3204131
24                 0              8928                 0                 0           2614995
29                 0             13525                 0                 0           1824584
35                 0             16759                 0                 0           1265911
42                 0             17048                 0                 0            868075
50                 0             14162                 5                 0            596417
60                 0             11806                15                 0            467747
72                 0              8569               108                 0            354276
86                 0              7042               276               227            269987
103                0              5936               372              2972            218931
124                0              4538               577               157            181360
149                0              2981              1076           7388090            144298
179                0              1929              1529          90535838            116628
215                0              1081              1450         182701876             93378
258                0               499              1125         141393480             74052
310                0               124               756          18883224             58617
372                0                31               460          24599272             45453
446                0                25               247          23516772             34310
535                0                10               146          13987584             26168
642                0                20               194          12091458             19965
770                0                 8               196           9269197             14649
924                0                 9               340           8082898             11015
1109               0                 9               225           4762865              8058
1331               0                 9               154           3330110              5866
1597               0                 8               144           2367615              4275
1916               0                 1               188           1633608              3087
2299               0                 4               216           1139820              2196
2759               0                 5               201            819019              1456
3311               0                 4               194            600522              1135
3973               0                 6               181            454566               786
4768               0                13               136            353886               587
5722               0                 6               152            280630               400
6866               0                 5                80            225545               254
8239               0                 6               112            183285               138
9887               0                 0                68            149820               109
11864              0                 5                99            121722                66
14237              0                57                86             98352                50
17084              0                18                99             79085                35
20501              0                 1                93             62423                11
24601              0                 0                61             49471                 9
29521              0                 0                69             37395                 5
35425              0                 4                56             28611                 6
42510              0                 0                57             21876                 1
51012              0                 9                60             16105                 0
61214              0                 0                52             11996                 0
73457              0                 0                50              8791                 0
88148              0                 0                38              6430                 0
105778             0                 0                25              4660                 0
126934             0                 0                15              3308                 0
152321             0                 0                 2              2364                 0
182785             0                 0                 0              1631                 0
219342             0                 0                 0              1156                 0
263210             0                 0                 0               887                 0
315852             0                 0                 0               618                 0
379022             0                 0                 0               427                 0
454826             0                 0                 0               272                 0
545791             0                 0                 0               168                 0
654949             0                 0                 0               115                 0
785939             0                 0                 0                61                 0
943127             0                 0                 0                58                 0
1131752            0                 0                 0                34                 0
1358102            0                 0                 0                19                 0
1629722            0                 0                 0                 9                 0
1955666            0                 0                 0                 4                 0
2346799            0                 0                 0                 5                 0
2816159            0                 0                 0                 2                 0
3379391            0                 0                 0                 0                 0
4055269            0                 0                 0                 0                 0
4866323            0                 0                 0                 0                 0
5839588            0                 0                 0                 0                 0
7007506            0                 0                 0                 0                 0
8409007            0                 0                 0                 0                 0
10090808           0                 0                 0                 0                 0
12108970           0                 0                 0                 0                 0
14530764           0                 0                 0                 0                 0
17436917           0                 0                 0                 0                 0
20924300           0                 0                 0                 0                 0
25109160           0                 0                 0                 0                 0

ERROR [WRITE-/10.8.44.98<http://10.8.44.98>] 2013-08-21 06:50:25,450 OutboundTcpConnection.java (line 197) error writing to /10.8.44.98<http://10.8.44.98>
java.lang.RuntimeException: java.io.IOException: Broken pipe
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:59)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
at org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:78)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:69)
at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131)
at org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:221)
at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:186)
at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
at java.nio.channels.Channels.writeFully(Channels.java:98)
at java.nio.channels.Channels.access$000(Channels.java:61)
at java.nio.channels.Channels$1.write(Channels.java:174)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
at org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
at org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
... 9 more


From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 2:35 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Still looking for help!  We have stopped almost ALL traffic to the cluster and still some nodes are showing almost 1000% CPU for cassandra with no iostat activity.   We were running cleanup on one of the nodes that was not showing load spikes however now when I attempt to stop cleanup there via nodetool stop cleanup the java task for stopping cleanup itself is at 1500% and has not returned after 2 minutes.  This is VERY odd behavior.  Any ideas?  Hardware failure?  Network?  We are not seeing anything there but wanted to get ideas.

Thanks

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Tuesday, August 20, 2013 8:32 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Nodes get stuck

Hi all,

    We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior recently where 3 of our nodes get locked up in high load in what appears to be a GC spiral while the rest of the cluster (7 total nodes) appears fine.  When I run a tpstats, I see the following (assuming tpstats returns at all) and top shows cassandra pegged at 2000%.  Obviously we have a large number of blocked reads.  In the past I could explain this due to unexpectedly wide rows however we have handled that.  When the cluster starts to meltdown like this its hard to get visibility into what's going on and what triggered the issue as everything starts to pile on.  Opscenter becomes unusable and because the effected nodes are in GC pressure, getting any data via nodetool or JMX is also difficult.  What do people do to handle these situations?  We are going to start graphing reads/writes/sec/CF to Ganglia in the hopes that it helps.

Thanks

Pool Name                    Active   Pending      Completed   Blocked  All time blocked
ReadStage                       256       381     1245117434         0                 0
RequestResponseStage              0         0     1161495947         0                 0
MutationStage                     8         8      481721887         0                 0
ReadRepairStage                   0         0       85770600         0                 0
ReplicateOnWriteStage             0         0       21896804         0                 0
GossipStage                       0         0        1546196         0                 0
AntiEntropyStage                  0         0           5009         0                 0
MigrationStage                    0         0           1082         0                 0
MemtablePostFlusher               0         0          10178         0                 0
FlushWriter                       0         0           6081         0              2075
MiscStage                         0         0             57         0                 0
commitlog_archiver                0         0              0         0                 0
AntiEntropySessions               0         0              0         0                 0
InternalResponseStage             0         0              6         0                 0
HintedHandoff                     1         1            246         0                 0

Message type           Dropped
RANGE_SLICE                482
READ_REPAIR                  0
BINARY                       0
READ                    515762
MUTATION                    39
_TRACE                       0
REQUEST_RESPONSE            29




Re: Nodes get stuck

Posted by Sylvain Lebresne <sy...@datastax.com>.
It seems you are running into
https://issues.apache.org/jira/browse/CASSANDRA-5677. The proper fix would
be to upgrade to 1.2.8.

--
Sylvain


On Wed, Aug 21, 2013 at 2:54 PM, Keith Wright <kw...@nanigans.com> wrote:

> So the stack appears to be related to walking tombstones for a fetch.  Can
> you please give me your take on if this is a plausible explanation:
>
>    - Given our data model, we can experience wide rows.  We protect
>    against these by randomly reading a portion on write and if the size is
>    beyond a certain threshold, we delete data
>    - This worked VERY well for some time now however perhaps we hit a row
>    that we deleted and has many tombstones.  The row is being requests
>    frequently so Cassandra is working very hard to process through all of its
>    tombstones (currently the RF # of nodes are at high load which again
>    suggests this).
>
> Question is what to do about it?  This is an LCS table with gc grace
> seconds at 86400.  I assume my only options are to force a major compaction
> via nodetool compaction or upgrades stables?  How can I validate this is
> the cause?  How can I prevent it going forward?  Set the gc grace seconds
> to a much lower value for that table?
>
> Thanks all!
>
> From: Keith Wright <kw...@nanigans.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Wednesday, August 21, 2013 8:31 AM
>
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Re: Nodes get stuck
>
> Thank you for responding.  I did a quick look and my mutation stage
> threads are currently in TIMED_WAITING (as expected since tpstats shows no
> active or pending) however most of my read stage threads are Runnable with
> the stack traces below.  I haven't dug into them yet but thought I would
> put them out there to see if anyone had any ideas since we are currently in
> a production down state.
>
> Thanks all!
>
> Most have the first stack:
>
> java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
>
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:69)
>
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
> java.util.TimSort.countRunAndMakeAscending(TimSort.java:329)
> java.util.TimSort.sort(TimSort.java:203)
> java.util.TimSort.sort(TimSort.java:173)
> java.util.Arrays.sort(Arrays.java:659)
> java.util.Collections.sort(Collections.java:217)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
> org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
> org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
>
> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
>
> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
> org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
> org.apache.cassandra.db.Table.getRow(Table.java:347)
>
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
>
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> java.lang.Thread.run(Thread.java:722)
>
> Name: ReadStage:1719
> State: RUNNABLE
> Total blocked: 1,005  Total waited: 913
>
> Stack trace:
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:252)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
> org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
> org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
>
> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
>
> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
> org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
> org.apache.cassandra.db.Table.getRow(Table.java:347)
>
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
>
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> java.lang.Thread.run(Thread.java:722)
>
> Name: ReadStage:1722
> State: RUNNABLE
> Total blocked: 1,001  Total waited: 897
>
> Stack trace:
> org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:58)
> org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:26)
>
> org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:229)
>
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:81)
>
> org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
> java.util.TimSort.binarySort(TimSort.java:265)
> java.util.TimSort.sort(TimSort.java:208)
> java.util.TimSort.sort(TimSort.java:173)
> java.util.Arrays.sort(Arrays.java:659)
> java.util.Collections.sort(Collections.java:217)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
>
> org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
> org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
> org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
> org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
>
> org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
>
> org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
> org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
> org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
> org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
>
> org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
>
> org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
>
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
>
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
>
> org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
>
> org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
>
> org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
>
> org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
>
> org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
>
> org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
>
> org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
> org.apache.cassandra.db.Table.getRow(Table.java:347)
>
> org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
> org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
>
> org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> java.lang.Thread.run(Thread.java:722)
>
>
> From: Sylvain Lebresne <sy...@datastax.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Wednesday, August 21, 2013 6:21 AM
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Re: Nodes get stuck
>
> A thread dump on one of the machine that has a suspiciously high CPU might
> help figuring out what it is that is taking all that CPU.
>
>
> On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright <kw...@nanigans.com>wrote:
>
>> Some last minute info on this to hopefully enlighten.  We are doing ~200
>> reads and writes across our 7 node SSD cluster right now (usually can do
>> closer to 20K reads at least) and seeing CPU load as follows for the nodes
>> (with some par new to give an idea of GC):
>>
>> 001 – 1200%   (Par New at 120 ms / sec)
>> 002 – 6% (Par New at 0)
>> 003 – 600% (Par New at 45 ms / sec)
>> 004 – 900%
>> 005 – 500%
>> 006 – 10%
>> 007 – 130%
>>
>> There are no compactions running on 001 however I did see a broken pipe
>> error in the logs there (see below).  Netstats for 001 shows nothing
>> pending.  It appears that all of the load/latency is related to one column
>> family.  You can see cfstats & cfhistograms output below and note that we
>> are using LCS.  I have brought the odd cfhistograms behavior to the thread
>> before and am not sure what's going on there.  We are in a production down
>> situation right now so any help would be much appreciated!!!
>>
>> Column Family: global_user
>> SSTable count: 7546
>> SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0]
>> Space used (live): 83848742562
>> Space used (total): 83848742562
>> Number of Keys (estimate): 549792896
>> Memtable Columns Count: 526746
>> Memtable Data Size: 117408252
>> Memtable Switch Count: 0
>> Read Count: 11673
>> Read Latency: 1950.062 ms.
>> Write Count: 118588
>> Write Latency: 0.080 ms.
>> Pending Tasks: 0
>> Bloom Filter False Positives: 4322
>> Bloom Filter False Ratio: 0.84066
>> Bloom Filter Space Used: 383507440
>> Compacted row minimum size: 73
>> Compacted row maximum size: 2816159
>> Compacted row mean size: 324
>>
>> [kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user
>> users/global_user histograms
>> Offset      SSTables     Write Latency      Read Latency          Row
>> Size      Column Count
>> 1               8866                 0                 0
>> 0              3420
>> 2               1001                 0                 0
>> 0          99218975
>> 3               1249                 0                 0
>> 0         319713048
>> 4               1074                 0                 0
>> 0          25073893
>> 5                132                 0                 0
>> 0          15359199
>> 6                  0                 0                 0
>> 0          27794925
>> 7                  0                12                 0
>> 0           7954974
>> 8                  0                23                 0
>> 0           7733934
>> 10                 0               184                 0
>> 0          13276275
>> 12                 0               567                 0
>> 0           9077508
>> 14                 0              1098                 0
>> 0           5879292
>> 17                 0              2722                 0
>> 0           5693471
>> 20                 0              4379                 0
>> 0           3204131
>> 24                 0              8928                 0
>> 0           2614995
>> 29                 0             13525                 0
>> 0           1824584
>> 35                 0             16759                 0
>> 0           1265911
>> 42                 0             17048                 0
>> 0            868075
>> 50                 0             14162                 5
>> 0            596417
>> 60                 0             11806                15
>> 0            467747
>> 72                 0              8569               108
>> 0            354276
>> 86                 0              7042               276
>> 227            269987
>> 103                0              5936               372
>>  2972            218931
>> 124                0              4538               577
>> 157            181360
>> 149                0              2981              1076
>> 7388090            144298
>> 179                0              1929              1529
>>  90535838            116628
>> 215                0              1081              1450
>> 182701876             93378
>> 258                0               499              1125
>> 141393480             74052
>> 310                0               124               756
>>  18883224             58617
>> 372                0                31               460
>>  24599272             45453
>> 446                0                25               247
>>  23516772             34310
>> 535                0                10               146
>>  13987584             26168
>> 642                0                20               194
>>  12091458             19965
>> 770                0                 8               196
>> 9269197             14649
>> 924                0                 9               340
>> 8082898             11015
>> 1109               0                 9               225
>> 4762865              8058
>> 1331               0                 9               154
>> 3330110              5866
>> 1597               0                 8               144
>> 2367615              4275
>> 1916               0                 1               188
>> 1633608              3087
>> 2299               0                 4               216
>> 1139820              2196
>> 2759               0                 5               201
>>  819019              1456
>> 3311               0                 4               194
>>  600522              1135
>> 3973               0                 6               181
>>  454566               786
>> 4768               0                13               136
>>  353886               587
>> 5722               0                 6               152
>>  280630               400
>> 6866               0                 5                80
>>  225545               254
>> 8239               0                 6               112
>>  183285               138
>> 9887               0                 0                68
>>  149820               109
>> 11864              0                 5                99
>>  121722                66
>> 14237              0                57                86
>> 98352                50
>> 17084              0                18                99
>> 79085                35
>> 20501              0                 1                93
>> 62423                11
>> 24601              0                 0                61
>> 49471                 9
>> 29521              0                 0                69
>> 37395                 5
>> 35425              0                 4                56
>> 28611                 6
>> 42510              0                 0                57
>> 21876                 1
>> 51012              0                 9                60
>> 16105                 0
>> 61214              0                 0                52
>> 11996                 0
>> 73457              0                 0                50
>>  8791                 0
>> 88148              0                 0                38
>>  6430                 0
>> 105778             0                 0                25
>>  4660                 0
>> 126934             0                 0                15
>>  3308                 0
>> 152321             0                 0                 2
>>  2364                 0
>> 182785             0                 0                 0
>>  1631                 0
>> 219342             0                 0                 0
>>  1156                 0
>> 263210             0                 0                 0
>> 887                 0
>> 315852             0                 0                 0
>> 618                 0
>> 379022             0                 0                 0
>> 427                 0
>> 454826             0                 0                 0
>> 272                 0
>> 545791             0                 0                 0
>> 168                 0
>> 654949             0                 0                 0
>> 115                 0
>> 785939             0                 0                 0
>>  61                 0
>> 943127             0                 0                 0
>>  58                 0
>> 1131752            0                 0                 0
>>  34                 0
>> 1358102            0                 0                 0
>>  19                 0
>> 1629722            0                 0                 0
>> 9                 0
>> 1955666            0                 0                 0
>> 4                 0
>> 2346799            0                 0                 0
>> 5                 0
>> 2816159            0                 0                 0
>> 2                 0
>> 3379391            0                 0                 0
>> 0                 0
>> 4055269            0                 0                 0
>> 0                 0
>> 4866323            0                 0                 0
>> 0                 0
>> 5839588            0                 0                 0
>> 0                 0
>> 7007506            0                 0                 0
>> 0                 0
>> 8409007            0                 0                 0
>> 0                 0
>> 10090808           0                 0                 0
>> 0                 0
>> 12108970           0                 0                 0
>> 0                 0
>> 14530764           0                 0                 0
>> 0                 0
>> 17436917           0                 0                 0
>> 0                 0
>> 20924300           0                 0                 0
>> 0                 0
>> 25109160           0                 0                 0
>> 0                 0
>>
>> ERROR [WRITE-/10.8.44.98] 2013-08-21 06:50:25,450
>> OutboundTcpConnection.java (line 197) error writing to /10.8.44.98
>> java.lang.RuntimeException: java.io.IOException: Broken pipe
>> at
>> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:59)
>> at
>> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
>> at
>> org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
>> at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62)
>> at
>> org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:78)
>> at
>> org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:69)
>> at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131)
>> at
>> org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:221)
>> at
>> org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:186)
>> at
>> org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
>> Caused by: java.io.IOException: Broken pipe
>> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
>> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
>> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
>> at sun.nio.ch.IOUtil.write(IOUtil.java:65)
>> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
>> at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
>> at java.nio.channels.Channels.writeFully(Channels.java:98)
>> at java.nio.channels.Channels.access$000(Channels.java:61)
>> at java.nio.channels.Channels$1.write(Channels.java:174)
>> at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
>> at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
>> at org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
>> at
>> org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
>> at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
>> at java.io.DataOutputStream.write(DataOutputStream.java:107)
>> at
>> org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
>> at
>> org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
>> at
>> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
>> ... 9 more
>>
>>
>> From: Keith Wright <kw...@nanigans.com>
>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Date: Wednesday, August 21, 2013 2:35 AM
>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Subject: Re: Nodes get stuck
>>
>> Still looking for help!  We have stopped almost ALL traffic to the
>> cluster and still some nodes are showing almost 1000% CPU for cassandra
>> with no iostat activity.   We were running cleanup on one of the nodes that
>> was not showing load spikes however now when I attempt to stop cleanup
>> there via nodetool stop cleanup the java task for stopping cleanup itself
>> is at 1500% and has not returned after 2 minutes.  This is VERY odd
>> behavior.  Any ideas?  Hardware failure?  Network?  We are not seeing
>> anything there but wanted to get ideas.
>>
>> Thanks
>>
>> From: Keith Wright <kw...@nanigans.com>
>> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Date: Tuesday, August 20, 2013 8:32 PM
>> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
>> Subject: Nodes get stuck
>>
>> Hi all,
>>
>>     We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior
>> recently where 3 of our nodes get locked up in high load in what appears to
>> be a GC spiral while the rest of the cluster (7 total nodes) appears fine.
>>  When I run a tpstats, I see the following (assuming tpstats returns at
>> all) and top shows cassandra pegged at 2000%.  Obviously we have a large
>> number of blocked reads.  In the past I could explain this due to
>> unexpectedly wide rows however we have handled that.  When the cluster
>> starts to meltdown like this its hard to get visibility into what's going
>> on and what triggered the issue as everything starts to pile on.  Opscenter
>> becomes unusable and because the effected nodes are in GC pressure, getting
>> any data via nodetool or JMX is also difficult.  What do people do to
>> handle these situations?  We are going to start graphing
>> reads/writes/sec/CF to Ganglia in the hopes that it helps.
>>
>> Thanks
>>
>> Pool Name                    Active   Pending      Completed   Blocked
>>  All time blocked
>> ReadStage                       256       381     1245117434         0
>>               0
>> RequestResponseStage              0         0     1161495947         0
>>               0
>> MutationStage                     8         8      481721887         0
>>               0
>> ReadRepairStage                   0         0       85770600         0
>>               0
>> ReplicateOnWriteStage             0         0       21896804         0
>>               0
>> GossipStage                       0         0        1546196         0
>>               0
>> AntiEntropyStage                  0         0           5009         0
>>               0
>> MigrationStage                    0         0           1082         0
>>               0
>> MemtablePostFlusher               0         0          10178         0
>>               0
>> FlushWriter                       0         0           6081         0
>>            2075
>> MiscStage                         0         0             57         0
>>               0
>> commitlog_archiver                0         0              0         0
>>               0
>> AntiEntropySessions               0         0              0         0
>>               0
>> InternalResponseStage             0         0              6         0
>>               0
>> HintedHandoff                     1         1            246         0
>>               0
>>
>> Message type           Dropped
>> RANGE_SLICE                482
>> READ_REPAIR                  0
>> BINARY                       0
>> READ                    515762
>> MUTATION                    39
>> _TRACE                       0
>> REQUEST_RESPONSE            29
>>
>>
>

Re: Nodes get stuck

Posted by Keith Wright <kw...@nanigans.com>.
So the stack appears to be related to walking tombstones for a fetch.  Can you please give me your take on if this is a plausible explanation:

 *   Given our data model, we can experience wide rows.  We protect against these by randomly reading a portion on write and if the size is beyond a certain threshold, we delete data
 *   This worked VERY well for some time now however perhaps we hit a row that we deleted and has many tombstones.  The row is being requests frequently so Cassandra is working very hard to process through all of its tombstones (currently the RF # of nodes are at high load which again suggests this).

Question is what to do about it?  This is an LCS table with gc grace seconds at 86400.  I assume my only options are to force a major compaction via nodetool compaction or upgrades stables?  How can I validate this is the cause?  How can I prevent it going forward?  Set the gc grace seconds to a much lower value for that table?

Thanks all!

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 8:31 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Thank you for responding.  I did a quick look and my mutation stage threads are currently in TIMED_WAITING (as expected since tpstats shows no active or pending) however most of my read stage threads are Runnable with the stack traces below.  I haven't dug into them yet but thought I would put them out there to see if anyone had any ideas since we are currently in a production down state.

Thanks all!

Most have the first stack:

java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:69)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.countRunAndMakeAscending(TimSort.java:329)
java.util.TimSort.sort(TimSort.java:203)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

Name: ReadStage:1719
State: RUNNABLE
Total blocked: 1,005  Total waited: 913

Stack trace:
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:252)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

Name: ReadStage:1722
State: RUNNABLE
Total blocked: 1,001  Total waited: 897

Stack trace:
org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:58)
org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:26)
org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:229)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:81)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.binarySort(TimSort.java:265)
java.util.TimSort.sort(TimSort.java:208)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)


From: Sylvain Lebresne <sy...@datastax.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 6:21 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

A thread dump on one of the machine that has a suspiciously high CPU might help figuring out what it is that is taking all that CPU.


On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright <kw...@nanigans.com>> wrote:
Some last minute info on this to hopefully enlighten.  We are doing ~200 reads and writes across our 7 node SSD cluster right now (usually can do closer to 20K reads at least) and seeing CPU load as follows for the nodes (with some par new to give an idea of GC):

001 – 1200%   (Par New at 120 ms / sec)
002 – 6% (Par New at 0)
003 – 600% (Par New at 45 ms / sec)
004 – 900%
005 – 500%
006 – 10%
007 – 130%

There are no compactions running on 001 however I did see a broken pipe error in the logs there (see below).  Netstats for 001 shows nothing pending.  It appears that all of the load/latency is related to one column family.  You can see cfstats & cfhistograms output below and note that we are using LCS.  I have brought the odd cfhistograms behavior to the thread before and am not sure what's going on there.  We are in a production down situation right now so any help would be much appreciated!!!

Column Family: global_user
SSTable count: 7546
SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0]
Space used (live): 83848742562
Space used (total): 83848742562
Number of Keys (estimate): 549792896
Memtable Columns Count: 526746
Memtable Data Size: 117408252
Memtable Switch Count: 0
Read Count: 11673
Read Latency: 1950.062 ms.
Write Count: 118588
Write Latency: 0.080 ms.
Pending Tasks: 0
Bloom Filter False Positives: 4322
Bloom Filter False Ratio: 0.84066
Bloom Filter Space Used: 383507440
Compacted row minimum size: 73
Compacted row maximum size: 2816159
Compacted row mean size: 324

[kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user
users/global_user histograms
Offset      SSTables     Write Latency      Read Latency          Row Size      Column Count
1               8866                 0                 0                 0              3420
2               1001                 0                 0                 0          99218975
3               1249                 0                 0                 0         319713048
4               1074                 0                 0                 0          25073893
5                132                 0                 0                 0          15359199
6                  0                 0                 0                 0          27794925
7                  0                12                 0                 0           7954974
8                  0                23                 0                 0           7733934
10                 0               184                 0                 0          13276275
12                 0               567                 0                 0           9077508
14                 0              1098                 0                 0           5879292
17                 0              2722                 0                 0           5693471
20                 0              4379                 0                 0           3204131
24                 0              8928                 0                 0           2614995
29                 0             13525                 0                 0           1824584
35                 0             16759                 0                 0           1265911
42                 0             17048                 0                 0            868075
50                 0             14162                 5                 0            596417
60                 0             11806                15                 0            467747
72                 0              8569               108                 0            354276
86                 0              7042               276               227            269987
103                0              5936               372              2972            218931
124                0              4538               577               157            181360
149                0              2981              1076           7388090            144298
179                0              1929              1529          90535838            116628
215                0              1081              1450         182701876             93378
258                0               499              1125         141393480             74052
310                0               124               756          18883224             58617
372                0                31               460          24599272             45453
446                0                25               247          23516772             34310
535                0                10               146          13987584             26168
642                0                20               194          12091458             19965
770                0                 8               196           9269197             14649
924                0                 9               340           8082898             11015
1109               0                 9               225           4762865              8058
1331               0                 9               154           3330110              5866
1597               0                 8               144           2367615              4275
1916               0                 1               188           1633608              3087
2299               0                 4               216           1139820              2196
2759               0                 5               201            819019              1456
3311               0                 4               194            600522              1135
3973               0                 6               181            454566               786
4768               0                13               136            353886               587
5722               0                 6               152            280630               400
6866               0                 5                80            225545               254
8239               0                 6               112            183285               138
9887               0                 0                68            149820               109
11864              0                 5                99            121722                66
14237              0                57                86             98352                50
17084              0                18                99             79085                35
20501              0                 1                93             62423                11
24601              0                 0                61             49471                 9
29521              0                 0                69             37395                 5
35425              0                 4                56             28611                 6
42510              0                 0                57             21876                 1
51012              0                 9                60             16105                 0
61214              0                 0                52             11996                 0
73457              0                 0                50              8791                 0
88148              0                 0                38              6430                 0
105778             0                 0                25              4660                 0
126934             0                 0                15              3308                 0
152321             0                 0                 2              2364                 0
182785             0                 0                 0              1631                 0
219342             0                 0                 0              1156                 0
263210             0                 0                 0               887                 0
315852             0                 0                 0               618                 0
379022             0                 0                 0               427                 0
454826             0                 0                 0               272                 0
545791             0                 0                 0               168                 0
654949             0                 0                 0               115                 0
785939             0                 0                 0                61                 0
943127             0                 0                 0                58                 0
1131752            0                 0                 0                34                 0
1358102            0                 0                 0                19                 0
1629722            0                 0                 0                 9                 0
1955666            0                 0                 0                 4                 0
2346799            0                 0                 0                 5                 0
2816159            0                 0                 0                 2                 0
3379391            0                 0                 0                 0                 0
4055269            0                 0                 0                 0                 0
4866323            0                 0                 0                 0                 0
5839588            0                 0                 0                 0                 0
7007506            0                 0                 0                 0                 0
8409007            0                 0                 0                 0                 0
10090808           0                 0                 0                 0                 0
12108970           0                 0                 0                 0                 0
14530764           0                 0                 0                 0                 0
17436917           0                 0                 0                 0                 0
20924300           0                 0                 0                 0                 0
25109160           0                 0                 0                 0                 0

ERROR [WRITE-/10.8.44.98<http://10.8.44.98>] 2013-08-21 06:50:25,450 OutboundTcpConnection.java (line 197) error writing to /10.8.44.98<http://10.8.44.98>
java.lang.RuntimeException: java.io.IOException: Broken pipe
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:59)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
at org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:78)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:69)
at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131)
at org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:221)
at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:186)
at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
at java.nio.channels.Channels.writeFully(Channels.java:98)
at java.nio.channels.Channels.access$000(Channels.java:61)
at java.nio.channels.Channels$1.write(Channels.java:174)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
at org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
at org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
... 9 more


From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 2:35 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Still looking for help!  We have stopped almost ALL traffic to the cluster and still some nodes are showing almost 1000% CPU for cassandra with no iostat activity.   We were running cleanup on one of the nodes that was not showing load spikes however now when I attempt to stop cleanup there via nodetool stop cleanup the java task for stopping cleanup itself is at 1500% and has not returned after 2 minutes.  This is VERY odd behavior.  Any ideas?  Hardware failure?  Network?  We are not seeing anything there but wanted to get ideas.

Thanks

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Tuesday, August 20, 2013 8:32 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Nodes get stuck

Hi all,

    We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior recently where 3 of our nodes get locked up in high load in what appears to be a GC spiral while the rest of the cluster (7 total nodes) appears fine.  When I run a tpstats, I see the following (assuming tpstats returns at all) and top shows cassandra pegged at 2000%.  Obviously we have a large number of blocked reads.  In the past I could explain this due to unexpectedly wide rows however we have handled that.  When the cluster starts to meltdown like this its hard to get visibility into what's going on and what triggered the issue as everything starts to pile on.  Opscenter becomes unusable and because the effected nodes are in GC pressure, getting any data via nodetool or JMX is also difficult.  What do people do to handle these situations?  We are going to start graphing reads/writes/sec/CF to Ganglia in the hopes that it helps.

Thanks

Pool Name                    Active   Pending      Completed   Blocked  All time blocked
ReadStage                       256       381     1245117434         0                 0
RequestResponseStage              0         0     1161495947         0                 0
MutationStage                     8         8      481721887         0                 0
ReadRepairStage                   0         0       85770600         0                 0
ReplicateOnWriteStage             0         0       21896804         0                 0
GossipStage                       0         0        1546196         0                 0
AntiEntropyStage                  0         0           5009         0                 0
MigrationStage                    0         0           1082         0                 0
MemtablePostFlusher               0         0          10178         0                 0
FlushWriter                       0         0           6081         0              2075
MiscStage                         0         0             57         0                 0
commitlog_archiver                0         0              0         0                 0
AntiEntropySessions               0         0              0         0                 0
InternalResponseStage             0         0              6         0                 0
HintedHandoff                     1         1            246         0                 0

Message type           Dropped
RANGE_SLICE                482
READ_REPAIR                  0
BINARY                       0
READ                    515762
MUTATION                    39
_TRACE                       0
REQUEST_RESPONSE            29



Re: Nodes get stuck

Posted by Keith Wright <kw...@nanigans.com>.
Thank you for responding.  I did a quick look and my mutation stage threads are currently in TIMED_WAITING (as expected since tpstats shows no active or pending) however most of my read stage threads are Runnable with the stack traces below.  I haven't dug into them yet but thought I would put them out there to see if anyone had any ideas since we are currently in a production down state.

Thanks all!

Most have the first stack:

java.nio.HeapByteBuffer.duplicate(HeapByteBuffer.java:107)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:69)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.countRunAndMakeAscending(TimSort.java:329)
java.util.TimSort.sort(TimSort.java:203)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

Name: ReadStage:1719
State: RUNNABLE
Total blocked: 1,005  Total waited: 913

Stack trace:
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:252)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)

Name: ReadStage:1722
State: RUNNABLE
Total blocked: 1,001  Total waited: 897

Stack trace:
org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:58)
org.apache.cassandra.db.marshal.Int32Type.compare(Int32Type.java:26)
org.apache.cassandra.db.marshal.AbstractType.compareCollectionMembers(AbstractType.java:229)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:81)
org.apache.cassandra.db.marshal.AbstractCompositeType.compare(AbstractCompositeType.java:31)
java.util.TimSort.binarySort(TimSort.java:265)
java.util.TimSort.sort(TimSort.java:208)
java.util.TimSort.sort(TimSort.java:173)
java.util.Arrays.sort(Arrays.java:659)
java.util.Collections.sort(Collections.java:217)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:255)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:281)
org.apache.cassandra.utils.IntervalTree$IntervalNode.<init>(IntervalTree.java:280)
org.apache.cassandra.utils.IntervalTree.<init>(IntervalTree.java:72)
org.apache.cassandra.utils.IntervalTree.build(IntervalTree.java:81)
org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:175)
org.apache.cassandra.db.AbstractThreadUnsafeSortedColumns.delete(AbstractThreadUnsafeSortedColumns.java:40)
org.apache.cassandra.db.AbstractColumnContainer.delete(AbstractColumnContainer.java:51)
org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:224)
org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:182)
org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:154)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:143)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:122)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:96)
com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143)
com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138)
org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:157)
org.apache.cassandra.db.filter.QueryFilter.collateColumns(QueryFilter.java:136)
org.apache.cassandra.db.filter.QueryFilter.collateOnDiskAtom(QueryFilter.java:84)
org.apache.cassandra.db.CollationController.collectAllData(CollationController.java:293)
org.apache.cassandra.db.CollationController.getTopLevelColumns(CollationController.java:65)
org.apache.cassandra.db.ColumnFamilyStore.getTopLevelColumns(ColumnFamilyStore.java:1357)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1214)
org.apache.cassandra.db.ColumnFamilyStore.getColumnFamily(ColumnFamilyStore.java:1126)
org.apache.cassandra.db.Table.getRow(Table.java:347)
org.apache.cassandra.db.SliceFromReadCommand.getRow(SliceFromReadCommand.java:70)
org.apache.cassandra.db.ReadVerbHandler.doVerb(ReadVerbHandler.java:44)
org.apache.cassandra.net.MessageDeliveryTask.run(MessageDeliveryTask.java:56)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
java.lang.Thread.run(Thread.java:722)


From: Sylvain Lebresne <sy...@datastax.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 6:21 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

A thread dump on one of the machine that has a suspiciously high CPU might help figuring out what it is that is taking all that CPU.


On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright <kw...@nanigans.com>> wrote:
Some last minute info on this to hopefully enlighten.  We are doing ~200 reads and writes across our 7 node SSD cluster right now (usually can do closer to 20K reads at least) and seeing CPU load as follows for the nodes (with some par new to give an idea of GC):

001 – 1200%   (Par New at 120 ms / sec)
002 – 6% (Par New at 0)
003 – 600% (Par New at 45 ms / sec)
004 – 900%
005 – 500%
006 – 10%
007 – 130%

There are no compactions running on 001 however I did see a broken pipe error in the logs there (see below).  Netstats for 001 shows nothing pending.  It appears that all of the load/latency is related to one column family.  You can see cfstats & cfhistograms output below and note that we are using LCS.  I have brought the odd cfhistograms behavior to the thread before and am not sure what's going on there.  We are in a production down situation right now so any help would be much appreciated!!!

Column Family: global_user
SSTable count: 7546
SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0]
Space used (live): 83848742562
Space used (total): 83848742562
Number of Keys (estimate): 549792896
Memtable Columns Count: 526746
Memtable Data Size: 117408252
Memtable Switch Count: 0
Read Count: 11673
Read Latency: 1950.062 ms.
Write Count: 118588
Write Latency: 0.080 ms.
Pending Tasks: 0
Bloom Filter False Positives: 4322
Bloom Filter False Ratio: 0.84066
Bloom Filter Space Used: 383507440
Compacted row minimum size: 73
Compacted row maximum size: 2816159
Compacted row mean size: 324

[kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user
users/global_user histograms
Offset      SSTables     Write Latency      Read Latency          Row Size      Column Count
1               8866                 0                 0                 0              3420
2               1001                 0                 0                 0          99218975
3               1249                 0                 0                 0         319713048
4               1074                 0                 0                 0          25073893
5                132                 0                 0                 0          15359199
6                  0                 0                 0                 0          27794925
7                  0                12                 0                 0           7954974
8                  0                23                 0                 0           7733934
10                 0               184                 0                 0          13276275
12                 0               567                 0                 0           9077508
14                 0              1098                 0                 0           5879292
17                 0              2722                 0                 0           5693471
20                 0              4379                 0                 0           3204131
24                 0              8928                 0                 0           2614995
29                 0             13525                 0                 0           1824584
35                 0             16759                 0                 0           1265911
42                 0             17048                 0                 0            868075
50                 0             14162                 5                 0            596417
60                 0             11806                15                 0            467747
72                 0              8569               108                 0            354276
86                 0              7042               276               227            269987
103                0              5936               372              2972            218931
124                0              4538               577               157            181360
149                0              2981              1076           7388090            144298
179                0              1929              1529          90535838            116628
215                0              1081              1450         182701876             93378
258                0               499              1125         141393480             74052
310                0               124               756          18883224             58617
372                0                31               460          24599272             45453
446                0                25               247          23516772             34310
535                0                10               146          13987584             26168
642                0                20               194          12091458             19965
770                0                 8               196           9269197             14649
924                0                 9               340           8082898             11015
1109               0                 9               225           4762865              8058
1331               0                 9               154           3330110              5866
1597               0                 8               144           2367615              4275
1916               0                 1               188           1633608              3087
2299               0                 4               216           1139820              2196
2759               0                 5               201            819019              1456
3311               0                 4               194            600522              1135
3973               0                 6               181            454566               786
4768               0                13               136            353886               587
5722               0                 6               152            280630               400
6866               0                 5                80            225545               254
8239               0                 6               112            183285               138
9887               0                 0                68            149820               109
11864              0                 5                99            121722                66
14237              0                57                86             98352                50
17084              0                18                99             79085                35
20501              0                 1                93             62423                11
24601              0                 0                61             49471                 9
29521              0                 0                69             37395                 5
35425              0                 4                56             28611                 6
42510              0                 0                57             21876                 1
51012              0                 9                60             16105                 0
61214              0                 0                52             11996                 0
73457              0                 0                50              8791                 0
88148              0                 0                38              6430                 0
105778             0                 0                25              4660                 0
126934             0                 0                15              3308                 0
152321             0                 0                 2              2364                 0
182785             0                 0                 0              1631                 0
219342             0                 0                 0              1156                 0
263210             0                 0                 0               887                 0
315852             0                 0                 0               618                 0
379022             0                 0                 0               427                 0
454826             0                 0                 0               272                 0
545791             0                 0                 0               168                 0
654949             0                 0                 0               115                 0
785939             0                 0                 0                61                 0
943127             0                 0                 0                58                 0
1131752            0                 0                 0                34                 0
1358102            0                 0                 0                19                 0
1629722            0                 0                 0                 9                 0
1955666            0                 0                 0                 4                 0
2346799            0                 0                 0                 5                 0
2816159            0                 0                 0                 2                 0
3379391            0                 0                 0                 0                 0
4055269            0                 0                 0                 0                 0
4866323            0                 0                 0                 0                 0
5839588            0                 0                 0                 0                 0
7007506            0                 0                 0                 0                 0
8409007            0                 0                 0                 0                 0
10090808           0                 0                 0                 0                 0
12108970           0                 0                 0                 0                 0
14530764           0                 0                 0                 0                 0
17436917           0                 0                 0                 0                 0
20924300           0                 0                 0                 0                 0
25109160           0                 0                 0                 0                 0

ERROR [WRITE-/10.8.44.98<http://10.8.44.98>] 2013-08-21 06:50:25,450 OutboundTcpConnection.java (line 197) error writing to /10.8.44.98<http://10.8.44.98>
java.lang.RuntimeException: java.io.IOException: Broken pipe
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:59)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
at org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:78)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:69)
at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131)
at org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:221)
at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:186)
at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
at java.nio.channels.Channels.writeFully(Channels.java:98)
at java.nio.channels.Channels.access$000(Channels.java:61)
at java.nio.channels.Channels$1.write(Channels.java:174)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
at org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
at org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
... 9 more


From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 2:35 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Still looking for help!  We have stopped almost ALL traffic to the cluster and still some nodes are showing almost 1000% CPU for cassandra with no iostat activity.   We were running cleanup on one of the nodes that was not showing load spikes however now when I attempt to stop cleanup there via nodetool stop cleanup the java task for stopping cleanup itself is at 1500% and has not returned after 2 minutes.  This is VERY odd behavior.  Any ideas?  Hardware failure?  Network?  We are not seeing anything there but wanted to get ideas.

Thanks

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Tuesday, August 20, 2013 8:32 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Nodes get stuck

Hi all,

    We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior recently where 3 of our nodes get locked up in high load in what appears to be a GC spiral while the rest of the cluster (7 total nodes) appears fine.  When I run a tpstats, I see the following (assuming tpstats returns at all) and top shows cassandra pegged at 2000%.  Obviously we have a large number of blocked reads.  In the past I could explain this due to unexpectedly wide rows however we have handled that.  When the cluster starts to meltdown like this its hard to get visibility into what's going on and what triggered the issue as everything starts to pile on.  Opscenter becomes unusable and because the effected nodes are in GC pressure, getting any data via nodetool or JMX is also difficult.  What do people do to handle these situations?  We are going to start graphing reads/writes/sec/CF to Ganglia in the hopes that it helps.

Thanks

Pool Name                    Active   Pending      Completed   Blocked  All time blocked
ReadStage                       256       381     1245117434         0                 0
RequestResponseStage              0         0     1161495947         0                 0
MutationStage                     8         8      481721887         0                 0
ReadRepairStage                   0         0       85770600         0                 0
ReplicateOnWriteStage             0         0       21896804         0                 0
GossipStage                       0         0        1546196         0                 0
AntiEntropyStage                  0         0           5009         0                 0
MigrationStage                    0         0           1082         0                 0
MemtablePostFlusher               0         0          10178         0                 0
FlushWriter                       0         0           6081         0              2075
MiscStage                         0         0             57         0                 0
commitlog_archiver                0         0              0         0                 0
AntiEntropySessions               0         0              0         0                 0
InternalResponseStage             0         0              6         0                 0
HintedHandoff                     1         1            246         0                 0

Message type           Dropped
RANGE_SLICE                482
READ_REPAIR                  0
BINARY                       0
READ                    515762
MUTATION                    39
_TRACE                       0
REQUEST_RESPONSE            29



Re: Nodes get stuck

Posted by Sylvain Lebresne <sy...@datastax.com>.
A thread dump on one of the machine that has a suspiciously high CPU might
help figuring out what it is that is taking all that CPU.


On Wed, Aug 21, 2013 at 8:57 AM, Keith Wright <kw...@nanigans.com> wrote:

> Some last minute info on this to hopefully enlighten.  We are doing ~200
> reads and writes across our 7 node SSD cluster right now (usually can do
> closer to 20K reads at least) and seeing CPU load as follows for the nodes
> (with some par new to give an idea of GC):
>
> 001 – 1200%   (Par New at 120 ms / sec)
> 002 – 6% (Par New at 0)
> 003 – 600% (Par New at 45 ms / sec)
> 004 – 900%
> 005 – 500%
> 006 – 10%
> 007 – 130%
>
> There are no compactions running on 001 however I did see a broken pipe
> error in the logs there (see below).  Netstats for 001 shows nothing
> pending.  It appears that all of the load/latency is related to one column
> family.  You can see cfstats & cfhistograms output below and note that we
> are using LCS.  I have brought the odd cfhistograms behavior to the thread
> before and am not sure what's going on there.  We are in a production down
> situation right now so any help would be much appreciated!!!
>
> Column Family: global_user
> SSTable count: 7546
> SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0]
> Space used (live): 83848742562
> Space used (total): 83848742562
> Number of Keys (estimate): 549792896
> Memtable Columns Count: 526746
> Memtable Data Size: 117408252
> Memtable Switch Count: 0
> Read Count: 11673
> Read Latency: 1950.062 ms.
> Write Count: 118588
> Write Latency: 0.080 ms.
> Pending Tasks: 0
> Bloom Filter False Positives: 4322
> Bloom Filter False Ratio: 0.84066
> Bloom Filter Space Used: 383507440
> Compacted row minimum size: 73
> Compacted row maximum size: 2816159
> Compacted row mean size: 324
>
> [kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user
> users/global_user histograms
> Offset      SSTables     Write Latency      Read Latency          Row Size
>      Column Count
> 1               8866                 0                 0                 0
>              3420
> 2               1001                 0                 0                 0
>          99218975
> 3               1249                 0                 0                 0
>         319713048
> 4               1074                 0                 0                 0
>          25073893
> 5                132                 0                 0                 0
>          15359199
> 6                  0                 0                 0                 0
>          27794925
> 7                  0                12                 0                 0
>           7954974
> 8                  0                23                 0                 0
>           7733934
> 10                 0               184                 0                 0
>          13276275
> 12                 0               567                 0                 0
>           9077508
> 14                 0              1098                 0                 0
>           5879292
> 17                 0              2722                 0                 0
>           5693471
> 20                 0              4379                 0                 0
>           3204131
> 24                 0              8928                 0                 0
>           2614995
> 29                 0             13525                 0                 0
>           1824584
> 35                 0             16759                 0                 0
>           1265911
> 42                 0             17048                 0                 0
>            868075
> 50                 0             14162                 5                 0
>            596417
> 60                 0             11806                15                 0
>            467747
> 72                 0              8569               108                 0
>            354276
> 86                 0              7042               276               227
>            269987
> 103                0              5936               372              2972
>            218931
> 124                0              4538               577               157
>            181360
> 149                0              2981              1076           7388090
>            144298
> 179                0              1929              1529          90535838
>            116628
> 215                0              1081              1450         182701876
>             93378
> 258                0               499              1125         141393480
>             74052
> 310                0               124               756          18883224
>             58617
> 372                0                31               460          24599272
>             45453
> 446                0                25               247          23516772
>             34310
> 535                0                10               146          13987584
>             26168
> 642                0                20               194          12091458
>             19965
> 770                0                 8               196           9269197
>             14649
> 924                0                 9               340           8082898
>             11015
> 1109               0                 9               225           4762865
>              8058
> 1331               0                 9               154           3330110
>              5866
> 1597               0                 8               144           2367615
>              4275
> 1916               0                 1               188           1633608
>              3087
> 2299               0                 4               216           1139820
>              2196
> 2759               0                 5               201            819019
>              1456
> 3311               0                 4               194            600522
>              1135
> 3973               0                 6               181            454566
>               786
> 4768               0                13               136            353886
>               587
> 5722               0                 6               152            280630
>               400
> 6866               0                 5                80            225545
>               254
> 8239               0                 6               112            183285
>               138
> 9887               0                 0                68            149820
>               109
> 11864              0                 5                99            121722
>                66
> 14237              0                57                86             98352
>                50
> 17084              0                18                99             79085
>                35
> 20501              0                 1                93             62423
>                11
> 24601              0                 0                61             49471
>                 9
> 29521              0                 0                69             37395
>                 5
> 35425              0                 4                56             28611
>                 6
> 42510              0                 0                57             21876
>                 1
> 51012              0                 9                60             16105
>                 0
> 61214              0                 0                52             11996
>                 0
> 73457              0                 0                50              8791
>                 0
> 88148              0                 0                38              6430
>                 0
> 105778             0                 0                25              4660
>                 0
> 126934             0                 0                15              3308
>                 0
> 152321             0                 0                 2              2364
>                 0
> 182785             0                 0                 0              1631
>                 0
> 219342             0                 0                 0              1156
>                 0
> 263210             0                 0                 0               887
>                 0
> 315852             0                 0                 0               618
>                 0
> 379022             0                 0                 0               427
>                 0
> 454826             0                 0                 0               272
>                 0
> 545791             0                 0                 0               168
>                 0
> 654949             0                 0                 0               115
>                 0
> 785939             0                 0                 0                61
>                 0
> 943127             0                 0                 0                58
>                 0
> 1131752            0                 0                 0                34
>                 0
> 1358102            0                 0                 0                19
>                 0
> 1629722            0                 0                 0                 9
>                 0
> 1955666            0                 0                 0                 4
>                 0
> 2346799            0                 0                 0                 5
>                 0
> 2816159            0                 0                 0                 2
>                 0
> 3379391            0                 0                 0                 0
>                 0
> 4055269            0                 0                 0                 0
>                 0
> 4866323            0                 0                 0                 0
>                 0
> 5839588            0                 0                 0                 0
>                 0
> 7007506            0                 0                 0                 0
>                 0
> 8409007            0                 0                 0                 0
>                 0
> 10090808           0                 0                 0                 0
>                 0
> 12108970           0                 0                 0                 0
>                 0
> 14530764           0                 0                 0                 0
>                 0
> 17436917           0                 0                 0                 0
>                 0
> 20924300           0                 0                 0                 0
>                 0
> 25109160           0                 0                 0                 0
>                 0
>
> ERROR [WRITE-/10.8.44.98] 2013-08-21 06:50:25,450
> OutboundTcpConnection.java (line 197) error writing to /10.8.44.98
> java.lang.RuntimeException: java.io.IOException: Broken pipe
> at
> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:59)
> at
> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
> at
> org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
> at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62)
> at
> org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:78)
> at
> org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:69)
> at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131)
> at
> org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:221)
> at
> org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:186)
> at
> org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
> Caused by: java.io.IOException: Broken pipe
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
> at sun.nio.ch.IOUtil.write(IOUtil.java:65)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
> at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
> at java.nio.channels.Channels.writeFully(Channels.java:98)
> at java.nio.channels.Channels.access$000(Channels.java:61)
> at java.nio.channels.Channels$1.write(Channels.java:174)
> at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
> at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
> at org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
> at
> org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
> at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
> at java.io.DataOutputStream.write(DataOutputStream.java:107)
> at org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
> at
> org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
> at
> org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
> ... 9 more
>
>
> From: Keith Wright <kw...@nanigans.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Wednesday, August 21, 2013 2:35 AM
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Re: Nodes get stuck
>
> Still looking for help!  We have stopped almost ALL traffic to the cluster
> and still some nodes are showing almost 1000% CPU for cassandra with no
> iostat activity.   We were running cleanup on one of the nodes that was not
> showing load spikes however now when I attempt to stop cleanup there via
> nodetool stop cleanup the java task for stopping cleanup itself is at 1500%
> and has not returned after 2 minutes.  This is VERY odd behavior.  Any
> ideas?  Hardware failure?  Network?  We are not seeing anything there but
> wanted to get ideas.
>
> Thanks
>
> From: Keith Wright <kw...@nanigans.com>
> Reply-To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Date: Tuesday, August 20, 2013 8:32 PM
> To: "user@cassandra.apache.org" <us...@cassandra.apache.org>
> Subject: Nodes get stuck
>
> Hi all,
>
>     We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior
> recently where 3 of our nodes get locked up in high load in what appears to
> be a GC spiral while the rest of the cluster (7 total nodes) appears fine.
>  When I run a tpstats, I see the following (assuming tpstats returns at
> all) and top shows cassandra pegged at 2000%.  Obviously we have a large
> number of blocked reads.  In the past I could explain this due to
> unexpectedly wide rows however we have handled that.  When the cluster
> starts to meltdown like this its hard to get visibility into what's going
> on and what triggered the issue as everything starts to pile on.  Opscenter
> becomes unusable and because the effected nodes are in GC pressure, getting
> any data via nodetool or JMX is also difficult.  What do people do to
> handle these situations?  We are going to start graphing
> reads/writes/sec/CF to Ganglia in the hopes that it helps.
>
> Thanks
>
> Pool Name                    Active   Pending      Completed   Blocked
>  All time blocked
> ReadStage                       256       381     1245117434         0
>             0
> RequestResponseStage              0         0     1161495947         0
>             0
> MutationStage                     8         8      481721887         0
>             0
> ReadRepairStage                   0         0       85770600         0
>             0
> ReplicateOnWriteStage             0         0       21896804         0
>             0
> GossipStage                       0         0        1546196         0
>             0
> AntiEntropyStage                  0         0           5009         0
>             0
> MigrationStage                    0         0           1082         0
>             0
> MemtablePostFlusher               0         0          10178         0
>             0
> FlushWriter                       0         0           6081         0
>          2075
> MiscStage                         0         0             57         0
>             0
> commitlog_archiver                0         0              0         0
>             0
> AntiEntropySessions               0         0              0         0
>             0
> InternalResponseStage             0         0              6         0
>             0
> HintedHandoff                     1         1            246         0
>             0
>
> Message type           Dropped
> RANGE_SLICE                482
> READ_REPAIR                  0
> BINARY                       0
> READ                    515762
> MUTATION                    39
> _TRACE                       0
> REQUEST_RESPONSE            29
>
>

Re: Nodes get stuck

Posted by Keith Wright <kw...@nanigans.com>.
Some last minute info on this to hopefully enlighten.  We are doing ~200 reads and writes across our 7 node SSD cluster right now (usually can do closer to 20K reads at least) and seeing CPU load as follows for the nodes (with some par new to give an idea of GC):

001 – 1200%   (Par New at 120 ms / sec)
002 – 6% (Par New at 0)
003 – 600% (Par New at 45 ms / sec)
004 – 900%
005 – 500%
006 – 10%
007 – 130%

There are no compactions running on 001 however I did see a broken pipe error in the logs there (see below).  Netstats for 001 shows nothing pending.  It appears that all of the load/latency is related to one column family.  You can see cfstats & cfhistograms output below and note that we are using LCS.  I have brought the odd cfhistograms behavior to the thread before and am not sure what's going on there.  We are in a production down situation right now so any help would be much appreciated!!!

Column Family: global_user
SSTable count: 7546
SSTables in each level: [2, 10, 106/100, 453, 6975, 0, 0]
Space used (live): 83848742562
Space used (total): 83848742562
Number of Keys (estimate): 549792896
Memtable Columns Count: 526746
Memtable Data Size: 117408252
Memtable Switch Count: 0
Read Count: 11673
Read Latency: 1950.062 ms.
Write Count: 118588
Write Latency: 0.080 ms.
Pending Tasks: 0
Bloom Filter False Positives: 4322
Bloom Filter False Ratio: 0.84066
Bloom Filter Space Used: 383507440
Compacted row minimum size: 73
Compacted row maximum size: 2816159
Compacted row mean size: 324

[kwright@lxpcas001 ~]$ nodetool cfhistograms users global_user
users/global_user histograms
Offset      SSTables     Write Latency      Read Latency          Row Size      Column Count
1               8866                 0                 0                 0              3420
2               1001                 0                 0                 0          99218975
3               1249                 0                 0                 0         319713048
4               1074                 0                 0                 0          25073893
5                132                 0                 0                 0          15359199
6                  0                 0                 0                 0          27794925
7                  0                12                 0                 0           7954974
8                  0                23                 0                 0           7733934
10                 0               184                 0                 0          13276275
12                 0               567                 0                 0           9077508
14                 0              1098                 0                 0           5879292
17                 0              2722                 0                 0           5693471
20                 0              4379                 0                 0           3204131
24                 0              8928                 0                 0           2614995
29                 0             13525                 0                 0           1824584
35                 0             16759                 0                 0           1265911
42                 0             17048                 0                 0            868075
50                 0             14162                 5                 0            596417
60                 0             11806                15                 0            467747
72                 0              8569               108                 0            354276
86                 0              7042               276               227            269987
103                0              5936               372              2972            218931
124                0              4538               577               157            181360
149                0              2981              1076           7388090            144298
179                0              1929              1529          90535838            116628
215                0              1081              1450         182701876             93378
258                0               499              1125         141393480             74052
310                0               124               756          18883224             58617
372                0                31               460          24599272             45453
446                0                25               247          23516772             34310
535                0                10               146          13987584             26168
642                0                20               194          12091458             19965
770                0                 8               196           9269197             14649
924                0                 9               340           8082898             11015
1109               0                 9               225           4762865              8058
1331               0                 9               154           3330110              5866
1597               0                 8               144           2367615              4275
1916               0                 1               188           1633608              3087
2299               0                 4               216           1139820              2196
2759               0                 5               201            819019              1456
3311               0                 4               194            600522              1135
3973               0                 6               181            454566               786
4768               0                13               136            353886               587
5722               0                 6               152            280630               400
6866               0                 5                80            225545               254
8239               0                 6               112            183285               138
9887               0                 0                68            149820               109
11864              0                 5                99            121722                66
14237              0                57                86             98352                50
17084              0                18                99             79085                35
20501              0                 1                93             62423                11
24601              0                 0                61             49471                 9
29521              0                 0                69             37395                 5
35425              0                 4                56             28611                 6
42510              0                 0                57             21876                 1
51012              0                 9                60             16105                 0
61214              0                 0                52             11996                 0
73457              0                 0                50              8791                 0
88148              0                 0                38              6430                 0
105778             0                 0                25              4660                 0
126934             0                 0                15              3308                 0
152321             0                 0                 2              2364                 0
182785             0                 0                 0              1631                 0
219342             0                 0                 0              1156                 0
263210             0                 0                 0               887                 0
315852             0                 0                 0               618                 0
379022             0                 0                 0               427                 0
454826             0                 0                 0               272                 0
545791             0                 0                 0               168                 0
654949             0                 0                 0               115                 0
785939             0                 0                 0                61                 0
943127             0                 0                 0                58                 0
1131752            0                 0                 0                34                 0
1358102            0                 0                 0                19                 0
1629722            0                 0                 0                 9                 0
1955666            0                 0                 0                 4                 0
2346799            0                 0                 0                 5                 0
2816159            0                 0                 0                 2                 0
3379391            0                 0                 0                 0                 0
4055269            0                 0                 0                 0                 0
4866323            0                 0                 0                 0                 0
5839588            0                 0                 0                 0                 0
7007506            0                 0                 0                 0                 0
8409007            0                 0                 0                 0                 0
10090808           0                 0                 0                 0                 0
12108970           0                 0                 0                 0                 0
14530764           0                 0                 0                 0                 0
17436917           0                 0                 0                 0                 0
20924300           0                 0                 0                 0                 0
25109160           0                 0                 0                 0                 0

ERROR [WRITE-/10.8.44.98] 2013-08-21 06:50:25,450 OutboundTcpConnection.java (line 197) error writing to /10.8.44.98
java.lang.RuntimeException: java.io.IOException: Broken pipe
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:59)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:30)
at org.apache.cassandra.db.ColumnFamilySerializer.serialize(ColumnFamilySerializer.java:73)
at org.apache.cassandra.db.Row$RowSerializer.serialize(Row.java:62)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:78)
at org.apache.cassandra.db.ReadResponseSerializer.serialize(ReadResponse.java:69)
at org.apache.cassandra.net.MessageOut.serialize(MessageOut.java:131)
at org.apache.cassandra.net.OutboundTcpConnection.write(OutboundTcpConnection.java:221)
at org.apache.cassandra.net.OutboundTcpConnection.writeConnected(OutboundTcpConnection.java:186)
at org.apache.cassandra.net.OutboundTcpConnection.run(OutboundTcpConnection.java:144)
Caused by: java.io.IOException: Broken pipe
at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
at sun.nio.ch.IOUtil.write(IOUtil.java:65)
at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
at java.nio.channels.Channels.writeFullyImpl(Channels.java:78)
at java.nio.channels.Channels.writeFully(Channels.java:98)
at java.nio.channels.Channels.access$000(Channels.java:61)
at java.nio.channels.Channels$1.write(Channels.java:174)
at java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:82)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:126)
at org.xerial.snappy.SnappyOutputStream.dump(SnappyOutputStream.java:297)
at org.xerial.snappy.SnappyOutputStream.rawWrite(SnappyOutputStream.java:244)
at org.xerial.snappy.SnappyOutputStream.write(SnappyOutputStream.java:99)
at java.io.DataOutputStream.write(DataOutputStream.java:107)
at org.apache.cassandra.utils.ByteBufferUtil.write(ByteBufferUtil.java:328)
at org.apache.cassandra.utils.ByteBufferUtil.writeWithLength(ByteBufferUtil.java:315)
at org.apache.cassandra.db.ColumnSerializer.serialize(ColumnSerializer.java:55)
... 9 more


From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Wednesday, August 21, 2013 2:35 AM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Re: Nodes get stuck

Still looking for help!  We have stopped almost ALL traffic to the cluster and still some nodes are showing almost 1000% CPU for cassandra with no iostat activity.   We were running cleanup on one of the nodes that was not showing load spikes however now when I attempt to stop cleanup there via nodetool stop cleanup the java task for stopping cleanup itself is at 1500% and has not returned after 2 minutes.  This is VERY odd behavior.  Any ideas?  Hardware failure?  Network?  We are not seeing anything there but wanted to get ideas.

Thanks

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Tuesday, August 20, 2013 8:32 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Nodes get stuck

Hi all,

    We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior recently where 3 of our nodes get locked up in high load in what appears to be a GC spiral while the rest of the cluster (7 total nodes) appears fine.  When I run a tpstats, I see the following (assuming tpstats returns at all) and top shows cassandra pegged at 2000%.  Obviously we have a large number of blocked reads.  In the past I could explain this due to unexpectedly wide rows however we have handled that.  When the cluster starts to meltdown like this its hard to get visibility into what's going on and what triggered the issue as everything starts to pile on.  Opscenter becomes unusable and because the effected nodes are in GC pressure, getting any data via nodetool or JMX is also difficult.  What do people do to handle these situations?  We are going to start graphing reads/writes/sec/CF to Ganglia in the hopes that it helps.

Thanks

Pool Name                    Active   Pending      Completed   Blocked  All time blocked
ReadStage                       256       381     1245117434         0                 0
RequestResponseStage              0         0     1161495947         0                 0
MutationStage                     8         8      481721887         0                 0
ReadRepairStage                   0         0       85770600         0                 0
ReplicateOnWriteStage             0         0       21896804         0                 0
GossipStage                       0         0        1546196         0                 0
AntiEntropyStage                  0         0           5009         0                 0
MigrationStage                    0         0           1082         0                 0
MemtablePostFlusher               0         0          10178         0                 0
FlushWriter                       0         0           6081         0              2075
MiscStage                         0         0             57         0                 0
commitlog_archiver                0         0              0         0                 0
AntiEntropySessions               0         0              0         0                 0
InternalResponseStage             0         0              6         0                 0
HintedHandoff                     1         1            246         0                 0

Message type           Dropped
RANGE_SLICE                482
READ_REPAIR                  0
BINARY                       0
READ                    515762
MUTATION                    39
_TRACE                       0
REQUEST_RESPONSE            29


Re: Nodes get stuck

Posted by Keith Wright <kw...@nanigans.com>.
Still looking for help!  We have stopped almost ALL traffic to the cluster and still some nodes are showing almost 1000% CPU for cassandra with no iostat activity.   We were running cleanup on one of the nodes that was not showing load spikes however now when I attempt to stop cleanup there via nodetool stop cleanup the java task for stopping cleanup itself is at 1500% and has not returned after 2 minutes.  This is VERY odd behavior.  Any ideas?  Hardware failure?  Network?  We are not seeing anything there but wanted to get ideas.

Thanks

From: Keith Wright <kw...@nanigans.com>>
Reply-To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Date: Tuesday, August 20, 2013 8:32 PM
To: "user@cassandra.apache.org<ma...@cassandra.apache.org>" <us...@cassandra.apache.org>>
Subject: Nodes get stuck

Hi all,

    We are using C* 1.2.4 with Vnodes and SSD.  We have seen behavior recently where 3 of our nodes get locked up in high load in what appears to be a GC spiral while the rest of the cluster (7 total nodes) appears fine.  When I run a tpstats, I see the following (assuming tpstats returns at all) and top shows cassandra pegged at 2000%.  Obviously we have a large number of blocked reads.  In the past I could explain this due to unexpectedly wide rows however we have handled that.  When the cluster starts to meltdown like this its hard to get visibility into what's going on and what triggered the issue as everything starts to pile on.  Opscenter becomes unusable and because the effected nodes are in GC pressure, getting any data via nodetool or JMX is also difficult.  What do people do to handle these situations?  We are going to start graphing reads/writes/sec/CF to Ganglia in the hopes that it helps.

Thanks

Pool Name                    Active   Pending      Completed   Blocked  All time blocked
ReadStage                       256       381     1245117434         0                 0
RequestResponseStage              0         0     1161495947         0                 0
MutationStage                     8         8      481721887         0                 0
ReadRepairStage                   0         0       85770600         0                 0
ReplicateOnWriteStage             0         0       21896804         0                 0
GossipStage                       0         0        1546196         0                 0
AntiEntropyStage                  0         0           5009         0                 0
MigrationStage                    0         0           1082         0                 0
MemtablePostFlusher               0         0          10178         0                 0
FlushWriter                       0         0           6081         0              2075
MiscStage                         0         0             57         0                 0
commitlog_archiver                0         0              0         0                 0
AntiEntropySessions               0         0              0         0                 0
InternalResponseStage             0         0              6         0                 0
HintedHandoff                     1         1            246         0                 0

Message type           Dropped
RANGE_SLICE                482
READ_REPAIR                  0
BINARY                       0
READ                    515762
MUTATION                    39
_TRACE                       0
REQUEST_RESPONSE            29