You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Patrik Modesto <pa...@gmail.com> on 2011/12/06 10:50:17 UTC

Cassandra not suitable?

Hi,

I'm quite desperate about Cassandra's performance in our production
cluster. We have 8 real-HW nodes, 32core CPU, 32GB memory, 4 disks in
raid10, cassandra 0.8.8, RF=3 and Hadoop.
We four keyspaces, one is the large one, it has 2 CFs, one is kind of
index, the other holds data. There are about 7milinon rows, mean row
size is 7kB. We run several mapreduce tasks, most of them just reads
from cassandra and writes to hdfs, but one fetch rows from cassnadra,
compute something and write it back, for each row we compute three new
json values, about 1kB each (they get overwritten next round).

We got lots and lots of Timeout exceptions, LiveSSTablesCount over
100. Reapir doesn't finish even in 24hours, reading from the other
keyspaces timeouts as well.  We set compaction_throughput_mb_per_sec:
0 but it didn't help.

Did we choose wrong DB for our usecase?

Regards,
Patrik

This is from one node:

 INFO 10:28:40,035 Pool Name                    Active   Pending   Blocked
 INFO 10:28:40,036 ReadStage                        96       695         0
 INFO 10:28:40,037 RequestResponseStage              0         0         0
 INFO 10:28:40,037 ReadRepairStage                   0         0         0
 INFO 10:28:40,037 MutationStage                     1         1         0
 INFO 10:28:40,038 ReplicateOnWriteStage             0         0         0
 INFO 10:28:40,038 GossipStage                       0         0         0
 INFO 10:28:40,038 AntiEntropyStage                  0         0         0
 INFO 10:28:40,039 MigrationStage                    0         0         0
 INFO 10:28:40,039 StreamStage                       0         0         0
 INFO 10:28:40,040 MemtablePostFlusher               0         0         0
 INFO 10:28:40,040 FlushWriter                       0         0         0
 INFO 10:28:40,040 MiscStage                         0         0         0
 INFO 10:28:40,041 FlushSorter                       0         0         0
 INFO 10:28:40,041 InternalResponseStage             0         0         0
 INFO 10:28:40,041 HintedHandoff                     1         5         0
 INFO 10:28:40,042 CompactionManager               n/a        27
 INFO 10:28:40,042 MessagingService                n/a   0,16559

And here is the nodetool ring  output:

10.2.54.91      NG          RAC1        Up     Normal  118.04 GB
12.50%  0
10.2.54.92      NG          RAC1        Up     Normal  102.74 GB
12.50%  21267647932558653966460912964485513216
10.2.54.93      NG          RAC1        Up     Normal  76.95 GB
12.50%  42535295865117307932921825928971026432
10.2.54.94      NG          RAC1        Up     Normal  56.97 GB
12.50%  63802943797675961899382738893456539648
10.2.54.95      NG          RAC1        Up     Normal  75.55 GB
12.50%  85070591730234615865843651857942052864
10.2.54.96      NG          RAC1        Up     Normal  102.57 GB
12.50%  106338239662793269832304564822427566080
10.2.54.97      NG          RAC1        Up     Normal  68.03 GB
12.50%  127605887595351923798765477786913079296
10.2.54.98      NG          RAC1        Up     Normal  194.6 GB
12.50%  148873535527910577765226390751398592512

Re: Cassandra not suitable?

Posted by Peter Schuller <pe...@infidyne.com>.

> I'm quite desperate about Cassandra's performance in our production
> cluster. We have 8 real-HW nodes, 32core CPU, 32GB memory, 4 disks in
> raid10, cassandra 0.8.8, RF=3 and Hadoop.
> We four keyspaces, one is the large one, it has 2 CFs, one is kind of
> index, the other holds data. There are about 7milinon rows, mean row
> size is 7kB. We run several mapreduce tasks, most of them just reads
> from cassandra and writes to hdfs, but one fetch rows from cassnadra,
> compute something and write it back, for each row we compute three new
> json values, about 1kB each (they get overwritten next round).
>
> We got lots and lots of Timeout exceptions, LiveSSTablesCount over
> 100. Reapir doesn't finish even in 24hours, reading from the other
> keyspaces timeouts as well.  We set compaction_throughput_mb_per_sec:
> 0 but it didn't help.

Exactly why you're seeing timeouts would depend on quite a few
factors. In general however, my observation is that you have ~ 100 gig
per node on nodes with 32 gigs of memory, *AND* you say you're running
map reduce jobs.

In general, I would expect that any performance problems you have are
probably due to cache misses and simply bottlenecking on disk I/O.
What to do about it depends very much on the situation and it's
difficult to give a concrete suggestion without more context.

Some things that might mitigate effects include using row cache for
the hot data set (if you have a very small hot data set that should
work well since the row cache is unaffected by e.g. mapreduce jobs),
selecting a different compaction strategy (leveled can be better,
depending), running map reduce on a separate DC that takes writes but
is separated from the live cluster that takes reads (unless you're
only doing batch request).

But those are just some random things thrown in the air; do not take
that as concrete suggestions for your particular case.

The key is understanding the access pattern, what the bottlenecks are,
in combination with how the database works - and figure out what the
most cost-effective solution is.

Note that if you're bottlenecking on disk I/O, it's not surprising at
all that repairing ~ 100 gigs of data takes more than 24 hours.

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)

Re: Cassandra not suitable?

Posted by Jonathan Ellis <jb...@gmail.com>.

Sounds like you're simply throwing more seq scans at it via m/r than
your disk can handle.  iostat could confirm that disk is the
bottleneck.  But "real" monitoring would be better.
http://www.datastax.com/products/opscenter

On Thu, Dec 8, 2011 at 1:02 AM, Patrik Modesto <pa...@gmail.com> wrote:
> Hi Jake,
>
> I see the timeouts in mappers as well as at random-access backend
> daemons (for web services). There are now 10 mappers, 2 reducers on
> each node. There is one big 4-disk raid10 array on each node on which
> there is cassandra together with HDFS. We store just few GB of files
> on HDFS, otherwise we don't use it.
>
> Regards,
> P.
>
> On Wed, Dec 7, 2011 at 15:33, Jake Luciani <ja...@gmail.com> wrote:
>> Where do you see the timeout exceptions? in the mappers?
>>
>> How many mappers reducers slots are you using?  What does your disk setup
>> look like? do you have HDFS on same disk as cassandra data dir?
>>
>> -Jake
>>
>>
>> On Tue, Dec 6, 2011 at 4:50 AM, Patrik Modesto <pa...@gmail.com>
>> wrote:
>>>
>>> Hi,
>>>
>>> I'm quite desperate about Cassandra's performance in our production
>>> cluster. We have 8 real-HW nodes, 32core CPU, 32GB memory, 4 disks in
>>> raid10, cassandra 0.8.8, RF=3 and Hadoop.
>>> We four keyspaces, one is the large one, it has 2 CFs, one is kind of
>>> index, the other holds data. There are about 7milinon rows, mean row
>>> size is 7kB. We run several mapreduce tasks, most of them just reads
>>> from cassandra and writes to hdfs, but one fetch rows from cassnadra,
>>> compute something and write it back, for each row we compute three new
>>> json values, about 1kB each (they get overwritten next round).
>>>
>>> We got lots and lots of Timeout exceptions, LiveSSTablesCount over
>>> 100. Reapir doesn't finish even in 24hours, reading from the other
>>> keyspaces timeouts as well.  We set compaction_throughput_mb_per_sec:
>>> 0 but it didn't help.
>>>
>>> Did we choose wrong DB for our usecase?
>>>
>>> Regards,
>>> Patrik
>>>
>>> This is from one node:
>>>
>>>  INFO 10:28:40,035 Pool Name                    Active   Pending   Blocked
>>>  INFO 10:28:40,036 ReadStage                        96       695         0
>>>  INFO 10:28:40,037 RequestResponseStage              0         0         0
>>>  INFO 10:28:40,037 ReadRepairStage                   0         0         0
>>>  INFO 10:28:40,037 MutationStage                     1         1         0
>>>  INFO 10:28:40,038 ReplicateOnWriteStage             0         0         0
>>>  INFO 10:28:40,038 GossipStage                       0         0         0
>>>  INFO 10:28:40,038 AntiEntropyStage                  0         0         0
>>>  INFO 10:28:40,039 MigrationStage                    0         0         0
>>>  INFO 10:28:40,039 StreamStage                       0         0         0
>>>  INFO 10:28:40,040 MemtablePostFlusher               0         0         0
>>>  INFO 10:28:40,040 FlushWriter                       0         0         0
>>>  INFO 10:28:40,040 MiscStage                         0         0         0
>>>  INFO 10:28:40,041 FlushSorter                       0         0         0
>>>  INFO 10:28:40,041 InternalResponseStage             0         0         0
>>>  INFO 10:28:40,041 HintedHandoff                     1         5         0
>>>  INFO 10:28:40,042 CompactionManager               n/a        27
>>>  INFO 10:28:40,042 MessagingService                n/a   0,16559
>>>
>>> And here is the nodetool ring  output:
>>>
>>> 10.2.54.91      NG          RAC1        Up     Normal  118.04 GB
>>> 12.50%  0
>>> 10.2.54.92      NG          RAC1        Up     Normal  102.74 GB
>>> 12.50%  21267647932558653966460912964485513216
>>> 10.2.54.93      NG          RAC1        Up     Normal  76.95 GB
>>> 12.50%  42535295865117307932921825928971026432
>>> 10.2.54.94      NG          RAC1        Up     Normal  56.97 GB
>>> 12.50%  63802943797675961899382738893456539648
>>> 10.2.54.95      NG          RAC1        Up     Normal  75.55 GB
>>> 12.50%  85070591730234615865843651857942052864
>>> 10.2.54.96      NG          RAC1        Up     Normal  102.57 GB
>>> 12.50%  106338239662793269832304564822427566080
>>> 10.2.54.97      NG          RAC1        Up     Normal  68.03 GB
>>> 12.50%  127605887595351923798765477786913079296
>>> 10.2.54.98      NG          RAC1        Up     Normal  194.6 GB
>>> 12.50%  148873535527910577765226390751398592512
>>
>>
>>
>>
>> --
>> http://twitter.com/tjake



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of DataStax, the source for professional Cassandra support
http://www.datastax.com

Re: Cassandra not suitable?

Posted by Patrik Modesto <pa...@gmail.com>.

Hi Jake,

I see the timeouts in mappers as well as at random-access backend
daemons (for web services). There are now 10 mappers, 2 reducers on
each node. There is one big 4-disk raid10 array on each node on which
there is cassandra together with HDFS. We store just few GB of files
on HDFS, otherwise we don't use it.

Regards,
P.

On Wed, Dec 7, 2011 at 15:33, Jake Luciani <ja...@gmail.com> wrote:
> Where do you see the timeout exceptions? in the mappers?
>
> How many mappers reducers slots are you using?  What does your disk setup
> look like? do you have HDFS on same disk as cassandra data dir?
>
> -Jake
>
>
> On Tue, Dec 6, 2011 at 4:50 AM, Patrik Modesto <pa...@gmail.com>
> wrote:
>>
>> Hi,
>>
>> I'm quite desperate about Cassandra's performance in our production
>> cluster. We have 8 real-HW nodes, 32core CPU, 32GB memory, 4 disks in
>> raid10, cassandra 0.8.8, RF=3 and Hadoop.
>> We four keyspaces, one is the large one, it has 2 CFs, one is kind of
>> index, the other holds data. There are about 7milinon rows, mean row
>> size is 7kB. We run several mapreduce tasks, most of them just reads
>> from cassandra and writes to hdfs, but one fetch rows from cassnadra,
>> compute something and write it back, for each row we compute three new
>> json values, about 1kB each (they get overwritten next round).
>>
>> We got lots and lots of Timeout exceptions, LiveSSTablesCount over
>> 100. Reapir doesn't finish even in 24hours, reading from the other
>> keyspaces timeouts as well.  We set compaction_throughput_mb_per_sec:
>> 0 but it didn't help.
>>
>> Did we choose wrong DB for our usecase?
>>
>> Regards,
>> Patrik
>>
>> This is from one node:
>>
>>  INFO 10:28:40,035 Pool Name                    Active   Pending   Blocked
>>  INFO 10:28:40,036 ReadStage                        96       695         0
>>  INFO 10:28:40,037 RequestResponseStage              0         0         0
>>  INFO 10:28:40,037 ReadRepairStage                   0         0         0
>>  INFO 10:28:40,037 MutationStage                     1         1         0
>>  INFO 10:28:40,038 ReplicateOnWriteStage             0         0         0
>>  INFO 10:28:40,038 GossipStage                       0         0         0
>>  INFO 10:28:40,038 AntiEntropyStage                  0         0         0
>>  INFO 10:28:40,039 MigrationStage                    0         0         0
>>  INFO 10:28:40,039 StreamStage                       0         0         0
>>  INFO 10:28:40,040 MemtablePostFlusher               0         0         0
>>  INFO 10:28:40,040 FlushWriter                       0         0         0
>>  INFO 10:28:40,040 MiscStage                         0         0         0
>>  INFO 10:28:40,041 FlushSorter                       0         0         0
>>  INFO 10:28:40,041 InternalResponseStage             0         0         0
>>  INFO 10:28:40,041 HintedHandoff                     1         5         0
>>  INFO 10:28:40,042 CompactionManager               n/a        27
>>  INFO 10:28:40,042 MessagingService                n/a   0,16559
>>
>> And here is the nodetool ring  output:
>>
>> 10.2.54.91      NG          RAC1        Up     Normal  118.04 GB
>> 12.50%  0
>> 10.2.54.92      NG          RAC1        Up     Normal  102.74 GB
>> 12.50%  21267647932558653966460912964485513216
>> 10.2.54.93      NG          RAC1        Up     Normal  76.95 GB
>> 12.50%  42535295865117307932921825928971026432
>> 10.2.54.94      NG          RAC1        Up     Normal  56.97 GB
>> 12.50%  63802943797675961899382738893456539648
>> 10.2.54.95      NG          RAC1        Up     Normal  75.55 GB
>> 12.50%  85070591730234615865843651857942052864
>> 10.2.54.96      NG          RAC1        Up     Normal  102.57 GB
>> 12.50%  106338239662793269832304564822427566080
>> 10.2.54.97      NG          RAC1        Up     Normal  68.03 GB
>> 12.50%  127605887595351923798765477786913079296
>> 10.2.54.98      NG          RAC1        Up     Normal  194.6 GB
>> 12.50%  148873535527910577765226390751398592512
>
>
>
>
> --
> http://twitter.com/tjake

Re: Cassandra not suitable?

Posted by Jake Luciani <ja...@gmail.com>.

Where do you see the timeout exceptions? in the mappers?

How many mappers reducers slots are you using?  What does your disk setup
look like? do you have HDFS on same disk as cassandra data dir?

-Jake

On Tue, Dec 6, 2011 at 4:50 AM, Patrik Modesto <pa...@gmail.com>wrote:

> Hi,
>
> I'm quite desperate about Cassandra's performance in our production
> cluster. We have 8 real-HW nodes, 32core CPU, 32GB memory, 4 disks in
> raid10, cassandra 0.8.8, RF=3 and Hadoop.
> We four keyspaces, one is the large one, it has 2 CFs, one is kind of
> index, the other holds data. There are about 7milinon rows, mean row
> size is 7kB. We run several mapreduce tasks, most of them just reads
> from cassandra and writes to hdfs, but one fetch rows from cassnadra,
> compute something and write it back, for each row we compute three new
> json values, about 1kB each (they get overwritten next round).
>
> We got lots and lots of Timeout exceptions, LiveSSTablesCount over
> 100. Reapir doesn't finish even in 24hours, reading from the other
> keyspaces timeouts as well.  We set compaction_throughput_mb_per_sec:
> 0 but it didn't help.
>
> Did we choose wrong DB for our usecase?
>
> Regards,
> Patrik
>
> This is from one node:
>
>  INFO 10:28:40,035 Pool Name                    Active   Pending   Blocked
>  INFO 10:28:40,036 ReadStage                        96       695         0
>  INFO 10:28:40,037 RequestResponseStage              0         0         0
>  INFO 10:28:40,037 ReadRepairStage                   0         0         0
>  INFO 10:28:40,037 MutationStage                     1         1         0
>  INFO 10:28:40,038 ReplicateOnWriteStage             0         0         0
>  INFO 10:28:40,038 GossipStage                       0         0         0
>  INFO 10:28:40,038 AntiEntropyStage                  0         0         0
>  INFO 10:28:40,039 MigrationStage                    0         0         0
>  INFO 10:28:40,039 StreamStage                       0         0         0
>  INFO 10:28:40,040 MemtablePostFlusher               0         0         0
>  INFO 10:28:40,040 FlushWriter                       0         0         0
>  INFO 10:28:40,040 MiscStage                         0         0         0
>  INFO 10:28:40,041 FlushSorter                       0         0         0
>  INFO 10:28:40,041 InternalResponseStage             0         0         0
>  INFO 10:28:40,041 HintedHandoff                     1         5         0
>  INFO 10:28:40,042 CompactionManager               n/a        27
>  INFO 10:28:40,042 MessagingService                n/a   0,16559
>
> And here is the nodetool ring  output:
>
> 10.2.54.91      NG          RAC1        Up     Normal  118.04 GB
> 12.50%  0
> 10.2.54.92      NG          RAC1        Up     Normal  102.74 GB
> 12.50%  21267647932558653966460912964485513216
> 10.2.54.93      NG          RAC1        Up     Normal  76.95 GB
> 12.50%  42535295865117307932921825928971026432
> 10.2.54.94      NG          RAC1        Up     Normal  56.97 GB
> 12.50%  63802943797675961899382738893456539648
> 10.2.54.95      NG          RAC1        Up     Normal  75.55 GB
> 12.50%  85070591730234615865843651857942052864
> 10.2.54.96      NG          RAC1        Up     Normal  102.57 GB
> 12.50%  106338239662793269832304564822427566080
> 10.2.54.97      NG          RAC1        Up     Normal  68.03 GB
> 12.50%  127605887595351923798765477786913079296
> 10.2.54.98      NG          RAC1        Up     Normal  194.6 GB
> 12.50%  148873535527910577765226390751398592512
>



-- 
http://twitter.com/tjake

Re: Cassandra not suitable?

Posted by Patrik Modesto <pa...@gmail.com>.

Thank you Jeremy, I've already changed the max.*.failures to 20, it
help jobs to finish but doesn't solve the source of the timeouts. I'll
try the other tips.

Regards,
Patrik

On Wed, Dec 7, 2011 at 17:29, Jeremy Hanna <je...@gmail.com> wrote:
> If you're getting lots of timeout exceptions with mapreduce, you might take a look at http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting
> We saw that and tweaked a variety of things - all of which are listed there.  Ultimately, we also boosted hadoop's tolerance for them as well and it was just fine - so that it could retry more.  A coworker had the same experience running hadoop over elastic search - having to up that tolerance.  An example configuration for modifying that is shown in the link above.
>
> Hopefully that will help for your mapreduce jobs at least.  We've had good luck with MR/Pig over Cassandra, but it's after some lessons learned wrt configuration of both Cassandra and Hadoop.
>
> On Dec 6, 2011, at 3:50 AM, Patrik Modesto wrote:
>
>> Hi,
>>
>> I'm quite desperate about Cassandra's performance in our production
>> cluster. We have 8 real-HW nodes, 32core CPU, 32GB memory, 4 disks in
>> raid10, cassandra 0.8.8, RF=3 and Hadoop.
>> We four keyspaces, one is the large one, it has 2 CFs, one is kind of
>> index, the other holds data. There are about 7milinon rows, mean row
>> size is 7kB. We run several mapreduce tasks, most of them just reads
>> from cassandra and writes to hdfs, but one fetch rows from cassnadra,
>> compute something and write it back, for each row we compute three new
>> json values, about 1kB each (they get overwritten next round).
>>
>> We got lots and lots of Timeout exceptions, LiveSSTablesCount over
>> 100. Reapir doesn't finish even in 24hours, reading from the other
>> keyspaces timeouts as well.  We set compaction_throughput_mb_per_sec:
>> 0 but it didn't help.
>>
>> Did we choose wrong DB for our usecase?
>>
>> Regards,
>> Patrik
>>
>> This is from one node:
>>
>> INFO 10:28:40,035 Pool Name                    Active   Pending   Blocked
>> INFO 10:28:40,036 ReadStage                        96       695         0
>> INFO 10:28:40,037 RequestResponseStage              0         0         0
>> INFO 10:28:40,037 ReadRepairStage                   0         0         0
>> INFO 10:28:40,037 MutationStage                     1         1         0
>> INFO 10:28:40,038 ReplicateOnWriteStage             0         0         0
>> INFO 10:28:40,038 GossipStage                       0         0         0
>> INFO 10:28:40,038 AntiEntropyStage                  0         0         0
>> INFO 10:28:40,039 MigrationStage                    0         0         0
>> INFO 10:28:40,039 StreamStage                       0         0         0
>> INFO 10:28:40,040 MemtablePostFlusher               0         0         0
>> INFO 10:28:40,040 FlushWriter                       0         0         0
>> INFO 10:28:40,040 MiscStage                         0         0         0
>> INFO 10:28:40,041 FlushSorter                       0         0         0
>> INFO 10:28:40,041 InternalResponseStage             0         0         0
>> INFO 10:28:40,041 HintedHandoff                     1         5         0
>> INFO 10:28:40,042 CompactionManager               n/a        27
>> INFO 10:28:40,042 MessagingService                n/a   0,16559
>>
>> And here is the nodetool ring  output:
>>
>> 10.2.54.91      NG          RAC1        Up     Normal  118.04 GB
>> 12.50%  0
>> 10.2.54.92      NG          RAC1        Up     Normal  102.74 GB
>> 12.50%  21267647932558653966460912964485513216
>> 10.2.54.93      NG          RAC1        Up     Normal  76.95 GB
>> 12.50%  42535295865117307932921825928971026432
>> 10.2.54.94      NG          RAC1        Up     Normal  56.97 GB
>> 12.50%  63802943797675961899382738893456539648
>> 10.2.54.95      NG          RAC1        Up     Normal  75.55 GB
>> 12.50%  85070591730234615865843651857942052864
>> 10.2.54.96      NG          RAC1        Up     Normal  102.57 GB
>> 12.50%  106338239662793269832304564822427566080
>> 10.2.54.97      NG          RAC1        Up     Normal  68.03 GB
>> 12.50%  127605887595351923798765477786913079296
>> 10.2.54.98      NG          RAC1        Up     Normal  194.6 GB
>> 12.50%  148873535527910577765226390751398592512
>

Re: Cassandra not suitable?

Posted by Jeremy Hanna <je...@gmail.com>.

If you're getting lots of timeout exceptions with mapreduce, you might take a look at http://wiki.apache.org/cassandra/HadoopSupport#Troubleshooting
We saw that and tweaked a variety of things - all of which are listed there.  Ultimately, we also boosted hadoop's tolerance for them as well and it was just fine - so that it could retry more.  A coworker had the same experience running hadoop over elastic search - having to up that tolerance.  An example configuration for modifying that is shown in the link above.

Hopefully that will help for your mapreduce jobs at least.  We've had good luck with MR/Pig over Cassandra, but it's after some lessons learned wrt configuration of both Cassandra and Hadoop.

On Dec 6, 2011, at 3:50 AM, Patrik Modesto wrote:

> Hi,
> 
> I'm quite desperate about Cassandra's performance in our production
> cluster. We have 8 real-HW nodes, 32core CPU, 32GB memory, 4 disks in
> raid10, cassandra 0.8.8, RF=3 and Hadoop.
> We four keyspaces, one is the large one, it has 2 CFs, one is kind of
> index, the other holds data. There are about 7milinon rows, mean row
> size is 7kB. We run several mapreduce tasks, most of them just reads
> from cassandra and writes to hdfs, but one fetch rows from cassnadra,
> compute something and write it back, for each row we compute three new
> json values, about 1kB each (they get overwritten next round).
> 
> We got lots and lots of Timeout exceptions, LiveSSTablesCount over
> 100. Reapir doesn't finish even in 24hours, reading from the other
> keyspaces timeouts as well.  We set compaction_throughput_mb_per_sec:
> 0 but it didn't help.
> 
> Did we choose wrong DB for our usecase?
> 
> Regards,
> Patrik
> 
> This is from one node:
> 
> INFO 10:28:40,035 Pool Name                    Active   Pending   Blocked
> INFO 10:28:40,036 ReadStage                        96       695         0
> INFO 10:28:40,037 RequestResponseStage              0         0         0
> INFO 10:28:40,037 ReadRepairStage                   0         0         0
> INFO 10:28:40,037 MutationStage                     1         1         0
> INFO 10:28:40,038 ReplicateOnWriteStage             0         0         0
> INFO 10:28:40,038 GossipStage                       0         0         0
> INFO 10:28:40,038 AntiEntropyStage                  0         0         0
> INFO 10:28:40,039 MigrationStage                    0         0         0
> INFO 10:28:40,039 StreamStage                       0         0         0
> INFO 10:28:40,040 MemtablePostFlusher               0         0         0
> INFO 10:28:40,040 FlushWriter                       0         0         0
> INFO 10:28:40,040 MiscStage                         0         0         0
> INFO 10:28:40,041 FlushSorter                       0         0         0
> INFO 10:28:40,041 InternalResponseStage             0         0         0
> INFO 10:28:40,041 HintedHandoff                     1         5         0
> INFO 10:28:40,042 CompactionManager               n/a        27
> INFO 10:28:40,042 MessagingService                n/a   0,16559
> 
> And here is the nodetool ring  output:
> 
> 10.2.54.91      NG          RAC1        Up     Normal  118.04 GB
> 12.50%  0
> 10.2.54.92      NG          RAC1        Up     Normal  102.74 GB
> 12.50%  21267647932558653966460912964485513216
> 10.2.54.93      NG          RAC1        Up     Normal  76.95 GB
> 12.50%  42535295865117307932921825928971026432
> 10.2.54.94      NG          RAC1        Up     Normal  56.97 GB
> 12.50%  63802943797675961899382738893456539648
> 10.2.54.95      NG          RAC1        Up     Normal  75.55 GB
> 12.50%  85070591730234615865843651857942052864
> 10.2.54.96      NG          RAC1        Up     Normal  102.57 GB
> 12.50%  106338239662793269832304564822427566080
> 10.2.54.97      NG          RAC1        Up     Normal  68.03 GB
> 12.50%  127605887595351923798765477786913079296
> 10.2.54.98      NG          RAC1        Up     Normal  194.6 GB
> 12.50%  148873535527910577765226390751398592512